Object Recognition in Images Slides originally created by Bernd Heisele.

48
Object Recognition in Images Slides originally created by Bernd Hei

Transcript of Object Recognition in Images Slides originally created by Bernd Heisele.

Object Recognition in Images

Slides originally created by Bernd Heisele

Pattern Recognition for VisionFall 2002

Object Detection Object Identification

Where is Jane?Where is a Face?Is there a face in the image?

Who is it?Is it Jane or Erik?

Object Recognition

Pattern Recognition for VisionFall 2002

Applicable to many classes of objects

Invariance:•“External parameters”

•Pose•Illumination

•“Internal parameters”•Person identity•Facial expression

General Problems of Recognition

Pattern Recognition for VisionFall 2002

Task

Given an input image, determine if there areobjects of a given class (e.g. faces, people, cars..)in the image and where they are located.

Object Detection

Pattern Recognition for VisionFall 2002

Detection—Problems

1. Classifier must generalize over all exemplars of one class.

2. Negative class consists of everything else.

3. High accuracy (small FP rate) required for most applications.

Pattern Recognition for VisionFall 2002

Feature Extraction

Search for faces atdifferent resolutions and locations

Feature vector (x1, x2 ,…, xn)

Classifier

Off-line training

Face examples

Non-face examples Pixel pattern

Classification Result

Face Detection – basic scheme

Pattern Recognition for VisionFall 2002

Training Set

Train Classifier

Labeled Test Set

False Positive

Correct

Classify

Sensitivity

Training and Testing

Pattern Recognition for VisionFall 2002

Gray

Histogram Equalization

Masking19x19

283 [0, 1]

Histogram Equalization

Gradient

x-y-Sobel Filtering

17x17 [0, 1]

Haar WaveletTransform

1740 [0, 1]Wavelets

Normalization

Image Features

Pattern Recognition for VisionFall 2002

ROC Image Features

Pattern Recognition for VisionFall 2002

Classifiers

Pattern Recognition for VisionFall 2002

Real2900 faces + 2900 mirrored faces

3D Morphing:

Synthetic (T. Vetter, Univ. of Freiburg)

Illumination

Rotation

Identity

+ =

Positive Training Data

Pattern Recognition for VisionFall 2002

Real vs. Synthetic

Pattern Recognition for VisionFall 2002

Problem: 1 face in 116.440 examined windows

Initial set of 25.000 non-faces

Retrain Classifier

Add totraining set

Determine false-positives on large set of non-face images

Bootstrapping

Negative Training Data

Pattern Recognition for VisionFall 2002

6.7 FP/image0.9 FP/image

Bootstrapping

Pattern Recognition for VisionFall 2002

Performance of Global Face Detectors

Pattern Recognition for VisionFall 2002

Rotation in the image plane

• Rotation invariant features• Apply 2D rotation to image

Rotation out of image plane

• Component-based classification• Train on rotated faces

Rotation

Pattern Recognition for VisionFall 2002

Single template

Component templates

Global vs. Components

Pattern Recognition for VisionFall 2002

Eyes Nose Mouth

1st Level:Components

classify

2nd Level:Geometrical relation between components

maximum responseof each component classifier + x, y location

classify

Component-based Detection

Pattern Recognition for VisionFall 2002

Synthetic faces:• 7 different 3-D head models• 2,500 faces Rotation: -30o to + 30o

• 3-D correspondences for automatic location of components

Components:• discriminatory• robust against changes in pose and illumination

Learning Components

Pattern Recognition for VisionFall 2002

Start with small initial regions

Expand into one of four directions

Extract new components from images

Train SVM classifiers

Choose best expansion according to error bound of SVMs

Learning Components—One Way To Do It

Pattern Recognition for VisionFall 2002

x*1

x*2

MR

Feature Space

M

Bound on errorE < c(R / M)

2

Margin, Radius and Expected Error

Cross Validation might be better

Pattern Recognition for VisionFall 2002

Some Examples

Pattern Recognition for VisionFall 2002

• About 40,000 faces

• 68 people

• 13 poses

• 43 illumination conditions

• 4 different expressions

Faces have been manually labeled (only –45o to 45o of rotation)

Test on CMU PIE Database

Pattern Recognition for VisionFall 2002

Face Detection: Component-based vs. Global Approach(5,000 faces 25, 000 non-faces)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

FP / inspected window

Co

rrec

t

14 learned components

whole face

Heisele, B., T. Serre, M. Pontil, T. Vetter and T. Poggio. Categorization by Learning and Combining Object Parts. In: Advances in Neural Information Processing Systems (NIPS'01), Vancouver, Canada

ROC Component vs. Global

Pattern Recognition for VisionFall 2002

Components are small, and prone to false detection, even within the face.

Advances on Component-base Face Detection Stan Bileschi

Stan Bileschi

Pattern Recognition for VisionFall 2002

Use the remainder of the face in the negative training set

Positive Negative

Training on Faces

Pattern Recognition for VisionFall 2002

Red:Trained only with faces.

Training on Faces Only

Blue:Trained on facesand non-faces.

Pattern Recognition for VisionFall 2002

Errors

Often, many components classify correctly, with only a few errors

Pattern Recognition for VisionFall 2002

Using Models of Pair-wise Positions

Pattern Recognition for VisionFall 2002

Pair-wise Biasing Leads to Tightened Result Images

Pattern Recognition for VisionFall 2002

Pair-wise Biasing

Pattern Recognition for VisionFall 2002

Application: Eye Detection

Pattern Recognition for VisionFall 2002

Task: Given an image of an object of a particular class (e.g. face) identify which exemplar it is.

B

B

D

DC

A

C

A

Identification

Pattern Recognition for VisionFall 2002

Identification—Problems

1. Multi-class problem

2. Classifier must distinguish between exemplars that might look very similar.

3. Classifier has to reject exemplars that were not in the training database.

Pattern Recognition for VisionFall 2002

Limited information in a single face image

Illumination Rotation

Problems in Face Identification

Pattern Recognition for VisionFall 2002

Face Image

Gray, Gradient, Wavelets, …

Feature extraction

Support Vector Machine, ….

Classifier

Training DataA

Identification Result

System Architecture

Pattern Recognition for VisionFall 2002

Multi-class Classification with SVM

A

Bottom-Up 1vs1

A or B or C or D

C DB

A or BA or B C or DC or D

Training: L (L-1) / 2Classification : L-1

A

B A,C,D

C A,B,D

D A,B,C

1 vs. All

B,C,D

Training: LClassification : L

Pattern Recognition for VisionFall 2002

Global Approach

Detect and extract face

Feed gray values into N SVMs

Classify based on maximum output

SVMA

SVMB

SVMC

SVMD

Max Operation

Identification result

Pattern Recognition for VisionFall 2002

Global Approach with Clustering

SVMC

SVMC

SVMC

SVMC

Max Operation

Identification result

Partition training images of each person into viewpoint-specificclusters

Train a linear SVM on each cluster

Take maximum over all SVM outputs

Pattern Recognition for VisionFall 2002

Component-based Approach

Detect and extract components

Feed gray values of components to N SVMs

Take max. over all SVM outputs

SVMA

SVMB

SVMC

SVMD

Max Operation

Identification Results

Pattern Recognition for VisionFall 2002Component-based Face Recognition with 3D Morphable Models

Feature 1 != Feature 2

Feature 1 = Feature 2

Why Components for Identification?

Jennifer Huang

Pattern Recognition for VisionFall 2002

Heisele, B., P. Ho and T. Poggio. Face Recognition with Support Vector Machines: Global Versus Component-based Approach, International Conference on Computer Vision (ICCV'01), Vancouver, Canada, Vol. 2, 688-694, 2001.

More ROC Curves

Pattern Recognition for VisionFall 2002

3D morphable models

A Morphable Model for the Synthesis of 3D Faces. Blanz, V. and Vetter, T.   SIGGRAPH'99 Conference Proceedings, pp. 187-194

Training data for component-basedface recognition

Morphable Models for Face Identification, Jennifer Huang

Jennifer Huang

Pattern Recognition for VisionFall 2002

Generation of 3D head model from two images

See class on Morphable Models

Morphable Model

Jennifer Huang

Pattern Recognition for VisionFall 2002

Synthetic images are easily rendered from 3D head model under varying illuminations and rotations in depth

Some Training Images

Jennifer Huang

Pattern Recognition for VisionFall 2002

Preliminary Results on Synthetic Images

Jennifer Huang

Pattern Recognition for VisionFall 2002

Problems Encountered:

Detection

Inaccurate Component detection

Recognition

Accuracy of 3D models

Choice of Illumination and Pose

Current Work – Testing on Real Images

Jennifer Huang

Pattern Recognition for VisionFall 2002

Literature

B. Heisele, A. Verri and T. Poggio: Learning and Vision Machines. Proceedings of the IEEE, Visual Perception: Technology and Tools, Vol. 90, No. 7, pp. 1164-1177, 2002.

See also CBCL Web page