Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

37
Object Recognition from Local Scale- Invariant Features David G. Lowe Presented by Ashley L. Kapron

Transcript of Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Page 1: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Object Recognition from Local Scale-Invariant Features

David G. Lowe

Presented by Ashley L. Kapron

Page 2: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Introduction

• Object Recognition– Recognize known objects in unknown

configurations

Page 3: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Previous Work

• Zhang et al – Harris Corner Detection – Detect peaks in local image

variation

• Schmid and Mohr– Harris Corner Detection– Local image descriptor at each

interest pt from an orientation-invariant vector of derivative-of-Gaussian image measurements

Page 4: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Motivation

• Limitations of previous work: – Examine image only on a single scale

• Current paper addresses this concern by identifying stable key locations in scale space

• Identify features that are invariant

Page 5: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Invariance

• Illumination

• Scale

• Rotation

• Affine

Page 6: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Scale Space• Different scales are appropriate for

describing different objects in the image, and we may not know the correct scale/size ahead of time.

Page 7: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Difference of Gaussian

1. A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

2. B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

3. DOG (Difference of Gaussian) = A – B

4. Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)

Page 8: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

A1

B1

Difference of Gaussian Pyramid

Input Image

Blur

Blur

Blur

Downsample

Downsample

B2

B3

A2

A3

A3-B3

A2-B2

A1-B1

DOG2

DOG1

DOG3

Blur

Page 9: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Pyramid Example

A1 B1 DOG1

DOG3

DOG3A2

A3 B3

B2

Page 10: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Feature detection

• Find maxima and minima of scale space• For each point on a DOG level:

– Compare to 8 neighbors at same level– If max/min, identify corresponding point at pyramid

level below– Determine if the corresponding point is max/min of its 8

neighbors– If so, repeat at pyramid level above

• Repeat for each DOG level• Those that remain are key points

Page 11: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Identifying Max/Min

DOG L-1

DOG L

DOG L+1

Page 12: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Refining Key List: Illumination

• For all levels, use the “A” smoothed image to compute– Gradient Magnitude

• Threshold gradient magnitudes: – Remove all key points with MIJ less than 0.1

times the max gradient value

• Motivation: Low contrast is generally less reliable than high for feature points

Page 13: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Assigning Canonical Orientation

• For each remaining key point:– Choose surrounding N x N window at DOG

level it was detected

DOG image

Page 14: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Assigning Canonical Orientation

• For all levels, use the “A” smoothed image to compute– Gradient Orientation

+

Gaussian Smoothed Image Gradient Orientation Gradient Magnitude

Page 15: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Assigning Canonical Orientation

• Gradient magnitude weighted by 2D gaussian

Gradient Magnitude 2D Gaussian Weighted Magnitude

* =

Page 16: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Assigning Canonical Orientation• Accumulate in histogram

based on orientation• Histogram has 36 bins with

10° increments

Weighted Magnitude

Gradient OrientationGradient OrientationS

um o

f W

eigh

ted

Mag

nitu

des

Page 17: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Assigning Canonical Orientation• Identify peak and assign

orientation and sum of magnitude to key point

Weighted Magnitude

Gradient OrientationGradient OrientationS

um o

f W

eigh

ted

Mag

nitu

des

Peak*

Page 18: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Refining Key List: Rotation

• The user may choose a threshold to exclude key points based on their assigned sum of magnitudes.

Page 19: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Example of Refinement

Max/mins from DOG pyramid

Filter for illumination

Filter for edge orientation

Page 20: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Local Image Description

• SIFT keys each assigned:– Location– Scale (analogous to level it was detected)– Orientation (assigned in previous canonical

orientation steps)

• Now: Describe local image region invariant to the above transformations

Page 21: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

SIFT key example

Page 22: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Local Image Description

For each key point:

• Identify 8x8 neighborhood (from DOG level it was detected)

• Align orientation to x-axis

Page 23: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Local Image Description

3. Calculate gradient magnitude and orientation map

4. Weight by Gaussian

Page 24: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Local Image Description

5. Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.

Page 25: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Local Image Description

6. This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)

Page 26: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Database Creation

• Index all key points of reference model image(s)– Store key point

descriptor vectors in database

Page 27: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Image Matching

• Find all key points identified in target image– Each key point will have 2d location, scale and

orientation, as well as invariant descriptor vector

• For each key point, find similar descriptor vectors in reference image database. – Descriptor vector may match more than one reference

image database– The key point “votes” for image(s)

• Use best-bin-first algorithm

Page 28: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Hough Transform Clustering• Create 4D Hough Transform (HT) Space

for each reference image1. Orientation bin = 30° bin

2. Scale bin = 2

3. X location bin = 0.25*ref image width

4. Y location bin = 0.25*ref image height

• If key point “votes” for reference image, tally its vote in 4D HT Space.

– This gives estimate of location and pose– Keep list of which key points vote for a bin

Page 29: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Verification

• Identify bins with largest votes (must have at least 3).

• Using list of key points which voted for a cell, compute affine transformation parameters (m, t)– Use corresponding coordinates of reference

model (x,y) and target image (u,v).

Page 30: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Verification

• If more than three points, solve in least-squares sense

Page 31: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Verification: Remove Outliers

• After applying affine transformation to key points, determine difference between calculated location and actual target image location

• Throw out if:– Orientation different by 15°– Scale off by sqrt(2)– X,Y location by 0.2*model size

• Repeat least-squares solution until no points are thrown out

Page 32: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

SIFT Example

Page 33: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

SIFT Example

Page 34: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

SIFT example

Page 35: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Advantages of SIFT• Numerous keys can be generated for even small objects

• Partial occlusion/image clutter ok because dozens of SIFT keys may be associated with an object, but only need to find 3

• Object models can undergo limited affine projection.

– Planar shapes can be recognized at 60 degree rotation away from camera.

• Individual features can be matched to a large database of objects

Page 36: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Limitations of SIFT

• Fully affine tranformations require additional steps

• Many parameters “engineered” for specific application. May need to be evaluated on case-to-case basis

Page 37: Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Thank you!