EECS 274 Computer Vision Local Invariant Features.

101
EECS 274 Computer Vision Local Invariant Features

Transcript of EECS 274 Computer Vision Local Invariant Features.

Page 1: EECS 274 Computer Vision Local Invariant Features.

EECS 274 Computer Vision

Local Invariant Features

Page 2: EECS 274 Computer Vision Local Invariant Features.

:Local features

• Matching with Harris Detector• Scale-invariant Feature Detection• Scale Invariant Image Descriptors• Affine-invariant Feature Detection• Object Recognition• SIFT Features

• Reading: S Chapter 4

Page 3: EECS 274 Computer Vision Local Invariant Features.

Examples

Page 4: EECS 274 Computer Vision Local Invariant Features.

Features

Page 5: EECS 274 Computer Vision Local Invariant Features.

What local features to use?

Page 6: EECS 274 Computer Vision Local Invariant Features.

Aperture problem

Ambiguity of 1-dimensional motion perception

Stripes moved upward 6 pixels

Stripes moved left 5 pixels

Page 7: EECS 274 Computer Vision Local Invariant Features.

Introduction

Local invariant photometric descriptors

( )local descriptor

Local : robust to occlusion/clutter + no segmentationPhotometric : distinctiveInvariant : to image transformations + illumination changes

Page 8: EECS 274 Computer Vision Local Invariant Features.

History - Recognition

Color histogram [Swain 91]

b

g

r

Each pixel is describedby a color vector

Distribution of color vectorsis described by a histogram

not robust to occlusion, not invariant, not distinctive

Page 9: EECS 274 Computer Vision Local Invariant Features.

History - Recognition

Eigenimages [Turk 91]• Each face vector is represented in the eigenimage space

– eigenvectors with the highest eigenvalues = eigenimages

• The new image is projected into the eigenimage space– determine the closest face

. .1v.

3v

2v.

not robust to occlusion, requires segmentation, not invariant, discriminant

Page 10: EECS 274 Computer Vision Local Invariant Features.

History - Recognition

Geometric invariants [Rothwell 92]• Function with a value independent of the

transformation

• Invariant for image rotation : distance of two points

• Invariant for planar homography : cross-ratio local and invariant, not discriminant, requires sub-pixel extraction of primitives

tt yxTyxyxfyxf ),(),( where),(),(

Page 11: EECS 274 Computer Vision Local Invariant Features.

History - Recognition

Problems : occlusion, clutter, image transformations, distinctiveness

Solution : recognition with local photometric invariants

[ Local greyvalue invariants for image retrieval, C. Schmid and R. Mohr, PAMI 1997 ]

Page 12: EECS 274 Computer Vision Local Invariant Features.

Approach

1) Extraction of interest points (characteristic locations)

2) Computation of local descriptors

3) Determining correspondences

4) Selection of similar images

( )local descriptor

Page 13: EECS 274 Computer Vision Local Invariant Features.

Matching with interest points

• Extraction of interest points with the Harris detector

• Comparison of points with cross-correlation

• Verification with the fundamental matrix

Page 14: EECS 274 Computer Vision Local Invariant Features.

Moravec corner detector

• Developed for Stanford Cart in 1977

Page 15: EECS 274 Computer Vision Local Invariant Features.

Moravec corner detectorChange of intensity for the shift [u,v]:

2,

( , ) ( , ) ( , ) ( , )x y

E u v w x y I x u y v I x y

IntensityShifted intensity

Window function

Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1)Look for local maxima in min(E)

Page 16: EECS 274 Computer Vision Local Invariant Features.

Problems of Moravec detector• Noisy response due to a binary window function• Only a set of shifts at every 45 degree is

considered• Only minimum of E is taken into account

Harris corner detector (1988) solves these problems.

Page 17: EECS 274 Computer Vision Local Invariant Features.

Harris detector

Based on the idea of auto-correlation

Important difference in all directions interest point

Page 18: EECS 274 Computer Vision Local Invariant Features.

Interest points

Geometric featuresrepeatable under transformations

2D characteristics of the signalhigh informational content

Comparison of different detectors [Schmid98] Harris detector

Page 19: EECS 274 Computer Vision Local Invariant Features.

Harris detectorAuto-correlation function for a point and a shift

Discrete shifts can be avoided with the auto-correlation matrix

),( yx ),(),( yxvu

x y

vyuxIyxIyxwvuE 2)),(),()(,(),(

v

uyxIyxIyxIvyuxI yx )),(),((),(),(with

2

2

,

2

,

,,

2

2

*

)),()(,(),(),(*),(

),(),(*),()),((*),(

),(),(),(),(

yyx

yxx

T

yxy

yxyx

yxyx

yxx

yyx

x

III

IIIwA

A

v

u

yxIyxwyxIyxIyxw

yxIyxIyxwyxIyxw

vu

v

uyxIyxIyxwvuE

uu

Page 20: EECS 274 Computer Vision Local Invariant Features.

Comparison of detectors: Rotation

[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]

repeatability = #good matches/mean(#points)

Page 21: EECS 274 Computer Vision Local Invariant Features.

Comparison of detectors: Perspective

[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]

repeatability = #good matches/mean(#points)

Page 22: EECS 274 Computer Vision Local Invariant Features.

Harris corner detectorIntensity change in shifting window: eigenvalue analysis

1, 2 – eigenvalues of A

direction of the slowest change

direction of the fastest change

(max)-1/2

(min)-1/2

Ellipse E(u,v) = const

2

2

*

),()(

yyx

yxx

T

III

IIIwA

AvuEE uuu

uncertainty ellipse

Shi and Tomasi use min(1, 2)

to locate good features to track

Page 23: EECS 274 Computer Vision Local Invariant Features.

Harris corner detector

1

2

Corner1 and 2 are large, 1 ~ 2;E increases in all directions

1 and 2 are small;E is almost constant in all directions

edge 1 >> 2

edge 2 >> 1

flat

Classification of image points using eigenvalues of M:

Page 24: EECS 274 Computer Vision Local Invariant Features.

Harris detection• Auto-correlation matrix

– captures the structure of the local neighborhood

– measure based on eigenvalues of this matrix• 2 strong eigenvalues => interest point• 1 strong eigenvalue => contour• 0 eigenvalue => uniform region

• Interest point detection– threshold on the eigenvalues– local maximum for localization

Page 25: EECS 274 Computer Vision Local Invariant Features.

Harris corner detectorMeasure of corner response:

(k – empirical constant, k = 0.04-0.06)

22121

2

)(

))(()det(

k

AtracekAR

Page 26: EECS 274 Computer Vision Local Invariant Features.

Example

Page 27: EECS 274 Computer Vision Local Invariant Features.

Example

Page 28: EECS 274 Computer Vision Local Invariant Features.

Example

Page 29: EECS 274 Computer Vision Local Invariant Features.

Good features

Using auto-correlation or Hessian matrix

Page 30: EECS 274 Computer Vision Local Invariant Features.

Local descriptors

Descriptors characterize the local neighborhood of a point

local descriptor

( )

Page 31: EECS 274 Computer Vision Local Invariant Features.

Local jet

Convolution of image I with Gaussian derivatives

)(*),(

)(*),(

)(*),(

)(*),(

)(),(

)(),(

),(

yy

xy

xx

y

x

GyxI

GyxI

GyxI

GyxI

GyxI

GyxI

yxv

)2

),(exp(

2

1),),((

),(),()(),(

2

2

2

tt yx

yxG

ydxdyyxxIyxGGyxI

Page 32: EECS 274 Computer Vision Local Invariant Features.

N-Jet, local jet

nsor epsilon te ricantisymmet theis where

2

2

)(

ij

yyyyxyxyxxxx

yyxx

yyyyyxxyxxxx

yyxx

kjiijk

lkijklij

ljiijkkkjiij

llijkklkijklij

ijij

ii

jiji

ii

LLLLLL

LL

LLLLLLLL

LLLL

L

LLLL

LLLL

LLLLLLLL

LLLLLLLL

LL

L

LLL

LL

L

• Invariance to image rotation : differential invariants [Koen87]

Page 33: EECS 274 Computer Vision Local Invariant Features.

Local descriptors

Robustness to illumination changes

In case of an affine transformation

2

2

2

2

2/1

2/3

)(

)(

)(

)

)(

)(

)(

)(

ii

kjiijk

ii

lkijklij

ii

kjiijkkkjiij

ii

llijkklkijklij

ii

jiij

ii

ii

ii

jiji

LL

LLLL

LL

LLLL

LL

LLLLLLLL

LL

LLLLLLLL

LL

LL

LL

L

LL

LLL

baII )()( 21 xx

or normalization of the image patch with mean and variance

Page 34: EECS 274 Computer Vision Local Invariant Features.

Determining correspondences

Vector comparison using the Mahalanobis distance

=?

)()(),( 1 qpqpqp TMdist

( )( )

Page 35: EECS 274 Computer Vision Local Invariant Features.

Selection of similar images

• In a large database – voting algorithm– additional constraints

• Rapid acces with an indexing mechanism

Page 36: EECS 274 Computer Vision Local Invariant Features.

Voting algorithm

local characteristicsvector of

( )

Page 37: EECS 274 Computer Vision Local Invariant Features.

Voting algorithm

} }

1 1 02 1 1

I is the corresponding model image1

2 1 1

Page 38: EECS 274 Computer Vision Local Invariant Features.

Additional constraints

• Semi-local constraints– neighboring points should match– angles, length ratios should be similar

• Global constraints– robust estimation of the image transformation (homogaphy,

epipolar geometry)

11

2

3

2

3

Page 39: EECS 274 Computer Vision Local Invariant Features.

Results

database with ~1000 images

Page 40: EECS 274 Computer Vision Local Invariant Features.

Results

Page 41: EECS 274 Computer Vision Local Invariant Features.

Harris detector

Interest points extracted with Harris (~ 500 points)

Page 42: EECS 274 Computer Vision Local Invariant Features.

Cross-correlation matching

Initial matches (188 pairs)

Page 43: EECS 274 Computer Vision Local Invariant Features.

Global constraints

Robust estimation of the fundamental matrix

99 inliers 89 outliers

Page 44: EECS 274 Computer Vision Local Invariant Features.

Summary of Harris detector

• Very good results in the presence of occlusion and clutter– local information– discriminant greyvalue information – invariance to image rotation and illumination

• Not invariance to scale and affine changes

• Solution for more general view point changes– local invariant descriptors to scale and rotation– extraction of invariant points and regions

Page 45: EECS 274 Computer Vision Local Invariant Features.

Scale Invariant Feature Detection• Consider two images of the same scene,

related by a scale change (i.e. zooming)• How are their scale space representations

related? Scale Space Theory (Lindeberg ’98):

Normalized derivatives have the same value atcorresponding relative scales

handat task for theparameter free a is

),,()',','('

'

)','(),(

),()','(

, s,derivative normalized

)1(

2

2/2/

tyxLstyxL

tst

yxIyxI

sysxyx

tt

m

yx

Page 46: EECS 274 Computer Vision Local Invariant Features.

Harris detector + scale changes

Page 47: EECS 274 Computer Vision Local Invariant Features.

Scale Adapted Harris Detector

Many corresponding points at which the scale factor correspondsto scale change between images

Page 48: EECS 274 Computer Vision Local Invariant Features.

Scale invariant Harris points• Multi-scale extraction of Harris interest points

• Selection of points at characteristic scale in scale space

Laplacian

Chacteristic scale :

- maximum in scale space

- scale invariant

Page 49: EECS 274 Computer Vision Local Invariant Features.

Scale invariant interest points

invariant points + associated regions [Mikolajczyk & Schmid’01] Harris-Laplacian Feature

multi-scale Harris points

selection of points

at the characteristic scale

with Laplacian

Page 50: EECS 274 Computer Vision Local Invariant Features.

Harris detector – adaptation to scale

Page 51: EECS 274 Computer Vision Local Invariant Features.

Evaluation of scale invariant detectors

repeatability – scale changes

Page 52: EECS 274 Computer Vision Local Invariant Features.

SIFT: Overview

• 1999• Generates image features,

“keypoints”– invariant to image scaling and rotation– partially invariant to change in

illumination and 3D camera viewpoint– many can be extracted from typical

images– highly distinctive

Page 53: EECS 274 Computer Vision Local Invariant Features.

Algorithm overview

1. Scale-space extrema detection– Uses difference-of-Gaussian function

2. Keypoint localization– Sub-pixel location and scale fit to a

model

3. Orientation assignment– 1 or more for each keypoint

4. Keypoint descriptor– Created from local image gradients

Page 54: EECS 274 Computer Vision Local Invariant Features.

Scale space

• Definition: where

),(),,(),,( yxIyxGyxL 222 2/)(

22

1),,(

yxeyxG

Page 55: EECS 274 Computer Vision Local Invariant Features.

Scale space

• Keypoints are detected using scale-space extrema in difference-of-Gaussian function D

• D definition:

• Efficient to compute

),,(),,(

),()),,(),,((),,(

yxLkyxL

yxIyxGkyxGyxD

Page 56: EECS 274 Computer Vision Local Invariant Features.

Relationship of D to

• Close approximation to scale-normalized Laplacian of Gaussian,

• Diffusion equation:

• Approximate ∂G/∂σ:– giving,

• When D has scales differing by a constant

factor it already incorporates the σ2 scale normalization required for scale-invariance

• e.g.,

G22

G22

GG 2

k

yxGkyxGG ),,(),,(

Gk

yxGkyxG 2),,(),,(

GkyxGkyxG 22)1(),,(),,(

2k

Page 57: EECS 274 Computer Vision Local Invariant Features.

Scale space construction

2k2σ

2kσ

σ

2kσ

σ

Page 58: EECS 274 Computer Vision Local Invariant Features.

Scale space• A collection of images obtained by progressively smoothing

the input image• Analogous to gradually reducing image resolution • See Vedaldi’s implementation (http://www.vlfeat.org/ ) for

details• Discretized scales

)),(min(logO 3,S

octaves of # : octave,per scales of # :

1,,,1,,0,6.1

,2),(

2

minmin0

),()1,(),()/(

0

hw

soosoosooSs

II

Os

OoooSs

IIDoGos

Page 59: EECS 274 Computer Vision Local Invariant Features.

Scale space images

first octave

second octave

third octave

fourth octave

Page 60: EECS 274 Computer Vision Local Invariant Features.

Difference-of-Gaussian images

first octave

second octave

third octave

fourth octave

Page 61: EECS 274 Computer Vision Local Invariant Features.

Frequency of sampling

• There is no minimum• Best frequency determined experimentally

Page 62: EECS 274 Computer Vision Local Invariant Features.

Prior smoothing for each octave

• Increasing σ increases robustness, but costs• σ = 1.6 a good tradeoff• Doubling the image initially increases

number of keypoints

Page 63: EECS 274 Computer Vision Local Invariant Features.

Finding extrema

• Sample point D(x,y,σ) is selected only if it is a minimum or a maximum of these points

DoG scale space

Extrema in this image

Page 64: EECS 274 Computer Vision Local Invariant Features.

Localization

• 3D quadratic function is fit to the local sample points

• Start with Taylor expansion with sample point as the origin– where

• Take the derivative with respect to X, and set it to 0, giving

• is the location of the keypoint• This is a 3x3 linear system

2

2

2

1)(

DDDD T

T

Tyx ),,(

XX

D

X

D ˆ02

2

DD2

12

ˆ

Page 65: EECS 274 Computer Vision Local Invariant Features.

Localization

• Hessian and derivative approximated by finite differences,– example:

• If X is > 0.5 in any dimension, process repeated

x

Dy

D

D

x

y

x

D

yx

D

x

Dyx

D

y

D

y

Dx

D

y

DD

2

222

2

2

22

22

2

2

4

)()(

1

2

2

,11

,11

,11

,11

2

,1

,,1

2

2

,1

,1

jik

jik

jik

jik

jik

jik

jik

jik

jik

DDDD

y

D

DDDD

DDD

XX

D

X

D ˆ02

2

Page 66: EECS 274 Computer Vision Local Invariant Features.

Filtering

• Contrast (use prev. equation):– If |D(X)| < 0.03, throw it out

• Edge-iness:– Use ratio of principal curvatures to throw out

poorly defined peaks– Curvatures come from Hessian:– Ratio of Trace(H)2 and Determinant(H)

– If ratio > (r+1)2/(r), throw it out (SIFT uses r=10)

XD

DDT

ˆ2

1)ˆ(

yyxy

xyxx

DD

DDH

22

2 )(

)(

)(,)()(

)(

HDet

HTrratioDDDHDet

DDHTr

xyyyxx

yyxx

Page 67: EECS 274 Computer Vision Local Invariant Features.

Orientation assignment

• Descriptor computed relative to keypoint’s orientation achieves rotation invariance

• Precomputed along with mag. for all levels (useful in descriptor computation)

• Multiple orientations assigned to keypoints from an orientation histogram– Significantly improve stability of matching

))),1(),1(/())1,()1,(((2tan),(

))1,()1,(()),1(),1((),( 22

yxLyxLyxLyxLayx

yxLyxLyxLyxLyxm

Page 68: EECS 274 Computer Vision Local Invariant Features.

Keypoint images

Page 69: EECS 274 Computer Vision Local Invariant Features.

Keypoint Selection

Finding extremawith DoG

Removing|D(X)| < 0.03

Removing|D(X)| < 0.03

Page 70: EECS 274 Computer Vision Local Invariant Features.

Select canonical orientation• Create histogram of local

gradient directions computed at selected scale

• Assign canonical orientation at peak of smoothed histogram

• Each key specifies stable 2D coordinates (x, y, scale, orientation)

0 2

Page 71: EECS 274 Computer Vision Local Invariant Features.

Descriptor

• Descriptor has 3 dimensions (x,y,θ)• Orientation histogram of gradient magnitudes• Position and orientation of each gradient sample

rotated relative to keypoint orientation

Page 72: EECS 274 Computer Vision Local Invariant Features.

Descriptor

• Weight magnitude of each sample point by Gaussian weighting function

• Distribute each sample to adjacent bins by trilinear interpolation (avoids boundary effects)

Page 73: EECS 274 Computer Vision Local Invariant Features.

Descriptor• Best results achieved with 4x4x8 =

128 descriptor size• Normalize to unit length

– Reduces effect of illumination change

• Cap each element to 0.2, normalize again– Reduces non-linear illumination changes– 0.2 determined experimentally

Page 74: EECS 274 Computer Vision Local Invariant Features.

Object detection

• Create a database of keypoints from training images

• Match keypoints to a database– Nearest neighbor

search

Page 75: EECS 274 Computer Vision Local Invariant Features.

PCA-SIFT

• Different descriptor (same keypoints)• Apply PCA to the gradient patch• Descriptor size is 20 (instead of 128)• More robust, faster

Page 76: EECS 274 Computer Vision Local Invariant Features.

SIFT: Summary

• Scale space• Difference-of-Gaussian• Localization• Filtering• Orientation assignment• Descriptor, 128 elements

Page 77: EECS 274 Computer Vision Local Invariant Features.

Object recognition

• Definition: Identify an object and determine its pose and model parameters

• Commercial object recognition– $4 billion/year industry for inspection and

assembly– Almost entirely based on template matching

• Upcoming applications– Mobile robots, toys, user interfaces– Location recognition– Digital camera panoramas, 3D scene modeling

Page 78: EECS 274 Computer Vision Local Invariant Features.

Invariant local features• Image content => local feature coordinates invariant to translation, rotation, scale• Technical details regarding

– Keypoint matching (using 1st and 2nd nearest neighbors)– Efficient nearest neighbor indexing – Clustering with Hough transform (model location, orientation, scale)– Account for affine distortion

Features

Page 79: EECS 274 Computer Vision Local Invariant Features.

Advantages of invariant local features

• Locality: features are local, so robust to occlusion and clutter (no prior segmentation)

• Distinctiveness: individual features can be matched to a large database of objects

• Quantity: many features can be generated for even small objects

• Efficiency: close to real-time performance

Page 80: EECS 274 Computer Vision Local Invariant Features.
Page 81: EECS 274 Computer Vision Local Invariant Features.
Page 82: EECS 274 Computer Vision Local Invariant Features.

Experimental evaluation

Page 83: EECS 274 Computer Vision Local Invariant Features.

Scale change (factor 2.5)

Harris-Laplace DoG

Page 84: EECS 274 Computer Vision Local Invariant Features.

Viewpoint change (60 degrees)

Harris-Affine (Harris-Laplace)

Page 85: EECS 274 Computer Vision Local Invariant Features.

Descriptors - conclusion

• SIFT + steerable perform best

• Performance of the descriptor independent of the detector

• Errors due to imprecision in region estimation, localization

Page 86: EECS 274 Computer Vision Local Invariant Features.

Image retrieval

…> 5000images

change in viewing angle

Page 87: EECS 274 Computer Vision Local Invariant Features.

Matches

22 correct matches

Page 88: EECS 274 Computer Vision Local Invariant Features.

Image retrieval

…> 5000images

change in viewing angle

+ scale change

Page 89: EECS 274 Computer Vision Local Invariant Features.

Matches

33 correct matches

Page 90: EECS 274 Computer Vision Local Invariant Features.

Multiple panoramas from an unordered image set

Page 91: EECS 274 Computer Vision Local Invariant Features.

Image registration and blending

Page 92: EECS 274 Computer Vision Local Invariant Features.

Location recognition

Page 93: EECS 274 Computer Vision Local Invariant Features.

Robot localization• Joint work with Stephen Se, Jim Little

Page 94: EECS 274 Computer Vision Local Invariant Features.

Recognizing panoramas• Matthew Brown and David Lowe• Recognize overlap from an unordered set of

images and automatically stitch together• SIFT features provide initial feature matching• Image blending at multiple scales hides the

seams

Panorama automatically assembled from 143 images

Page 95: EECS 274 Computer Vision Local Invariant Features.

Map continuously built over time

Page 96: EECS 274 Computer Vision Local Invariant Features.

Planar recognition• Planar surfaces can be

reliably recognized at a rotation of 60° away from the camera

• Affine fit approximates perspective projection

• Only 3 points are needed for recognition

Page 97: EECS 274 Computer Vision Local Invariant Features.

3D object recognition• Extract

outlines with background subtraction

Page 98: EECS 274 Computer Vision Local Invariant Features.

3D object recognition

• Only 3 keys are needed for recognition, so extra keys provide robustness

• Affine model is no longer as accurate

Page 99: EECS 274 Computer Vision Local Invariant Features.

Recognition under occlusion

Page 100: EECS 274 Computer Vision Local Invariant Features.

Comparison to template matching

• Costs of template matching– 250,000 locations x 30 orientations x 4 scales =

30,000,000 evaluations– Does not easily handle partial occlusion and other

variation without large increase in template numbers– Viola & Jones cascade must start again for each

qualitatively different template

• Costs of local feature approach– 3000 evaluations (reduction by factor of 10,000)– Features are more invariant to illumination, 3D rotation,

and object variation– Use of many small subtemplates increases robustness to

partial occlusion and other variations

Page 101: EECS 274 Computer Vision Local Invariant Features.

More local features

• Harris Laplace• Harris Affine• Hessian detector• Hessian Laplace• Hessian Affine• MSER (Maximally Stable Extremal

Regions)• SURF (Speeded-Up Robust Feature)