Lecture 10: Multi-view geometry - Artificial...

77
Lecture 10 - Fei-Fei Li Lecture 10: Multi-view geometry Professor Fei-Fei Li Stanford Vision Lab 25-Oct-12 1

Transcript of Lecture 10: Multi-view geometry - Artificial...

Page 1: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

Lecture 10: Multi-view geometry

Professor Fei-Fei Li Stanford Vision Lab

25-Oct-12 1

Page 2: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

What we will learn today?

• Review for stereo vision • Correspondence problem (Problem Set 2 (Q3)) • Active stereo vision systems • Structure from motion

25-Oct-12 2

Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10

Page 3: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 3

Recovering structure from a single view Pinhole perspective projection

C

Ow

P p

Calibration rig

Scene

Camera K

Why is it so difficult?

Intrinsic ambiguity of the mapping from 3D to image (2D)

Page 4: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 4

Two eyes help!

O2 O1

p2

p1

?

This is called triangulation

K2 =known K1 =known

R, T

Page 5: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 5

Epipolar geometry

• Epipolar Plane • Epipoles e1, e2

• Epipolar Lines • Baseline

O1 O2

p2

P

p1

e1 e2

= intersections of baseline with image planes = projections of the other camera center = vanishing points of camera motion direction

Page 6: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 6

Triangulation

O O’

p

p’

P

0pFpT =′

F = Fundamental Matrix (Faugeras and Luong, 1992)

Page 7: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 7

Why is F useful?

- Suppose F is known - No additional information about the scene and camera is given - Given a point on left image, how can I find the corresponding point on right image?

l’ = FT p p

Page 8: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 8

Estimating F

O O’

p p’

P The Eight-Point Algorithm

(Hartley, 1995)

(Longuet-Higgins, 1981)

Lsq. solution by SVD!

• Rank 8 A non-zero solution exists (unique)

• Homogeneous system

• If N>8 F̂

f

1=f

W 0=f

W

Page 9: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 9

Rectification

p’

P

p

O O’

• Make two camera images “parallel” • Correspondence problem becomes easier

Page 10: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 10

Application: view morphing S. M. Seitz and C. R. Dyer, Proc. SIGGRAPH 96, 1996, 21-30

Page 11: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

What we will learn today?

• Review for stereo vision • Correspondence problem (Problem Set 2 (Q3)) • Active stereo vision systems • Structure from motion

25-Oct-12 11

Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10

Page 12: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 12

Stereo-view geometry

• Correspondence: Given a point in one image, how

can I find the corresponding point p’ in another one?

• Camera geometry: Given corresponding points in two images, find camera matrices, position and pose.

• Scene geometry: Find coordinates of 3D point from its projection into 2 or multiple images.

Previous lecture (#9)

Page 13: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 13

Stereo-view geometry

• Correspondence: Given a point in one image, how

can I find the corresponding point p’ in another one?

• Camera geometry: Given corresponding points in two images, find camera matrices, position and pose.

• Scene geometry: Find coordinates of 3D point from its projection into 2 or multiple images.

This lecture (#10)

Page 14: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 14

Depth estimation Pinhole perspective projection

Subgoals: - Solve the correspondence problem - Use corresponding observations to triangulate

p’

P

p

O1 O2

Page 15: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 15

Correspondence problem Pinhole perspective projection Pinhole perspective projection

p’

P

p

O1 O2

Given a point in 3D, discover corresponding observations in left and right images [also called binocular fusion problem]

Page 16: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 16

Correspondence problem Pinhole perspective projection Pinhole perspective projection

•A Cooperative Model (Marr and Poggio, 1976)

•Correlation Methods (1970--)

•Multi-Scale Edge Matching (Marr, Poggio and Grimson, 1979-81)

[FP] Chapters: 11

Page 17: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 17

Correlation Methods Pinhole perspective projection

p

• Pick up a window around p(u,v)

Page 18: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 18

Correlation Methods Pinhole perspective projection

• Pick up a window around p(u,v) • Build vector W • Slide the window along v line in image 2 and compute w’ • Keep sliding until w∙w’ is maximized.

Page 19: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 19

Correlation Methods Pinhole perspective projection

Normalized Correlation; maximize: )ww)(ww()ww)(ww(′−′−′−′−

Page 20: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 20

Correlation Methods Pinhole perspective projection Left Right

scanline

Norm. corr Credit slide S. Lazebnik

Page 21: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 21

Correlation Methods Pinhole perspective projection

– Smaller window - More detail - More noise

– Larger window

- Smoother disparity maps - Less prone to noise

Window size = 3 Window size = 20

Credit slide S. Lazebnik

Page 22: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 22

Issues with Correlation Methods Pinhole perspective projection

•Fore shortening effect

•Occlusions

O O’

O O’

It is desirable to have small B/z ratio!

Page 23: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 24

Issues with Correlation Methods Pinhole perspective projection

•Homogeneous regions

Hard to match pixels in these regions

Page 24: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 25

Issues with Correlation Methods Pinhole perspective projection

•Repetitive patterns

Page 25: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 26

Results with window search Pinhole perspective projection Data

Ground truth Window-based matching

Credit slide S. Lazebnik

Page 26: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 27

Improving correspondence: Non-local constraints

• Uniqueness – For any point in one image, there should be at most one

matching point in the other image

Page 27: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 28

Improving correspondence: Non-local constraints

• Uniqueness – For any point in one image, there should be at most one

matching point in the other image

• Ordering – Corresponding points should be in the same order in both

views

Courtesy J. Ponce

Page 28: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 29

Dynamic Programming Pinhole perspective projection (Baker and Binford, 1981)

•Nodes = matched feature points (e.g., edge points). • Arcs = matched intervals along the epipolar lines. • Arc cost = discrepancy between intervals. Find the minimum-cost path going monotonically down and right from the top-left corner of the graph to its bottom-right corner.

[Uses ordering constraint]

Courtesy J. Ponce

Page 29: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 30

Dynamic Programming (Baker and Binford, 1981)

(Ohta and Kanade, 1985)

Page 30: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 31

Improving correspondence: Non-local constraints

• Uniqueness – For any point in one image, there should be at most one

matching point in the other image

• Ordering – Corresponding points should be in the same order in both

views

• Smoothness Disparity is typically a smooth function of x (except in occluding boundaries)

Page 31: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 32

Stereo matching as energy minimization Pinhole perspective projection I1 I2 D

• Energy functions of this form can be minimized using graph cuts

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 01

W1(i ) W2(i+D(i )) D(i )

)(),,( smooth21data DEDIIEE βα +=

( )∑ −=ji

jDiDE,neighbors

smooth )()(ρ( )221data ))(()(∑ +−=i

iDiWiWE

Page 32: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 33

Stereo matching as energy minimization Pinhole perspective projection

Energy minimization Ground truth Window-based matching

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 01

Page 33: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

What we will learn today?

• Review for stereo vision • Correspondence problem (Problem Set 2 (Q3)) • Active stereo vision systems (see Friday Oct 28

TA Section) • Structure from motion

25-Oct-12 34

Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10

Page 34: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 35

Active stereo (point)

Replace one of the two cameras by a projector - Single camera - Projector geometry calibrated - What’s the advantage of having the projector?

P

O2

p’

Correspondence problem solved!

projector

Page 35: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 36

Active stereo (stripe)

O -Projector and camera are parallel - Correspondence problem solved!

projector

Page 36: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 37

Active stereo (stripe)

• Optical triangulation – Project a single stripe of laser light – Scan it across the surface of the object – This is a very precise version of structured light scanning

Digital Michelangelo Project http://graphics.stanford.edu/projects/mich/

Page 37: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 38

Active stereo (stripe)

Digital M

ichelangelo Project

http://graphics.stanford.edu/projects/mich/

Page 38: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 39

Active stereo (shadows)

Light source O

- 1 camera,1 light source - very cheap setup - calibrated light source

J. Bouguet & P. Perona, 99 S. Savarese, J. Bouguet & Perona, 00

Page 39: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 40

Active stereo (shadows) J. Bouguet & P. Perona, 99 S. Savarese, J. Bouguet & Perona, 00

Page 40: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 41

Active stereo (color-coded stripes)

- Dense reconstruction - Correspondence problem again - Get around it by using color codes

projector O

L. Zhang, B. Curless, and S. M. Seitz 2002 S. Rusinkiewicz & Levoy 2002

Page 41: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 42

Active stereo (color-coded stripes)

L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002

Rapid shape acquisition: Projector + stereo cameras

Page 42: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

What we will learn today?

• Review for stereo vision • Correspondence problem (Problem Set 2 (Q3)) • Active stereo vision systems • Structure from motion

25-Oct-12 43

Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10

Page 43: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 44

Multiple view geometry - the structure from motion problem

p1j

p2j

pmj

Pj

M1

M2

Mm

Given m images of n fixed 3D points; pij = Mi Pj , i = 1, … , m, j = 1, … , n

Page 44: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 45

Courtesy of Oxford Visual Geometry Group

Multiple view geometry - the structure from motion problem

Page 45: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

From the mxn correspondences pij, estimate: •m projection matrices Mi •n 3D points Pj

25-Oct-12 46

Multiple view geometry - the structure from motion problem

p1j

p2j

pmj

Pj

M1

M2

Mm

motion structure

Page 46: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

From the mxn correspondences pij, estimate: •m projection matrices Mi •n 3D points Pj

25-Oct-12 47

Multiple view geometry - the structure from motion problem

motion structure

• If 2 cameras and 2 frames, then back to the stereopsis case: Essential & Fundamental matrices. • But SfM is usually good for a generic scenario: no camera information, multi- ple images, etc.

• E.g. a video recorded from hand- held camcorder

Page 47: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 48

Structure from motion ambiguity

[ ]iiii TRKM =jij PMp =

jPH 1j HM −

• SFM can be solved up to a N-degree of freedom ambiguity

• In the general case (nothing is known) the ambiguity is expressed by an arbitrary affine or projective transformation

( )( )jijij PHHMPMp -1 ==

Page 48: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 49

Affine ambiguity

Page 49: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 50

Prospective ambiguity

Page 50: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 51

Structure from motion ambiguity

• A scale (similarity) ambiguity exists even for calibrated cameras • For calibrated cameras, the similarity ambiguity is the only ambiguity • A reconstruction up to scale is called metric

[Longuet-Higgins ’81]

Page 51: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

vTvtAProjective

15dof

Affine 12dof

Similarity 7dof

Euclidean 6dof

Preserves intersection and tangency

Preserves parallellism, volume ratios

Preserves angles, ratios of length

10tA

T

10tR

T

s

10tR

TPreserves angles, lengths

• With no constraints on the camera calibration matrix or on the scene, we get a projective reconstruction

• Need additional information to upgrade the reconstruction to affine, similarity, or Euclidean

Types of ambiguity

Page 52: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 53

Multiple view geometry - overview

1. Recover camera and geometry up to ambiguity • Use affine approximation if possible (affine SFM) • Algebraic methods • Factorization methods

2. Metric upgrade to obtain solution up to scale (remove perspective or affine ambiguity) – Self-calibration

3. Bundle adjustment (optimize solution across all observations)

Page 53: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 54

Affine structure from motion - a simpler problem

Image

Image

From the mxn correspondences pij, estimate: •m projection matrices Mi (affine cameras) •n 3D points Pj

p

p’

M = camera matrix

P

Page 54: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 55

Affine structure from motion - a simpler problem

P p

p’

;1

=+=

=

PbAPp M

vu [ ]bAM =

M = camera matrix

Image

Image

Page 55: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 56

Given m images of n fixed points Pj we can write

Problem: estimate the m 2×4 matrices Mi and the n positions Pj from the m×n correspondences pij .

2m × n equations in 8m+3n unknowns

Two approaches: - Algebraic approach (affine epipolar geometry; estimate F; cameras; points) - Factorization method

How many equations and how many unknown?

m cameras n points

Affine structure from motion - a simpler problem

Page 56: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 57

Factorization method C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method, 1992.

Two steps: • Centering the data • Factorization

( )

ji

n

kkji

n

kikiiji

n

kikijij

PPn

P

Pn

Ppn

pp

ˆ1

11ˆ

1

11

AA

bAbA

=

−=

+−+=−=

∑∑

=

==

Pk

pik pi ^

•Centering: subtract the centroid of the image points • Assume that the origin of the world coordinate system is at the centroid of the 3D points

Page 57: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 58

Factorization method C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method, 1992.

Two steps: • Centering the data • Factorization

jiij Pp A=ˆ

After centering, each normalized point pij is related to the 3D point Pi by

Page 58: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 59

Factorization method • Let’s create a 2m ×n data (measurement) matrix:

=

mnmm

n

n

ppp

pppppp

ˆˆˆ

ˆˆˆˆˆˆ

21

22221

11211

D cameras (2 m )

points (n )

Page 59: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 60

Factorization method • Let’s create a 2m × n data (measurement) matrix:

[ ]n

mmnmm

n

n

PPP

ppp

pppppp

212

1

21

22221

11211

ˆˆˆ

ˆˆˆˆˆˆ

=

=

A

AA

D

cameras (2 m × 3)

points (3 × n )

The measurement matrix D = M S has rank 3 (it’s a product of a 2mx3 matrix and 3xn matrix)

(2 m × n) M

S

Page 60: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 61

Factorization method

Source: M. Hebert

Page 61: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 62

Factorization method

Source: M. Hebert

• Singular value decomposition of D:

Page 62: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 63

Factorization method • Singular value decomposition of D:

Since rank (D)=3, there are only 3 non-zero singular values

Source: M. Hebert

Page 63: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 64

Factorization method • Obtaining a factorization from SVD:

Motion (cameras)

structure

What is the issue here? D has rank>3 because of - measurement noise - affine approximation

Source: M. Hebert

Page 64: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 65

Factorization method • Obtaining a factorization from SVD:

Motion (cameras) structure

Page 65: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 66

Factorization method

Source: M. Hebert

1. Given: m images and n features pij

2. For each image i, center the feature coordinates 3. Construct a 2m × n measurement matrix D:

– Column j contains the projection of point j in all views – Row i contains one coordinate of the projections of all the n

points in image i

4. Factorize D: – Compute SVD: D = U W VT

– Create U3 by taking the first 3 columns of U – Create V3 by taking the first 3 columns of V – Create W3 by taking the upper left 3 × 3 block of W

5. Create the motion and shape matrices: – M = M = U3 and S = W3V3

T (or U3W3½ and S = W3

½ V3T )

Page 66: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 67

Multiple view geometry - overview

• Recover camera and geometry up to ambiguity • Use affine approximation if possible (affine SFM) • Algebraic methods • Factorization methods

• Metric upgrade to obtain solution up to scale (remove perspective or affine ambiguity) – Self-calibration

• Bundle adjustment (optimize solution across all observations)

Page 67: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 68

Self-calibration

Condition N. Views •Constant internal parameters 3

•Aspect ratio and skew known •Focal length and offset vary

4*

•Aspect ratio and skew constant •Focal length and offset vary

5*

•skew =0, all other parameters vary 8*

•Prior knowledge on cameras or scene can be used to add constraints and remove ambiguities •Obtain metric reconstruction (up to scale)

[HZ] Chapters: 18

Page 68: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 69

Multiple view geometry - overview

• Recover camera and geometry up to ambiguity • Use affine approximation if possible (affine SFM) • Algebraic methods • Factorization methods

• Metric upgrade to obtain solution up to scale (remove ambiguity) – Self-calibration

• Bundle adjustment (optimize solution across all observations)

Page 69: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 70

Bundle adjustment

p1j

p2j

p3j

pj

P1

P2

P3

M1Pj

M2Pj M3Pj

?

• Non-linear method for refining structure and motion • Minimizing re-projection error • It can be used before or after metric upgrade

( )2

1 1,),( ∑∑

= =

=m

i

n

jjiij PMpDPME

[HZ] Chapters: 18

Page 70: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 71

Bundle adjustment

• Advantages • Handle large number of views • Handle missing data

• Limitations • Large minimization problem (parameters grow with number of views)

• Requires good initial condition

• Non-linear method for refining structure and motion • Minimizing re-projection error • It can be used before or after metric upgrade

( )2

1 1,),( ∑∑

= =

=m

i

n

jjiij PMpDPME

[HZ] Chapters: 18

Page 71: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 72

Stereo SDK vision software development kit A. Criminisi, A. Blake and D. Robertson

Page 72: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 73

Foreground/background segmentation using stereo V. Kolmogorov, A. Criminisi, A. Blake, G. Cross and C. Rother. Bi-layer segmentation of binocular stereo video CVPR 2005

http://research.microsoft.com/~antcrim/demos/ACriminisi_Recognition_CowDemo.wmv

Page 73: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 74

3D Urban Scene Modeling using a stereo system 3D Urban Scene Modeling Integrating Recognition and Reconstruction, N. Cornelis, B. Leibe, K. Cornelis, L. Van Gool, IJCV 08.

http://www.vision.ee.ethz.ch/showroom/index.en.html#

Page 74: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 75

Photo synth Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," ACM Transactions on Graphics (SIGGRAPH Proceedings),2006,

http://phototour.cs.washington.edu/bundler/

Bundler is a structure-from-motion (SfM) system for unordered image collections (for instance, images from the Internet) written in C and C++.

Page 75: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 76

Additional references D. Nistér, PhD thesis ’01 See also 5-point algorithm

M. Pollefeys et al 98---

M. Brown and D. G. Lowe, 05

Page 76: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li 25-Oct-12 77

Two-frame stereo correspondence algorithms

http://vision.middlebury.edu/stereo/

Page 77: Lecture 10: Multi-view geometry - Artificial …vision.stanford.edu/.../lecture10_multi_view_cs231a.pdfLecture 10 - Fei-Fei Li 29 25-Oct-12 Dynamic Programming Pinhole perspective

Lecture 10 -

Fei-Fei Li

What we have learned today

• Review for stereo vision • Correspondence problem (Problem Set 2 (Q3)) • Active stereo vision systems (see Friday Oct 28

TA Section) • Structure from motion

25-Oct-12 78

Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10