Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij....

40
Structure from Motion Raul Queiroz Feitosa

Transcript of Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij....

Page 1: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

Structure from Motion

Raul Queiroz Feitosa

Page 2: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 2

Objectives

State the Structure from Motion (SFM) problem,

Introduce SFM techniques for small relief (affine), and

Introduce SFM techniques for large relief (projective).

Page 3: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 3

Contents

Introduction

Affine SFM

Projective SFM

Page 4: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 4

Structure from motion

Automatic recovery of camera motion and scene

structure from two or more images.

also called automatic camera tracking or matchmoving.

Page 5: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 5

Structure from motion

click here to

watch the

video

Page 6: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 6

Applications of SFM

Computer vision:

multiple-view shape reconstruction,

novel view synthesis,

autonomous vehicle navigation.

Film production:

insertion of computer-generated imagery (CGI) into live-

action backgrounds,

Page 7: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 7

Stereopsis × SFM

Stereopsis

Intrinsic camera parameters known (calibrated

cameras), and

Extrinsic camera parameters known (camera pose

relative to a fixed world coordinate system known)

Structure from Motion

Camera parameters are (partially) unknown and

possibly change over time.

Page 8: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 8

SFM problem formulation

Given the images pij seen by m cameras of n fixed

3D points Pj (homogeneous)

estimate the m projection matrices Mi and the 3D

points Pj from the mn correspondences pij.

Page 9: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 9

Structure from Motion

Contents:

Introduction

Affine SFM

Projective SFM

Page 10: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 10

Affine approximation If scene's relief is small compared with distance zj to the camera

the perspective projection may be approximated by affine

models.

increasing focal length / distance from camera

center at

infinite

Page 11: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 11

Affine model

For affine cameras (see chapter on camera models), the

previous equation takes the (nonhomogeneous) form

where Mi is the affine projection matrix, for i=1,…, m,

j=1,…,n.

Dropping i and j from the notation above, yields

iji

j

iij bPP

p

M

1

2×1 2×3 3×1 2×1

bPp

2

1

232221

131211

b

b

z

y

x

aaa

aaa

v

u

2×3 3×1 2×1

Page 12: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 12

Affine model

Thus

involves 2mn equations in 8m+3n unknowns.

Hence, for a sufficient number of views (m) and a sufficient number of points (n), camera motion (Mi) and

scene structure (Pi) may be recovered.

iji

j

iij bPP

p

M

1

2×1 2×4 3×1 2×3 3×1 2×1

Page 13: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 13

Affine ambiguity

Affine Ambiguity

If are solutions to the SFM affine problem, so are , where

and Q is an arbitrary affine transformation matrix, -

that is, it can be written as

,

where C is a nonsingular 3x3 matrix and d R3.

ji P and Mji P and M

1

1 and 1 jj

ii QQPP

MM

1

with 1

11

1

TTQQ

00

dd CCC

Page 14: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 14

Affine ambiguity

Affine Ambiguity

The affine SFM problem can only be defined up to an

affine transformation ambiguity.

Taking into account the 12 parameters of the general

affine transformation, a solution may exist as long as

2mn ≥ 8m+3n-12

Thus, for 2 views, we need 4 point correspondences.

# of Mi

# of Pj

Page 15: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 15

Offset to the center of mass

Translate the origin of the image coordinate system to the

center of mass pi0 of the observed points (pij → pij - pi0 ),

by doing

For simplicity, the origin of the world coordinate system

is moved to the centroid of the 3D points (Pj → Pj - P0 )

This yields

n

k

n

k

kjiikiiji

n

k

ikijijnnn 1 11

111PPbPbPppp

jiij Pp

Page 16: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 16

The measurement matrixx

Let’s create a 2m × n data (measurement) matrix D:

The measurement matrix D = P must have rank 3!

P

D

n

mmnmm

n

n

PPP

ppp

ppp

ppp

21

2

1

21

22221

11211

cam

eras

(2 m

)

points (n)

2m×n 2m×3 3×n

Page 17: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 17

Tomasi-Kanade factorization

Applying SVD to matrix D D=UWVT

Since rank(D)=3 (noise free)

U= W= VT=

Therefore

D= U3 W3 VT

3 = P

In practice, due to noise, inaccuracy in location of the interest points, and the mere fact that cameras are not affine, the matrix D is full rank.

U3 Un-3 W3 0

0 0

VT3.

VTn-3

2m×n 2m×3 3×3 3×n

Page 18: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 18

Tomasi-Kanade factorization algorithm

1. Compute the singular value decomposition.

D=UWVT

2. Construct the matrices U3, V3 and W3 formed by the three leftmost columns of the matrices U and V (the ones

corresponding to the largest singular values), and the corresponding 3×3 sub matrix of W.

3. Define

0= U3 (W3)½

P0 = (W3)½

VT

3

the 2m×3 matrix is an estimate of the camera motion, and the 3×n matrix is an estimate of the scene structure.

It can be proven that W3 is the closest rank-3 approximation of D.

Page 19: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 19

Recalling the affine ambiguity

The decomposition

0= U3 (W3)½

P0 = (W3)½

VT

3

is not unique.

For any 3×3 non singular matrix C

= 0 C and P = C-1 P0

is also a solution (affine ambiguity).

Page 20: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Geometric Camera Models 20

Affine camera models

Summary of affine mappings

p=M WP , where

Note that in all cases the 3rd row is =(0 0 0 1)!

1

2

1

0

y

T

x

T

t

t

r

r

M

1

2

1

0

y

T

x

T

t

t

r

r

M

1

2

1

0

y

T

x

T

t

t

r

r

M

11

0sin

0cot

2

1

0

y

T

x

T

t

t

r

r

M

orthographic - 5 dof scaled orthographic – 6 dof

weak-perspective - 7 dof general affine - 8 dof

Page 21: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 21

Eliminating the affine ambiguity

From a previous slide

the image coordinates pij are the projections of Pij, on the vectors aT

i1 and aTi2 , the rows of i.

Consider scaled orthographic projection: image

axes are perpendicular.

jiij Pp

2211

21

2211

21 0

ˆˆˆˆ

0ˆˆ

i

TT

ii

TT

i

i

TT

i

i

T

ii

T

i

i

T

i

aaaa

aa

aaaa

aa

CCCC

CC

Page 22: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 22

Eliminating the affine ambiguity Scaled orthographic: image axes are perpendicular

This translates into 2m quadratic equations in the elements of C

Solve for C

Update and P → = 0C and P = C-1 P0

Note that the solution is defined up to an arbitrary rotation.

Common practice: → first camera frame ≡ world frame.

p

P i1a

ia

2211

21

2211

21 0

ˆˆˆˆ

0ˆˆ

i

TT

ii

TT

i

i

TT

i

i

T

ii

T

i

i

T

i

aaaa

aa

aaaa

aa

CCCC

CC

Page 23: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 23

Tomasi-Kanade algorithm summary

Given: m images and n features pij

For each image i, center the feature coordinates

Construct a 2m × n measurement matrix D:

Column j has the projection of point Pj in all views

Row i has one coordinate of the projections of all n points in image i

Factorize D :

Compute SVD: D = U W VT

Create U3 by taking the first 3 columns of U

Create V3 by taking the first 3 columns of V

Create W3 by taking the upper left 3 × 3 block of W

Create the motion and shape matrices:

0 = U3W3½ and P0 = W3

½ V3T (or 0 = U3 and P0 = W3V3

T)

Eliminate affine ambiguity.

Source: M. Hebert

Page 25: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 25

Contents

Introduction

Affine SFM

Projective SFM

Page 26: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 26

Projective model

For projective cameras we have

for i=1,…,m and j=1,…,n where mTi1, m

Ti2, m

Ti3, are the

rows of a 3 ×4 matrix Mi and Pj denotes the homo-

geneous coordinate vector of the point Pj in some world

coordinate system.

or 1

13

2

3

1

ji

ji

ij

ji

ji

ij

ji

ij

ij

ij

ij

v

u

zv

u

Pm

Pm

Pm

Pm

Pp M

Page 27: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 27

Projective ambiguity

According to Theorem 1 (see chapter on Camera Model) Mi is an arbitrary rank-3 3×4 matrix.

Hence, if Mi and Pj are solutions for the equation

above, so are

M’i =Mi Q and P’j =Q-1 Pj ,

where Q is an arbitrary projective transformation matrix,

i.e., any non singular 4×4 matrix.

or 1

13

2

3

1

ji

ji

ij

ji

ji

ij

ji

ij

ij

ij

ij

v

u

zv

u

Pm

Pm

Pm

Pm

Pp M

Page 28: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 28

Projective ambiguity

The 4×4 matrix Q is defined up to a scale, since multiplying it

by a nonzero scalar amounts to applying inverse scaling to Mi

and Pj.

Thus, the projective SFM problem can only be defined up to a

projective transformation ambiguity.

Taking into account the projective ambiguity the problem has a

finite number of solutions as soon as

2mn ≥ 11m+3n-15

For two views, we need seven point correspondences.

# of Mi

# of Pj

Page 29: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

Bilinear projection SFM

As in algebraic reconstruction, we do

So, we look for a solution that minimizes

This can be done by alternating steps where Pj are kept constant (estimated) while Mi are estimated

(kept constant).

21-Jun-17 Structure from Motion 29

zij pij = Mi Pj pij Mi Pj = 0

2

ij

jiijE Pp M

Page 30: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 30

Bundle Adjustment

Given estimates for the matrices Mi (i=1,…,m) and vectors

Pj (j=1,…,n) , refine them by using non linear least square

to minimize the global reprojection error E:

Expensive, but encompasses a physically significant error

measure.

As initial guesses use Pj from Affine SFM e and from them compute Mi.

ji ji

ji

ij

ji

ji

ij vumn

E,

2

3

2

2

3

11

Pm

Pm

Pm

Pm

Page 31: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 31

Eliminating the projective ambiguity

Let us assume that Mi (i=1,…,m) and vectors Pj (j=1,…,n)

have been estimated.

If Mi and Pj are a reconstruction in a particular Euclidean

coordinate system, there exist a 4×4 matrix Q such that

The next slides show two methods of computing the Euclidean upgrade matrix Q, when (some of) the intrinsic

parameters are known.

iiii PP1ˆ and ˆ QQMM

ˆ ˆ

Page 32: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 32

Eliminating the projective ambiguity

Method 1 (calibrated cameras):

Mi is defined up to a scale, so

where ρi accounts for the scale and Ki is the calibration

matrix.

Writing Q=(Q3 q4), where Q3 is 4×3 matrix and q4 is

a vector in R4 , yields

If Ki is known (calibrated), can

be treated as a scaled rotation matrix, Thus ….

)( iiiii t KM

3 iiii KQM

ˆ

ˆ

3

1

iiii QMK

Page 33: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 33

Eliminating the projective ambiguity

Method 1 (cont.): … the rows (j=1,2,3) of are perpendicular vectors,

so

where Q3 is defined up to a scale. To determine it uniquely we assume that the world coordinate system coincides with the first camera’s frame, that means, =[Id 0].=

.

T

ijm

0

0

0

0

0

33332332

23321331

1333

3332

2331

i

TT

ii

TT

i

i

TT

ii

TT

i

i

TT

i

i

TT

i

i

TT

i

QQQQ

QQQQ

QQ

QQ

QQ

mmmm

mmmm

mm

mm

mm

ii MK 1

ii MK 1

Page 34: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 34

Eliminating the projective ambiguity

Method 1 (cont.):

Given m images, we obtain 5(m-1) quadratic equations in the coefficients of Q that can be solved using non

linear least squares.

The vector q4 of matrix Q

Q=(Q3 q4),

can be determined by assuming (arbitrarily) that the

origin of world coordinate system and the origin of the

first camera’s frame coincide.

Page 35: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 35

Eliminating the projective ambiguity

Method 2:

The previous equations are linear in the 10 coefficients of the symmetric matrix =Q3Q3

T (linear least

squares)

To enforce the rank 3 of , we do

=UΛUT,

where Λ is the diagonal matrix with the eigenvalues of and U is the orthogonal matrix formed by its

eigenvectors.

Page 36: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 36

Eliminating the projective ambiguity

Method 2 (cont.):

To enforce rank 3 of we do

Q=U3Λ3½

where U3 is formed by the 3 columns of U associated to

the 3 largest eigenvalues of and Λ3 is the

corresponding sub matrix of Λ.

=

Vector q4 can be computed as in the previous method.

Page 37: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 37

Eliminating the projective ambiguity

These methods can be adapted to the case where only some

of the intrinsic camera parameters are known.

Since i are orthogonal matrices

each image provides a set of constraints between the entries of i and .

2 T

iii

T

ii KKMM

Page 38: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 38

Eliminating the projective ambiguity

Assuming, for example that the center of the image is known for

each camera, we can take u0=v0=0 and

Each zero entry of provides 2 independent linear equations

in the 10 coefficients of the 4×4 matrix . With m ≥ 5 images, these

parameters can be estimated via linear least squares.

What if zero skew can be assumed?

T

i iKK

0

0

32

31

i

T

i

i

T

i

mm

mm

100

0sin

1

sin

cos

0sin

cos

sin

1

2

2

2

2

2

2

2

2

i

i

i

iii

i

iii

i

i

T

i i

KK

Page 39: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 39

Readings

C. Tomasi and T. Kanade. Shape and motion from image streams under

orthography: A factorization method. International Journal of Computer

Vision, 9(2):137-154, November 1992.

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,

Modeling, and Matching Video Clips Containing Multiple Moving Objects.

PAMI 2007.

Muhamad, S. and Hebert, M. (2000), Iterative Projective Reconstruction

from Multiple Views, Proc. IEEE Conference on Computer Vision and

Pattern Recognition, 2000, pp. II, 430-437.

Computer Vision - A modern approach 2nd ED , by D. Forsyth and J.

Ponce, 2012, chapter 8.

Page 40: Structure from Motion - PUC-Riovisao/Structure from Motion.pdf · from the mn correspondences p ij. 21-Jun-17 Structure from Motion 9 ... 21-Jun-17 Structure from Motion 28 Projective

21-Jun-17 Structure from Motion 40

Next Topic

Kalman Filter