Lecture 12 andrew fitzgibbon - 3 d vision in a changing world

3D Vision in a Changing World

PEOPLE

Unwrap mosaics: a new representation for video editing SIGGRAPH 2008 Alex Rav-‐Acha, Pushmeet Kohli, Carsten Rother and Andrew Fitzgibbon.

What shape are dolphins? Building 3D morphable models from 2D images PAMI 2013 Tom Cashman, Andrew Fitzgibbon

User-‐Specific Hand Modeling from Monocular Depth Sequences CVPR 2014 Jonathan Taylor, Richard Stebbing, Varun Ramakrishna, Cem Keskin, Jamie Shotton, Shahram Izadi, Andrew Fitzgibbon, Aaron Hertzmann

Real-‐Time Non-‐Rigid Reconstruction Using an RGB-‐D Camera SIGGRAPH 2014 Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rehmann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, Marc Stamminger

RECOVER 3D SHAPE FROM ONE OR MORE IMAGES OR VIDEOS IN DYNAMIC SCENES WITHOUT DENSE (>50) POINT CORRESPONDENCES

Goal

3

FITTING HANDS TO 3D DATA

FITTING SUBDIVISION SURFACES TO 2D DATA

REALTIME MESH FITTING TO 3D 9

KINÊTRE 10

UNWRAP MOSAICS

TO BEGIN…

EARLY WORK

!  1998: we computed a decent 3D reconstruction of a 36-‐frame sequence

!  Giving 3D super-‐resolution !  And set ourselves the goal of solving a 1500-‐frame sequence

!  Leading to…

[FCZ98] Fitzgibbon, Cross & Zisserman, SMILE 1998

EARLY WORK

... but so flat, so dull ...

HOW DO I DO IT?

Optic flow Error Accumulation, No occlusion

Desired

The future of computer vision

28

I’M NOT GOING TO TELL YOU WHAT YOU WILL INVENT…

…I AM GOING TO TRY TO TELL YOU HOW TO DO IT.

The future of computer vision YOU!

29

HOW TO INVENT THE FUTURE

Say WHAT you want to do, not

HOW you’re going to do it.

Know the difference between MODEL and ALGORITHM

30

Model Algorithm

!  Describes how your data came into being

!  For 𝑖=1:𝑛 𝜈↓𝑖  ~ Gaussian(0,1) 𝑥↓𝑖 

=𝑐+ 𝑣↓𝑖 

! 

!  Describes how you will find model parameters

!  “Compute the mean”

𝑐= 1/𝑛 ∑𝑖↑▒𝑥↓𝑖  

31 MODEL, NOT ALGORITHM

𝑝(𝑥↓𝑖 )= (2𝜋)↑− 1/2  exp(− 1/2 (𝑥↓𝑖 −𝑐)↑2 ) 

Model Algorithm


!  For 𝑖=1:𝑛 𝜈↓𝑖  ~ Gaussian(0,𝜎) 𝑥↓𝑖 

=𝑐+ 𝑣↓𝑖 


!  “Mean and variance”

𝑐= 1/𝑛 ∑𝑖↑▒𝑥↓𝑖  

32 MODEL, NOT ALGORITHM

𝜎= (1/𝑛 ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2  )↑1/2   𝑝(𝑥↓𝑖 )= (2𝜋𝜎↑2 )↑− 1/2  exp(− 1/2 (𝑥↓𝑖 −𝑐)↑2 ) 

TIP: HOW TO WRITE A MULTIVARIATE GAUSSIAN 33

𝒩 𝒙𝝁, 𝚺 = |2𝜋𝚺|↑− 1/2  exp(− 1/2 (𝒙−𝝁)↑⊤ 𝚺↑−1 (𝒙−𝝁)) 

Model Algorithm


!  For 𝑖=1:𝑛 𝑘 ~ Categorical(𝜶) 𝒙↓𝑖  ~ Gaussian( 𝝁↓𝑘 , 𝚺↓𝑘 )

! 


!  “EM”

34 GAUSSIAN MIXTURE

𝑝(𝑥↓𝑘 )=∑𝑘↑▒𝛼↓𝑘 |2𝜋Σ↓𝑘 |↑− 1/2  exp(− 1/2 (𝑥↓𝑖 − 𝜇↓𝑘 )↑⊤ Σ↓𝑘↑−1 (𝑥↓𝑖 − 𝜇↓𝑘 ))  

WHAT, WHAT, HOW 35

Model 𝒑(𝒙↓𝒊 ) Objective Algorithm 𝒩(𝑥↓𝑖 ; 𝑐,1) min┬𝑐  ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2    mean

𝒩(𝑥↓𝑖 ; 𝑐,𝜎) min┬𝑐,𝜎  ∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2  + log 𝜎↑2  )   Mean & variance

∑𝑘↑▒𝛼↓𝑘 𝒩(𝑥↓𝑖 ;𝜇↓𝑘 , Σ↓𝑘 )  max┬█■𝛼↓1..𝐾  █■𝜇↓1..𝐾  Σ↓1..𝐾     ∑𝑖↑▒log 𝑝(𝑥↓𝑖 )    GMM-‐EM Bishop’96

DRILLDOWN: FITTING A GAUSSIAN 36

𝐸(𝑐,𝜎)=∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2  + log 𝜎↑2  ) 

𝑐→ ←𝜎

“Closed form” “Just do it”

!  Set 𝜕𝐸/𝜕𝑐 =0,𝜕𝐸/𝜕𝜎 =0

37 HOW TO MINIMIZE 𝐸(𝑐,𝜎)

𝐸(𝑐,𝜎)=∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2  + log 𝜎↑2  ) 

!  It’s quasiconvex. !  Are you relieved?

BUT HOW CAN FMINUNC POSSIBLY WORK? 38

Model Algorithm

!  For 𝑖=1:𝑛 𝜈↓𝑖  ~ Gaussian(0,𝜎) 𝑥↓𝑖 

=𝑐+ 𝑣↓𝑖 

39 AH, BUT I KNOW MORE: PRIOR ON 𝜎

Model Algorithm

!  𝜎 ~ Exponential(𝜆=0.5) For 𝑖=1:𝑛 𝜈↓𝑖  ~ Gaussian(0,𝜎) 𝑥↓𝑖 

=𝑐+ 𝑣↓𝑖 

40 AH, BUT I KNOW MORE: PRIOR ON 𝜎

𝐸(𝑐,𝜎)=∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2  + log 𝜎↑2  ) +𝜆𝜎

... Quick Matlab interlude ... (these scripts are on awful.codeplex.com)

41

A ONE-‐DIMENSIONAL PROBLEM

GRADIENT DESCENT

BISECTION METHODS

NEWTON'S METHOD

SUCCESS OF NEWTON’S METHOD

NEWTON'S METHOD

FAILURE OF NEWTON'S METHOD

HOW TO GET THE BEST OF BOTH WORLDS?

!  Over to you...

MULTIPLE DIMENSIONS

!  Alternation

!  Gradient descent

!  2nd order methods

HOW BIG CAN N BE?

!  Problem sizes can range from a handful of parameters to about 1 million.

!  First consider examples for which N=2 to make things easy to visualize.

!  Extension to higher dimensions follows trivially, although not all algorithms work well for all problem sizes.

NOTATION

DERIVATIVES

!  Can be computed: "  symbolically; use maple/mathematica "  numerically; by “finite differences”:

𝜕/𝜕𝑥  𝑓(𝑥,𝑦)= lim┬𝛿→0  𝑓(𝑥+𝛿, 𝑦)−𝑓(𝑥,𝑦)/𝛿   "  Numerical derivatives surprisingly accurate, but generally very

expensive "  Use only to check the symbolic ones

!  If you can’t compute derivatives, use Powell’s direction set method "  Probably best not to use Downhill simplex (fminsearch) unless

function very noisy

CONDITIONING

!  Be aware that these algorithms make some assumptions about the behaviour of your function "  try to ensure the minimum is in the unit cube, and start at

a corner "  keep all numbers “around 1” – don’t use pixel values of

O(500)

OPTIMIZERS IN MATLAB

!  Lsqnonlin !  Fminunc !  Fmincon

!  Use most specific !  Check options.LargeScale!

!  It’s quasiconvex. !  Are you relieved?

BUT HOW CAN FMINUNC POSSIBLY WORK? 70

“CLOSED FORM” 71

Direct least square fi.ng of ellipses A Fitzgibbon, M Pilu, RB Fisher Pa>ern Analysis and Machine Intelligence 1999 ~1800 Simultaneous linear esImaIon of mulIple view geometry and lens distorIon A Fitzgibbon Computer Vision and Pa>ern RecogniIon, 2001 ~200

!  The real world is not Gaussian in many ways… "  Perhaps most crucially for us, in robust estimation

TWO WRONG THINGS 72

Cauchy Laplace

WHAT, WHAT, HOW 73

Model 𝒑(𝒙↓𝒊 ) Objective Algorithm 𝒩(𝑥↓𝑖 ; 𝑐,1) ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2   mean

𝐶𝑎𝑢𝑐ℎ𝑦(𝑥↓𝑖 ;𝑐,𝛾) ∑𝑖↑▒log (1+ (𝑥↓𝑖 −𝑐)↑2 /𝛾↑2  ) +𝑓(𝛾)  ?

𝐿𝑎𝑝𝑙𝑎𝑐𝑒( 𝑥↓𝑖 ; 𝑐) ∑𝑖↑▒|𝑥↓𝑖 −𝑐|  ?

WHAT, WHAT, HOW 74

Model Algorithm 𝑥↓𝑖 =𝑐+𝓃 Mean, median, trimmed mean 𝑥↓𝑖 =𝑅𝑦↓𝑐(𝑖) +𝑇 “ICP” 𝒙↓𝑖 =𝐴𝒗↓𝑖 , 𝒙↓𝑖 ∈ ℝ↑𝑁 , 𝒗↓𝑖 ∈ ℝ↑𝑑  Principal Components Analysis

Principle Components Analysis PCA with missing data

𝒙↓𝑖 =𝑑𝑖𝑎𝑔(𝑊𝑒𝑖𝑔ℎ𝑡𝑠↓𝑖 )𝐴𝒗↓𝑖  “Imputation”

𝒙↓𝑖 =𝑓(𝜃↓𝑖 )𝐴𝒗↓𝑖  Transformed Components Analysis

GMM “EM”

!  So. Nonlinear optimization works.

!  Let’s look at nonrigid 3d reconstruction.

79

HOW DO I DO IT?

Non-‐Rigid Structure from Motion C Bregler, L Torresani, A Hertzmann, H Biermann CVPR 2000 – PAMI 2008

(311, 308) 311

308

(204, 285) 311

308

204

285

(142, 296) 311

308

204

285

142

296

311

308

204

285

142

296

*

*

204,285 142,296 311,308

2T

311

308

204

285

142

296

*

*

*

*

*

*

357

377

333

377

204,285 142,296 311,308

357,377 333,377

311

308

204

285

142

296

*

*

*

*

*

*

357

377

333

377

311

308

204

285

142

296

*

*

*

*

*

*

357

377

333

377

track no.

Frame no

1

2T

1 ntracks

(For this example: ntracks = 1135, T = 227)

MEASUREMENT MATRIX: S

Derive S=𝑃 𝑋, and factorize

…

𝑃↓1 

𝑃↓2 

𝑃↓𝑇 

𝑋↓1  𝑋↓𝑚 

track no.

Frame no

1

2T

1 ntracks


⋮ ⋮

𝑋↓𝑗 :4×1

𝑃↓𝑖 :2×4

!  3D Point 𝑋↓𝑗 =[█■𝑥↓𝑗  𝑦↓𝑗  𝑧↓𝑗  1 ] !  Affine camera 𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24  ]

!  2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗 

AFFINE STRUCTURE FROM MOTION

!  𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24  ], 𝑋↓𝑗 =[█■𝑥↓𝑗  𝑦↓𝑗  𝑧↓𝑗  1 ]

!  2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗  !  2T-‐D track [█■𝑥↓1𝑗  𝑥↓2𝑗  ⋮𝑥↓𝑇𝑗  ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ]𝑋↓𝑗 



!  2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗  !  [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚  ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ][█■𝑋↓1 &𝑋↓2 &…&𝑋↓𝑚  ]



!  2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗  !  [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚  ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ][█■𝑋↓1 &𝑋↓2 &…&𝑋↓𝑚  ]

!  Matrix factorization: 𝑆=𝑃𝑋, the same equations as PCA

!  Hint 1: Do not use SVD.


Derive S=𝑃 𝑋, and factorize

…

𝑃↓1 

𝑃↓2 

𝑃↓𝑇 

𝑋↓1  𝑋↓𝑚 

track no.

Frame no

1

2T

1 ntracks


⋮ ⋮

𝑋↓𝑗 :4×1

𝑃↓𝑖 :2×4 [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚  ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ][𝑋↓1  𝑋↓2 …𝑋↓𝑚 ]

Derive 𝑀=𝑃 (∑𝑘↑▒𝛼↓𝑘 𝐵↓𝑘 ) , and factorize

LINEAR SHAPE BASIS

[█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚  ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ][𝑋↓1  𝑋↓2 …𝑋↓𝑚 ]=[█■𝑃↓1  𝑃↓2  ⋮𝑃↓𝑇  ]×↓2 [█■𝛼↓1↑1 …𝛼↓1↑𝑘  ⋮𝛼↓𝑚↑1 …𝛼↓𝑚↑𝑘  ]×↓? [𝐵↓1  𝐵↓2 …𝐵↓𝐾 ]

𝛼↓𝑖0  ℬ↓0  𝛼↓𝑖1  ℬ↓1  𝛼↓𝑖2  ℬ↓2  + + 𝒳↓𝑖 =

EMBEDDING

𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇 

Orthographic: linear (in 𝑋) embedding in ℝ↑4  ) embedding in ℝ↑4  Perspective: (slightly) nonlinear embedding in ℝ↑3  Previous work on nonrigid case: embed into ℝ↑3𝐾 

INTERLUDE: WHAT IS A SURFACE? 99

!  Surface: mapping 𝑀(𝒖) from ℝ↑2 ↦ ℝ↑3  "  E.g. cylinder 𝑀(𝑢,𝑣)=(cos𝑢 , sin 𝑢 ,𝑣)

*the surface is actually the set {𝑀(𝑢;Θ)|𝑢∈Ω}

𝑢𝑣

INTERLUDE: WHAT IS A SURFACE? 100


!  Probably not all of ℝ↑2 , but a subset Ω

"  E.g. square Ω=[0,2𝜋)×[0,𝐻]

!  And we’ll look at parameterised surfaces 𝑀(𝒖;Θ) "  E.g. Cylinder 𝑀(𝑢,𝑣;𝑅,𝐻)=(𝑅cos𝑢 ,𝑅sin 𝑢 ,𝐻𝑣) with Ω=[0,2𝜋)×[0,1]


𝑢𝑣

EMBEDDING

𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇 

Orthographic: linear (in 𝑋) embedding in ℝ↑4  ) embedding in ℝ↑4  Perspective: (slightly) nonlinear embedding in ℝ↑3  Previous work on nonrigid case: embed into ℝ↑3𝐾 

EMBEDDING

𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇 

Orthographic: linear (in 𝑋) embedding in ℝ↑4  ) embedding in ℝ↑4  Perspective: (slightly) nonlinear embedding in ℝ↑3  Previous work on nonrigid case: embed into ℝ↑3𝐾  Our big idea: surfaces are mappings ℝ↑2 ↦ ℝ↑3  So embed (nonlinearly) into ℝ↑2 

NONLINEAR EMBEDDING INTO ℝ↑𝟐 

EVERYTHING’S EASY WITH A REFERENCE

STITCHING, DISCRETE MRF, ETC

“UNWRAP MOSAICS”

FAILURE CASES: NON-‐DISC TOPOLOGY

Something trickier

111

3D reconstruction of object classes from silhouettes

RIGID MODEL

Easy to make a rough rigid model. But need to: 1.  Match it to images 2.  Learn how it moves

MODEL REPRESENTATION

𝒳↓𝑛 =∑𝑘=0↑𝐾▒𝛼↓𝑖𝑘 ℬ↓𝑘  

𝛼↓𝑖0  ℬ↓0  𝛼↓𝑖1  ℬ↓1  𝛼↓𝑖2  ℬ↓2  + + 𝒳↓𝑖 =

Linear blend shapes: Image 𝑖 represented by coefficient vector 𝜶↓𝑖 =[𝛼↓𝑖1 ,…, 𝛼↓𝑖𝐾 ]

MODEL REPRESENTATION

𝒳↓𝑛 =∑𝑘=0↑𝐾▒𝛼↓𝑖𝑘 ℬ↓𝑘  

𝛼↓𝑖1  ℬ↓1  𝛼↓𝑖2  ℬ↓2  + + 𝒳↓𝑖 =

Linear blend shapes: Image 𝑖 represented by coefficient vector 𝜶↓𝑖 =[𝛼↓𝑖1 ,…, 𝛼↓𝑖𝐾 ]

ℬ↓0 

DATA TERMS

Image 𝑖

𝒔↓𝑖𝑗 , 𝒏↓𝑖𝑗 

Linear Blend Shapes Model: 𝑿↓𝑖 =∑𝑘↑▒𝛼↓𝑖𝑘 𝑩↓𝑘   Silhouette: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖𝑠↓𝑖𝑗 −𝜋(𝜃↓𝑖 , 𝑀(𝑢↓𝑖𝑗 , 𝑿↓𝑖 ))‖↑2   Normal: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖[█■𝑛↓𝑖𝑗 @0 ]− 𝑅↓𝑖 𝑁(𝑢↓𝑖𝑗 , 𝑿↓𝑖 )‖↑2  

SURFACE FITTING TO MULTIPLE SILHOUETTES

APPLICATIONS

Curve/surface fitting Parameter estimation “Bundle adjustment” (Video from our friends at G)

SURFACE FITTING

126

!  Given some data "  2d points, 3d points, silhouettes, shading,…

!  Fit a 3D surface "  Under appropriate priors: e.g. spatial/temporal smoothness "  And possibly other parameters: camera/light positions,

BRDF…

GOAL 127

SURFACE 128


𝑢𝑣



!  Probably not all of ℝ↑2 , but a subset Ω "  E.g. square Ω=[0,2𝜋)×[0,𝐻]

!  And we’ll look at parameterised surfaces 𝑀(𝒖;Θ) "  E.g. Cylinder 𝑀(𝑢,𝑣;𝑅,𝐻)=(𝑅cos𝑢 ,𝑅sin 𝑢 ,𝐻𝑣)

with Ω=[0,2𝜋)×[0,1]

SURFACE 129


𝑢𝑣

A PROXY PROBLEM

WE KNOW THE EXACT MODEL 132

𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3   𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6   )

AND ESTIMATING IT WELL GETS US CLOSE… 133


!  Given a parametric surface model 𝑀(𝒖;Θ)

!  And data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3 

!  And known correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω

!  Compute Θ↑∗ = argmin┬Θ  ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖  

!  For various “norms” ‖⋅‖

SURFACE FITTING: KNOWN CORRESPONDENCES 134


Problem:

!  Parametric surface model 𝑀(𝒖;Θ) !  Data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3  !  Correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω Θ↑∗ = argmin┬Θ  ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖   Solution:

!  Derivatives

𝜕𝐸/𝜕Θ =−2∑𝑖↑▒(𝒔↓𝑖 −𝑀(𝑢↓𝑖 ;Θ))⋅ 𝜕/𝜕Θ 𝑀( 𝒖↓𝑖 ;Θ)  !  And solve 𝜕𝐸/𝜕Θ =0

KNOWN CORRESPONDENCES: EASY 135


Input:

!  Parametric surface model 𝑀(𝒖;Θ) !  Data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3  !  Correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω Θ↑∗ = argmin┬Θ  ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖   Compute:

!  Assuming smooth 𝑀… See later KNOWN CORRESPONDENCES: EASY 136

interface Surface { Vec3 Eval_M(Vec2 𝑢, VecN Θ); // Derivatives Vec3 Eval_Mu(Vec2 𝑢, VecN Θ); Vec3 Eval_Mv(Vec2 𝑢, VecN Θ); Matrix Eval_M𝚯(Vec2 𝑢, VecN Θ); }; struct Problem { Vec3 s[m]; Vec2 u[m]; Surface M; Vec3m residuals(VecN Θ) -‐> out { for(i…) out[3i:3i+2] = s[i] – M.Eval_M(u[i], Θ); } Matrix3mxN jacobian(VecN Θ) -‐> J { … J[3i][p]= … M.Eval_Mu …

… M.Eval_MΘ … } }

Problem prob; VecN Θ = some_generic_initializer(…); Θ = LevenbergMarquardt(prob, Θ);

!  Given a parametric surface model 𝑀(𝒖;Θ) !  And data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3  !  And known correspondences {𝑢↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω !  Minimize sum of closest-‐point distances Θ↑∗ = argmin┬Θ  ∑𝑖↑▒min┬𝒖  ‖𝒔↓𝑖 −𝑀(𝒖;Θ)‖   

SURFACE FITTING: UNKNOWN CORRESPONDENCES 137

BAD SOLUTION: “ALTERNATION” OR “ICP”

ICP, a bad 1st-‐order method

𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3   𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6   ) Problem: Find Θ↑∗  Θ↑∗ = argmin┬Θ  ∑𝑖=1↑𝑚▒min┬𝑢  ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2     Solution: -‐  Get initial Θ↓0  -‐  Repeat

-‐  ∀𝑖:𝑢↓𝑖 ≔ argmin┬𝑢  ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2   -‐  Θ≔ argmin┬Θ  ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2   

JOINT MINIMIZATION OF 𝑢 AND 𝑋

𝐸(Θ)= ∑𝑖=1↑𝑚▒min┬𝑢↓𝑖   ‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2    �= min┬𝑢↓1..𝑚  ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2    � �𝐸({𝑢↓1 , …, 𝑢↓𝑚 },Θ)= ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2  

!  Joint minimization of {𝑢↓1 , …, 𝑢↓𝑚 } and 𝑋 using a non-‐linear optimiser. "  Move calculating the minimum distance “out” to the overall minimization problem.

BETTER SOLUTION: JUST USE LSQNONLIN

Lsqnonlin, 𝑚+6 params, slowed down 10x

𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3   𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6   ) Problem: Find Θ↑∗  Θ↑∗ = argmin┬Θ, u↓1 ,…, 𝑢↓𝑚   ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2    Solution: -‐  Get initial Θ↓0  -‐  ∀𝑖:𝑢↓𝑖 ≔ argmin┬𝑢  ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2   -‐  Θ↑∗ = lsqnonlin(𝐸,[Θ↓0 , 𝑢↓1 ,.., 𝑢↓𝑚 ])

[Or au_levmarq on http://awful.codeplex.com]

CONVERGENCE RATES

ICP, a bad 1st-‐order method A second order method, slowed down 10x

CONVERGENCE CURVES

10 -‐1 10

0 10 1 10

2 10 3

10 -‐1.4

10 -‐1.3

10 -‐1.2

10 -‐1.1

Time (sec)

Error

AH, BUT WHAT ABOUT TEST ERROR?

10 -‐1 10

0 10 1 10

2 10 3

10 -‐1.4

10 -‐1.3

10 -‐1.2

10 -‐1.1

Time (sec)

Error

NOT 𝑂(𝑚↑3 ) 144

10 2 10

3 10 4 10

5 10 6 10

-‐2

10 0

10 2

O(n) O(n 2 )

Number of data points

Time (sec)

Timings for n-‐point ellipse fit

fdgrad cp simplex marg 5+n ad lbfgs 5+n ad noHess 5+n ad Hess 5+n LSQ lev-‐marq Hsampson sampson

!  Say Θ∈ ℝ↑𝑁  for N=600, and 𝑚=40𝐾

!  Option 1: Alternate min over Θ and 𝑢 "  Very fast per iteration "  Requires lots of iterations

!  Option 2: Simultaneous optimization "  Needs smooth 𝑀 to compute derivatives "  Requires few iterations. "  Very slow per iteration unless

we make good use of sparsity.

MINIMIZATION STRATEGIES: SUMMARY 145

min┬Θ, 𝒖↓1..𝑚  ∑𝑖=1↑𝑚▒‖𝑠↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖  

Θ 𝒖↓𝟏 … 𝒖↓𝒎 

FITTING A CUBIC B-‐SPLINE

Contour: linear in control vertices 𝑋∈ ℝ↑2×4 , piecewise cubic in 𝑢

𝑀(𝑢;𝑋)= 𝒙↓𝑖 (− 𝑢 ↑3 /6 + 𝑢 ↑2 /2 − 𝑢 /2 + 1/6 )+ 𝒙↓⌊𝑢⌋⊕1 (𝑢 ↑3 /2 − 𝑢 ↑2 + 2/3 )+ 𝒙↓⌊𝑢⌋⊕2 (− 𝑢 ↑3 /2 + 𝑢 ↑2 /2 + 𝑢 /2 + 1/6 )+ 𝒙↓⌊𝑢⌋⊕3 (𝑢 ↑3 /6 )

𝑖=⌊𝑢⌋, 𝑢 =𝑢−𝑖

FITTING A CUBIC B-‐SPLINE

… …

#1 #10 #40

•  Fix 𝑋 and solve for 𝑢↓𝑖 = arg min┬𝑢∈Ω  ‖𝒔↓𝑖 −𝑀(𝑢;𝑋)‖↑2  

•  This is itself a nonlinear minimization

•  Then fix 𝑢↓1..𝑚  and solve for 𝑋

JOINT MINIMIZATION OF 𝑢 AND 𝑋

𝐸(𝑋)= ∑𝑖=1↑𝑚▒min┬𝑢↓𝑖   ‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;𝑋)‖↑2    𝐸({𝑢↓1 , …, 𝑢↓𝑚 }, 𝑋)= ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;𝑋)‖↑2  

→

!  Advantages "  Typically faster and better convergence. "  Greater model flexibility (e.g. ability to share correspondence points, pairwise terms).

!  Need to compute derivatives …

!  Joint minimization of {𝑢↓1 , …, 𝑢↓𝑚 } and 𝑋 using a non-‐linear optimiser. "  Move calculating the minimum distance “out” to the overall minimization problem.

TOOL: SUBDIVISION SURFACES

149

CONTROL MESH 151

Control mesh vertices 𝑉∈ ℝ↑3×𝑛  Here 𝑛=16

SUBDIV RULE: STEP 1. ADD NEW VERTICES 152

SUBDIV RULE: STEP 2. AVERAGE NEIGHBOURS 153

2 SUBDIVISIONS 154

3 SUBDIVISIONS 155

LIMIT SURFACE 156

Control mesh vertices 𝑉∈ ℝ↑3×𝑛  Here 𝑛=16 Blue surface is {𝑀(𝒖;𝑉) | 𝒖∈Ω} Ω is the grey surface

CONTROL VERTICES DEFINE THE SHAPE 157

Control mesh vertices 𝑉∈ ℝ↑3×𝑛  Here 𝑛=16 Blue surface is {𝑀(𝒖;𝑉) | 𝒖∈Ω} Ω is the grey surface

!  Mostly, 𝑀 is quite simple: 𝑀(𝒖;𝑋)=𝑀(𝑡, 𝑢,𝑣;𝒙↓1 ,…, 𝒙↓𝑛 )=∑█■𝑖+𝑗≤4𝑘=1..𝑛 ↑▒𝐴↓𝑖𝑗𝑘↑𝑡 𝑢↑𝑖 𝑣↑𝑗 𝒙↓𝑘  

"  Integer triangle id 𝑡 "  Quartic in 𝑢,𝑣 "  Linear in 𝑋 "  Easy derivatives

!  But… "  2nd Derivatives unbounded although normals well defined "  Piecewise parameter domain

SUBDIVISION SURFACE: PARAMETRIC FORM 158

EXAMPLES 159

SIGNAL: OBJECT SILHOUETTE

160

𝒔↓𝑖𝑗  2D point 𝒏↓𝑖𝑗  2D normal

DATA TERMS

Image 𝑖

𝒖↓𝑖𝑗  Contour generator preimage in 𝛀 (unknown)

c.g. point in 3D is 𝑀(𝒖↓𝑖𝑗 ;𝑿↓𝑖 )

DATA TERMS

Image 𝑖

𝒔↓𝑖𝑗 , 𝒏↓𝑖𝑗 

Linear Blend Shapes Model: 𝑿↓𝑖 =∑𝑘↑▒𝛼↓𝑖𝑘 𝑩↓𝑘   Silhouette: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖𝑠↓𝑖𝑗 −𝜋(𝜃↓𝑖 , 𝑀(𝑢↓𝑖𝑗 , 𝑿↓𝑖 ))‖↑2   Normal: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖[█■𝑛↓𝑖𝑗 @0 ]− 𝑅↓𝑖 𝑁(𝑢↓𝑖𝑗 , 𝑿↓𝑖 )‖↑2  

Data fidelity terms

“Technical” terms

Smoothing terms

OPTIMIZATION

!  Alternation++: "  Dynamic programming on discretized 𝑢 parameters "  Quasi-‐Newton on all parameters

!  NOT: "  Fit shapes, perform PCA, repeat

INITIAL ESTIMATE FOR MEAN SHAPE

This is true, but misleading

CONTINOUS OPTIMIZATION

!  Can focus on this term to understand entire optimization. "  Total number of residuals 𝑛 = number of silhouette points.

Say 300𝑁 (𝑁= number of images) ≈10,000 "  Total number of unknowns 2𝑛+𝐾𝑁+𝑚 where 𝑚≈3𝐾× number of vertices ≈3,000

EXAMPLE RESULTS 170

EXAMPLE RESULTS 171

OPTIMIZATION

NUMBER OF IMAGES 173

8 16 32

PARAMETER SENSITIVITY “Pixel” terms: noise level params “Dimensionless” terms “Smoothness” terms

𝐸=∑𝑖=1↑𝑛▒(𝐸↓𝑖↑sil + 𝐸↓𝑖↑norm + 𝐸↓𝑖↑con )  + ∑𝑖=1↑𝑛▒(𝐸↓𝑖↑cg + 𝐸↓𝑖↑reg )  + 𝝃↓𝟎↑𝟐 𝐸↓0↑tp + 𝝃↓𝐝𝐞𝐟↑𝟐 ∑𝑖=1↑𝑛▒𝐸↓𝑚↑tp  

Before we stop...

some very exciting recent work

178

!  For textured scenes, 3D is relatively easy "  Some algorithms look like PCA "  But don’t use SVD, use lsqnonlin

!  When freed from concerns of “linearity”, can use much more powerful tools (e.g. subdivision surfaces) "  But you must allow correspondences to vary "  And you must exploit sparsity

!  Future work: "  Video, More/less user intervention.

CONCLUSIONS ETC 179

A NOTE ON MANIFOLDS

180

GIVEN ®, WHAT IS P(X|®)?

-2 -1 0 1 2-2

-1

0

1

2

-2 -1 0 1 2-2

-1

0

1

2

APPROXIMATE BY MAX...

!  Not a great approximation?

COMPUTING P(X|®)

!  Not a great approximation? !  Think: what is the real noise model? !  Let’s work with this for a while...

COMPUTING P(X|®)

!  “I tried fminX and it didn’t work” "  Matlab’s implementations not necessarily the best "  You must supply derivatives "  They must be correct "  You must take care of sparseness "  You should choose a good parametrization

!  All parameters have a similar effect !  All parameters “around 1”

!  What about LBFGS? !  What about netscale?

DISCUSSION POINTS: OPTIMIZATION

Lecture 12 andrew fitzgibbon - 3 d vision in a changing world

Software

Transcript of Lecture 12 andrew fitzgibbon - 3 d vision in a changing world