Lecture 12 andrew fitzgibbon - 3 d vision in a changing world
Transcript of Lecture 12 andrew fitzgibbon - 3 d vision in a changing world
PEOPLE
Unwrap mosaics: a new representation for video editing SIGGRAPH 2008 Alex Rav-‐Acha, Pushmeet Kohli, Carsten Rother and Andrew Fitzgibbon.
What shape are dolphins? Building 3D morphable models from 2D images PAMI 2013 Tom Cashman, Andrew Fitzgibbon
User-‐Specific Hand Modeling from Monocular Depth Sequences CVPR 2014 Jonathan Taylor, Richard Stebbing, Varun Ramakrishna, Cem Keskin, Jamie Shotton, Shahram Izadi, Andrew Fitzgibbon, Aaron Hertzmann
Real-‐Time Non-‐Rigid Reconstruction Using an RGB-‐D Camera SIGGRAPH 2014 Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rehmann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, Marc Stamminger
RECOVER 3D SHAPE FROM ONE OR MORE IMAGES OR VIDEOS IN DYNAMIC SCENES WITHOUT DENSE (>50) POINT CORRESPONDENCES
Goal
3
EARLY WORK
! 1998: we computed a decent 3D reconstruction of a 36-‐frame sequence
! Giving 3D super-‐resolution ! And set ourselves the goal of solving a 1500-‐frame sequence
! Leading to…
[FCZ98] Fitzgibbon, Cross & Zisserman, SMILE 1998
I’M NOT GOING TO TELL YOU WHAT YOU WILL INVENT…
…I AM GOING TO TRY TO TELL YOU HOW TO DO IT.
The future of computer vision YOU!
29
HOW TO INVENT THE FUTURE
Say WHAT you want to do, not
HOW you’re going to do it.
Know the difference between MODEL and ALGORITHM
30
Model Algorithm
! Describes how your data came into being
! For 𝑖=1:𝑛 𝜈↓𝑖 ~ Gaussian(0,1) 𝑥↓𝑖
=𝑐+ 𝑣↓𝑖
!
! Describes how you will find model parameters
! “Compute the mean”
𝑐= 1/𝑛 ∑𝑖↑▒𝑥↓𝑖
31 MODEL, NOT ALGORITHM
𝑝(𝑥↓𝑖 )= (2𝜋)↑− 1/2 exp(− 1/2 (𝑥↓𝑖 −𝑐)↑2 )
Model Algorithm
! Describes how your data came into being
! For 𝑖=1:𝑛 𝜈↓𝑖 ~ Gaussian(0,𝜎) 𝑥↓𝑖
=𝑐+ 𝑣↓𝑖
! Describes how you will find model parameters
! “Mean and variance”
𝑐= 1/𝑛 ∑𝑖↑▒𝑥↓𝑖
32 MODEL, NOT ALGORITHM
𝜎= (1/𝑛 ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2 )↑1/2 𝑝(𝑥↓𝑖 )= (2𝜋𝜎↑2 )↑− 1/2 exp(− 1/2 (𝑥↓𝑖 −𝑐)↑2 )
Model Algorithm
! Describes how your data came into being
! For 𝑖=1:𝑛 𝑘 ~ Categorical(𝜶) 𝒙↓𝑖 ~ Gaussian( 𝝁↓𝑘 , 𝚺↓𝑘 )
!
! Describes how you will find model parameters
! “EM”
34 GAUSSIAN MIXTURE
𝑝(𝑥↓𝑘 )=∑𝑘↑▒𝛼↓𝑘 |2𝜋Σ↓𝑘 |↑− 1/2 exp(− 1/2 (𝑥↓𝑖 − 𝜇↓𝑘 )↑⊤ Σ↓𝑘↑−1 (𝑥↓𝑖 − 𝜇↓𝑘 ))
WHAT, WHAT, HOW 35
Model 𝒑(𝒙↓𝒊 ) Objective Algorithm 𝒩(𝑥↓𝑖 ; 𝑐,1) min┬𝑐 ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2 mean
𝒩(𝑥↓𝑖 ; 𝑐,𝜎) min┬𝑐,𝜎 ∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2 + log 𝜎↑2 ) Mean & variance
∑𝑘↑▒𝛼↓𝑘 𝒩(𝑥↓𝑖 ;𝜇↓𝑘 , Σ↓𝑘 ) max┬█■𝛼↓1..𝐾 █■𝜇↓1..𝐾 Σ↓1..𝐾 ∑𝑖↑▒log 𝑝(𝑥↓𝑖 ) GMM-‐EM Bishop’96
“Closed form” “Just do it”
! Set 𝜕𝐸/𝜕𝑐 =0,𝜕𝐸/𝜕𝜎 =0
37 HOW TO MINIMIZE 𝐸(𝑐,𝜎)
𝐸(𝑐,𝜎)=∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2 + log 𝜎↑2 )
Model Algorithm
! 𝜎 ~ Exponential(𝜆=0.5) For 𝑖=1:𝑛 𝜈↓𝑖 ~ Gaussian(0,𝜎) 𝑥↓𝑖
=𝑐+ 𝑣↓𝑖
40 AH, BUT I KNOW MORE: PRIOR ON 𝜎
𝐸(𝑐,𝜎)=∑𝑖↑▒((𝑥↓𝑖 −𝑐)↑2 /𝜎↑2 + log 𝜎↑2 ) +𝜆𝜎
HOW BIG CAN N BE?
! Problem sizes can range from a handful of parameters to about 1 million.
! First consider examples for which N=2 to make things easy to visualize.
! Extension to higher dimensions follows trivially, although not all algorithms work well for all problem sizes.
DERIVATIVES
! Can be computed: " symbolically; use maple/mathematica " numerically; by “finite differences”:
𝜕/𝜕𝑥 𝑓(𝑥,𝑦)= lim┬𝛿→0 𝑓(𝑥+𝛿, 𝑦)−𝑓(𝑥,𝑦)/𝛿 " Numerical derivatives surprisingly accurate, but generally very
expensive " Use only to check the symbolic ones
! If you can’t compute derivatives, use Powell’s direction set method " Probably best not to use Downhill simplex (fminsearch) unless
function very noisy
CONDITIONING
! Be aware that these algorithms make some assumptions about the behaviour of your function " try to ensure the minimum is in the unit cube, and start at
a corner " keep all numbers “around 1” – don’t use pixel values of
O(500)
OPTIMIZERS IN MATLAB
! Lsqnonlin ! Fminunc ! Fmincon
! Use most specific ! Check options.LargeScale!
“CLOSED FORM” 71
Direct least square fi.ng of ellipses A Fitzgibbon, M Pilu, RB Fisher Pa>ern Analysis and Machine Intelligence 1999 ~1800 Simultaneous linear esImaIon of mulIple view geometry and lens distorIon A Fitzgibbon Computer Vision and Pa>ern RecogniIon, 2001 ~200
! The real world is not Gaussian in many ways… " Perhaps most crucially for us, in robust estimation
TWO WRONG THINGS 72
Cauchy Laplace
WHAT, WHAT, HOW 73
Model 𝒑(𝒙↓𝒊 ) Objective Algorithm 𝒩(𝑥↓𝑖 ; 𝑐,1) ∑𝑖↑▒(𝑥↓𝑖 −𝑐)↑2 mean
𝐶𝑎𝑢𝑐ℎ𝑦(𝑥↓𝑖 ;𝑐,𝛾) ∑𝑖↑▒log (1+ (𝑥↓𝑖 −𝑐)↑2 /𝛾↑2 ) +𝑓(𝛾) ?
𝐿𝑎𝑝𝑙𝑎𝑐𝑒( 𝑥↓𝑖 ; 𝑐) ∑𝑖↑▒|𝑥↓𝑖 −𝑐| ?
WHAT, WHAT, HOW 74
Model Algorithm 𝑥↓𝑖 =𝑐+𝓃 Mean, median, trimmed mean 𝑥↓𝑖 =𝑅𝑦↓𝑐(𝑖) +𝑇 “ICP” 𝒙↓𝑖 =𝐴𝒗↓𝑖 , 𝒙↓𝑖 ∈ ℝ↑𝑁 , 𝒗↓𝑖 ∈ ℝ↑𝑑 Principal Components Analysis
Principle Components Analysis PCA with missing data
𝒙↓𝑖 =𝑑𝑖𝑎𝑔(𝑊𝑒𝑖𝑔ℎ𝑡𝑠↓𝑖 )𝐴𝒗↓𝑖 “Imputation”
𝒙↓𝑖 =𝑓(𝜃↓𝑖 )𝐴𝒗↓𝑖 Transformed Components Analysis
GMM “EM”
HOW DO I DO IT?
Non-‐Rigid Structure from Motion C Bregler, L Torresani, A Hertzmann, H Biermann CVPR 2000 – PAMI 2008
*
*
*
*
357
377
333
377
204,285 142,296 311,308
357,377 333,377
311
308
204
285
142
296
*
*
*
*
*
*
357
377
333
377
Derive S=𝑃 𝑋, and factorize
…
𝑃↓1
𝑃↓2
𝑃↓𝑇
𝑋↓1 𝑋↓𝑚
track no.
Frame no
1
2T
1 ntracks
(For this example: ntracks = 1135, T = 227)
⋮ ⋮
𝑋↓𝑗 :4×1
𝑃↓𝑖 :2×4
! 3D Point 𝑋↓𝑗 =[█■𝑥↓𝑗 𝑦↓𝑗 𝑧↓𝑗 1 ] ! Affine camera 𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24 ]
! 2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗
AFFINE STRUCTURE FROM MOTION
! 𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24 ], 𝑋↓𝑗 =[█■𝑥↓𝑗 𝑦↓𝑗 𝑧↓𝑗 1 ]
! 2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗 ! 2T-‐D track [█■𝑥↓1𝑗 𝑥↓2𝑗 ⋮𝑥↓𝑇𝑗 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ]𝑋↓𝑗
AFFINE STRUCTURE FROM MOTION
! 𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24 ], 𝑋↓𝑗 =[█■𝑥↓𝑗 𝑦↓𝑗 𝑧↓𝑗 1 ]
! 2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗 ! [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ][█■𝑋↓1 &𝑋↓2 &…&𝑋↓𝑚 ]
AFFINE STRUCTURE FROM MOTION
! 𝑃↓𝑖 =[█■𝑝↓𝑖11 &𝑝↓𝑖13 &𝑝↓𝑖13 &𝑝↓𝑖14 @𝑝↓𝑖21 &𝑝↓𝑖22 &𝑝↓𝑖24 &𝑝↓𝑖24 ], 𝑋↓𝑗 =[█■𝑥↓𝑗 𝑦↓𝑗 𝑧↓𝑗 1 ]
! 2D point 𝑥↓𝑖𝑗 = 𝑃↓𝑖 𝑋↓𝑗 ! [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ][█■𝑋↓1 &𝑋↓2 &…&𝑋↓𝑚 ]
! Matrix factorization: 𝑆=𝑃𝑋, the same equations as PCA
! Hint 1: Do not use SVD.
AFFINE STRUCTURE FROM MOTION
Derive S=𝑃 𝑋, and factorize
…
𝑃↓1
𝑃↓2
𝑃↓𝑇
𝑋↓1 𝑋↓𝑚
track no.
Frame no
1
2T
1 ntracks
(For this example: ntracks = 1135, T = 227)
⋮ ⋮
𝑋↓𝑗 :4×1
𝑃↓𝑖 :2×4 [█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ][𝑋↓1 𝑋↓2 …𝑋↓𝑚 ]
Derive 𝑀=𝑃 (∑𝑘↑▒𝛼↓𝑘 𝐵↓𝑘 ) , and factorize
LINEAR SHAPE BASIS
[█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ][𝑋↓1 𝑋↓2 …𝑋↓𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ]×↓2 [█■𝛼↓1↑1 …𝛼↓1↑𝑘 ⋮𝛼↓𝑚↑1 …𝛼↓𝑚↑𝑘 ]×↓? [𝐵↓1 𝐵↓2 …𝐵↓𝐾 ]
𝛼↓𝑖0 ℬ↓0 𝛼↓𝑖1 ℬ↓1 𝛼↓𝑖2 ℬ↓2 + + 𝒳↓𝑖 =
Derive 𝑀=𝑃 (∑𝑘↑▒𝛼↓𝑘 𝐵↓𝑘 ) , and factorize
LINEAR SHAPE BASIS
[█■𝑥↓11 &𝑥↓12 &…&𝑥↓1𝑚 @𝑥↓21 &𝑥↓22 &…&𝑥↓2𝑚 @⋮&⋮&⋱&⋮@𝑥↓𝑇1 &𝑥↓𝑇2 &…&𝑥↓𝑇𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ][𝑋↓1 𝑋↓2 …𝑋↓𝑚 ]=[█■𝑃↓1 𝑃↓2 ⋮𝑃↓𝑇 ]×↓2 [█■𝛼↓1↑1 …𝛼↓1↑𝑘 ⋮𝛼↓𝑚↑1 …𝛼↓𝑚↑𝑘 ]×↓? [𝐵↓1 𝐵↓2 …𝐵↓𝐾 ]
𝛼↓𝑖0 ℬ↓0 𝛼↓𝑖1 ℬ↓1 𝛼↓𝑖2 ℬ↓2 + + 𝒳↓𝑖 =
EMBEDDING
𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇
Orthographic: linear (in 𝑋) embedding in ℝ↑4 ) embedding in ℝ↑4 Perspective: (slightly) nonlinear embedding in ℝ↑3 Previous work on nonrigid case: embed into ℝ↑3𝐾
INTERLUDE: WHAT IS A SURFACE? 99
! Surface: mapping 𝑀(𝒖) from ℝ↑2 ↦ ℝ↑3 " E.g. cylinder 𝑀(𝑢,𝑣)=(cos𝑢 , sin 𝑢 ,𝑣)
*the surface is actually the set {𝑀(𝑢;Θ)|𝑢∈Ω}
𝑢𝑣
INTERLUDE: WHAT IS A SURFACE? 100
! Surface: mapping 𝑀(𝒖) from ℝ↑2 ↦ ℝ↑3 " E.g. cylinder 𝑀(𝑢,𝑣)=(cos𝑢 , sin 𝑢 ,𝑣)
! Probably not all of ℝ↑2 , but a subset Ω
" E.g. square Ω=[0,2𝜋)×[0,𝐻]
! And we’ll look at parameterised surfaces 𝑀(𝒖;Θ) " E.g. Cylinder 𝑀(𝑢,𝑣;𝑅,𝐻)=(𝑅cos𝑢 ,𝑅sin 𝑢 ,𝐻𝑣) with Ω=[0,2𝜋)×[0,1]
*the surface is actually the set {𝑀(𝑢;Θ)|𝑢∈Ω}
𝑢𝑣
EMBEDDING
𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇
Orthographic: linear (in 𝑋) embedding in ℝ↑4 ) embedding in ℝ↑4 Perspective: (slightly) nonlinear embedding in ℝ↑3 Previous work on nonrigid case: embed into ℝ↑3𝐾
EMBEDDING
𝑆↓:,𝑖 =𝜋(𝑋↓𝑖 ) 𝜋:ℝ↑𝑟 ↦ ℝ↑2𝑇
Orthographic: linear (in 𝑋) embedding in ℝ↑4 ) embedding in ℝ↑4 Perspective: (slightly) nonlinear embedding in ℝ↑3 Previous work on nonrigid case: embed into ℝ↑3𝐾 Our big idea: surfaces are mappings ℝ↑2 ↦ ℝ↑3 So embed (nonlinearly) into ℝ↑2
RIGID MODEL
Easy to make a rough rigid model. But need to: 1. Match it to images 2. Learn how it moves
MODEL REPRESENTATION
𝒳↓𝑛 =∑𝑘=0↑𝐾▒𝛼↓𝑖𝑘 ℬ↓𝑘
𝛼↓𝑖0 ℬ↓0 𝛼↓𝑖1 ℬ↓1 𝛼↓𝑖2 ℬ↓2 + + 𝒳↓𝑖 =
Linear blend shapes: Image 𝑖 represented by coefficient vector 𝜶↓𝑖 =[𝛼↓𝑖1 ,…, 𝛼↓𝑖𝐾 ]
MODEL REPRESENTATION
𝒳↓𝑛 =∑𝑘=0↑𝐾▒𝛼↓𝑖𝑘 ℬ↓𝑘
𝛼↓𝑖1 ℬ↓1 𝛼↓𝑖2 ℬ↓2 + + 𝒳↓𝑖 =
Linear blend shapes: Image 𝑖 represented by coefficient vector 𝜶↓𝑖 =[𝛼↓𝑖1 ,…, 𝛼↓𝑖𝐾 ]
ℬ↓0
DATA TERMS
Image 𝑖
𝒔↓𝑖𝑗 , 𝒏↓𝑖𝑗
Linear Blend Shapes Model: 𝑿↓𝑖 =∑𝑘↑▒𝛼↓𝑖𝑘 𝑩↓𝑘 Silhouette: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖𝑠↓𝑖𝑗 −𝜋(𝜃↓𝑖 , 𝑀(𝑢↓𝑖𝑗 , 𝑿↓𝑖 ))‖↑2 Normal: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖[█■𝑛↓𝑖𝑗 @0 ]− 𝑅↓𝑖 𝑁(𝑢↓𝑖𝑗 , 𝑿↓𝑖 )‖↑2
APPLICATIONS
Curve/surface fitting Parameter estimation “Bundle adjustment” (Video from our friends at G)
! Given some data " 2d points, 3d points, silhouettes, shading,…
! Fit a 3D surface " Under appropriate priors: e.g. spatial/temporal smoothness " And possibly other parameters: camera/light positions,
BRDF…
GOAL 127
SURFACE 128
*the surface is actually the set {𝑀(𝑢;Θ)|𝑢∈Ω}
𝑢𝑣
! Surface: mapping 𝑀(𝒖) from ℝ↑2 ↦ ℝ↑3 " E.g. cylinder 𝑀(𝑢,𝑣)=(cos𝑢 , sin 𝑢 ,𝑣)
! Surface: mapping 𝑀(𝒖) from ℝ↑2 ↦ ℝ↑3 " E.g. cylinder 𝑀(𝑢,𝑣)=(cos𝑢 , sin 𝑢 ,𝑣)
! Probably not all of ℝ↑2 , but a subset Ω " E.g. square Ω=[0,2𝜋)×[0,𝐻]
! And we’ll look at parameterised surfaces 𝑀(𝒖;Θ) " E.g. Cylinder 𝑀(𝑢,𝑣;𝑅,𝐻)=(𝑅cos𝑢 ,𝑅sin 𝑢 ,𝐻𝑣)
with Ω=[0,2𝜋)×[0,1]
SURFACE 129
*the surface is actually the set {𝑀(𝑢;Θ)|𝑢∈Ω}
𝑢𝑣
AND ESTIMATING IT WELL GETS US CLOSE… 133
𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3 𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6 )
! Given a parametric surface model 𝑀(𝒖;Θ)
! And data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3
! And known correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω
! Compute Θ↑∗ = argmin┬Θ ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖
! For various “norms” ‖⋅‖
SURFACE FITTING: KNOWN CORRESPONDENCES 134
𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3 𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6 )
Problem:
! Parametric surface model 𝑀(𝒖;Θ) ! Data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3 ! Correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω Θ↑∗ = argmin┬Θ ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖ Solution:
! Derivatives
𝜕𝐸/𝜕Θ =−2∑𝑖↑▒(𝒔↓𝑖 −𝑀(𝑢↓𝑖 ;Θ))⋅ 𝜕/𝜕Θ 𝑀( 𝒖↓𝑖 ;Θ) ! And solve 𝜕𝐸/𝜕Θ =0
KNOWN CORRESPONDENCES: EASY 135
𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3 𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6 )
Input:
! Parametric surface model 𝑀(𝒖;Θ) ! Data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3 ! Correspondences {𝒖↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω Θ↑∗ = argmin┬Θ ∑𝑖↑▒‖𝒔↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖ Compute:
! Assuming smooth 𝑀… See later KNOWN CORRESPONDENCES: EASY 136
interface Surface { Vec3 Eval_M(Vec2 𝑢, VecN Θ); // Derivatives Vec3 Eval_Mu(Vec2 𝑢, VecN Θ); Vec3 Eval_Mv(Vec2 𝑢, VecN Θ); Matrix Eval_M𝚯(Vec2 𝑢, VecN Θ); }; struct Problem { Vec3 s[m]; Vec2 u[m]; Surface M; Vec3m residuals(VecN Θ) -‐> out { for(i…) out[3i:3i+2] = s[i] – M.Eval_M(u[i], Θ); } Matrix3mxN jacobian(VecN Θ) -‐> J { … J[3i][p]= … M.Eval_Mu …
… M.Eval_MΘ … } }
Problem prob; VecN Θ = some_generic_initializer(…); Θ = LevenbergMarquardt(prob, Θ);
! Given a parametric surface model 𝑀(𝒖;Θ) ! And data samples {𝒔↓𝑖 }↓𝑖=1↑𝑚 ⊂ ℝ↑3 ! And known correspondences {𝑢↓𝑖 }↓𝑖=1↑𝑚 ⊂Ω ! Minimize sum of closest-‐point distances Θ↑∗ = argmin┬Θ ∑𝑖↑▒min┬𝒖 ‖𝒔↓𝑖 −𝑀(𝒖;Θ)‖
SURFACE FITTING: UNKNOWN CORRESPONDENCES 137
BAD SOLUTION: “ALTERNATION” OR “ICP”
ICP, a bad 1st-‐order method
𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3 𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6 ) Problem: Find Θ↑∗ Θ↑∗ = argmin┬Θ ∑𝑖=1↑𝑚▒min┬𝑢 ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2 Solution: -‐ Get initial Θ↓0 -‐ Repeat
-‐ ∀𝑖:𝑢↓𝑖 ≔ argmin┬𝑢 ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2 -‐ Θ≔ argmin┬Θ ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2
JOINT MINIMIZATION OF 𝑢 AND 𝑋
𝐸(Θ)= ∑𝑖=1↑𝑚▒min┬𝑢↓𝑖 ‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2 �= min┬𝑢↓1..𝑚 ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2 � �𝐸({𝑢↓1 , …, 𝑢↓𝑚 },Θ)= ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2
! Joint minimization of {𝑢↓1 , …, 𝑢↓𝑚 } and 𝑋 using a non-‐linear optimiser. " Move calculating the minimum distance “out” to the overall minimization problem.
BETTER SOLUTION: JUST USE LSQNONLIN
Lsqnonlin, 𝑚+6 params, slowed down 10x
𝑀(𝑢,Θ)=(█■𝜃↓1 cos𝑢+ 𝜃↓2 sin 𝑢 + 𝜃↓3 𝜃↓4 cos𝑢+ 𝜃↓5 sin 𝑢 + 𝜃↓6 ) Problem: Find Θ↑∗ Θ↑∗ = argmin┬Θ, u↓1 ,…, 𝑢↓𝑚 ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;Θ)‖↑2 Solution: -‐ Get initial Θ↓0 -‐ ∀𝑖:𝑢↓𝑖 ≔ argmin┬𝑢 ‖𝒔↓𝑖 −𝑀(𝑢;Θ)‖↑2 -‐ Θ↑∗ = lsqnonlin(𝐸,[Θ↓0 , 𝑢↓1 ,.., 𝑢↓𝑚 ])
[Or au_levmarq on http://awful.codeplex.com]
CONVERGENCE CURVES
10 -‐1 10
0 10 1 10
2 10 3
10 -‐1.4
10 -‐1.3
10 -‐1.2
10 -‐1.1
Time (sec)
Error
AH, BUT WHAT ABOUT TEST ERROR?
10 -‐1 10
0 10 1 10
2 10 3
10 -‐1.4
10 -‐1.3
10 -‐1.2
10 -‐1.1
Time (sec)
Error
NOT 𝑂(𝑚↑3 ) 144
10 2 10
3 10 4 10
5 10 6 10
-‐2
10 0
10 2
O(n) O(n 2 )
Number of data points
Time (sec)
Timings for n-‐point ellipse fit
fdgrad cp simplex marg 5+n ad lbfgs 5+n ad noHess 5+n ad Hess 5+n LSQ lev-‐marq Hsampson sampson
! Say Θ∈ ℝ↑𝑁 for N=600, and 𝑚=40𝐾
! Option 1: Alternate min over Θ and 𝑢 " Very fast per iteration " Requires lots of iterations
! Option 2: Simultaneous optimization " Needs smooth 𝑀 to compute derivatives " Requires few iterations. " Very slow per iteration unless
we make good use of sparsity.
MINIMIZATION STRATEGIES: SUMMARY 145
min┬Θ, 𝒖↓1..𝑚 ∑𝑖=1↑𝑚▒‖𝑠↓𝑖 −𝑀( 𝒖↓𝑖 ;Θ)‖
Θ 𝒖↓𝟏 … 𝒖↓𝒎
FITTING A CUBIC B-‐SPLINE
Contour: linear in control vertices 𝑋∈ ℝ↑2×4 , piecewise cubic in 𝑢
𝑀(𝑢;𝑋)= 𝒙↓𝑖 (− 𝑢 ↑3 /6 + 𝑢 ↑2 /2 − 𝑢 /2 + 1/6 )+ 𝒙↓⌊𝑢⌋⊕1 (𝑢 ↑3 /2 − 𝑢 ↑2 + 2/3 )+ 𝒙↓⌊𝑢⌋⊕2 (− 𝑢 ↑3 /2 + 𝑢 ↑2 /2 + 𝑢 /2 + 1/6 )+ 𝒙↓⌊𝑢⌋⊕3 (𝑢 ↑3 /6 )
𝑖=⌊𝑢⌋, 𝑢 =𝑢−𝑖
FITTING A CUBIC B-‐SPLINE
… …
#1 #10 #40
• Fix 𝑋 and solve for 𝑢↓𝑖 = arg min┬𝑢∈Ω ‖𝒔↓𝑖 −𝑀(𝑢;𝑋)‖↑2
• This is itself a nonlinear minimization
• Then fix 𝑢↓1..𝑚 and solve for 𝑋
JOINT MINIMIZATION OF 𝑢 AND 𝑋
𝐸(𝑋)= ∑𝑖=1↑𝑚▒min┬𝑢↓𝑖 ‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;𝑋)‖↑2 𝐸({𝑢↓1 , …, 𝑢↓𝑚 }, 𝑋)= ∑𝑖=1↑𝑚▒‖𝒔↓𝑖 −𝑀( 𝑢↓𝑖 ;𝑋)‖↑2
→
! Advantages " Typically faster and better convergence. " Greater model flexibility (e.g. ability to share correspondence points, pairwise terms).
! Need to compute derivatives …
! Joint minimization of {𝑢↓1 , …, 𝑢↓𝑚 } and 𝑋 using a non-‐linear optimiser. " Move calculating the minimum distance “out” to the overall minimization problem.
LIMIT SURFACE 156
Control mesh vertices 𝑉∈ ℝ↑3×𝑛 Here 𝑛=16 Blue surface is {𝑀(𝒖;𝑉) | 𝒖∈Ω} Ω is the grey surface
CONTROL VERTICES DEFINE THE SHAPE 157
Control mesh vertices 𝑉∈ ℝ↑3×𝑛 Here 𝑛=16 Blue surface is {𝑀(𝒖;𝑉) | 𝒖∈Ω} Ω is the grey surface
! Mostly, 𝑀 is quite simple: 𝑀(𝒖;𝑋)=𝑀(𝑡, 𝑢,𝑣;𝒙↓1 ,…, 𝒙↓𝑛 )=∑█■𝑖+𝑗≤4𝑘=1..𝑛 ↑▒𝐴↓𝑖𝑗𝑘↑𝑡 𝑢↑𝑖 𝑣↑𝑗 𝒙↓𝑘
" Integer triangle id 𝑡 " Quartic in 𝑢,𝑣 " Linear in 𝑋 " Easy derivatives
! But… " 2nd Derivatives unbounded although normals well defined " Piecewise parameter domain
SUBDIVISION SURFACE: PARAMETRIC FORM 158
𝒔↓𝑖𝑗 2D point 𝒏↓𝑖𝑗 2D normal
DATA TERMS
Image 𝑖
𝒖↓𝑖𝑗 Contour generator preimage in 𝛀 (unknown)
c.g. point in 3D is 𝑀(𝒖↓𝑖𝑗 ;𝑿↓𝑖 )
DATA TERMS
Image 𝑖
𝒔↓𝑖𝑗 , 𝒏↓𝑖𝑗
Linear Blend Shapes Model: 𝑿↓𝑖 =∑𝑘↑▒𝛼↓𝑖𝑘 𝑩↓𝑘 Silhouette: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖𝑠↓𝑖𝑗 −𝜋(𝜃↓𝑖 , 𝑀(𝑢↓𝑖𝑗 , 𝑿↓𝑖 ))‖↑2 Normal: 𝐸↓𝑖↑𝑠𝑖𝑙 =∑𝑗=1↑𝑆↓𝑖 ▒‖[█■𝑛↓𝑖𝑗 @0 ]− 𝑅↓𝑖 𝑁(𝑢↓𝑖𝑗 , 𝑿↓𝑖 )‖↑2
OPTIMIZATION
! Alternation++: " Dynamic programming on discretized 𝑢 parameters " Quasi-‐Newton on all parameters
! NOT: " Fit shapes, perform PCA, repeat
CONTINOUS OPTIMIZATION
! Can focus on this term to understand entire optimization. " Total number of residuals 𝑛 = number of silhouette points.
Say 300𝑁 (𝑁= number of images) ≈10,000 " Total number of unknowns 2𝑛+𝐾𝑁+𝑚 where 𝑚≈3𝐾× number of vertices ≈3,000
PARAMETER SENSITIVITY “Pixel” terms: noise level params “Dimensionless” terms “Smoothness” terms
𝐸=∑𝑖=1↑𝑛▒(𝐸↓𝑖↑sil + 𝐸↓𝑖↑norm + 𝐸↓𝑖↑con ) + ∑𝑖=1↑𝑛▒(𝐸↓𝑖↑cg + 𝐸↓𝑖↑reg ) + 𝝃↓𝟎↑𝟐 𝐸↓0↑tp + 𝝃↓𝐝𝐞𝐟↑𝟐 ∑𝑖=1↑𝑛▒𝐸↓𝑚↑tp
! For textured scenes, 3D is relatively easy " Some algorithms look like PCA " But don’t use SVD, use lsqnonlin
! When freed from concerns of “linearity”, can use much more powerful tools (e.g. subdivision surfaces) " But you must allow correspondences to vary " And you must exploit sparsity
! Future work: " Video, More/less user intervention.
CONCLUSIONS ETC 179
! Not a great approximation? ! Think: what is the real noise model? ! Let’s work with this for a while...
COMPUTING P(X|®)
! “I tried fminX and it didn’t work” " Matlab’s implementations not necessarily the best " You must supply derivatives " They must be correct " You must take care of sparseness " You should choose a good parametrization
! All parameters have a similar effect ! All parameters “around 1”
! What about LBFGS? ! What about netscale?
DISCUSSION POINTS: OPTIMIZATION