Mixtures of Eigenfeatures for Real−Time Structure from Texturejebara/papers/iccv.showcase.pdf ·...
Transcript of Mixtures of Eigenfeatures for Real−Time Structure from Texturejebara/papers/iccv.showcase.pdf ·...
Mixtures of Eigenfeaturesfor Real−Time Structurefrom Texture
Tony Jebara Kenneth RussellAlex Pentland
Vision & ModelingPerceptual ComputingM.I.T. Media Laboratory
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
MOTIVATION− Recover full 3D information from live video:
rotation, translation, focal length, structure, texture− Structure is underconstrained from a single frame usually constraints / regularization compensate:
a) local smoothness (shape from shading)b) motion (structure from motion)c) multiple images (stereo techniques)d) knowledge of a class of objectse) global vs. local structure constraints
− [CVPR 97]: Real−time feedback 3D tracker with SfMSfM only recovers sparse point field (40 points @ 30Hz) (unless many features or optical flow −> Not Real Time)
− How much structure info in a static frame?− Real time structure estimation −> temporal analysis− Fully reconstruct 3D face compactly (video−conf)− Can use global linear analysis vs. local non−linear
redundancies and regularities in global data
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
SYSTEM ARCHITECTURE− Learn the mapping between texture and structureprobabilistically to avoid cumbersome low−level operationsand underconstrained formulations
− Use pre−processing and tracking to focus learningresources where needed.
VIDEO
DETECTION &3D TRACKING NORMALIZATION STATISTICAL 3D
FACE MODELING
LOCALIZATION MUGSHOTTEXTURE
ESTIMATEDSTRUCTURE
RECOVERED STRUCTURE FEEDBACK
−Want a compact real−time temporal representation of facial 3D pose,translation, structure and texture
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
3D TRAINING DATA−Knowledge Based Statistical ApproachLearn mapping between texture and structure fromCyberware 3D face scans.
−360 Degree Cylindrical (R,G,B,radius) value for each(theta,y) value. (very large dimensionality)
−Database of over 300 Scans:Wright Patterson Air Force, Lab Personnel,Headus (Cyberware affiliate), Other...
RANGE MAPUNKNOWN
TEXTURE MAPKNOWN
COMBINED& RENDERED
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
PRE−PROCESSING− Remove unwanted variations and non−linearities (suchas illumination, background and 3D pose) to uncoverinteresting correlations between texture & structure
− Segmentation and 3D alignment (semi−rigid transformation)
−Illumination Correction (RGB not independent)Eigenspace Basis over facial colors for histogramming
Histogram fitting via cdf’s
T (i) = c (c (i))s−>d d s
−1
0 20 40 60 80 100 120 140 0
50
100
0
10
20
30
40
50
60
70
80
R
G
B
6070
8090
100110 40
50
60
70
80
15
20
25
30
35
40
45
50 60 70 80 90 100 110 120 20
40
60
80
10
15
20
25
30
35
40
45
50
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
EIGENFEATURES− Form a rank−defficient Gaussian model over thedata (of dims M >> N samples)
− PCA sensitive to scaling (but not rotation):
−Focus PCA resources where needed using modulareigenspace with a weight for each pixel
−Estimate several smaller M using N samples−Span a larger space
−6 −4 −2 0 2 4 6−5
−4
−3
−2
−1
0
1
2
3
4
−6 −4 −2 0 2 4 6−20
−15
−10
−5
0
5
10
15
−6 −4 −2 0 2 4 6−20
−15
−10
−5
0
5
10
15
EIGENSPACEMODULAR
SAMPLE #1 SAMPLE #2TRAININGTRAINING
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
MIXTURES OF EIGENFEATURES− Decouple image spatially into multiple eigenspaceseach with its own region of applicability (weight)forms a mask for each eigenspace
−Make a more generalizable space than full PCA− Simple iterative algorithm based on assumption that
the Modules are binary and don’t overlap
Inject regularizers (blur, epsilon, symmetry)
Estimate eigenspaces over training set
Compute eigenspaces error over x−validation set
Partition image into modules according to errors
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
− Initialize the model pseudo−randomly and iterateuntil convergence (min. cross−validation error)
FINDING THE EIGENFEATURES− Decouples face into multiple regions with their ownlinear models and modes.
− Initialized both randomly andmanually using Laplace’sequation for interpolation
− Converges to a local min error, reduces Error up to 70% after 25 iterations
− Resulting Masks & Spatial Errors
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
EIGENVECTORS, PROJECTION & FILTERING− Modular Eigenvectors resulting from training show coupling over image regions
Compact Coefficient Representation4x14 Floats, storage size < 100 bytes
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
Original10,000+ DOF
Recon
stru
ctio
n
Proj
ecti
on
<100 DOF Filtered10,000+ DOF
REGRESSION
− Tricky in higher dims due to rank defficiency and cost− Use trained model,condition on known & estimate unknown
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
3D FACE TRACKING AND SfM− In Real−Time, Detect, Track and Normalize a facein 3D by computing 3D pose and coarse 3D structure
− 3D GLOBAL ADAPTIVE COUPLING OF 2D FEATURESLocal 2D Feature <−> Global Rigid 3D StructureTracking Holistic Texture
3D EIGENHEAD
FROM MOTIONSTRUCTURE
TRACKINGFEATURE
MODELING
30 HZ
TRACKINGDETECTION
DFFS < TDFFS > T
0.5 HZ
3D NORMALIZATIONOPERATIONSSYMMETRY
CLASSIFICATIONSKIN
FACE SPACEDISTANCE FROM
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
DETECTION AND DFFS
− YUV Mixture of Gaussians skin color model (EM)− Symmetry Operations (Radial and Axial)− Geometric Constraints
− Refining and Pruning initial localization with DFFSon 3D Frontally Warped Mug−Shot Images
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
2D FEATURE TRACKING
−Get 2D motion, u = (Tx, Ty, θ, scale) via linear model of SSD−2D motion is also represented as motion of 2 anchor points
Also get sensitivity of model or 4x4covariance of u (over SSD) byartificially perturbing each tracker.
C4x4 (u,SSD)
Initialize 8 Local 2D Feature Trackers
SSD NORMALIZED CORRELATION
TRANSLATION SCALEROTATION
x
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
FEEDBACK STRUCTURE FROM MOTION
T
α 1...N
β
X
Y
^
^
y2D Features
Residuals Constrained 2D Featuresy
X
R AdaptiveNoise Matrix
X̂
H
ωxyzxy
EIGENSPACE
KALMANEXTENDED
FILTER
SfM
POINT-WISESTRUCTURE
zyz
xyzx
ωT
α 1...N
β
− Get 2D Features from anchor points on each patch− Form block−diagonal R = diag(C1, C2, ... C8)− Form pointwise eigenspace of structure (α) from Cyberware
− Use feedback of constrained, rigid SfMto re−initialize the 2D trackers forthe next frame.
− GLOBAL ADAPTIVE 3D FUSION OF 2D TRACKERS
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
TRACKING ACCURACY & STABILITY
− Robust due to 3D SfM based integration of multiple trackers− Recover 3D Translation, Rotation, Focal Length
Point−Wise Structure Estimate (20 points)− 30 Hz Tracking (1 Hz detection)− Stable in tracking mode for extended periods of time− Robust to <65 degrees rot
large occlusionfast motion
−Vision vs Polhemus HeadTracking and Struture Est.
RMS differences 0.11 units 2.35 degrees Scale 10−12cm * 0.11
Within polhemus error [Azarbayejani & Pentland, 95]
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
TEXTURE STABILIZATION & 3D REGRESSION
− Use estimates of Tx, Ty, Tz and quaternion to align aaverage Cyberware head to the face and texture map it
Regress unknown components of sparse cylindrical map:
1) Continuous Texture (interpolate texture)2) 3D Range Data Structure
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
Sparse Cylindrical Map
Synthesized FrontalMug−Shot
Tracking
3D Model
TEMPORAL DESCRIPTION
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
Co
effi
cien
tM
agn
itu
des
Time
Static Face Motion
During Static Tracking:
Avg %SSD differencefrom Cyberware = 18 % Magnitudescan (in eigenspace)
SPATIO−TEMPORAL INTEGRATION (KF)CURRENT & FUTURE WORK:
− Spatial Structure IntegrationAccumulation of texture dataover whole cylindricalstructure when trackinglarge pose variations
−Tempral Structure Integration:Recursive integration of eigenspace coefficients viaKalman Filter
− More flexible non−linear models
− More detailed temporal integrationand intra−personal &extra−personal analysis
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab
ICCV ’98 Mixtures of Eigenfeatures for Real−Time Structure from Texture
Tony Jebara − Kenneth Russell − Alex Pentland − MIT Media Lab