Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9...
-
Upload
sharlene-rodgers -
Category
Documents
-
view
218 -
download
0
Transcript of Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9...
![Page 1: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/1.jpg)
Learning to be a Depth Camera for Close-Range Human Capture and
Interaction
Sean Ryan Fanello*^ (+9 other guys*)
*Microsoft Research ^iCub Facility – Istituto Italiano di Tecnologia
Presented by Supasorn Suwajanakorn
![Page 2: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/2.jpg)
2D Camera -> Depth sensing
![Page 3: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/3.jpg)
Camera’s modification
![Page 4: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/4.jpg)
Camera’s modification
![Page 5: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/5.jpg)
Machine Learning
IR intensity Depth valuePer pixel
IR reflection is skin-tone invariant [Simpson et al. 1998]
![Page 6: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/6.jpg)
Multi-layered Decision Forest
IR Image pixel
Probability
Quantized depth
0-25 cm25-50 cm50-75 cm75-100 cm
23cm 40cm 60cm 76cm
23 x + 40 x + 60 x + 76 xOutput =
Layer 1Classification Forest
Layer 2Regression Forests
![Page 7: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/7.jpg)
Features
Tree splitting parameter
Left branch Right branch
yes no
u
v
Invariant to additive illumination
x
![Page 8: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/8.jpg)
Training
Layer 1 (output quantized depth posterior)Shanon Entropy
Maximize information gain
Layer 2 (output depth value)Differential Entropy
( Fitting Gaussian to )
![Page 9: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/9.jpg)
Training Data
Real
• Photos taken with calibrated depth camera + IR camera
Synthetic
• 100K Renders of hand and face • Models illumination fall off, noise,
vignetting, subsurface scattering• Face variations thru 100+
Blendshapes• Hand variations thru 26-dof model• Random global transformations
![Page 10: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/10.jpg)
Hyper-parameter
Number of quantized depth
Error
Optimal : 4 quantized depths, 3 trees / forest, 25 max tree depth
![Page 11: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/11.jpg)
![Page 12: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/12.jpg)
3D Fused
![Page 13: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/13.jpg)
Generalization
Across Different People
Across Different Cameras
![Page 14: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/14.jpg)
vs Single-Layer Regressor
Use single-tree forest for all Claimed benefits
1. Multi-layer can infer useful intermediate results that simplify primary task, which in turn increases the accuracy of the model.
2. Shallower tree depth with comparable accuracy. Low memory usage.
![Page 15: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/15.jpg)
Video Demo
![Page 16: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/16.jpg)
Conclusion
• Low-cost technique to convert 2D camera to real-time depth sensor.
• Discriminative approach implicitly handles variations e.g. light physics, skin color, geometry
• Multi-layer can achieve comparable accuracy to deeper single-layer forest
![Page 17: Learning to be a Depth Camera for Close- Range Human Capture and Interaction Sean Ryan Fanello*^ (+9 other guys*) * Microsoft Research ^iCub Facility –](https://reader030.fdocuments.net/reader030/viewer/2022032600/56649dc85503460f94abd82c/html5/thumbnails/17.jpg)
Limitations & Discussion
• Fails to generalize beyond face and hand– Strong shape prior / bias in the forest
• Works only on uniform albedo (skin)– Feature too simple? Why does it even work?
• Output is noisy even if IR image is smooth– No spatial or temporal smoothness
• Baseline method too simple?– Interesting to compare to single view modeling