Automatic Photo Pop-up
description
Transcript of Automatic Photo Pop-up
![Page 1: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/1.jpg)
Automatic Photo Pop-up
Derek HoiemAlexei A.EfrosMartial Hebert Carnegie Mellon University
![Page 2: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/2.jpg)
Abstract This paper presents a fully automatic method f
or creating a 3D model from a single photograph.
Our algorithm labels regions of the input image into coarse categories:”ground”,”sky”,and”vertical”.
Because of the inherent ambiguity of the problem and the statistical nature of the approach,the algorithm is not expected to work on every image.
![Page 3: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/3.jpg)
Overview
Image to Superpixels(Superpixel:nearly uniform regions)
Superpixels to Multiple Constellations Multiple Constellations to Superpixel Lab
els Superpixel Labels to 3D Model!!
![Page 4: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/4.jpg)
Features for Geometric Classes
![Page 5: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/5.jpg)
Features for Geometric Classes Color:it is valuable in identifying the material of
a surface. Texture:provides additional information about
the material of a surface. Location:provide strong cues for distinguishing
between ground,vertical structures,and sky. 3D Geometry:help determine the 3D orientation
of surface.
![Page 6: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/6.jpg)
Features for Geometric Classes
Num:the number of variables in each set
Used:how many variables from each set are actually used in the classifier
![Page 7: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/7.jpg)
Features for Geometric Classes
![Page 8: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/8.jpg)
Horizon Position We estimate the horizon position from th
e intersections of nearly parallel lines by finding the position that minimizes the L1orL2 distance from all of the intersection points in the image.
This often provides a resonable estimate,since these scenes contain many parallel to the ground plane.
![Page 9: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/9.jpg)
Labeling the Image
![Page 10: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/10.jpg)
Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification
![Page 11: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/11.jpg)
Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification
![Page 12: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/12.jpg)
Labeling the Image Obtaining Superpixels
Superpixels correspond to small,nearly-uniform regions in the image. Our implementation uses the over-segmentation technique of [Felzenszwalb and Huttenlocher 2004]. The use of superpixels improves the computational efficiency of our algorithm.
![Page 13: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/13.jpg)
Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification
![Page 14: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/14.jpg)
Labeling the Image
Forming Constellations We group superpixels that are likely to share a common geometric label into “Constellation”.
To form constellations, we initialize by assigning one randomly selected superpixel to each of Nc constellations. We then iteratively assign each remaining superpixel to the constellation most likely to share its label, maximizing the average pairwise log-likelihoods with other superpixels in the constellation:
![Page 15: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/15.jpg)
Labeling the Image Forming Constellations
Nc:the number of the constellation
nk:the number of superpixels in constellstion Ck
Y:label of superpixel
Z:feature
![Page 16: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/16.jpg)
Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification
![Page 17: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/17.jpg)
Labeling the Image
Geometric ClassificationFor each constellation,we estimate :
1. Label likelihood: the confidence in each geometric label.
2. Homogeneity likelihood: whether all superpixels in the constellation have the same label.
![Page 18: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/18.jpg)
Labeling the Image
Geometric ClassificationNext,we estimate the likelihood of a superpixel label by marginalizing over the constellation likelihood:
Si:the i th superpixel
Yi:the label of Si
![Page 19: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/19.jpg)
Training
![Page 20: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/20.jpg)
Training Training Data
The likelihood function used to group superpixels and label constellations are learned from training images.Each training image is over-segmented into superpixels,and each superpixel is given a ground truth label according to its geometric class.
![Page 21: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/21.jpg)
Training
![Page 22: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/22.jpg)
Training Superpixel Same-Label Likelihoods
To learn the likelihood that two superpixels have the same label,we sample 2500 same-label and different-label pairs of superpixels from our training data.We estimate the pairwise likelihood function are estimated using the logistic regression version of Adaboost[Collins 2002] with weak learners based on eight-node decision trees[Friedman 2000].
![Page 23: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/23.jpg)
Training
Z1,Z2:the features from a pair of superpixels
Y1,y2:the labels of the superpixels
nf:the number of features
:This likelihood function is obtained using kernel density estimation [Duda 2000] over the m th weighted distribution.
![Page 24: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/24.jpg)
Training Constellation Label and Homogeneity Likelihoods
To learn the label likelihood and Homogeneity likelihood ,we form multiple sets of constellations for the superpixels in our training images using the learned pairwise function. Each constellation is then labeled as “ground”,”vertical”,”sky”,or”mixed”,according to the ground truth.
Each decision tree weak learner selects the best features to use and estimates the confidence in each label based on those features.
![Page 25: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/25.jpg)
Training Constellation Label and Homogeneity Likelihoods
The boosted decision tree estimator outputs a confidence for each of “ground” ,”vertical” ,”sky”, and” mixed”,which are normalized to sum to 1.
The product of the label and homogeneity likelihoods for a particular geometric label is then given by the normalized confidence in “ground” ”,”vertical” ,”sky” ,and ”mixed”.
![Page 26: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/26.jpg)
Creating the 3D Model
![Page 27: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/27.jpg)
Creating the 3D Model Cutting and Folding
Our model=ground plane+planar object We need to partition the vertical regions i
nto a set of objects and determine where each object meets the ground. Preprocess:
Set any superpixels that are labeled as ground or sky and completely surrounded by non-ground or non-sky pixels to the most common label of the neighboring superpixels.
![Page 28: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/28.jpg)
Creating the 3D Model Cutting and Folding1. We divide the vertically labeled pixels into disconnec
ted or loosely connected regions using the connected components algorithm.
2. For each region,we fit a set of line segments to the region’s boundary with the labeled ground using the Hough transform[Duda 1972]
3. Next,within each region,we form the disjoint line segments into a set polylines.
![Page 29: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/29.jpg)
Creating the 3D Model Cutting and Folding
We treat each polyline as a separate object, modeled with a set of connected planes that are perpendicular to the ground plane.
![Page 30: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/30.jpg)
Creating the 3D Model Camera Parameters To obtain true 3D world coordinates,we would need to
know the two camera parameters:1. intrinsic2. extrinsic
引用自 www.csie.ntu.edu.tw/~cyy/courses/ vfx/05spring/lectures/scribe/07scribe.pdf
![Page 31: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/31.jpg)
Failure Works
![Page 32: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/32.jpg)
Failure
![Page 33: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/33.jpg)
Failure
![Page 34: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/34.jpg)
Failure
Four example of failures1. Labeling error2. Polyline fitting error3. Modeling assumptions4. Occlusion in the image5. Poor estimation of the horizon position
![Page 35: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/35.jpg)
Result
![Page 36: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/36.jpg)
Result
![Page 37: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/37.jpg)
Result
![Page 38: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/38.jpg)
Result
![Page 39: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/39.jpg)
Conclusion
![Page 40: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/40.jpg)
Conclusion
Future work1. Use segmentation technique[Li 2004/Laz
zy snapping]2. Estimate the orientation of vertical regio
n from the image data,allowing a more robust polyline fit.
3. An extension of the system to the indoor scene.
![Page 41: Automatic Photo Pop-up](https://reader036.fdocuments.net/reader036/viewer/2022081603/56813b4c550346895da4397a/html5/thumbnails/41.jpg)
Thanks for your listening!!