Slides on Photosynth.net, from my MSc at Imperial

29
3D browsing of a photos dataset Uncovering Photosynth.net Markou Nikolas, Romain Dossin, Kevin Keraudren November 29, 2010

Transcript of Slides on Photosynth.net, from my MSc at Imperial

Page 1: Slides on Photosynth.net, from my MSc at Imperial

3D browsing of a photos datasetUncovering Photosynth.net

Markou Nikolas, Romain Dossin, Kevin Keraudren

November 29, 2010

Page 2: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Introduction

Flickr search ”Rome Coliseum”

34,169 results

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 2 / 22

Page 3: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Introduction

Flickr search ”Rome Coliseum”

34,169 results

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 2 / 22

Page 4: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Introduction

Huge amount of data on the Web: Flickr > 5 billion photos

How can we browse such amount ?

What can we learn from it ?

What if we could turn 2D into 3D ?

→ Photosynth.net(University of Washington + Microsoft Research)

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 3 / 22

Page 5: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

1 The Bundler PipelineExtract the focal length from the EXIF tags (extract focal.pl)Find feature points in each image using SIFTMatch keypoint descriptors between each pair of imagesStructure from motion : recover a set of camera parameters and a3D location for each track

2 Photo Explorer RenderingRender the sceneTransitions

View Interpolation

3 Running the code

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 4 / 22

Page 6: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Extract the focal length from the EXIF tags (extract focal.pl)

Extract the focal length from the EXIF tags(extract focal.pl)

Jhead

ImageMagick: identify -format %[exif:*] image.jpg

focalpixels = X resolution ∗ (focalmm/CCD widthmm).→ used later to initialize the bundle adjustment

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 5 / 22

Page 7: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Find feature points in each image using SIFT

SIFT - Scale Invariant Feature TransformFrom scale space to feature space

SIFT transforms an image into a large collection of local feature vectorseach of which is invariant to :

image translation

scaling

rotation

and partially invariant to :

illumination changes

affine projections

3d projections

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 6 / 22

Page 8: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Find feature points in each image using SIFT

SIFT (continued)It is based on the highly successful Gaussian pyramid and the simple toimplement Difference of Gaussians (DoG) technique.

Source: http://fourier.eng.hmc.edu

For each different level the maxima and minima points are kept androtation histogram is created from the pixels around those for extrarobustness.These features can then be matched on other images. Objects can alsobe described as a set of features.

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 7 / 22

Page 9: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Find feature points in each image using SIFT

Find feature points in each image using SIFT

Output format of ./sift

<number o f keypoints> <d e s c r i p t o r length><subp ixe l row> <subp ixe l column> <scale> <o r i e n t a t i o n>< i n v a r i a n t d e s c r i p t o r vector><subp ixe l row> <subp ixe l column> <scale> <o r i e n t a t i o n>< i n v a r i a n t d e s c r i p t o r vector>. . .

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 8 / 22

Page 10: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 11: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 12: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 13: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 14: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 15: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 16: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

The fundamental matrix

Corresponding points within stereo-pair images are connected by thefundamental matrix.Set of corresponding points xi ↔ x′i in two imagesF is the fundamental matrix⇐⇒ ∀i , x′iFxi = 0linear equation in the unknown entries of F:If x = (x ,y ,1) , x′ = (x ′,y ′,1) then :x ′xf11 + x ′yf12 + x ′f13 + y ′xf21 + y ′yf22 + y ′f23 + xf31 + f33 = 0

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22

Page 17: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Fundamental matrix estimation using the 8 pointalgorithm and RANSAC

With n point matches, this can be rewritten :

Af =

x ′1x1 x ′1y1 x ′1 y ′1x1 y ′1y1 y ′1x1 y1 1...

x ′nxn x ′nyn x ′n y ′nxn y ′nyn y ′nxn yn 1

f = 0

where f the 9-vector made up of F in row-major order∃ solutions⇐⇒ rank(A)≥ 8, unicity in the case of equality(f determined up to scale)If A > 8 (ex. noise): least-squares solution or run RANSAC and keepthe best fitting model

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 10 / 22

Page 18: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

RANSAC - Random Sample ConsensusDuring any matching procedure we are stuck with erroneous matches.These mismatched points are called outliers and are usuallycatastrophic when trying to fit a model to the data.

Source: Wikipedia

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 11 / 22

Page 19: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

RANSAC - Random Sample Consensus

Method:

Randomly choose a number points

Try to fit a model to them

Check how many other points are in consensus with the model

It is repeated and the best fit is left as a solution.All points not fitting this solution (outliers) are usually removed from thedata set. This process filters most of the large errors.

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 12 / 22

Page 20: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Match keypoint descriptors between each pair of images

Organize the matches into tracks

Source: ”Modeling the World from Internet Photo Collections”

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 13 / 22

Page 21: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Structure from motion : recover a set of camera parameters and a 3D location for each track

Structure from motion : recover a set of cameraparameters and a 3D location for each track

Start with the two cameras (images) that best matchEstimate their parameters (focal length from EXIF tags, 5 pointsalgorithm)Recover the 3D position of the points they both observe through abundle adjustment

Then take the camera that observes the most of the same pointsEstimate its parameters using Direct Linear TransformationRun a bundle adjustment adding only the already known points :only the new camera parameters can changeRun another bundle adjustment adding the points observed byanother camera

Iterate with a new cameraMarkou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 14 / 22

Page 22: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Structure from motion : recover a set of camera parameters and a 3D location for each track

Structure from motion : recover a set of cameraparameters and a 3D location for each trackBundle adjustment :

n cameras parametrized by Θij

m tracks parametrized by the 3D points Xj

qij the observed projection of the j-th track in the i-th cameraP(Θ,X) : mapping between a 3D point X and its 2D projection in acamera with parameters Θ

wij : 1 if camera i observes point j, 0 otherwise

Minimize (the unknowns are the 3D points Xj ):

n

∑i=1

m

∑j=1

wij ‖qij −P(Θi ,Xj)‖

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 15 / 22

Page 23: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Render the scene

Render the scene

As frustra

Images

Points and lines

3D rendering

Sources: Wikipedia & ”Modeling the World fromInternet Photo Collections”

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 16 / 22

Page 24: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Transitions

Transitions

Representation accuracy of the real scene

Camera motionI Linear interpolationI TimingI Twinkle

View interpolationI Triangulated MorphsI Planar Morphs

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 17 / 22

Page 25: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Transitions

Triangulated Morphs

Method

I Projection of the points onto each imageI 2D Delaunay triangulation, with edges constraintsI Projection of the triangulation onto an average planeI Creation of a 3D meshI Display depending on camera location

Rendering

I Good geometryI Artifacts

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 18 / 22

Page 26: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Transitions

Planar Morphs

Method

I Projection onto a common planeI Display depending on camera location

Rendering

I Lower quality of the geometryI Less Artifacts

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 19 / 22

Page 27: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Running the codeOne full run: Bundler→ Poisson reconstruction82 photos from Flickr, big images, 44h, 115 196 points recovered at theBundler stage...

Figure: Cloud points obtained from Bundler, and Poisson surfacereconstruction done after PMVS

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 20 / 22

Page 28: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

ConclusionPhotosynth is only a beginning...The University of North Carolina aims to reconstruct famous sites,in a day from a ”normal machine”

Figure: Reconstruction from 2D photosMarkou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 21 / 22

Page 29: Slides on Photosynth.net, from my MSc at Imperial

Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion

Quizz

1 SIFT is used on images to find key features. In whattransformations is SIFT invariant to and when it doesn’t perform sowell ? How does this affect the clusters generated afterwards ?

2 If you were given the parameters of a camera (rotation matrix, focallength, position of the center), the associated image and the 3Dcloud point it observes, where would you place the 2D image in 3Dspace ?

3 As Photosynth is a web application that must be able to displayhigh-resolution photos to a lot of people simultaneously, how doyou think Microsoft optimized this system in order to avoid largedata transfers ?

Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 22 / 22