Lecture 11: Stereo and optical flow CS6670: Computer Vision Noah Snavely.
Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.
-
date post
15-Jan-2016 -
Category
Documents
-
view
227 -
download
0
Transcript of Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.
![Page 1: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/1.jpg)
Lecture 11: Structure from motion, part 2
CS6670: Computer VisionNoah Snavely
![Page 2: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/2.jpg)
Announcements
• Project 2 due tonight at 11:59pm– Artifact due tomorrow (Friday) at 11:59pm
• Questions?
![Page 3: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/3.jpg)
Final project
• Form your groups and submit a proposal
• Work on final project
• Final project presentations the last few classes
• Final report due the last day of classes
![Page 4: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/4.jpg)
Final project
• One-person project: implement a recent research paper
![Page 5: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/5.jpg)
Example one person projects• Seam carving
– SIGGRAPH 2007
![Page 6: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/6.jpg)
Example one person projects
• What does the sky tell us about the camera?– Lalonde, et al, ECCV 2008
![Page 7: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/7.jpg)
Other ideas
• Will post more ideas online
• Look at recent ICCV/CVPR/ECCV/SIGGRAPH papers
![Page 8: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/8.jpg)
Example two/three person projects
• Automatic calibration of your camera’s clock
• In roughly what year was a given photo taken?
• Matching historical images to modern-day images
• Matching images on news sites / blogs for article matching
• Will post others online soon…
![Page 9: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/9.jpg)
Final projects
• Think about your groups and project ideas
• Proposals will be due on Tuesday, Oct. 27
![Page 10: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/10.jpg)
Scene reconstruction
Feature detectionFeature detection
Pairwisefeature matching
Pairwisefeature matching
Incremental structure
from motion
Incremental structure
from motion
Correspondence estimation
Correspondence estimation
![Page 11: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/11.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 12: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/12.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 13: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/13.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 14: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/14.jpg)
Feature matchingMatch features between each pair of images
![Page 15: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/15.jpg)
Feature matchingRefine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs
![Page 16: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/16.jpg)
Image connectivity graph
(graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)
![Page 17: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/17.jpg)
Incremental structure from motion
![Page 18: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/18.jpg)
Incremental structure from motion
![Page 19: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/19.jpg)
Incremental structure from motion
![Page 20: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/20.jpg)
Problem size
• Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem
![Page 21: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/21.jpg)
Photo Tourism overview
Scene reconstruction
Scene reconstruction
Photo ExplorerPhoto
ExplorerInput photographs
![Page 22: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/22.jpg)
Photo Explorer
![Page 23: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/23.jpg)
Can we reconstruct entire cities?
![Page 24: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/24.jpg)
![Page 25: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/25.jpg)
![Page 26: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/26.jpg)
Gigantic matching problem
• 1,000,000 images 500,000,000,000 pairs– Matching all of these on a 1,000-node cluster would
take more than a year, even if we match 10,000 every second
– And involves TBs of data
• The vast majority (>99%) of image pairs do not match
• There are better ways of finding matching images (more on this later)
![Page 27: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/27.jpg)
Gigantic SfM problem
• Largest problem size we’ve seen:– 15,000 cameras– 4 million 3D points– more than 12 million parameters– more than 25 million equations
• Huge optimization problem• Requires sparse least squares techniques
![Page 28: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/28.jpg)
Building Rome in a Day
Rome, Italy. Reconstructed 150,000 in 21 hours on 496 machines
Colosseum
St. Peter’s Basilica
Trevi Fountain
![Page 29: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/29.jpg)
Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352
Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352
![Page 30: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/30.jpg)
Dubrovnik
Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352
![Page 31: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/31.jpg)
San Marco Square
San Marco Square and environs, Venice. 14,079 photos, out of an initial 250,000. Total reconstruction time: 3 days. Number of cores: 496.
![Page 32: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/32.jpg)
![Page 33: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/33.jpg)
![Page 34: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/34.jpg)
![Page 36: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/36.jpg)
Questions?
![Page 37: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/37.jpg)
A few more preliminaries• Let’s revisit homogeneous coordinates• What is the geometric intuition?
– a point in the image is a ray in projective space
• Each point (x,y) on the plane is represented by a ray (sx,sy,s)– all points on the ray are equivalent: (x, y, 1) (sx, sy, s)
(0,0,0)
(sx,sy,s)
image plane
(x,y,1)y
x-z
![Page 38: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/38.jpg)
Projective lines• What does a line in the image correspond to in
projective space?
• A line is a plane of rays through origin– all rays (x,y,z) satisfying: ax + by + cz = 0
z
y
x
cba0 :notationvectorin
• A line is also represented as a homogeneous 3-vector ll p
![Page 39: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/39.jpg)
l
Point and line duality– A line l is a homogeneous 3-vector– It is to every point (ray) p on the line: l p=0
p1p2
What is the intersection of two lines l1 and l2 ?
• p is to l1 and l2 p = l1 l2
Points and lines are dual in projective space• given any formula, can switch the meanings of points
and lines to get another formula
l1
l2
p
What is the line l spanned by rays p1 and p2 ?
• l is to p1 and p2 l = p1 p2
• l is the plane normal
.
![Page 40: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/40.jpg)
Ideal points and lines
• Ideal point (“point at infinity”)– p (x, y, 0) – parallel to image plane– It has infinite image coordinates
(sx,sy,0)-y
x-z image plane
• Ideal line• l (a, b, 0) – parallel to image plane
(a,b,0)-y
x-z image plane
• Corresponds to a line in the image (finite coordinates)– goes through image origin (principle point)
![Page 41: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/41.jpg)
Two-view geometry
• Where do epipolar lines come from?
epipolar plane
epipolar lineepipolar lineepipolar lineepipolar line
0
3d point lies somewhere along r
(projection of r)
![Page 42: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/42.jpg)
Fundamental matrix
• This epipolar geometry is described by a special 3x3 matrix , called the fundamental matrix
• Epipolar constraint on corresponding points:
• The epipolar line of point p in image J is:
epipolar plane
epipolar lineepipolar lineepipolar lineepipolar line
0
(projection of ray)
![Page 43: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/43.jpg)
Fundamental matrix
• Two special points: e1 and e2 (the epipoles): projection of one camera into the other
epipolar plane
epipolar lineepipolar lineepipolar lineepipolar line
0
(projection of ray)
![Page 44: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/44.jpg)
Fundamental matrix
• Two special points: e1 and e2 (the epipoles): projection of one camera into the other
0
![Page 45: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/45.jpg)
Rectified case
• Images have the same orientation, t parallel to image planes
• Where are the epipoles?
![Page 46: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/46.jpg)
Epipolar geometry demo
![Page 47: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/47.jpg)
Fundamental matrix
• Why does F exist?• Let’s derive it…
0
![Page 48: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/48.jpg)
Fundamental matrix – calibrated case
0
: intrinsics of camera 1 : intrinsics of camera 2
: rotation of image 2 w.r.t. camera 1
: ray through p in camera 1’s (and world) coordinate system
: ray through q in camera 2’s coordinate system
![Page 49: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/49.jpg)
Fundamental matrix – calibrated case
• , , and are coplanar• epipolar plane can be represented as
0
![Page 50: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/50.jpg)
Fundamental matrix – calibrated case
0
![Page 51: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/51.jpg)
Fundamental matrix – calibrated case
• One more substitution:– Cross product with t can be represented as a 3x3 matrix
0
![Page 52: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/52.jpg)
Fundamental matrix – calibrated case
0
![Page 53: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/53.jpg)
Fundamental matrix – calibrated case
0
{the Essential matrix
![Page 54: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/54.jpg)
Fundamental matrix – uncalibrated case
0
the Fundamental matrix
![Page 55: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.](https://reader035.fdocuments.net/reader035/viewer/2022062314/56649d435503460f94a1f79b/html5/thumbnails/55.jpg)
Properties of the Fundamental Matrix
• is the epipolar line associated with
• is the epipolar line associated with
• and
• is rank 2
• How many parameters does F have?55
T