Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

88
Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University

Transcript of Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Page 1: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo Vision

ECE 847:Digital Image Processing

Stan BirchfieldClemson University

Page 2: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Modeling from multiple views

time

# cameras

photographbinocular stereo

trinocular stereo

multi-baseline stereo

camcorder

human vision

camera dome

two frames ...

...

– Greek for solid

Page 3: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereoscope

Invented by Wheatstone in 1838

Page 4: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Modern version

Page 5: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Can you fuse these?

rightleft

No special instrument needed

Just relax your eyes

L

R

Page 6: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Random dot stereogram

invented by Bela Juleszin 1959

http://www.magiceye.com/faq.htm

Page 7: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Autostereogram

Do you see the shark?

http://en.wikipedia.org/wiki/Autostereogram

Page 8: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Can you cross-fuse these?

right leftNote: Cross-fusion is necessary if distance between images

is greater than inter-ocular distance

L

R

R

L

impossible: instead, trickthe brain:

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Page 9: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Human stereo geometry

http://webvision.med.utah.edu/space_perception.html

fixationpoint

corresponding points

aR

aL

disparity

Page 10: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Horopter

• Horopter: surfacewhere disparity is zero

• For round retina,the theoretical horopteris a circle(Vieth-Muller circle)

http://webvision.med.utah.edu/space_perception.html

Page 11: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Cyclopean image

http://webvision.med.utah.edu/space_perception.html

http://bearah718.tripod.com/sitebuildercontent/sitebuilderpictures/cyclops.jpg

Page 12: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Panum’s fusional area (volume)

• Human visual system is only capable of fusing the two images with a narrow range of disparities around fixation point

• This area (volume) is Panum’s fusional area

• Outside this area we get double-vision (diplopia)

http://www.allaboutvision.com/conditions/double-vision.htm

Page 13: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Human visual pathway

Page 14: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

photos courtesy California Academy of Science

Cheetah:More accurate

depthestimation

Antelope:larger field

of view

Prey and predator

Page 15: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo geometry for pinhole cameras with flat retinas

C,C’,x,x’ and X are coplanar

Left camera Right camera

world point

center ofprojection

epipolarplane epipolar line for x

epipole

baseline

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 16: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

epipoles e,e’= intersection of baseline with image plane = projection of projection center in other image= vanishing point of camera motion direction

an epipolar plane = plane containing baseline (1-D family)

an epipolar line = intersection of epipolar plane with image(always come in corresponding pairs)

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 17: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

What if only C,C’,x are known?

Epipolar geometry

epipole

centerof

projection baseline

All points on p project on l and l’M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 18: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Family of planes and lines l and l’ Intersection in e and e’

Epipolar geometry

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 19: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Example: Converging cameras

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 20: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

e

e’

Example: Forward motion

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 21: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 20 points)

Page 22: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Epipolar geometry andFundamental matrix

epipolar line

(epipole: intersection of all epipolar lines)

(computed with 50 points)

Page 23: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Fundamental matrix

point in image 1

point in image 2

fundamentalmatrix

Page 24: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Epipolar lines (1)

epipolar line in image 2associated with (x,y) in image 1

Page 25: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Epipolar lines (2)

epipolar line in image 1associated with (x’,y’) in image 2

Page 26: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Computing the fundamental matrix

knownunknown

Page 27: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Computing the fundamental matrix

knownunknown

Page 28: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Computing the fundamental matrix

1. Construct A (nx9) from correspondences2. Compute SVD of A: A = UVT

3. Unit-norm solution of Af=0 is given by vn (the right-most singular vector of A)

4. Reshape f into F1

5. Compute SVD of F1: F1=UFFVFT

6. Set F(3,3)=0 to get F’(The enforces rank(F)=2)

7. Reconstruct F=UFF’VFT

8. Now xTF is epipolar line in image 2,and Fx’ is epipolar line in image 1

Page 29: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

(simple for stereo rectification)

Example: Motion parallel to image plane

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 30: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Example: Motion parallel to image scanlines

Epipoles are at infinity

Scanlines are the epipolar lines

In this case, the images are said to be “rectified”

Tsukuba stereo images courtesy of Y. Ohta and Y. Nakamura at the University of Tsukuba

Page 31: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Standard (rectified) stereo geometry

pure translation along X-axis

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 32: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Perspective projection

X

x

X

x

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 33: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Rectified geometry

xL xR

XL -XR

b

Page 34: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Standard stereo geometry

• disparity is inversely proportional to depth• stereo vision is less useful for distant objects

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 35: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Binocular rectified stereo

rightleft

disparity map depth discontinuities

epipolarconstraint

Page 36: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Matching scanlines

inte

nsi

ty

L

R

dis

par

ity

lamp

wall

pixel

rightleft

Page 37: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo matching

• Search is limited to epipolar line (1D)• Look for most similar pixel

?

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 38: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Aggregation

• Use more than one pixel• Assume neighbors have similar

disparities*

– Use correlation window containing pixel

– Allows to use SSD, ZNCC, Census, etc.

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 39: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Block matching

Page 40: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Dissimilarity measures

Connection between SSD and cross correlation:

Also normalized correlation, rank, census, sampling-insensitive ...

Most common:

Page 41: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

More efficient implementation

Key idea: Summation over window is correlation with box filter, which is separable

Running sum improves efficiency even more

Note: w is half-width

Page 42: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Sum of Square Differences

Dissimilarity measures

Note: SAD is fast approximation (replace square with absolute value)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 43: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Dissimilarity measures

If energy does not change much, then minimizing SSD equals maximizing cross-correlation:

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 44: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Zero-mean Normalized Cross Correlation

Similarity measures

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 45: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Compare intensities pixel-by-pixel

Comparing image regions

I(x,y) I´(x,y)

Census

Similarity measures

125 126 125

127 128 130

129 132 135

0 0 0

0 1

1 1 1

(Real-time chip from TYZX based on Census)

only compare bit signature

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 46: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Sampling-InsensitivePixel Dissimilarity

d(xL,xR)

xL xR

d(xL,xR) = min{d(xL,xR) ,d(xR,xL)}Our dissimilarity measure:

[Birchfield & Tomasi 1998]

IL IR

Page 47: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Given: An interval A such that [xL – ½ , xL + ½] _ A, and

[xR – ½ , xR + ½] _ A

Dissimilarity Measure Theorems

If | xL – xR | ≤ ½, then d(xL,xR) = 0

| xL – xR | ≤ ½ iff d(xL,xR) = 0

∩∩

Theorem 1:

Theorem 2:

(when A is convex or concave)

(when A is linear)

Page 48: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Aggregation window sizes

Small windows • disparities similar• more ambiguities• accurate when correct

Large windows • larger disp. variation• more discriminant• often more robust• use shiftable windows to

deal with discontinuities

(Illustration from Pascal Fua)

Page 49: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Occlusions

(Slide from Pascal Fua)

Page 50: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Left-right consistency check

• Search left-to-right, then right-to-left• Retain disparity only if they agree

xL

d

Do minima coincide?

Page 51: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Results: correlation

disparity mapleft

with left-right consistency check

Page 52: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Constraints

• Epipolar – match must lie on epipolar line• Piecewise constancy – neighboring pixels should usually

have same disparity• Piecewise continuity – neighboring pixels should usually

have similar disparity• Disparity – impose allowable range of disparities (Panum’s

fusional area)• Disparity gradient – restricts slope of disparity• Figural continuity – disparity of edges across scanlines• Uniqueness – each pixel has no more than one match

(violated by windows and mirrors)• Ordering – disparity function is monotonic (precludes thin

poles)

Page 53: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Exploiting scene constraints

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 54: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Ordering constraint

11 22 33 4,54,5 66 11 2,32,3 44 55 66

2211 33 4,54,5 6611

2,32,3

44

55

66

surface slicesurface slice surface as a pathsurface as a path

occlusion right

occlusion left

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 55: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Uniqueness constraint

• In an image pair each pixel has at most one corresponding pixel– In general one corresponding pixel– In case of occlusion there is none

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 56: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Disparity constraint

surface slicesurface slice surface as a pathsurface as a path

bounding box

dispa

rity b

and

use reconstructed features to determine bounding box

constantdisparitysurfaces

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 57: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Figural continuity constraint

right left

[University of Tsukuba]

Page 58: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Cooperative algorithm

Page 59: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Disparity space image

Page 60: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Dynamic Programming: 1D Search

Dis

par

ity

map

occlusion

depthdiscontinuity

RIGHTL

EF

T

c a r t

ca

t 3 2 1 1 12 1 0 1 21 0 1 2 30 1 2 3 4

string editing:

stereo matching:

penalties: mismatch = 1 insertion = 1 deletion = 1

c a t

c a r t

Page 61: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Minimizing a 2D Cost FunctionalMinimize:

disparity

p,q

p,q

d(p, )2D:

GL

OB

AL

dis

par

ity

pixel?

p,q

d(p, )

1D:

E E d(p, ) u(l )data smoothness p p,q p,q

{p,q} N

Global

u(l )p,q

Discontinuity penalty:

lp,q

minimum cut = disparity surface

u(l )= lp,q p,q p,qsolves

LO

CA

L Local (GOOD)

(BAD)

Page 62: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Multiway-Cut:2D Search

pixels

labels

pixels

labels

[Boykov, Veksler, Zabih 1998]

Page 63: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Multiway-Cut Algorithm

),( x'x ))(, x(x fg

minimum cut

),(

)]()()[,())(,x'xx

x'xx'xx(x fffg Minimizes

source label

sink label

pixels

(cost of label discontinuity)

(cost of assigninglabel to pixel)

pixels

labels

Page 64: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Energy minimization

(Slide from Pascal Fua)

Page 65: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Graph Cut

(Slide from Pascal Fua)

(general formulation requires multi-way cut!)

Page 66: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

(Boykov et al ICCV‘99)

(Roy and Cox ICCV‘98)

Simplified graph cut

Page 67: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Correspondence as Segmentation

• Problem: disparities (fronto-parallel) O()surfaces (slanted) O( 2 n)=> computationally intractable!

• Solution: iteratively determine which labels to use

labelpixels

find affineparametersof regions

multiway-cut(Expectation)

Newton-Raphson(Maximization)

Page 68: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo Results (Dynamic Programming)

Page 69: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo Results (Multiway-Cut)

Page 70: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo Results on Middlebury Database

imag

eB

irch

fiel

dT

om

asi 1

999

Ho

ng

-C

hen

200

4

Page 71: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Untextured regions remain a challenge

Multiway-cutDynamic programming

Page 72: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Results: dynamic programming

disparity map

[Bobick & Intille]

left

Page 73: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Results: multiway cut

disparity mapleft

[Kolmogorov & Zabih]

Page 74: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Results: multiway cut (untextured)

disparity map

Page 75: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Multi-camera configurations

Okutami and Kanade

(illustration from Pascal Fua)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 76: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Example: Tsukuba

Tsukuba dataset

Page 77: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Real-time stereo on GPU• Computes Sum-of-Square-Differences (use

pixelshader)• Hardware mip-map generation for aggregation over

window• Trade-off between small and large support window

(Yang and Pollefeys, CVPR2003)

290M disparity hypothesis/sec (Radeon9800pro)e.g. 512x512x36disparities at 30Hz

GPU is great for vision too!

Page 78: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo matching

Optimal path(dynamic programming )

Similarity measure(SSD or NCC)

Constraints• epipolar

• ordering

• uniqueness

• disparity limit

Trade-off

• Matching cost (data)

• Discontinuities (prior)

Consider all paths that satisfy the constraints

pick best using dynamic programming

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 79: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Hierarchical stereo matching

Dow

nsam

plin

g

(Gau

ssia

n p

yra

mid

)

Dis

pari

ty p

rop

ag

ati

on

Allows faster computation

Deals with large disparity ranges

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 80: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Disparity map

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y),y)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 81: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Example: reconstruct image from neighboring images

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 82: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo matching with general camera configuration

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 83: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Image pair rectification

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 84: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Planar rectification

Bring two views Bring two views to standard stereo setupto standard stereo setup

(moves epipole to )(not possible when in/close to image)

~ image size

(calibrated)(calibrated)

Distortion minimization(uncalibrated)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 85: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 86: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

polarrectification

planarrectification

originalimage pair

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Page 87: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

Stereo camera configurations

(Slide from Pascal Fua)

Page 88: Stereo Vision ECE 847: Digital Image Processing Stan Birchfield Clemson University.

More cameras

Multi-baseline stereo

[Okutomi & Kanade]