Vision-based SLAM

Post on 11-Jan-2016

47 views 0 download

Tags:

description

Vision-based SLAM. Simon Lacroix Robotics and AI group LAAS/CNRS, Toulouse With contributions from: Anthony Mallet, Il-Kyun Jung, Thomas Lemaire and Joan Sola. Perceive data In a volume Very far Very precisely. 1024 x 1024 pixels 60º x 60º FOV  0.06 º pixel resolution - PowerPoint PPT Presentation

Transcript of Vision-based SLAM

Vision-based SLAM

Simon LacroixRobotics and AI groupLAAS/CNRS, Toulouse

With contributions from:Anthony Mallet, Il-Kyun Jung,

Thomas Lemaire and Joan Sola

Benefits of vision for SLAM ?

• Cameras : low cost, light and power-saving

• Perceive data – In a volume– Very far– Very precisely

1024 x 1024 pixels60º x 60º FOV

0.06 º pixel resolution

1.0 cm at 10.0 m

• Stereovision– 2 cameras

provide depth

• Images carry a vast amount of information• A vast know-how exists in the computer vision community

• The way humans perceive depth

Stereo image pair Stereo images viewerStereo camera

0. A few words on stereovision

• Very popular in the early 20th century

• Anaglyphs

PolarizationRed/Blue

Principle of stereovision

In 2 dimensions (two linear cameras):

Right camera

b

Right image

Disparityd

)tan()tan( α +=

bd

Left camera

Left imageα

Principle of stereovision

In 3 dimensions (two usual matrix cameras):

1. Establish the geometry of the system (off line)2. Establish matches between the two images, compute the disparity3. On the basis of the matches disparity, compute the 3D coordinates

Geometry of stereovision

x

y

z x

y

z

Ol

Or

P

pl

P1

P2

pr

pr1pr2

Geometry of stereovision

Ql

Q

Qr

x

y

z x

y

z

Ol

Or

P

pl

pr

Geometry of stereovision

x

y

z x

y

z

Ol

Or

R

Epipolar geometry

Epipoles

Epipolar lines

Stereo images rectification

Goal: transform the images so that epipolar lines are parallel

Interest: computational cost reduction of the matching process

Dense pixel-based stereovision

Problem:  « For each pixel in the left image, find its correspondant in the right image »

… 3 6 3 7 9 2 8 7 6 8 9 6 4 9 0 9 9 0 …

… 3 5 7 4 9 6 3 9 6 5 8 6 3 0 1 9 7 5 …

Left line

Right line

???

The matches are computed on windows

Several ways to compare windows: “SAD”, “SSD”, “ZNCC”, Hamming distance on census-transformed images…

Dense pixel-based stereovision

Original image Disparity map 3D image

Outline

0. A few words on stereovision

0-bis. Visual odometry

2. Pixels selection3. Pixels tracking

1. Stereovision

Stereovision

4. Motionestimation

Visual odometry principle

QuickTime™ et undécompresseur MPEG-4 vidéo

sont requis pour visionner cette image.

Visual odometry• Fairly good precision (up to 1% on 100m trajectories)

• But:

– Depends on odometry (to track pixels)

– No error model available

Visual odometry• Applied on the Mars Exploration Rovers

50 % slip

Outline

0. A few words on stereovision

0-bis. Visual odometry

1. Stereovision SLAM

What kind of landmarks ?

Interest points = sharp peaks of the autocorrelation function

Harris detector (precise version [Schmidt 98])

Auto-correlation matrix:

Principal curvatures defined by the two eigen values of the matrix

(s: scale of the detection)

λ1,λ 2

Landmarks : interest points

• Landmark matching

?

Interest points stability

Interest point repeatability

Interest point similarity : resemblance measure of the two principal curvatures of repeated points

points Detected

points Repeated=tyRepetabili

= 70% (7 repeated points out of 10 detected points)

)',max(

)',min()',(

)',max(

)',min()',(

22

222

11

111 λλ

λλ=

λλλλ

= xxSxxS pp

Maximum point similarity: 1

Interest points stability

Repeatability and point similarity evaluation:

Evaluated with known artificial rotation and scale changes

Interest points matching

Principle: combine signal and geometric information to match groups of points [Jung ICCV 01]

Consecutive images Large viewpoint change Small overlap

Landmark matching results

1.5 scale change 3.0 scale change

Landmark matching results

Landmark matching results (Ced)

Detected points Matched points An other example

– Landmark detection– Relative observations (measures)

• Of the landmark positions• Of the robot motions

– Observation associations– Refinement of the landmark and robot positions

Vision : interest points

StereovisionVisual motion estimation

Interest points matching Extended Kalman filter

Stereovision SLAM

Dense stereovision actually not required

IP matching applied on stereo frames (even easier !)

Dense stereovision actually not required

IP matching applied on stereo frames (even easier !)

Visual motion estimation

1. Stereovision

2. Interest point detection

3. Interest points matching

4. Stereovision

5. Motionestimation

– Landmark detection– Relative observations (measures)

• Of the landmark positions• Of the robot motions

– Observation associations– Refinement of the landmark and robot positions

Vision : interest points OK

Stereovision OKVisual motion estimation OK

Interest points matching OK Extended Kalman filter

Stereovision SLAM

Seting up the Kalman filter

x(k +1) = f (x(k),u(k +1)) + v(k +1), v with covariance Pv (k)

z(k) = h(x(k)) + w(k), w with covariance Pw (k)

u(k +1) = (Δφ,Δθ,Δψ ,Δtx,Δty,Δtz )

υ i(k +1) = zi(k +1) − ˆ z i(k +1/k)

• System state:

• System equation:

• Observation equation:

• Prediction: motion estimates

• Landmark “discovery”: stereovision

• Observation : matching + stereovision

x(k) = [x p,m1,...,mN ], avec x p = [φ,θ,ψ , tx, ty, tz] et mi = [x i, y i,zi]

P(k) =Ppp (k) Ppm (k)

Ppm (k) Pmm (k)

⎣ ⎢

⎦ ⎥

mi = [x i, y i,zi]

Need to estimate the errors

Error estimates (1)

• Errors on the disparity estimates

)( cfd =σempirical study:

s

• Errors on the 3D coordinates :

Maximal errors : 0.4 m baseline: 2310 xx−≤σ

1.2 m baseline: 2410.3 xx

−≤σ

Online estimation of the errors

2xd

x dx α

σσα=⇒=

Stereovision error:

Error estimates (2)• Interest point matching error (not miss-matching)

- Correlation surface built thanks to rotation and scale adaptive correlation,fitted with a Gaussian distribution

Gaussian distribution Correlation surface

• Combination of matching and stereo error - Driven by 8 neighbor 3D points and projecting one sigma covariance ellipse to 3D surface

1 pixel

X0Xk

wk

))(( 220

20

8

1

22

8

10 kk

kkX XXw σ+σ+−=σ ∑

=

220 kσσ , : variance of stereo vision error

Error estimates (3)

Visual motion estimation error

• Propagating the uncertainty of 3D matching points set to optimal motion estimate

- 3D matching points set

- Optimal motion estimate

- Cost function

J( ˆ u , ˆ Q ) = (X 'n −R( ˆ Θ , ˆ Φ , ˆ Ψ )Xn − ˆ t T )2

n=1

N

∑€

ˆ Q = Q + ΔQ = [X1,..., XN , X '1 ,..., X 'N ]

),̂,̂,̂ˆ,ˆ,ˆ(ˆzyxuuu tttΨΦΘ=Δ+=

• Covariance of the random perturbation Δu : propagation using Taylor series expansion of the Jacobian of the cost function around Qu ˆ,ˆ

QuickTime™ et undécompresseur

sont requis pour visionner cette image.

Results

70m loop, altitude from 25 to 30m, 90 stereo pair processed

Landmark error ellipses (x40)

Tra

ject

ory

and

land

mar

ksP

ositi

on a

nd

attit

ude

varia

nces

Results

70m loop, altitude from 25 to 30m, 90 stereo pair processed

Frame 1/90

Reference

Reference

Std. Dev.

VME

result

VME

Abs.error

SLAM

result

SLAM

Std. Dev.

SLAM

Abs. error

Θ 6.19° 0.18° 11.93° 5.74° 6.01° 0.16° 0.18°

Φ 2.31° 0.66° 4.00° 1.69° 1.42° 0.55° 0.89°

-105.94° 0.06° -105.52° 0.41° -106.03° 0.08° 0.09°

tx 3.17m 0.26m 5.31m 2.14m 3.13m 0.09m 0.04m

ty 0.61m 0.07m 2.01m 1.40m 0.26m 0.19m 0.35m

tz -1.52m 0.04m -3.25m 1.73m -1.51m 0.03m 0.01m

Results (Ced)

270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped

Landmark error ellipses (x30)

Tra

ject

ory

and

land

mar

ksP

ositi

on a

nd

attit

ude

varia

nces

Results (Ced)

270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped

Frame 1/400

Reference

Reference

Std. Dev.

VME

result

VME

Abs.error

SLAM

result

SLAM

Std. Dev.

SLAM

Abs. error

Θ -0.12° 0.87° -0.13° 0.01° -3.68° 0.38° 3.56°

Φ 2.87° 1.14° -4.99° 7.86° 5.54° 0.40° 1.64°

105.44° 0.23° 101.82° 3.62° 104.32° 0.19° 1.12°

tx -4.73m 0.57m 5.45m10.38

m-3.98m

0.21m

0.95m

ty 0.14m 0.46m 3.04m 2.90m -2.16m0.22

m2.12m

tz 3.89m 0.15m 19.81m15.94

m3.46m

0.11m

0.43m

Application to ground rovers

landmark uncertainty ellipses (x5)

• 110 stereo pairs processed, 60m loop

Application to ground rovers

Frame 1/100

Reference

Reference

Std. Dev.

VME

result

VME

Abs.error

SLAM

result

SLAM

Std. Dev.

SLAM

Abs. error

Θ 0.52° 0.31° 2.75° 2.23° 0.88° 0.98° 0.36 °

Φ 0.36° 0.25° -0.11° 0.47° 0.72° 0.74° 0.36 °

-0.14° 0.16° 1.89° 2.03° 1.24° 1.84° 1.38°

tx -0.012m

0.010m

0.057m0.069

m-

0.077m0.069

m0.065

m

ty -0.243m

0.019m

-1.018m0.775

m-

0.284m0.064

m0.041

m

tz 0.019m0.015

m0.144m

0.125m

0.018m0.019

m0.001

m

• 110 stereo pairs processed, 60m loop

Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs

… … …

Application to indoor robots

10 timesCov. ellipse

About 30 m long trajectory, 1300 stereo image pairs

Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs

10 timesCov. ellipse

Beginning of loopMiddle of loop

End of loop

Application to indoor robots

Phi Theta Elevation

-Two rotation angles(Phi, Theta) and Elevation must be zero

CameraPhi

Theta

Elevation

About 30 m long trajectory, 1300 stereo image pairs

Outline

0. A few words on stereovision

0-bis. Visual odometry

1. Stereovision SLAM

2. Monocular (bearing-only) SLAM

Bearing-only SLAM

Generic SLAM

– Landmark detection– Relative observations (measures)

• Of the landmark positions• Of the robot motions

– Observation associations– Refinement of the landmark and robot positions

Stereovision SLAM

Vision : interest points

Stereovision Visual motion estimation

Interest points matching Extended Kalman filter

Bearing-only SLAM

Generic SLAM

– Landmark detection– Relative observations (measures)

• Of the landmark positions• Of the robot motions

– Observation associations– Refinement of the landmark and robot positions

Monocular SLAM

Vision : interest points

« Multi-view stereovision » INS, Motion model, GPS…

Interest points matching Particle filter + extended Kalman filter

« Observation filter » ≈ Gaussian particles

1. Landmark initialisation

2. Landmark observations

Landmark observations

Bearing-only SLAM

Overview of the whole algorithm

Bearing-only SLAM

Comparison stereo / bearing-only

Mapped landmarks (bearing-only case)

Looking forward / looking sidewards

stereovision bearing-only

Using panoramic vision

Data association is still an issue

« View-based » qualitative navigation can help to focus the search

View-based navigation

Indexing with global attributes

Local characteristics histograms based on gaussian derivatives Color Histograms

Texture histograms

Local Characteristics Histograms Family

(LCHF)

View-based navigation

Empirical relation between image distance and cartesian distance

Closing the loop

1. Image processing at each image acquisition

Closing the loop2. SLAM processes at each image acquisition

Closing the loop

QuickTime™ et undécompresseur Cinepak

sont requis pour visionner cette image.

Outline

0. A few words on stereovision

0-bis. Visual odometry

1. Stereovision SLAM

2. Monocular (bearing-only) SLAM

3. Bearing-only SLAM using line segments

Using line segments

Initializing line segments landmarks

Line segment representation: Plücker coordinates

In 2 dimensions:

Bearing-only SLAM with line segments

QuickTime™ et undécompresseur

sont requis pour visionner cette image.

Bearing-only SLAM with line segments

Summary

0. A few words on stereovision

0-bis. Visual odometry

1. Stereovision SLAM

2. Monocular (bearing-only) SLAM

3. Bearing-only SLAM using line segments