Vision-based SLAM

Simon LacroixRobotics and AI groupLAAS/CNRS, Toulouse

With contributions from:Anthony Mallet, Il-Kyun Jung,

Thomas Lemaire and Joan Sola

Benefits of vision for SLAM ?

• Cameras : low cost, light and power-saving

• Perceive data – In a volume– Very far– Very precisely

1024 x 1024 pixels60º x 60º FOV

0.06 º pixel resolution

1.0 cm at 10.0 m

• Stereovision– 2 cameras

provide depth

• Images carry a vast amount of information• A vast know-how exists in the computer vision community

• The way humans perceive depth

Stereo image pair Stereo images viewerStereo camera

0. A few words on stereovision

• Very popular in the early 20th century

• Anaglyphs

PolarizationRed/Blue

Principle of stereovision

In 2 dimensions (two linear cameras):

Right camera

Right image

Disparityd

)tan()tan( α +=

Left camera

Left imageα

Principle of stereovision

In 3 dimensions (two usual matrix cameras):

1. Establish the geometry of the system (off line)2. Establish matches between the two images, compute the disparity3. On the basis of the matches disparity, compute the 3D coordinates

Geometry of stereovision

pr1pr2

Geometry of stereovision

Epipolar geometry

Epipoles

Epipolar lines

Stereo images rectification

Goal: transform the images so that epipolar lines are parallel

Interest: computational cost reduction of the matching process

Dense pixel-based stereovision

Problem: « For each pixel in the left image, find its correspondant in the right image »

… 3 6 3 7 9 2 8 7 6 8 9 6 4 9 0 9 9 0 …

… 3 5 7 4 9 6 3 9 6 5 8 6 3 0 1 9 7 5 …

Left line

Right line

The matches are computed on windows

Several ways to compare windows: “SAD”, “SSD”, “ZNCC”, Hamming distance on census-transformed images…

Dense pixel-based stereovision

Original image Disparity map 3D image

Outline

0-bis. Visual odometry

2. Pixels selection3. Pixels tracking

1. Stereovision

Stereovision

4. Motionestimation

Visual odometry principle

QuickTime™ et undécompresseur MPEG-4 vidéo

sont requis pour visionner cette image.

Visual odometry• Fairly good precision (up to 1% on 100m trajectories)

• But:

– Depends on odometry (to track pixels)

– No error model available

Visual odometry• Applied on the Mars Exploration Rovers

50 % slip

Outline

1. Stereovision SLAM

What kind of landmarks ?

Interest points = sharp peaks of the autocorrelation function

Harris detector (precise version [Schmidt 98])

Auto-correlation matrix:

Principal curvatures defined by the two eigen values of the matrix

(s: scale of the detection)

λ1,λ 2

Landmarks : interest points

• Landmark matching

Interest points stability

Interest point repeatability

Interest point similarity : resemblance measure of the two principal curvatures of repeated points

points Detected

points Repeated=tyRepetabili

= 70% (7 repeated points out of 10 detected points)

)',max(

)',min()',(

)',max(

)',min()',(

111 λλ

λλλλ

= xxSxxS pp

Maximum point similarity: 1

Interest points stability

Repeatability and point similarity evaluation:

Evaluated with known artificial rotation and scale changes

Interest points matching

Principle: combine signal and geometric information to match groups of points [Jung ICCV 01]

Consecutive images Large viewpoint change Small overlap

Landmark matching results

1.5 scale change 3.0 scale change

Landmark matching results

Landmark matching results (Ced)

Detected points Matched points An other example

– Landmark detection– Relative observations (measures)

• Of the landmark positions• Of the robot motions

– Observation associations– Refinement of the landmark and robot positions

Vision : interest points

StereovisionVisual motion estimation

Interest points matching Extended Kalman filter

Stereovision SLAM

Dense stereovision actually not required

IP matching applied on stereo frames (even easier !)

Dense stereovision actually not required

IP matching applied on stereo frames (even easier !)

Visual motion estimation

1. Stereovision

2. Interest point detection

3. Interest points matching

4. Stereovision

5. Motionestimation

Vision : interest points OK

Stereovision OKVisual motion estimation OK

Interest points matching OK Extended Kalman filter

Stereovision SLAM

Seting up the Kalman filter

x(k +1) = f (x(k),u(k +1)) + v(k +1), v with covariance Pv (k)

z(k) = h(x(k)) + w(k), w with covariance Pw (k)

u(k +1) = (Δφ,Δθ,Δψ ,Δtx,Δty,Δtz )

υ i(k +1) = zi(k +1) − ˆ z i(k +1/k)

• System state:

• System equation:

• Observation equation:

• Prediction: motion estimates

• Landmark “discovery”: stereovision

• Observation : matching + stereovision

x(k) = [x p,m1,...,mN ], avec x p = [φ,θ,ψ , tx, ty, tz] et mi = [x i, y i,zi]

P(k) =Ppp (k) Ppm (k)

Ppm (k) Pmm (k)

⎣ ⎢

⎦ ⎥

mi = [x i, y i,zi]

Need to estimate the errors

Error estimates (1)

• Errors on the disparity estimates

)( cfd =σempirical study:

• Errors on the 3D coordinates :

Maximal errors : 0.4 m baseline: 2310 xx−≤σ

1.2 m baseline: 2410.3 xx

−≤σ

Online estimation of the errors

x dx α

σσα=⇒=

Stereovision error:

Error estimates (2)• Interest point matching error (not miss-matching)

- Correlation surface built thanks to rotation and scale adaptive correlation,fitted with a Gaussian distribution

Gaussian distribution Correlation surface

• Combination of matching and stereo error - Driven by 8 neighbor 3D points and projecting one sigma covariance ellipse to 3D surface

1 pixel

))(( 220

kkX XXw σ+σ+−=σ ∑

220 kσσ , : variance of stereo vision error

Error estimates (3)

Visual motion estimation error

• Propagating the uncertainty of 3D matching points set to optimal motion estimate

- 3D matching points set

- Optimal motion estimate

- Cost function

J( ˆ u , ˆ Q ) = (X 'n −R( ˆ Θ , ˆ Φ , ˆ Ψ )Xn − ˆ t T )2

∑€

ˆ Q = Q + ΔQ = [X1,..., XN , X '1 ,..., X 'N ]

),̂,̂,̂ˆ,ˆ,ˆ(ˆzyxuuu tttΨΦΘ=Δ+=

• Covariance of the random perturbation Δu : propagation using Taylor series expansion of the Jacobian of the cost function around Qu ˆ,ˆ

QuickTime™ et undécompresseur

Results

70m loop, altitude from 25 to 30m, 90 stereo pair processed

Landmark error ellipses (x40)

Results

70m loop, altitude from 25 to 30m, 90 stereo pair processed

Frame 1/90

Reference

Std. Dev.

result

Abs.error

result

Std. Dev.

Abs. error

Θ 6.19° 0.18° 11.93° 5.74° 6.01° 0.16° 0.18°

Φ 2.31° 0.66° 4.00° 1.69° 1.42° 0.55° 0.89°

-105.94° 0.06° -105.52° 0.41° -106.03° 0.08° 0.09°

tx 3.17m 0.26m 5.31m 2.14m 3.13m 0.09m 0.04m

ty 0.61m 0.07m 2.01m 1.40m 0.26m 0.19m 0.35m

tz -1.52m 0.04m -3.25m 1.73m -1.51m 0.03m 0.01m

Results (Ced)

270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped

Landmark error ellipses (x30)

Results (Ced)

270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped

Frame 1/400

Reference

Std. Dev.

result

Abs.error

result

Std. Dev.

Abs. error

Θ -0.12° 0.87° -0.13° 0.01° -3.68° 0.38° 3.56°

Φ 2.87° 1.14° -4.99° 7.86° 5.54° 0.40° 1.64°

105.44° 0.23° 101.82° 3.62° 104.32° 0.19° 1.12°

tx -4.73m 0.57m 5.45m10.38

m-3.98m

ty 0.14m 0.46m 3.04m 2.90m -2.16m0.22

m2.12m

tz 3.89m 0.15m 19.81m15.94

m3.46m

Application to ground rovers

landmark uncertainty ellipses (x5)

• 110 stereo pairs processed, 60m loop

Application to ground rovers

Frame 1/100

Reference

Std. Dev.

result

Abs.error

result

Std. Dev.

Abs. error

Θ 0.52° 0.31° 2.75° 2.23° 0.88° 0.98° 0.36 °

Φ 0.36° 0.25° -0.11° 0.47° 0.72° 0.74° 0.36 °

-0.14° 0.16° 1.89° 2.03° 1.24° 1.84° 1.38°

tx -0.012m

0.010m

0.057m0.069

0.077m0.069

m0.065

ty -0.243m

0.019m

-1.018m0.775

0.284m0.064

m0.041

tz 0.019m0.015

m0.144m

0.125m

0.018m0.019

m0.001

• 110 stereo pairs processed, 60m loop

Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs

… … …

Application to indoor robots

10 timesCov. ellipse

About 30 m long trajectory, 1300 stereo image pairs

Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs

10 timesCov. ellipse

Beginning of loopMiddle of loop

End of loop

Application to indoor robots

Phi Theta Elevation

-Two rotation angles(Phi, Theta) and Elevation must be zero

CameraPhi

Elevation

About 30 m long trajectory, 1300 stereo image pairs

Outline

2. Monocular (bearing-only) SLAM

Bearing-only SLAM

Generic SLAM

Stereovision SLAM

Stereovision Visual motion estimation

Interest points matching Extended Kalman filter

Bearing-only SLAM

Generic SLAM

Monocular SLAM

« Multi-view stereovision » INS, Motion model, GPS…

Interest points matching Particle filter + extended Kalman filter

« Observation filter » ≈ Gaussian particles

1. Landmark initialisation

2. Landmark observations

Landmark observations

Bearing-only SLAM

Overview of the whole algorithm

Bearing-only SLAM

Comparison stereo / bearing-only

Mapped landmarks (bearing-only case)

Looking forward / looking sidewards

stereovision bearing-only

Using panoramic vision

Data association is still an issue

« View-based » qualitative navigation can help to focus the search

View-based navigation

Indexing with global attributes

Local characteristics histograms based on gaussian derivatives Color Histograms

Texture histograms

Local Characteristics Histograms Family

(LCHF)

View-based navigation

Empirical relation between image distance and cartesian distance

Closing the loop

1. Image processing at each image acquisition

Closing the loop2. SLAM processes at each image acquisition

Closing the loop

QuickTime™ et undécompresseur Cinepak

Outline

3. Bearing-only SLAM using line segments

Using line segments

Initializing line segments landmarks

Line segment representation: Plücker coordinates

In 2 dimensions:

Bearing-only SLAM with line segments

QuickTime™ et undécompresseur

Bearing-only SLAM with line segments

Summary

3. Bearing-only SLAM using line segments

Vision-based SLAM

Documents

Transcript of Vision-based SLAM

GP-SLAM+: Real-Time 3D Lidar SLAM Based on Improved …ras.papercept.net/images/temp/IROS/files/2304.pdf · 2020. 10. 22. · GP-SLAM+: real-time 3D lidar SLAM based on improved regionalized

Vision-based SLAM and moving objects tracking for the …vigir.missouri.edu/~gdesouza/Research/Conference_CDs/... · 2014. 7. 23. · Vision-based SLAM and moving objects tracking

Keyframe-Based Visual-Inertial SLAM Using Nonlinear Optimizationroboticsproceedings.org/rss09/p37.pdf · Keyframe-Based Visual-Inertial SLAM Using Nonlinear Optimization Stefan Leutenegger

Improvements for an appearance-based SLAM-Approach for large … · 2009. 11. 26. · Index Terms—Visual SLAM, panoramic vision, appearance-based localization, view-based approach

Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization

Practical Course: Vision Based Navigation · − Vision-based Simultaneous Localization and Mapping (SLAM) − Structure from Motion (SfM) ... • Course takes place during the lecture

Monocular Vision SLAM for Indoor Aerial Vehiclesmotion, optical ow, and stereo vision. None of these have a potential for on-line SLAM applications with reasonable computation as well

Vision-Based SLAM: Stereo and Monocular …mdailey/cvreadings/Lemaire-SLAM.pdf · Vision-Based SLAM: Stereo and Monocular Approaches 345 Figure 1.The ATRV rover Dala and the 10 m

Vision and SLAM

Review of Computer Vision Techniques - University of Waterloowavelab.uwaterloo.ca/slam/2017-SLAM/Lecture4-computer_vision... · Shadow Effects. Introduction ... Feature Detectors

Event-Based 3D SLAM with a Depth-Augmented Dynamic Vision ... · Event-based 3D SLAM with a depth-augmented dynamic vision sensor David Weikersdorfer 1, David B. Adrian , Daniel Cremers

EAO-SLAM: Monocular Semi-Dense Object SLAM Based on ... · EAO-SLAM: Monocular Semi-Dense Object SLAM Based on Ensemble Data Association Yanmin Wu 1, Yunzhou Zhang; 2, Delong Zhu3,

Pose Graph Compression for Laser-Based SLAM · 2011-11-22 · Pose Graph Compression for Laser-Based SLAM 3 2 Related Work A large variety of graph-based SLAM approaches have been

EKF-based SLAM fusing heterogeneous landmarks

Gamma-SLAM: Using Stereo Vision and Variance Grid Maps for ...

ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsitylabrococo/tutorial_icra_2016/icra16_slam_tutorial... · ICRA 2016 Tutorial on SLAM . 2 Graph-Based SLAM ?? 3 Graph-Based SLAM

10.10 · process (f.e. LSD-SLAM) 10.10.2018 7 40 Visual Computing Institute | Prof. Dr . Bastian Leibe Computer Vision 2 Part 1 –Introduction RGB-D SLAM Example 41 Computer Vision

Photometric Bundle Adjustment for Vision-Based …ci2cv.net/media/papers/Hatem_ACCV_2016_PhotometricBA.pdfPhotometric Bundle Adjustment for Vision-Based SLAM Hatem Alismail?, Brett

A survey on non-filter-based monocular Visual SLAM systems · A survey on non-filter-based monocular Visual SLAM systems 3 Table 1 List of different visual SLAM system. Non-filter-based

LNCS 5303 - Improving the Agility of Keyframe-Based SLAMkosecka/eccv08/eccv08_klein.pdfImproving the Agility of Keyframe-Based SLAM Georg Klein and David Murray Active Vision Laboratory,