What is Visual Odometry - UCSBlbmedia.ece.ucsb.edu/.../Lecture6-visual-odometry.pdf · Visual...

What is Visual Odometry

• 视觉里程计(Visual odometry)起初主要应

用在机器人领域中，用于解决移动机器人在未知环境中的自主定位和导航问题

• 它的核心功能是分析采集的图片序列，并由此确定相机的当前位置和姿态。根据每一帧的相机姿态，可以得到整个系统的轨迹图

baseline

Epipolar plane

Epipolar holes

Epipolar lines

The epipolar geometry

C,C’,x,x’ and X are coplanar

All points on p project on l and l’

Family of planes p and lines l and l’

Intersection in e and e’

epipoles e,e’

= intersection of baseline with image plane

= projection of projection center in other image

an epipolar plane = plane containing baseline (1-D family)

an epipolar line = intersection of epipolar plane with image

(always come in corresponding pairs)

Example: converging cameras

Example: motion parallel with image plane

Example: forward motion

Essential Matrix and Fundamental Matrix

两个相机拍摄同一场景获得的左右两幅图像对应的点之间的关系可以

通过Essential矩阵或者Fundamental矩阵来表明。

Matrix form of cross product

Calibrated Camera

vupRptp

)1,','('

)1,,( with 0)](['

0' Epp

' 0 with p Ep E t R SR Essential matrix

Uncalibrated Camera

0 ' pEp

scoordinate camerain ' and toingcorrespond scoordinate pixelin points ' and pppp

p'MppMp '' and 1

Fundamental matrix

Properties of fundamental and

essential matrix

• Matrix is 3 x 3

• Transpose : If F is essential matrix of cameras (P, P’).

FT is essential matrix of camera (P’,P)

• Epipolar lines: Think of p and p’ as points in the projective

plane then F p is projective line in the right image.

That is l’=F p l = FT p’

• Encodes information of the extrinisic parameters only

Least square approach

1|| constraint under the

) Minimize

Fp'(p i

We have a homogeneous system A f =0

The least square solution is smallest singular value of A,

i.e. the last column of V in SVD of A = U D VT

3D Reconstruction

• Stereo: we know the viewing geometry (extrinsic parameters) and the intrinsic parameters: Find correspondences exploiting epipolar geometry, then reconstruct

• Structure from motion (with calibrated cameras): Find correspondences, then estimate extrinsic parameters (rotation and direction of translation), then reconstruct.

• Uncalibrated cameras: Find correspondences,

Compute projection matrices (up to a projective transformation), then reconstruct up to a projective transformation.

Point reconstruction

M Xx XM 'x'

Geometric error

Reconstruct matches in projective frame

by minimizing the reprojection error

Non-iterative optimal solution

Reconstruction for intrinsically

calibrated cameras

• Compute the essential matrix E using normalized points.

• Select M=[I|0] M’=[R|T] then E=[Tx]R

• Find T and R using SVD of E

Reconstruction from uncalibrated

cameras Reconstruction problem:

given xi↔x‘i , compute M,M‘ and Xi

ii M Xx ii XMx for all i

without additional information possible

only up to projective ambiguity

Projective Reconstruction

Theorem

• Assume we determine matching points xi and xi’. Then we can compute a unique Fundamental matrix F.

• The camera matrices M, M’ cannot be recovered uniquely

• Thus the reconstruction (Xi) is not unique

• There exists a projective transformation H such that

12,,1,2 HMMHMMHXX ii

Reconstruction ambiguity:

projective

iii XHM HM Xx P

Structure from Motion

Simultaneous Localization and Mapping (SLAM)

Visual Odometry

Camera calibration

• Determine camera parameters from known

3D points or calibration object(s)

1. internal or intrinsic parameters such as

focal length, optical center, aspect ratio:

what kind of camera?

2. external or extrinsic (pose)

parameters:

where is the camera?

• How can we do this?

Coordinate Systems

• World Coordinate System: It’s a known

reference coordinate system with respect to

which we calibrate the camera.

• Camera Coordinate System: It’s a

coordinate system with its origin at the

optical center of the camera.

• Pixel Coordinate System

Camera:Geometry Involved

• Mathematical Definition: A camera is a mapping

between a 3D world (object space) and a 2D

image.

• Calibration: The objective of calibration is to

calculate the intrinsic and/or extrinsic parameters

of a camera given a set of images taken using the

camera.

Camera Models

• Perspective:

• Orthographic:

Note: We will deal only with perspective projection

xx ' yy '

Intrinsic Parameters

• Let be the coordinates of a point in 3D. Its projection on

the image plane is given by:

),,( zyx

Pixel Square Camera, Ideal

Pixelr Rectangula Camera, Ideal

Center Displaced

Axes Coordinate Rotated

),( vu

Intrinsic Parameters

0)sin(

0)cot(

Hence, the 5 intrinsic parameters of a camera are:

θ β α vu 00

Extrinsic Parameters

• The camera frame ( C ) can be different from the world frame (W).

Vectoron translati3x1 t

MatrixRotation 3x3 R

wc PtRP

Hence, we have 6 extrinsic

Parameters

Perspective Projection Matrix

• M = 3x4 matrix. Taking into consideration both the intrinsic

and extrinsic parameters:

tvtrvr

tuttrurr

)sin()sin(

)cot()cot(

Rotation and Translation

Camera matrix

• Fold intrinsic calibration matrix K and

extrinsic pose parameters (R,t) together into

a camera matrix

• M = K [R | t ]

• (put 1 in lower r.h. corner for 11 d.o.f.)

Camera matrix calibration

• Directly estimate 11 unknowns in the M

matrix using known 3D points (Xi,Yi,Zi) and

measured feature positions (ui,vi)

Camera matrix calibration

• Linear regression:

– Bring denominator over, solve set of (over-

determined) linear equations. How?

– Least squares (pseudo-inverse)

Projective structure from motion

• Given: m images of n fixed 3D points

• xij = Pi Xj , i = 1,… , m, j = 1, … , n

• Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding points xij

Slides from Lana Lazebnik

Bundle adjustment • Non-linear method for refining structure and motion

• Minimizing reprojection error

jiijDE XPxXP

Existing attempts : SLAM

• SLAM : Simultaneous Localization and Mapping

can use many different types of sensor to acquire observation data used in building the

map such as laser rangefinders, sonar sensors and cameras.

– Well-established in robotics (using a rich array of sensors)

– Demonstrated with a single hand-held camera by Davison at 2003 (Mono-

SLAM).

– Mono-SLAM was applied to

AR system at 2004.

Existing attempts : Model based tracking

• Model-based tracking is

– More robust

– More accurate

– Proposed by Lepetit et. al.

at ISMAR 2003

Frame by Frame SLAM • Why?

is SLAM fundamentally harder?

One frame

Find features

Draw graphics

Update camera pose and entire map Many DOF

Frame by Frame SLAM • SLAM

– Updating entire map every frame is so

expensive!!!

– Needs “sparse map of high-quality features”

- A. Davison

• Proposed approach

– Use dense map (of low quality features)

– Don’t update the map every frame : Keyframes

– Split the tracking and mapping into two

threads

Parallel Tracking and Mapping

• Proposed method - Split the tracking and mapping into two threads

One frame

Find features

Draw graphics

Update camera pose only Simple & easy

Thread #2 Mapping

Thread #1 Tracking

Update map

Parallel Tracking and Mapping

Tracking thread:

• Responsible estimation of camera

pose and rendering augmented

graphics

• Must run at 30 Hz

• Make as robust and accurate as

possible

Mapping thread:

• Responsible for providing the

• Can take lots of time per key

• Make as rich and accurate as

possible

Tracking thread • Overall flow

Pre-process frame

Project points

Measure points

Update Camera Pose

Project points

Measure points

Update Camera Pose

Draw Graphics

Coarse stage Fine stage

Pre-process frame

• Make for pyramid levels

640x480 320x240 160x120 80x60

Pre-process frame • Make for pyramid levels

• Detect Fast corners

– E. Rosten et al (ECCV 2006)

640x480 320x240 160x120 80x60

Project Points

• Use motion model to update camera pose

– Constant velocity model

Vt =(Pt – Pt-1)/∇t

Estimated current Pt+1

Previous pos Pt

Previous pos Pt-1

Pt+1=Pt+∇t’(Vt)

∇t’

Project Points

• Choose subset to measure

– ~ 50 biggest features for coarse stage

– 1000 randomly selected for fine stage

640x480 320x240 160x120 80x60

~50 1000

Measure Points • Generate 8x8 matching template (warped

from source key-frame:map)

• Search a fixed radius around projected

position

– Use Zero-mean SSD

– Only search at Fast corner

points

Update caemra pose • 6-DOF problem

– Obtain by SFM (Three-point algorithm)

Mapping thread • Overall flow

Stereo Initialization

Wait for new key frame

Add new map points

Optimize map

Map maintenance

Tracker

Stereo Initialization

• Use five-point-pose algorithm

– D. Nister et. al. 2006

• Requires a pair of frames and feature correspondences

• Provides initial map

• User input required:

– Two clicks for two key-frames

– Smooth motion for feature correspondence

Wait for new key frame

• Key frames are only added if :

– There is a sufficient baseline to the other key frame

– Tracking quality is good

– Keyframe (4 level pyramid images and its corners)

• When a key frame is added :

– The mapping thread stops whatever it is doing

– All points in the map are measured in the keyframe

– New map points are found and added to the map

Add new map points • Want as many map points as possible

• Check all maximal FAST corners in the key

frame :

– Check score

– Check if already in map

• Epipolar search in a neighboring key frame

• Triangulate matches and add to map

• Repeat in four image pyramid levels

Optimize map

• Use batch SFM method: Bundle Adjustment

• Adjusts map point positions and key frame

• Minimize reprojection error of all points in all

keyframes (or use only last N key frames)

Map maintenance

• When camera is not exploring, mapping thread

has idle time

• Data association in bundle adjustment is

reversible

• Re-attempt outlier measurements

• Try measure new map features in all old key

frames

Comparison to EKF-SLAM • More Accurate

• More robust

• Faster tracking

SLAM based AR Proposed AR

System and Results • Environment

– Desktop PC (Intel Core 2 Duo 2.66 GHz)

– OS : Linux

– Language : C++

• Tracking speed

Total 19.2 ms

Key frame preparation 2.2 ms

Feature Projection 3.5 ms

Patch search 9.8 ms

Iterative pose update 3.7 ms

System and Results • Mapping scalability and speed

– Practical limit

• 150 key frames

• 6000 points

– Bundle adjustment timing

Key frames 2-49 50-99 100-149

Local Bundle Adjustment 170 ms 270 ms 440 ms

Global Bundle Adjustment 380 ms 1.7 s 6.9 s

Demonstration

Remaining problem

• Outlier management

• Still brittle in some scenario

– Repeated texture

– Passive stereo initialization

• Occlusion problem

• Relocation problem

What is Visual Odometry - UCSBlbmedia.ece.ucsb.edu/.../Lecture6-visual-odometry.pdf · Visual...

Documents

Transcript of What is Visual Odometry - UCSBlbmedia.ece.ucsb.edu/.../Lecture6-visual-odometry.pdf · Visual...

Bias Compensation in Visual Odometry

Real-time Dense Visual Odometry for Quadrocopters

2D-3D Camera Fusion for Visual Odometry in Outdoor ... · Visual odometry is generally carried out by relying on 2D-2D, 3D-3D, or 2D-3D information. 2D-2D based methods typically

Visual Pipe Mapping with a Fisheye Camera - Robotics … Pipe Mapping with a Fisheye Camera ... Pipe inspection is a ... ment of a ﬁsheye visual odometry and mapping system for an

Computer Vision Group Technical University of Munich Jakob ... · Jakob Engel Semi-Dense Visual Odometry for a Monocular Camera 1 Semi-Dense Visual Odometry for a Monocular Camera

ViPR: Visual-Odometry-aided Pose Regression for 6DoF Camera … · 2019-12-25 · ViPR: Visual-Odometry-aided Pose Regression for 6DoF Camera Localization Felix Ott1, Tobias Feigl2,1,

e90 Project - Stereo Visual Odometry

Scale-Awareness of Light Field Camera Based Visual Odometry · Visual Odometry. Niclas Zeller. 1; 23. and Franz Quint and Uwe Stilla. 1. Technische Universit at Munc hen fniclas.zeller,stillag@tum.de.

Keypoint Locations Self-Improving Visual Odometry - arXiv

Towards Robust Visual Odometry with a Multi …Visual odometry (VO) is a technique used to estimate camera motion from images. As a fundamental block for robot navigation, virtual

Visual Odometry

SVO: Semi-Direct Visual Odometry for Monocular and … SVO: Semi-Direct Visual Odometry for Monocular and Multi-Camera Systems Christian Forster, Zichao Zhang, Michael Gassner, Manuel

Balanced Covariance Estimation for Visual Odometry Using ... · visual odometry (VO), which is particularly challenging to specify uncertainty. This is because camera is an extroverted

Direct Visual Odometry for a Fisheye-Stereo Camera · Direct Visual Odometry for a Fisheye-Stereo Camera Peidong Liu 1, Lionel Heng2, Torsten Sattler , Andreas Geiger 1,3, and Marc

EFFICIENT INERTIALLY AIDED VISUAL ODOMETRY TOWARDS …

PennCOSYVIO: A Challenging Visual Inertial Odometry Benchmarkkostas/mypub.dir/pfrommer17icra.pdf · PennCOSYVIO: A Challenging Visual Inertial Odometry Benchmark Bernd Pfrommer 1Nitin

Consistency of EKF-Based Visual-Inertial Odometry

Robust Real-Time Visual Odometry with a Single Camera and an … · 2011-12-19 · Robust Real-Time Visual Odometry with a Single Camera and an IMU Laurent Kneip laurent.kneip@mavt.ethz.ch

Review of visual odometry: types, approaches, challenges ...

Semi-Dense Visual Odometry for a Monocular Camera...Semi-Dense Visual Odometry for a Monocular Camera Jakob Engel, Jurgen Sturm, Daniel Cremers¨ TU Munchen, Germany¨ Abstract We