”Structure from motion”algorithms for the Cycab...

25
Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to 15/09) 2nd year internship report ”Structure from motion” algorithms for the Cycab robot Internship tutor : Christophe BRAILLON GRAVIR Laboratory, e-Motion team INRIA Rhˆ one-Alpes 655 avenue de l’Europe Montbonnot 38334 SAINT ISMIER Cedex, France

Transcript of ”Structure from motion”algorithms for the Cycab...

Page 1: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Guillaume CERNIER2nd year ENSIMAGSummer 2006(3 months from 19/06 to 15/09)

2nd year internship report

”Structure from motion” algorithmsfor the Cycab robot

Internship tutor : Christophe BRAILLONGRAVIR Laboratory, e-Motion teamINRIA Rhone-Alpes655 avenue de l’EuropeMontbonnot38334 SAINT ISMIER Cedex, France

Page 2: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Contents

1 Context 11.1 The laboratory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 INRIA Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 INRIA Rhone-Alpes organization . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 The e-Motion team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.1 Team presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.2 Team composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.3 My internship tutor(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 The internship 62.1 Internship subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 The cycab vision platform and its graphs . . . . . . . . . . . . . . . . . . 62.1.2 ”Structure from motion” algorithms . . . . . . . . . . . . . . . . . . . . . 72.1.3 A camera calibration GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.4 Technical choices and precisions . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Internship organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.2 Publication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Relationship skills acquired . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Bibliography 16

3 Appendix 17

i

Page 3: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CONTENTS ii

Technical summary

One of a project of the e-Motion team (GRAVIR lab at the INRIA Rhone-Alpes), I collab-orate with, is the design of the Cycab robot. This robot is an autonomous vehicle intended tobe able to move in urban spaces especially for reduced-mobility persons.You can find more information of this large R&D project on the internet ( 1, 2)

Figure 1: Cycabs in front of Rhone-Alpes INRIA unit

Therefore, such a vehicle has to be able to avoid obstacles like pedestrians or other vehicles.There are several solutions to avoid obstacles :

• The first, which has already been dealt is a laser telemeter which would be installed infront of the robot. But this kind of feature has disadvantages such the accuracy limitedto a 2D horizontal plane.

• The second is to use the camera which would be also available in front of the vehicle andtries to estimate the 3D shapes of the surrounding objects.

• The best would be to fusion the two first processes ; and this is the solution which iscurrently designed in the e-Motion team

But, my work during these three months was to implement the second solution and especially”structure from motion” algorithms.

These algorithms had to be really integrated in what they called the Cycab platform whichis the big software which will be embedded in the vehicle (in a laptop on the current prototypeversion of the Cycab robot).

1http://www-lara.inria.fr/cycaba/2http://www.inria.fr/rrrt/rt-0229.html

Page 4: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CONTENTS iii

Acknowledgments

I first would like to return thanks to Christophe Braillon, my internship tutor (and alsoENSIMAG ex-student) for having believed in me during these 3 months and also for bringingme many elements for succeeding in this experience.I also would like to thank Amaury Negre for giving me some crucial theorical and mathematicalelements about my work.I will speak again about them in the ”e-Motion team” section.From a general point of view, I thank all the e-Motion team of the INRIA Rhone-Alpes andalso Axel Ladilas (who also carried out this internship during 3 months in with this team) forall the great moments spent this summer especially the leaving at the Annecy lake, the footballtournament, the volley-ball sessions and above all the plentiful card games.

Page 5: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Chapter 1

Context

1.1 The laboratory

1.1.1 INRIA Presentation

The ”Institut National de Recherche en Informatique et en Automatique”(INRIA) is a Frenchnational research institution focusing on computer science, control theory and applied mathe-matics. Created in 1967 at Rocquencourt near Paris, INRIA (The National Institute for Researchin Computer Science and Control) is a public scientific and technological establishment (EPST)under the double supervision of the French Research Ministry and the Ministry of Economy,Finance and Industry.It consists of 6 research units distributed in whole France :

• INRIA Futurs (Lille, Saclay, Bordeaux) 1

• INRIA Lorraine (LORIA - in association with CNRS and local universities) 2

• INRIA Rennes (IRISA - in association with CNRS and local universities) 3

• INRIA Rhone-Alpes (near Grenoble) 4

• INRIA Rocquencourt (a suburb of Paris) 5

• INRIA Sophia-Antipolis (near Nice) 6

[source : Wikipedia 7]

I carried out my summer internship in the Rhone-Alpes research unit and thus I will brieflydescribe to you.

1http://www-futurs.inria.fr/2http://www.loria.fr/INRIA/3http://www.irisa.fr/4http://www.inrialpes.fr/5http://www-rocq.inria.fr/6http://www-sop.inria.fr/7http://en.wikipedia.org/wiki/INRIA

1

Page 6: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 1. CONTEXT 2

1.1.2 INRIA Rhone-Alpes organization

Figure 1.1: Rhone-Alpes INRIA unit

There are 24 research teams at the Rhone-Alpes research unit distributed in 6 researchthemes.

Communicating systems

• ARES - Architectures of networks of services

• COMPSYS - Compilation and embedded systems

• PLANETE - Protocols and applications for the Internet

• POP ART - Programming languages, Operating Systems, Parallelism, and Aspects forReal-Time

• SARDES - System architecture for reflective distributed computing environments

• VASY - System validation - Research and applications

Cognitive systems

• ARTIS - Acquisition, representation and transformations for image synthesis

• EVASION - Virtual environments for animation and image synthesis of natural objects

• LEAR - Learning and recognition in vision

• PRIMA - Perception, recognition and integration for observation of activity

• I3D - 3 dimensional interaction

• MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems

Page 7: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 1. CONTEXT 3

Symbolic systems

• ARENAIRE - Computer arithmetic

• EXMO - Computer mediated exchange of structured knowledge

• WAM - Web, adaptation and multimedia

Numerical systems

• BIPOP - Modelling, Simulation and Control of Non-Smooth Dynamical Systems

• E-MOTION - Geometry and Probability for Motion and Action

• GRAAL - Algorithms and Scheduling for Distributed Heterogeneous Platforms

• MESCAL - Middleware efficiently scalable

• MOAIS - PrograMming and scheduling design fOr Applications in Interactive Simulation

• MOISE - Modelling, Observations, Identification for Environmental Sciences

• OPALE - Optimization and control, numerical algorithms and integration of complexmultidiscipline systems governed by PDE

• RESO - Protocols and softwares for very high-performance network

Biological systems

• HELIX - Computer science and genomics

The teams marked in red are those which belongs to the GRAVIR laboratory (GRaphics,VIsion and Robotics). This laboratory, which includes the e-Motion team, results in fact of theunion of the INPG, INRIA, CNRS and UJF.

1.2 The e-Motion team

1.2.1 Team presentation

e-Motion is an acronymous for ”Geometry and Probability for Motion and Action”.

Project-team presentation overview

The project-team e-Motion aims at developing models and algorithms allowing to build “ar-tificial systems” including advanced sensori-motors loops, and exhibiting sufficiently efficient androbust behaviors for being able to operate in open and dynamic environments (i.e. in partiallyknown environments, where time and dynamics play a major role), and leading to varied inter-actions with humans.Recent technological progress on embedded computational power, on sensor technologies, and onminiaturised mechatronic systems, make the required technological breakthroughs potentiallypossible (including from the scalability point of view).

Page 8: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 1. CONTEXT 4

Approach and research themes

In order to try to reach this objective, the team proposes to combine the respective ad-vantages of the computational geometry and of the theory of probabilities, while working incooperation with neurophysiologists for trying to apply and experiment some biological models.This approach leads them to study, under these different points of view, three strongly correlatedfundamental research themes:

• Perception and Multimodal modelling of space and motion

• Motion planning for the physical world

• Probabilistic inference for decision and modelling of living mechanisms

Therefore, some applications are in robotics and the Cycab robot is an example of their works.

The team have also founded a startup located near the research unit named Probayes. Thisstartup now develops softwares such the C++ bayesian library ProBT and 2 others 2nd yearENSIMAG students (Olivier CARRE, Olivier SAINGRE) have worked on this summer.

1.2.2 Team composition

Christian Laugier is the head of the project-team.The administrative assistant is Anne Pasteur.

Then, the staff is divided in INRIA and CNRS members :

INRIA staff :There are two CR, Thierry Fraichard (who was at the research unit during my internship)

and Sepanta Sekhavat who has left in Iran for 2 years.

CNRS staff :The two members of the CNRS staff are Emmanuel Mazer and Pierre Beissiere and

they both are related to the startup Probayes (Emmanuel Mazer is the Probayes’ CEO).

And like in many research project-teams, e-Motion receives many Ph.D. students and stu-dent intern, especially from UJF (DEA or masters’degree).

During my internship, there were also some chinese and korean collaborators from Toyotaworking in the context of the R&D collaboration between Toyota and the INRIA Rhone-Alpes 8.

8http://www.ccifj.or.jp/lm/article.php3?id_article=1297&lang=fr

Page 9: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 1. CONTEXT 5

1.2.3 My internship tutor(s)

Christophe BRAILLON was my main internship tutor but, as I have said in the ”Acknowl-edgments” section, Amaury Negre supervized also my work.They are both PhD students.

In this picture, they are in Bresil for robotics lectures.

Figure 1.2: Christophe Braillon & Amaury Negre

Page 10: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Chapter 2

The internship

2.1 Internship subject

My internship consisted of developing several tools around the ”structure from motion” pro-cess.Dealt works in my internship can be divided in three sections :

• The cycab vision platform and its graphs

• ”Structure from motion” algorithms

• A camera calibration GUI

2.1.1 The cycab vision platform and its graphs

I have not really worked to design the cycab vision platform but it was very important toknow well how it works to begin my contribution.

The platform is a software which manages the execution of objects called processors whichrealize a part of computation.These processors have inputs and outputs so they have to be synchronized to correctly exchangedatas (images, matrixes, numerics, ...) between themselves.

In order to the platform know what processors to launch, in which order, ... , we have toexplain it how. The choice realized by the platform designers (especially my internship tutor,Christophe BRAILLON) is to create graphs describing processors, their inputs/outputs and re-lationships between them.The advantage of using graphs in the platform is to cut a big computation (for example, ”struc-ture from motion” process is quite complex) in smaller parts of computation and the possibilityto reuse these smaller parts in other processes (for example a processor which interacts with auser to load an image file can be reuse in many processes).

I will give an example of a graph realizing a ”structure from motion” process after I willexplain this

6

Page 11: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 7

2.1.2 ”Structure from motion” algorithms

”Structure from motion” presentation

All my work is inspired from the thesis of Marco Zuchelli about ”Structure From Motion”[Zuc]

What is the ”structure from motion” process ?Currently, one solution to estimate objects structure is to use 2 cameras and to proceed by

triangulation (projective geometry).This solution is often called stereo-vision but on the Cycab robot only one camera will be em-bedded.So the challenge is to do the same with only one camera.It would be possible if there was a good way to know the motion of the camera between two ofits snapshots. By this way, the problem is equivalent to the stereo-vision.

So, the ”structure from motion” process consists of two main sub-processes :

The first step consists of compute a vector field called ”optical flow” from two successivepictures.The second step consists of both estimating the motion of the camera between the two snapshotsand estimating the 3D scene structure around.

Page 12: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 8

The first step can be summed up by this picture :

-MOTION

@@

@@R

PICTURE 1 (t)

��

��

PICTURE 2 (t− dt)

��

��COMPUTATION ...

?

OPTICAL FLOW

Page 13: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 9

and the second step by this picture :

?

OPTICAL FLOW

��

��COMPUTATION ...

?

3D STRUCTURE

Page 14: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 10

This is only a presentation and ”structure from motion” may seem to be simple but in factit needs many algorithms related to geometry projective and using algebra.

Technical details are given in the appendix, but theorical aspects involve algortihms such :

• GoodFeaturesToTrack algorithm : to select in pictures only interesting points in order tolighten the computation

• Lucas-Kanade pyramidal algorithm : this is the algorithm choosen to compute the opticalflow.

• projective geometry aspects : fundamental, intrinsic matrixes ...

• SVD and least-squares methods to find these matrixes and to compute the ego-motion ofthe camera

Page 15: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 11

You can see under a graph (like those I dealt with in the previous section) realizing a”structure from motion” process on the platform :

HSca le

Min d oubl e= 1 .0 Ma x do ubl e= 1 0 .0 Ste p do u bl e= 1 .0

O1

HSca le

Min d oubl e= 1 0 0 .0 Ma x do ubl e= 1 0 0 0 0 .0 Ste p do u bl e= 1 0 0 .0

val ma x

Cam Ca lib Loa dFro m XML

in t rin s ic_m a t rix di st ort ion _coeffs

Im gLoa dMul t ipl e

ste p

O1

RGB2 Gra y

I1

O1

Undi s tortFro m Int rins ics

im a ge ca mer a _m a t rix di st ort ion _coeffs

im a ge

Dr a wSpa rs eVec torFiel d

im a ge ve ctor_fiel d

im a ge

Del a yImg

I1

O1

Op t ica lFlowPyrLK

im g1 im g2 ma x ma x_ini t

funda ment a l_m a t rix vf

Dr a wPro ces s or

im a ge

im a ge

Eg oMot ion

funda ment a l_m a t rix in t rin s ic_m a t rix

rota t ion tra ns la t ion

Cont ro lPla t fo rm

im a ge

_3DR ec ons t ruc t ion

in t rin s ic_m a t rix rota t ion tra ns la t ion Spa rs eVF

_3DP oint s

Displ a yMa t rix

ma t rix stri n g=ro ta t ion

Displ a yMa t rix

ma t rix stri n g=t ra ns la t ion

Op enG L_View 3 DPoint s

_3DP oint s

Page 16: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 12

Results

As you can see in this picture,

results are very promising : the 3D scene in the video flow was well detected (the poster, thedesk, the person (Iker :) ), the chair, ...).

Christophe BRAILLON had supplied an improvement of the first model (based on 2 pic-tures) consisting of using 4 pictures per graph cycle. Only the first and the fourth picture wouldbe used to compute optical flow (and so, the fundamental matrix) but we would dispose of 3estimations of each 3D point (1st-2nd, 2nd-3rd and 3rd-4th pictures). Therefore, we can realizetheir mean or choose the best according to some criterions. This would be equivalent to filterthe reconstruction and would result to robustify the model.

And because my internship found I was early, I begin such an improvement 2 weeks beforethe end of the internship, but results were’nt promising.

Page 17: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 13

2.1.3 A camera calibration GUI

A relevant point to remember when computing ”structure from motion” algorithms is thateach camera render pictures differently. The deformations it implies can be summed up in thematrix called ”intrinsic matrix” of the camera.Another dependent parameters are the distortion coefficients. It is impossible to begin a ”struc-ture from motion” process without undistort pictures otherwise, calculus might be quite impre-cise.This means that the camera has to be calibrated (that is to said, find intrinsic matrix and dis-tortion coefficients).

That’s why, I have also developed a camera calibration GUI in order to calibrate cameras.

This is a picture of the final version of this GUI.

It needs more than 3 weeks to develop it from scratch.It was quite difficult because I had to use a 3D computer vision library (OpenCV 1) which con-tains some bugs and which is not very easy to use.

As you may see, a camera calibration needs some pictures to be realized (often around ten).And the practice to realize a calibration is to use chessboard pictures.Indeed OpenCV tries to find intrinsic matrix and distortion coefficients by computing the dif-ference between the corners positions it detects on the pictures and those if they were aligned(You can notice in the picture the camera distortion because of straight red lines not matching

1http://opencvlibrary.sourceforge.net/

Page 18: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 14

the corners in the center of the chessboard).

2.1.4 Technical choices and precisions

The technical choices was very limited because all the cycab-vision platform was developedin C++.As I dealt with just before, I had also to use the OpenCV library because it fitted well my work.

All the plugins (or processors in the graph) I have designed had also to be developed inC++, as dynamic libraries (.so files) in order to the platform can load them.

But I was quite free in the camera calibration GUI conception, so that I could put in prac-tice, the GUI lessons I have attended during the 2nd year at the ENSIMAG.

2.2 Internship organization

2.2.1 Planning

june 15th – july 13th Documents reading about the ”structure from motion” and first pro-cessors conception and development for the platform (optical flow computation) (learninghow to use the OpenCV library)

july 24th – august 4th Camera calibration GUI conception and development

august 7th – september 1st Bugs correction on the 2 first parts and last step of the ”struc-ture from motion” process : ego-motion and 3D reconstruction processors conception anddevelopment processors for the platform

september 4th – september 15th Trying to improve the ”structure from motion” model tomore than 2 pictures and co-writing article with Christophe BRAILLON for the ICRA2006 conference

It is important to notice I have reached the initial goal (implement simple ”structure frommotion” algorithms) 2 weeks before the end of the internship so that, I could realize someadditional works (the improvement of the model, the co-writing of the article).

2.2.2 Publication

As I said in the planning section, I had the good luck to contribute to the writing for anarticle for the ICRA 2006 conference 2.

I had succeed in writing something about my work but, because a lack of time to the othersco-authors to finish, the article won’t be published before the end of the year.

However you can find what I wrote in the appendix, which gives also technical and theoricaldetails for this report.

2http://www.icra2006.org

Page 19: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 2. THE INTERNSHIP 15

2.3 Relationship skills acquired

I am very satisfied about the relationship skills acquired while I was working at the INRIARhone-Alpes.

I really felt that all my work was useful for the team and its design of the Cycab robot;especially because my internship tutors have explained me how will they use my work in thenear future.

Before the internship, working in research field seemed to me as an individual task, but Icould realize how others (PhD or not) students in the team were always ready to help me andanswer my questions on several topics.The organisation of a research unit or a project team is quite different from a classical companybut I can say that I have improved my relastionship skills during this experience.Maybe it is due to the fact to work in team and I hope that will help me for my future internshipsand jobs.

Page 20: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Bibliography

[ST94] Jianbo Shi et Carlo Tomasi : Good features to track. In IEEE Conference on ComputerVision and Pattern Recognition (CVPR’94), Seattle, juin 1994.

[Stu01] Peter Sturm : Auto-calibrage de cameras et reconstruction 3-d a partir d’images. InActes des Journees Nationales de la Recherche en Robotique, Hyeres, France, October2001.

[TK91] Carlo Tomasi et Takeo Kanade : Detection and tracking of point features. Rapporttechnique CMU-CS-91-132, Carnegie Mellon University, April 1991.

[Zuc] Marco Zucchelli : Optical flow based structure from motion.

16

Page 21: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

Chapter 3

Appendix

ICRA 2006 article excerpt

The article was entitled Stereo and structure from motion based occupancy grid

Co-authors were Christophe Braillon1, Guillaume Cernier1, Kane Usher2, James L. Crowley1

and Christian Laugier1

1Laboratoire GRAVIR 2CSIRO ICT CentreINRIA Rhone-Alpes Autonomous Systems Lab

655 avenue de l’Europe 1 Technology court38334 Saint Ismier Cedex, France Pullenvale QLD 4069, Australia

Email: [email protected] Email: [email protected]

The article has not already been submitted to ICRA 2006.

This excerpt clarifies and gives details on the different steps of the ”structure from motion”.

Optical flow obstacle detection

The optical flow-based obstacle detection consists of estimating the 3D structure of theshapes arrounding the camera by tracking interesting points in 2 (or more) pictures taken atdistinct locations (because of the ego-motion of the camera).The optical flow between 2 pictures I1 and I2 (is to said a set of point pairs) permits to eval-uate both the ego-motion of the camera between the 2 snapshots and, thanks to the intrinsicsparameters of the camera (3 × 3 matrix K supposed known and constant during the motion),the set of 3D points corresponding to each pair in the originals pictures.

17

Page 22: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 3. APPENDIX 18

Optical flow computation

The optical flow consists in establishing point pairs between only 2 pictures (that can thusbe represented as a vector field). So, this computation has to be repeated when a new picturecomes from the camera.

An interesting way to lighten the computation of the optical flow is to track only the featureswhich give the greatest information. A description of a ”Good Features to track” method usedin our work has been given by Jianbo Shi and Carlo Tomasi [ST94].One of the advantages of their algorithm is the possibility to customize the maximal number offeatures we want to track which is a control of computation cost and performance.

The features thus found are then tracked in the next image by a pyramidal method of opticalflow computation. This used in our work is the Lucas & Kanade method. After this computa-tion, there is also some filtering because not all the points in the first picture have been foundin the second.

Therefore, the vector field thus obtained (vectors linking 1st picture points to 2nd picturepoints) is sparse, not because of dividing the resolution of a dense one but on selecting themotion of the greatest features. And this brings the two adantages of faster and more accuratecomputation of the ”structure from motion”

Structure from motion

Once the optical flow computed, ”structure from motion” process can be divided in severalsteps :

Fundamuntal matrix computationThe fundamuntal matrix F is strictly defined by :

p′TFp = 0

for all point pairs (p ∈ I1, p′ ∈ I2) detected by the previous optical flow computation.

F is a 3× 3 matrix (F = (fi,j)(i,j)∈[1,3]2).Assuming that we have n point pairs, we could find the 9 coefficients fi,j by solving the system

Af = 0

, where f is a vector of the unknown components of F (f = (F1,1, F1,2, F1,3, F2,1, F2,2, F2,3, F3,1, F3,2, F3,3, )T )and A the n× 9 matrix defined by :

A = (pi1p

′i1, p

i2p

′i1, p

i3p

′i1, p

i1p

′i2, p

i2p

′i2, p

i3p

′i2, p

i1p

′i3, p

i2p

′i3, p

i3p

′i3)i∈[1,n]

This system is solved by a least squares method, is to said that f is established such that :

f = arg minf‖Af‖2 , under ‖f‖ = 1

Page 23: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 3. APPENDIX 19

Ego-Motion estimationOnce we have computed the fundamental matrix F and because we know the intrinsic matrix

K, we are able to evaluate the ego-motion of the camera between the two snapshots in order tobe able to reconstruct 3D points, like in the case of two cameras.A location and orientation of the camera may be considered as a (R, t) pair of rotation matrixand translation vector.Computing the ego-motion is equivalent to find the rotation matrix R = R2 and translationvector t = t2 of the camera which occurred from the 1st picture I1 to the 2nd one I2 (the 1stsnapshot can be considered as a point of view such that R1 = I3 and t = 0 in a fixed reference).

In order to find R and t, we have recourse to the essential matrix E which is known to bedefined by E = K1FK2 so that in our case, we can calculate :

E = KFKT

because intrinsic parameters are constant during the motion.The usual properties of the essential matrix E ensure us that it is diagonalizable.Therefore, we can realize the SVD decomposition of this matrix E . We note U and V thematrixes such that it exists a diagonal matrix D = diag(λ1, λ2, λ3) with λ1 >= λ2 >= λ3

and :E = UDV T

Here, ego-motion methods stipulate that, mathematically speaking, there are two solutionsfor the (R, t) pair. Indeed, the two possible values for R are :

R1

= U

0 1 0−1 0 00 0 1

V T

and

R2

= U

0 −1 01 0 00 0 1

V T

t has only one possible value :

t =

xyz

deducing x, y, z from T such that

T = V

0 −1 01 0 00 0 0

V T

=

0 −z yz 0 −x−y x 0

The problem which is raising now is how to choose the matrix R among the candidates R

1

and R2.

Page 24: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 3. APPENDIX 20

The solution is to choose the nearest matrix (according to the L2 norm) of the 3× 3 identitymatrix I.We choose this solution because the other is always far different from the identity matrix Iand can’t fit to our problem because of the smallness of motion between two snapshots (andespecially very small amplitude rotations).But, the calculus sometimes show that none of them are near enough of the identity (for examplewhen min(‖R1 − I‖, ‖R2 − I‖) > 0.5). In this case, this means that, one of them are quite nearfrom −I so that, we have to change R

1, R

2and also T by their opposites and choose R among

{R1, R

2} which minimizes the distance to I.After this step, knowing the ego-motion of the camera between the two snapshots and the

projection of each pair points on the 2 pictures, it is now possible to give an evaluation of the3D position of tracked features.

3D scene reconstructionBecause detecting obstacle in front of the camera is the main aim, the 3D scene reconstruc-

tion always will be realized according to a reference linked with the camera. More precisely,each two-picture-based 3D reconstruction will be realized according to a reference linked withthe camera at its first location.

So, a summary of the ultimate problem raising in the ”structure of motion” process is :Because we know the camera positions at the two snapshots and the intrinsic parameters, weknow the 2 projection matrixes (linking real object points to their projections in the pictures)P1 and P2 :

P1 = K1R1

I3×3 −t1

= K

0I3×3 0

0

(because we choose R1 = I3×3 and t1 = 0 in the reference linked with the first position camera),and

P2 = K2R2

I3×3 −t2

= KR

I3×3 −t

(with the previous paragraph notations)

how could we find the unknown 3D point Q corresponding to its projections q1 and q2 onthe two pictures (Q, q1 and q2 are in the projective space : is to said that Q ∈ R4 and qi ∈ R3).The problem is therefore to find Q satisfying :

P1Q ∼ q1 (1)P2Q ∼ q2 (2)

Page 25: ”Structure from motion”algorithms for the Cycab robotcernierg.free.fr/CV/projets/rapport_stage2A.pdf · Guillaume CERNIER 2nd year ENSIMAG Summer 2006 (3 months from 19/06 to

CHAPTER 3. APPENDIX 21

qi(i = 1 . . . 2) can be written as

siui

sivi

si

(where si represent the scale factor in the projec-

tive space).So the previous equations can be rewritten as :

(PiQ)1 = siui

(PiQ)2 = sivi

(PiQ)3 = si

⇒{

(PiQ)1 = ui(PiQ)3(PiQ)2 = vi(PiQ)3

Noting Pi =

· · · p1iT · · ·

· · · p2iT · · ·

· · · p3iT · · ·

, the problem is strictly equivalent to :

AQ = 0R4

where

A =

u1p

31T − p1

1T

v1p31T − p2

1T

u2p32T − p1

2T

v2p32T − p2

2T

Once again, we have recourse to a least squares method to find Q minimizing the quantity‖AQ‖2, under ‖Q‖ = 1

NB : The vector Q have to be divided by its factor scale (Q4) in order to

Q1

Q2

Q3

represents

the real 3D coordinates of the reconstructed point.