[IEEE 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) -...

Combination of a low cost GPS with visual localization based on a previous map for

outdoor navigation

Cristina Gamallo, Pablo Quintıa, Roberto Iglesias-Rodrıguez

Department of Electronics and Computer Science

Universidade de Santiago de Compostela

Santiago de Compostela, Spain

{cristina.gamallo; pablo.quintia; roberto.iglesias.rodriguez}@usc.es

Jacobo V. Lorenzo, Carlos V. Regueiro

Department of Electronics and Systems

Universidade da Coruna

A Coruna, Spain

[email protected]

Abstract—In this work we propose the combination of alow cost GPS with a vision based localization system. Bothtypes of localization complement each other, providing betterprecision and robustness than using them separately. The visuallocalization compares traversable regions detected by a camerawith regions previously labeled in a map. We create this mapby hand from images taken from Google Maps by labellingthose regions passable by the robot (e.g. pavement). For theintegration of both localization systems we propose the use of aparticle filter. We obtained promising preliminary results takenin the surroundings of the Faculty of Computer Science of theUniversity of A Coruna.

Keywords-Visual localization; particle filter; mobile robotic;

I. INTRODUCTION

One of the main objectives in mobile robotics is to obtain

autonomous vehicles capable of intelligent navigation, aimed

to reduce human intervention in routine tasks or under

dangerous or impossible conditions for humans (e.g. toxicity,

extreme climate). Accurate robot localization is extremely

important to achieve this goal. A great variety of localization

techniques were proposed in the last years. Their approaches

varies significantly depending on the environment the robot

will operate in, the sensors available, the knowledge about

the environment and the task.

In general, the position can be obtained from internal

and external sensors. The former provides a continuous and

instant localization relative to the initial localization of the

robot and without any knowledge about the environment.

In this group are included the odometry of the robot and

inertial navigation systems. The external sensors provide an

absolute positioning by comparing their measurements with

external elements in the environment like: beacon systems,

active or passive marks and satellites. The GPS is the sensor

most widely used amongst these external sensors.

This work pretends to study the feasibility of the local-

ization of a robot in outdoor environments, using low cost

sensors: a GPS and a camera. We will use a particle filter

to estimate the position of the robot from the information

provided by both sensors.

Of all the information contained in the images from

the camera, the relevant information for our system is to

distinguish the floor, passable by the robot, from the rest

of the image. Our process will take advantage of the usual

aspect of these regions (e.g. pavement) in comparison with

the aspect of those other areas where the robot can not run.

The image will be segmented using a thresholding technique.

Compared to other proposed localization techniques, our

system is focused on low cost and simplicity. For example,

Real-time Kinematic GPS provides centimeter accuracy, but

the cost is hundreds the prize of our system. Furthermore,

for robot navigation such precision is unnecessary.

II. LOCALIZATION SYSTEM

Our system is based on a particle filter, implemented

following the Monte Carlo Localization algorithm proposed

by Thrun in [1] (Alg. 1). Particle filters are based in

probabilistic techniques that allow the combination of the

sensor readings with information related to the movement

of the robot.

Algorithm 1 MCL Algorithm

for all m do

xmt = motion model(ut ,x

mt−1)

wmt = observation model(zt ,x

mt ,Map)

Xt = Xt + 〈zt ,xmt ,Map〉

end for

Xt = resample model(Xt)return Xt

The particle filter represents the hindsight density with a

discrete set of M particles (x1, ...,xM) and their associated

probabilities (w1, ...,wM). At each time instant t, the M

particles are updated according to the motion ut (propagation

stage) and the current observation zt (anchoring stage). In

a third stage, a resampling of the particle set takes place

(Figure 1) to repeat the process and generate the estimated

result. The advantage of these algorithms is that they are able

to remove noise present in different sources of information.

1146978-1-4577-1676-8/11/$26.00 c©2011 IEEE

Figure 1: General schema with the main stages in a particle filter.

Figure 2: Odometry motion model: robot motion is approximatedin three steps: rotation δrot1, translation δtrans and rotation δrot2.

One of the main advantages against other faster filters (e.g.

Kalman filter) is that particle filters can deal with ambiguous

and/or inaccurate situations. For example, when several

results are possible and only with time and more information

we can reduce the uncertainty and find the correct result.

In our case, each particle represents a position (x,y)and orientation (θ) in the map. If the original position is

known, the particles are distributed around this position,

otherwise they are randomly distributed in the environment

with a probability 1/M. The first stage of the algorithm is to

apply the motion model p(xt |ut ,xt−1) to each one of the M

particles (Figure 1). The new generated particles represent

the prediction of the state variable according to the executed

motion and without considering the observation. We tested

both the velocity motion model and the odometry motion

model [1]. The latter (Figure 2) provided the best results

both in simulation tests and in real experiments.

The second stage of a particle filter, called observation

model p(zt |xt) (Figure 1), consists in evaluating the pre-

dictions provided by the motion model with the observed

information. In our system we have two available sensors

(vision and GPS), hence our observation model will be a

combination of the models of each sensor. The objective is

to use the GPS information to filter those hypotheses that are

clearly invalid, and leave the fine tunning to the information

provided by the vision system. The final model will be the

product of both.

Figure 3: Schema of the vision observation model. Left, predictedimage for a particle according to the previous map of the envi-ronment. Right, image from the camera. The big square (in blue)highlights the pixels that will be processed and the grid (in red)shows the discretization to 16 values used in the system.

The vision observation model is shown in Figure 3. Using

the map of the environment, and the position and orientation

codified in each particle, we can generate a synthetic image

of what the robot should see from the predicted position

(Figure 3(a)). In order to increase the robustness and the

performance of our system, we will only consider the part

of the image with the lowest distortion, in particular the

grid of 16 values located at the bottom of the image (blue

box in Figure 3(b)). Comparing the predicted image with

the real observation (Figure 3(b)), the model calculates the

weight of each particle. As the images are binary –white

represents traversable terrain, black no traversable terrain–

the comparison between the 16 values of each image is

straightforward. In the example of Figure 3 there would be

a mismatch between the predicted image from the particle

and the real image obtained, because of this the weight of

the particle should be diminished. The building of the map

and the processing of the real images, stages that are prior

to this process are explained in the next section.

For the observation model of the GPS we used the

absolute position readings provided by the GPS receiver.

Then we compared them to the position of each particle,

assigning the highest weights to those particles that are the

closest to the information provided by the GPS.

The resampling stage is the creation of a new set of

particles from the distribution of the weights. The particles

with small weights are likely to be under sampled, while

those with bigger weights will probably be sampled several

times. By doing this we make more dense the part of the

state more valuable while the a priori less promising parts

are left. To avoid an excessive reduction in the number

of different particles, the resampling will only take place

when the particle distribution is deteriorated in a certain

amount. Specifically when Me f f < threshold, being Me f f =1/(∑M

i=1(wit)

2). Threshold is set to M5/6. In this imple-

mentation we keep constant the total number of particles,

although it could change according to the characteristics of

the weight distribution (Kullback-Leibler divergence method

or KDL) and generate random particles [2], [3].

The estimation of the position xt is calculated as the

2011 11th International Conference on Intelligent Systems Design and Applications 1147

weighted mean of all particles, each position and orien-

tation is adjusted according to the weight of the particle:

xt = (∑M1 wi

txit)/(∑

M1 wi

t)

III. VISION MODEL

The application of the vision observation model summa-

rized in Figure (3), implies the building of a previous map

where the passable regions are stored, and the processing of

the acquired images to detect those regions.

The first step to get the map for the visual localization

model is to use an aerial photography from Google Maps

at the maximum available resolution. Secondly, we must

process the image to remove irrelevant information (e.g.

using a Canny edge detection filter) and binarize it by hand

so traversable areas (e.g. pavement) are in white and the

rest in black. This process is hard to automatize, since in

most cases there are occlusions that make the extraction of

the map difficult. Although it is fairly simple for a human

familiarized with the environment. This method was chosen

because it allows us to take any map of the world from

Google Maps and adapt it to our system in few steps.

Moreover, since the localization system does not require

very accurate maps, it would be possible to work with hand

made maps. The only important aspect is that the map allows

to anchor the position of the robot with its environment.

This is the main advantage of using the camera instead of

the GPS, even when it implies a loss of accuracy. Figure

4 shows all the steps in the map extraction process. The

environment corresponds to a nearby square to the Faculty

of Computer Science of the University of A Coruna (lon.

43.332603, lat. -8.411884), the size of which is 30x35 m.

For the conversion of the GPS coordinates (in degrees)

to map coordinates (in meters) we need to calibrate the

system. Several points were selected on the environment

and their coordinates were provided both from Google Maps

and from the map. The final conversion factors are that each

0.0000125 degrees of longitude or 0.0000090415 degrees of

latitude are equivalent to 1.0 meters in the map. This stage

also corrects errors that can be found in the map used –in

some cases Google Maps has errors of several meters.

The different aspect of those regions that can be traversed

by the robot in comparison with the aspect of those other

areas that are not traversable simplifies the detection of

the passable areas. A thresholding technique can be used.

We have tried several classic techniques [4]: application of

a fixed threshold, application of the Otsu method [5] and

using a dynamic thresholding (Figure 5). The best results

were obtained with a fixed threshold of 127, lowering the

threshold to 85 we start to observe anomalous behaviors (the

image goes black).

As can be seen in Figure 5, the biggest problem was the

strong and sudden changes in light, mainly those caused by

shadows (the experiment was performed in a sunny day at

midday, for example see first, second, sixth, and seventh

Figure 5: Seven examples of images in the test environment (rows).In the columns, from left to right: original image, thresholding at85, thresholding at 127, dynamic thresholding, and Otsu method.

rows in Figure 5). The fact that the robot traverses different

types of pavement with different light reflection properties

made this situation worse (for example, see first, fourth and

fifth rows in Figure 5).

Because the processing of the image provides no clear

differentiation of what is traversable and what is not (Figure

3(b)), only a small region of the image will be considered.

This region is discretized into a grid of 16 cells. Each cell

will be considered free if there are more white pixels than

black pixels, otherwise it will be considered as occupied.

1148 2011 11th International Conference on Intelligent Systems Design and Applications

Figure 4: Extraction of the map for visual localization. a) original image; b) Canny filter; c) map.

Table I: Error in the position estimate according to the size of theparticle set.

# particles θ (rad) Position (m)

Mean 0.043 0.424200,000 Std. deviation 0.039 0.221

Max 0.222 0.854


Max 0.235 0.955


Max 0.262 0.957

Mean 0.074 0.360200 Std. deviation 0.365 0.257

Max 0.325 0.951

Table II: Execution time according to the size of the particle set.

# particles Time/Iteration (s)

200,000 0.951

20,000 0.095

2,000 0.014

200 0.002

IV. SIMULATION TESTS

We performed several previous experiments using the

Stage simulator [6] with the aim of validating the filter and

adjustment of the parameters. The use of the simulator al-

lows us to perform experiments under controlled conditions,

so that the exact path of the robot is known, and it can be

compared with the simulated odometry and the filter output.

Those tests are only valid to check the applicability of the

algorithm before testing it in a real environment.

In tables I and II we can see the results for different sizes

of the set of particles. We have focused on determining the

execution time of each iteration and the accuracy in the

estimation according to the size of each particle set. For

each measurement we performed 10 experiments. From the

results obtained we can infer that the optimum size of the

particle set is of 2000 particles, since it provides a good

Figure 6: Mobile robot used in the experiments: a Pioneer P3-DXequipped with two STOC cameras and a home made 3D laser.

balance between the mean error and the execution time.

V. OUTDOOR EXPERIMENTAL RESULTS

In the experiments described in this section we have used

the mobile robot Pioneer P3-DX from MobileRobots, the

GPS receiver 747 A+ GPS Trip Recorder, and a STOC stereo

camera from VidereDesign (the right camera that can be seen

in Figure 6). This camera is turned 40◦towards the right (in

relation to the forward motion of the robot) and pointing

to the ground. We plan to use the cameras to implement a

visual obstacle avoidance behaviour.

For this experiment the robot was moved around the

environment presented earlier in this paper, the robot was

driven by a human operator with a joystick at a maximum

speed of 0.3 m/s. The route traveled describes the path of

Figure 7(a). Data was taken at 1 Hz (every single second

we take a GPS measurement and an image).

In Figure 7(b) we can see the trajectory according to the

odometry of the robot (blue). If we apply the localization

system using only the information from the GPS (Figure 7(b)

green) there is no significantly improvement, this is because


Figure 7: Trajectories: left) robot, middle) GPS, and right) combining GPS and vision. Odometry measurements are in blue, estimatedtrajectories are in red, GPS information is shown in green, particles are the small dots in the figure.

Figure 8: Some examples at the beginning of the experiment that show how visual information corrects the GPS data. Odometrymeasurements are in blue, estimated trajectories are in red, GPS information is shown in green, particles are the small dots in thefigure.

Figure 9: Three snapshots that show how our system deals with ambiguous information at the region marked as d in Figure 7(c). Odometrymeasurements are in blue, estimated trajectories are in red, GPS information is shown in green, particles are the small dots in the figure.

1150 2011 11th International Conference on Intelligent Systems Design and Applications

the particle filter adapts to the path described by the GPS

without improving the results of it.

When we combine the GPS with the information of the

camera the improvement is significant (Figures 7(c) and 8

red), which is very relevant if we consider the fact that the

processing of the image is very simple (just a thresholding to

detect the passable regions: pavement), and the observation

model of the camera consists only in detecting the regions

of pixels that are mainly white (traversable areas).

The system still commits some errors, mainly as a result of

the simplifications applied to the algorithms. At the position

a in Figure 7(c) the localization system cannot correct the

errors of the GPS positioning because the robot is at the very

center of a small square (where the whole visual region is

passable). We can see an example on the 7th row in Figure 5.

Something similar happens at the position marked as d:

the turning of the robot to the left is compatible (from the

point of view of the camera) with both the real trajectory

of the robot (blue) and the trajectory predicted by our filter

(red). When the position estimated is no longer compatible

with the GPS measurement, the system calculates a new

position that is valid for the vision and GPS information.

This sequence can be seen in Figure 9. It is in these

cases when we can appreciate the complementarity of both

sensors.

At the positions b and c in Figure 7(c) we can check that

when the robot stops the filter may change the estimated

position by weighting the different hypotheses. Those prob-

lems can be mitigated if a more complex observation model

of the camera is used. Additionally, at position b, there are

big shadows (see 6th row in Figure 5) and the pavement is

no detected correctly. On the other hand, at position c the

GPS data show that movement is perpendicular to the real

movement along the pavement.

VI. CONCLUSIONS

A low cost GPS has an error in the position estimation

that can hardly be lower than 2 m, in fact the real error

committed is usually higher. This precision is clearly not

enough if we want to keep the robot in a narrow path, e.g.

a pavement.

In this work we have proved that it is possible to integrate

the GPS with the use of a camera and a given map of the

environment. This map was obtained directly from Google

Maps with a simple processing (edge detection and correc-

tion of occlusions) made by a human operator to determine

those regions traversable by the robot.

We have performed several experiments in an area of the

Elvina campus next to the Faculty of Computer Science of

the University of A Coruna, Spain. In these experiments

we have checked that both sensors (GPS and vision) are

complementary, so the advantages of one of them make up

for the drawbacks of the other. The error in the positioning

is significantly reduced when the robot operates outdoors

and the robustness of localization is considerably improved.

Even so, the proposed system commits some errors. When

the robot stops it is difficult for the particle filter to detect

that the robot is keeping still. Another problem occurs when

the robot is at the center of a big traversable zone or finds a

fork in the path. In both cases, the very simple observation

model applied to the camera is not enough to keep a precise

localization and avoid errors.

To avoid those errors it is essential to improve the vision

observation model and to increase the number of cameras

with the aim of obtaining more information about the

environment and, hence more accurate and robust results.

To increase robustness against sudden changes in lightning

conditions we will use the method proposed in [7] to adjust

the camera parameters. Another future research consists

in removing the need of a previous map using SLAM

techniques [8], and two STOC stereo cameras and a 3D laser

to detect obstacles and thus classify automatically which

regions are traversable by the robot or not.

ACKNOWLEDGMENT

This work has been funded by the research grants

TIN2009-07737, INCITE08PXIB262202PR, and TIN2008-

04008/TSI.

REFERENCES

[1] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics, ser.Intelligent robotics and autonomous agents. MIT Press, 2005.

[2] C. Gamallo, C. V. Regueiro, P. Quintıa, and M. Mucientes,“Omnivision-based KLD-Monte Carlo Localization,” Roboticsand Autonomous Systems, vol. 58, no. 3, pp. 295–305, 2010.

[3] ——, “Montecarlo localization for a guide mobile robot ina crowded environment based on omnivision,” in TowardsAutonomous Robotics Systems, Edimburgh (UK), 2008, pp. 1–8.

[4] R. Gonzalez and R. Woods, Digital Image Processing.Addison-Wesley, 1992.

[5] N. Otsu, “A threshold selection method from grey level his-tograms,” IEEE Trans. on Systems, Man, and Cybernetics,vol. 9, no. 1, pp. 62–66, 1979.

[6] R. Vaughan, “Massively multi-robot simulations in Stage,”Swarm Intelligence, vol. 2, no. 2–4, pp. 189–208, 2008.

[7] F. Garcia Ruiz, “Adaptive exposure for dynamic light variationsin agricultural applications,” Ph.D. dissertation, University ofCopenhagen, Faculty of Life Sciences, 2008.

[8] C. Gamallo, M. Mucientes, and C. V. Regueiro, “Visual Fast-SLAM through omnivision,” in Towards Autonomous RoboticSystems, Londonderry (UK), 2009, pp. 128–135.


[IEEE 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) -...

Documents

Transcript of [IEEE 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) -...