[IEEE 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) -...
Transcript of [IEEE 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA) -...
Combination of a low cost GPS with visual localization based on a previous map for
outdoor navigation
Cristina Gamallo, Pablo Quintıa, Roberto Iglesias-Rodrıguez
Department of Electronics and Computer Science
Universidade de Santiago de Compostela
Santiago de Compostela, Spain
{cristina.gamallo; pablo.quintia; roberto.iglesias.rodriguez}@usc.es
Jacobo V. Lorenzo, Carlos V. Regueiro
Department of Electronics and Systems
Universidade da Coruna
A Coruna, Spain
Abstract—In this work we propose the combination of alow cost GPS with a vision based localization system. Bothtypes of localization complement each other, providing betterprecision and robustness than using them separately. The visuallocalization compares traversable regions detected by a camerawith regions previously labeled in a map. We create this mapby hand from images taken from Google Maps by labellingthose regions passable by the robot (e.g. pavement). For theintegration of both localization systems we propose the use of aparticle filter. We obtained promising preliminary results takenin the surroundings of the Faculty of Computer Science of theUniversity of A Coruna.
Keywords-Visual localization; particle filter; mobile robotic;
I. INTRODUCTION
One of the main objectives in mobile robotics is to obtain
autonomous vehicles capable of intelligent navigation, aimed
to reduce human intervention in routine tasks or under
dangerous or impossible conditions for humans (e.g. toxicity,
extreme climate). Accurate robot localization is extremely
important to achieve this goal. A great variety of localization
techniques were proposed in the last years. Their approaches
varies significantly depending on the environment the robot
will operate in, the sensors available, the knowledge about
the environment and the task.
In general, the position can be obtained from internal
and external sensors. The former provides a continuous and
instant localization relative to the initial localization of the
robot and without any knowledge about the environment.
In this group are included the odometry of the robot and
inertial navigation systems. The external sensors provide an
absolute positioning by comparing their measurements with
external elements in the environment like: beacon systems,
active or passive marks and satellites. The GPS is the sensor
most widely used amongst these external sensors.
This work pretends to study the feasibility of the local-
ization of a robot in outdoor environments, using low cost
sensors: a GPS and a camera. We will use a particle filter
to estimate the position of the robot from the information
provided by both sensors.
Of all the information contained in the images from
the camera, the relevant information for our system is to
distinguish the floor, passable by the robot, from the rest
of the image. Our process will take advantage of the usual
aspect of these regions (e.g. pavement) in comparison with
the aspect of those other areas where the robot can not run.
The image will be segmented using a thresholding technique.
Compared to other proposed localization techniques, our
system is focused on low cost and simplicity. For example,
Real-time Kinematic GPS provides centimeter accuracy, but
the cost is hundreds the prize of our system. Furthermore,
for robot navigation such precision is unnecessary.
II. LOCALIZATION SYSTEM
Our system is based on a particle filter, implemented
following the Monte Carlo Localization algorithm proposed
by Thrun in [1] (Alg. 1). Particle filters are based in
probabilistic techniques that allow the combination of the
sensor readings with information related to the movement
of the robot.
Algorithm 1 MCL Algorithm
for all m do
xmt = motion model(ut ,x
mt−1)
wmt = observation model(zt ,x
mt ,Map)
Xt = Xt + 〈zt ,xmt ,Map〉
end for
Xt = resample model(Xt)return Xt
The particle filter represents the hindsight density with a
discrete set of M particles (x1, ...,xM) and their associated
probabilities (w1, ...,wM). At each time instant t, the M
particles are updated according to the motion ut (propagation
stage) and the current observation zt (anchoring stage). In
a third stage, a resampling of the particle set takes place
(Figure 1) to repeat the process and generate the estimated
result. The advantage of these algorithms is that they are able
to remove noise present in different sources of information.
1146978-1-4577-1676-8/11/$26.00 c©2011 IEEE
Figure 1: General schema with the main stages in a particle filter.
Figure 2: Odometry motion model: robot motion is approximatedin three steps: rotation δrot1, translation δtrans and rotation δrot2.
One of the main advantages against other faster filters (e.g.
Kalman filter) is that particle filters can deal with ambiguous
and/or inaccurate situations. For example, when several
results are possible and only with time and more information
we can reduce the uncertainty and find the correct result.
In our case, each particle represents a position (x,y)and orientation (θ) in the map. If the original position is
known, the particles are distributed around this position,
otherwise they are randomly distributed in the environment
with a probability 1/M. The first stage of the algorithm is to
apply the motion model p(xt |ut ,xt−1) to each one of the M
particles (Figure 1). The new generated particles represent
the prediction of the state variable according to the executed
motion and without considering the observation. We tested
both the velocity motion model and the odometry motion
model [1]. The latter (Figure 2) provided the best results
both in simulation tests and in real experiments.
The second stage of a particle filter, called observation
model p(zt |xt) (Figure 1), consists in evaluating the pre-
dictions provided by the motion model with the observed
information. In our system we have two available sensors
(vision and GPS), hence our observation model will be a
combination of the models of each sensor. The objective is
to use the GPS information to filter those hypotheses that are
clearly invalid, and leave the fine tunning to the information
provided by the vision system. The final model will be the
product of both.
Figure 3: Schema of the vision observation model. Left, predictedimage for a particle according to the previous map of the envi-ronment. Right, image from the camera. The big square (in blue)highlights the pixels that will be processed and the grid (in red)shows the discretization to 16 values used in the system.
The vision observation model is shown in Figure 3. Using
the map of the environment, and the position and orientation
codified in each particle, we can generate a synthetic image
of what the robot should see from the predicted position
(Figure 3(a)). In order to increase the robustness and the
performance of our system, we will only consider the part
of the image with the lowest distortion, in particular the
grid of 16 values located at the bottom of the image (blue
box in Figure 3(b)). Comparing the predicted image with
the real observation (Figure 3(b)), the model calculates the
weight of each particle. As the images are binary –white
represents traversable terrain, black no traversable terrain–
the comparison between the 16 values of each image is
straightforward. In the example of Figure 3 there would be
a mismatch between the predicted image from the particle
and the real image obtained, because of this the weight of
the particle should be diminished. The building of the map
and the processing of the real images, stages that are prior
to this process are explained in the next section.
For the observation model of the GPS we used the
absolute position readings provided by the GPS receiver.
Then we compared them to the position of each particle,
assigning the highest weights to those particles that are the
closest to the information provided by the GPS.
The resampling stage is the creation of a new set of
particles from the distribution of the weights. The particles
with small weights are likely to be under sampled, while
those with bigger weights will probably be sampled several
times. By doing this we make more dense the part of the
state more valuable while the a priori less promising parts
are left. To avoid an excessive reduction in the number
of different particles, the resampling will only take place
when the particle distribution is deteriorated in a certain
amount. Specifically when Me f f < threshold, being Me f f =1/(∑M
i=1(wit)
2). Threshold is set to M5/6. In this imple-
mentation we keep constant the total number of particles,
although it could change according to the characteristics of
the weight distribution (Kullback-Leibler divergence method
or KDL) and generate random particles [2], [3].
The estimation of the position xt is calculated as the
2011 11th International Conference on Intelligent Systems Design and Applications 1147
weighted mean of all particles, each position and orien-
tation is adjusted according to the weight of the particle:
xt = (∑M1 wi
txit)/(∑
M1 wi
t)
III. VISION MODEL
The application of the vision observation model summa-
rized in Figure (3), implies the building of a previous map
where the passable regions are stored, and the processing of
the acquired images to detect those regions.
The first step to get the map for the visual localization
model is to use an aerial photography from Google Maps
at the maximum available resolution. Secondly, we must
process the image to remove irrelevant information (e.g.
using a Canny edge detection filter) and binarize it by hand
so traversable areas (e.g. pavement) are in white and the
rest in black. This process is hard to automatize, since in
most cases there are occlusions that make the extraction of
the map difficult. Although it is fairly simple for a human
familiarized with the environment. This method was chosen
because it allows us to take any map of the world from
Google Maps and adapt it to our system in few steps.
Moreover, since the localization system does not require
very accurate maps, it would be possible to work with hand
made maps. The only important aspect is that the map allows
to anchor the position of the robot with its environment.
This is the main advantage of using the camera instead of
the GPS, even when it implies a loss of accuracy. Figure
4 shows all the steps in the map extraction process. The
environment corresponds to a nearby square to the Faculty
of Computer Science of the University of A Coruna (lon.
43.332603, lat. -8.411884), the size of which is 30x35 m.
For the conversion of the GPS coordinates (in degrees)
to map coordinates (in meters) we need to calibrate the
system. Several points were selected on the environment
and their coordinates were provided both from Google Maps
and from the map. The final conversion factors are that each
0.0000125 degrees of longitude or 0.0000090415 degrees of
latitude are equivalent to 1.0 meters in the map. This stage
also corrects errors that can be found in the map used –in
some cases Google Maps has errors of several meters.
The different aspect of those regions that can be traversed
by the robot in comparison with the aspect of those other
areas that are not traversable simplifies the detection of
the passable areas. A thresholding technique can be used.
We have tried several classic techniques [4]: application of
a fixed threshold, application of the Otsu method [5] and
using a dynamic thresholding (Figure 5). The best results
were obtained with a fixed threshold of 127, lowering the
threshold to 85 we start to observe anomalous behaviors (the
image goes black).
As can be seen in Figure 5, the biggest problem was the
strong and sudden changes in light, mainly those caused by
shadows (the experiment was performed in a sunny day at
midday, for example see first, second, sixth, and seventh
Figure 5: Seven examples of images in the test environment (rows).In the columns, from left to right: original image, thresholding at85, thresholding at 127, dynamic thresholding, and Otsu method.
rows in Figure 5). The fact that the robot traverses different
types of pavement with different light reflection properties
made this situation worse (for example, see first, fourth and
fifth rows in Figure 5).
Because the processing of the image provides no clear
differentiation of what is traversable and what is not (Figure
3(b)), only a small region of the image will be considered.
This region is discretized into a grid of 16 cells. Each cell
will be considered free if there are more white pixels than
black pixels, otherwise it will be considered as occupied.
1148 2011 11th International Conference on Intelligent Systems Design and Applications
Figure 4: Extraction of the map for visual localization. a) original image; b) Canny filter; c) map.
Table I: Error in the position estimate according to the size of theparticle set.
# particles θ (rad) Position (m)
Mean 0.043 0.424200,000 Std. deviation 0.039 0.221
Max 0.222 0.854
Mean 0.883 0.43320,000 Std. deviation 0.406 0.293
Max 0.235 0.955
Mean 0.087 0.4172,000 Std. deviation 0.446 0.281
Max 0.262 0.957
Mean 0.074 0.360200 Std. deviation 0.365 0.257
Max 0.325 0.951
Table II: Execution time according to the size of the particle set.
# particles Time/Iteration (s)
200,000 0.951
20,000 0.095
2,000 0.014
200 0.002
IV. SIMULATION TESTS
We performed several previous experiments using the
Stage simulator [6] with the aim of validating the filter and
adjustment of the parameters. The use of the simulator al-
lows us to perform experiments under controlled conditions,
so that the exact path of the robot is known, and it can be
compared with the simulated odometry and the filter output.
Those tests are only valid to check the applicability of the
algorithm before testing it in a real environment.
In tables I and II we can see the results for different sizes
of the set of particles. We have focused on determining the
execution time of each iteration and the accuracy in the
estimation according to the size of each particle set. For
each measurement we performed 10 experiments. From the
results obtained we can infer that the optimum size of the
particle set is of 2000 particles, since it provides a good
Figure 6: Mobile robot used in the experiments: a Pioneer P3-DXequipped with two STOC cameras and a home made 3D laser.
balance between the mean error and the execution time.
V. OUTDOOR EXPERIMENTAL RESULTS
In the experiments described in this section we have used
the mobile robot Pioneer P3-DX from MobileRobots, the
GPS receiver 747 A+ GPS Trip Recorder, and a STOC stereo
camera from VidereDesign (the right camera that can be seen
in Figure 6). This camera is turned 40◦towards the right (in
relation to the forward motion of the robot) and pointing
to the ground. We plan to use the cameras to implement a
visual obstacle avoidance behaviour.
For this experiment the robot was moved around the
environment presented earlier in this paper, the robot was
driven by a human operator with a joystick at a maximum
speed of 0.3 m/s. The route traveled describes the path of
Figure 7(a). Data was taken at 1 Hz (every single second
we take a GPS measurement and an image).
In Figure 7(b) we can see the trajectory according to the
odometry of the robot (blue). If we apply the localization
system using only the information from the GPS (Figure 7(b)
green) there is no significantly improvement, this is because
2011 11th International Conference on Intelligent Systems Design and Applications 1149
Figure 7: Trajectories: left) robot, middle) GPS, and right) combining GPS and vision. Odometry measurements are in blue, estimatedtrajectories are in red, GPS information is shown in green, particles are the small dots in the figure.
Figure 8: Some examples at the beginning of the experiment that show how visual information corrects the GPS data. Odometrymeasurements are in blue, estimated trajectories are in red, GPS information is shown in green, particles are the small dots in thefigure.
Figure 9: Three snapshots that show how our system deals with ambiguous information at the region marked as d in Figure 7(c). Odometrymeasurements are in blue, estimated trajectories are in red, GPS information is shown in green, particles are the small dots in the figure.
1150 2011 11th International Conference on Intelligent Systems Design and Applications
the particle filter adapts to the path described by the GPS
without improving the results of it.
When we combine the GPS with the information of the
camera the improvement is significant (Figures 7(c) and 8
red), which is very relevant if we consider the fact that the
processing of the image is very simple (just a thresholding to
detect the passable regions: pavement), and the observation
model of the camera consists only in detecting the regions
of pixels that are mainly white (traversable areas).
The system still commits some errors, mainly as a result of
the simplifications applied to the algorithms. At the position
a in Figure 7(c) the localization system cannot correct the
errors of the GPS positioning because the robot is at the very
center of a small square (where the whole visual region is
passable). We can see an example on the 7th row in Figure 5.
Something similar happens at the position marked as d:
the turning of the robot to the left is compatible (from the
point of view of the camera) with both the real trajectory
of the robot (blue) and the trajectory predicted by our filter
(red). When the position estimated is no longer compatible
with the GPS measurement, the system calculates a new
position that is valid for the vision and GPS information.
This sequence can be seen in Figure 9. It is in these
cases when we can appreciate the complementarity of both
sensors.
At the positions b and c in Figure 7(c) we can check that
when the robot stops the filter may change the estimated
position by weighting the different hypotheses. Those prob-
lems can be mitigated if a more complex observation model
of the camera is used. Additionally, at position b, there are
big shadows (see 6th row in Figure 5) and the pavement is
no detected correctly. On the other hand, at position c the
GPS data show that movement is perpendicular to the real
movement along the pavement.
VI. CONCLUSIONS
A low cost GPS has an error in the position estimation
that can hardly be lower than 2 m, in fact the real error
committed is usually higher. This precision is clearly not
enough if we want to keep the robot in a narrow path, e.g.
a pavement.
In this work we have proved that it is possible to integrate
the GPS with the use of a camera and a given map of the
environment. This map was obtained directly from Google
Maps with a simple processing (edge detection and correc-
tion of occlusions) made by a human operator to determine
those regions traversable by the robot.
We have performed several experiments in an area of the
Elvina campus next to the Faculty of Computer Science of
the University of A Coruna, Spain. In these experiments
we have checked that both sensors (GPS and vision) are
complementary, so the advantages of one of them make up
for the drawbacks of the other. The error in the positioning
is significantly reduced when the robot operates outdoors
and the robustness of localization is considerably improved.
Even so, the proposed system commits some errors. When
the robot stops it is difficult for the particle filter to detect
that the robot is keeping still. Another problem occurs when
the robot is at the center of a big traversable zone or finds a
fork in the path. In both cases, the very simple observation
model applied to the camera is not enough to keep a precise
localization and avoid errors.
To avoid those errors it is essential to improve the vision
observation model and to increase the number of cameras
with the aim of obtaining more information about the
environment and, hence more accurate and robust results.
To increase robustness against sudden changes in lightning
conditions we will use the method proposed in [7] to adjust
the camera parameters. Another future research consists
in removing the need of a previous map using SLAM
techniques [8], and two STOC stereo cameras and a 3D laser
to detect obstacles and thus classify automatically which
regions are traversable by the robot or not.
ACKNOWLEDGMENT
This work has been funded by the research grants
TIN2009-07737, INCITE08PXIB262202PR, and TIN2008-
04008/TSI.
REFERENCES
[1] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics, ser.Intelligent robotics and autonomous agents. MIT Press, 2005.
[2] C. Gamallo, C. V. Regueiro, P. Quintıa, and M. Mucientes,“Omnivision-based KLD-Monte Carlo Localization,” Roboticsand Autonomous Systems, vol. 58, no. 3, pp. 295–305, 2010.
[3] ——, “Montecarlo localization for a guide mobile robot ina crowded environment based on omnivision,” in TowardsAutonomous Robotics Systems, Edimburgh (UK), 2008, pp. 1–8.
[4] R. Gonzalez and R. Woods, Digital Image Processing.Addison-Wesley, 1992.
[5] N. Otsu, “A threshold selection method from grey level his-tograms,” IEEE Trans. on Systems, Man, and Cybernetics,vol. 9, no. 1, pp. 62–66, 1979.
[6] R. Vaughan, “Massively multi-robot simulations in Stage,”Swarm Intelligence, vol. 2, no. 2–4, pp. 189–208, 2008.
[7] F. Garcia Ruiz, “Adaptive exposure for dynamic light variationsin agricultural applications,” Ph.D. dissertation, University ofCopenhagen, Faculty of Life Sciences, 2008.
[8] C. Gamallo, M. Mucientes, and C. V. Regueiro, “Visual Fast-SLAM through omnivision,” in Towards Autonomous RoboticSystems, Londonderry (UK), 2009, pp. 128–135.
2011 11th International Conference on Intelligent Systems Design and Applications 1151