Characters Extraction for Traffic Sign Destination boards ...519026/FULLTEXT01.pdf · For example,...
Transcript of Characters Extraction for Traffic Sign Destination boards ...519026/FULLTEXT01.pdf · For example,...
-
Characters Extraction for
Traffic Sign Destination
boards in video and still
images
Qiu Peng
2010.9.30
Master
Thesis
Computer
Engineering
Nr:E3986D
-
II
DEGREE PROJECT
Computer Engineering
Programme Reg number Extent
Masters Programme in Computer Engineering - Applied
Artificial Intelligence
E3986D 15 ECTS
Name of student Year-Month-Day
Qiu Peng 2010.9.30 Supervisor Examiner
Hasan Fleyeh
Company/Department Supervisor at the Company/Department Computer Science Hasan fleyeh
Title
Recognition characters on the Destination board Keywords RGB, HSV, extract character, SVM
Abstract
Traffic Control Signs or destination boards on roadways offer significant information for
drivers. Regulation signs tell something like your speed, turns, etc; Warning signs warn
drivers of conditions ahead to help them avoid accidents; Destination signs show distances
and directions to various locations; Service signs display location of hospitals, gas and rest
areas etc. Because the signs are so important and there is always a certain distance from them
to drivers, to let the drivers get information clearly and easily even in bad weather or other
situations. The idea is to develop software which can collect useful information from a special
camera which is mounted in the front of a moving car to extract the important information and
finally show it to the drivers. For example, when a frame contains on a destination drive sign
board it will be text something like "Linkoping 50",so the software should extract every
character of "Linkoping 50", compare them with the already known character data in the
database. if there is extracted character match "k" in the database then output the destination
name and show to the driver. In this project C++ will be used to write the code for this
software.
-
III
ACKNOWLEDGMENT First, I would like to thank my advisor, Mr. Hasan Fleyeh. The decision he made in Fall, last
year, to take me as one of his graduate assistants gave me the opportunity to do the research I
am interested in. I am grateful to his support and advice ever since then. He creates a
wonderful and dynamic environment for me to learn and gives me the freedom to explore the
interesting problems in field of Computer Vision and Digital Image Processing.
-
1
TABLE OF CONTENS 1 . C h a p t e r 1 I n t r o d u c t i o n……………………………. .………………… . . 3
1.1 The background………………………………………………………….…………….4
1.2 Application of road sign recognition system………………………………...………...4
1.3 Aim………………………………………………………………………...………......5
1.4 Contents arranged………………………………………………………….……..……5
2 . Chapter 2 I ma ge process ing theory…………………………………… . 7
2.1 Image acquisition………….………………………………………………………...…8
2.2 The HSV color model………………..………………………………………………...8
2.2.1 Theory details……………………….…………………………………………….8
2.2.2 HSV color model definition……………………………………………...……….9
2.3 Image segmentation…………………………..……………….………….……….….10
2.4 Shadow and highlight invariant color segmentation……...……………………..…....10
2.4.1 Theory details……………………………………………………………..….…..11
2.5 The noise problem……………………………….…………………………….….…..11
2.5.1 Problem with noise filters…………………………...……………………...……11
3. Chapter 3 Support vector machine…………………………………… .15
3.1 Introduction…………………………………………………………………..…...…..16
3.2 Machine learning……………………………………………………………..…..…...16
3.3 Statistical learning theory…………………………………………………...………...16
3.4 Support vector machine……………………………………………………...………..17
3.5 Two situations……………………………………………………………….…..……18
3.5.1 Liner separated problem…………………………………………………….…....18
3.5.2 Non liner separated problem………………………………………………....…..20
3.6 Kernel function……………………………………………………………………….21
3.7 Use Kernel function to solve non liner problem………………………………….......21
4 . Chapter 4 The implementa t ion………………… . .………… . . .………22
4.1 Real time traffic signs recognition flowchart…………………………………….…...24
4.2 application component…………………………………………..………………....…25
4.2.1 Implementation of background extraction module…………………………………25
4.2.2 Apply shadow and highlight invariant segmentation algorithm…………………....29
4.2.3 Algorithm Implementation………………………………………………………….30
4.2.4 Extraction area implementation………………………………………………...…..31
4.2.5 Second Time Image Processing Module…………………………………...………34
4.2.6 Character extraction module implementation………………………………………37
4.2.7 Training and testing with SVM……………………………………………………..42
5. Chapter 5 Analys i s and resul t………………………………………… .47
5.1 Analysis the application……………...…………………………………….…………48
5.1.1 Analysis with the character extraction part…………………………………………48
5.1.2 Analysis of hsv color model image…………………………………………………48
5.1.3analysis of the character extraction algorithm………………………………………48
5.1.4 analysis of the noise filter algorithm……………………………………..…………50
5.1.5 result……………………………………………………………….………………..51
5.1.6 Character recognition……………………………………………………………….66
5.1.7 SVM applied here……………………………………………………………….….66
5.1.8 Test with Liner function…………………………………………………………….68
5.1.9 Test with Polynomial function……………………………………………………...70
5.1.10 Test with RBF function……………………………………………………………72
5.1.11 Test with Sigmoid…………………………………………………………………74
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
2
6. Chapter 6 conclusion and future works…………………………………....…………..76
6.1 conclusion……………………….…………………………………………………..77
6.2 future works…………………………..……………………………...……………...77
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
3
Chapter 1 Introduction
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
4
1.1 The background In an environment with all kinds of traffic signs and city names. These kind of things plays an
important role in regulating the traffic and warns the driver to prohibit certain actions for their
safety and for the safety of their passengers.
Road signs use colors, shapes, and markings to communicate message to on road drivers.
Without such information the motion of traffic would be disorderly and unpredictable. It‟s
very crucial for drivers to identify road signs at right time, at right place but at times when
everything is expected to be perfect, from others off course, we tend to forget the inherent
imperfection of mankind . Noticing these safety precaution signs on the road greatly depends
on the physical and mental health of the drivers. There visual perception ability can be
affected by stress, tension and physical illness and sometimes it‟s the lack of knowledge about
road signs. According to a recent poll conducted by motoring website, New Car Net, one in
three motorists fail to recognize even the most basic Road Signs. It‟s because of these reasons
an autonomous robust real time road sign recognition system has gained interest since last two
decades. The very first paper appeared in 1984 which aimed on testing various computer
vision methods for detection of objects in outdoor scenes. Since then many research groups
and companies have been interested and have conducted research in the field. Computer
vision has been applied to a wide variety of intelligent transport systems (ITS)[1] such as
traffic monitoring system, traffic related parameter estimation and intelligent vehicles, and an
important part of intelligent vehicles is the detection and recognition of Road signs. A robust
real time and automatic road sign detection and recognition system can really support and
disburden drivers by giving information at good time; it can increase driving efficiency, save
lives and can provide driving comfort.
1.2 Application of Road Sign Recognition System The Road Sign Recognition is a field of applied computer vision research concerned with the
automatic detection and classification of road signs in traffic scene images acquired from a
moving car. The result of this research effort will be the subsystem of Driver Support System
(DSS). The aim is to provide DSS with the ability to understand its neighborhood
environment and so permit advanced driver support such as collision prediction and
avoidance.
Employing computer vision technology in smart vehicle design calls for consideration of all
its advantages and disadvantages. Firstly, vision subsystem incorporated into the DSS may
exploit all the information processed by human drivers without any requirements for new
traffic infrastructure devices (a very hard and expensive task). Smart cars equipped with
vision based systems will be able to adapt themselves to operate in different countries (with
often quite dissimilar traffic devices).
As the integration of various technologies in the field of traffic engineering has been
introduced (ITS) the convenience of computer vision usage has become more obvious. We
may observe this trend e.g. in proceedings of annual IEEE International Conference on
Intelligent Vehicles (IVS). More than 50% of papers are focused on Image Processing and
Computer Vision method.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
5
Obviously, there exist even disadvantages of the vision-based approach. Smart vehicles will
operate in real traffic conditions on the road. So, the algorithms must be robust enough to give
good results even under adverse illumination and weather conditions. Although this system
property may seem to be solved easily it is the real challenge for the algorithm developers.
For example Fridtj of Stein, main project manager of Cleopatra project (Clusters of embedded
parallel time-critical applications) said that "reliable optical detection is the biggest hurdle the
project must overcome".
There cannot be assured absolute system reliability and the system will not be "fail-safe"
because of the definition of individual transportation system. The aim is to provide a level of
safety similar to or higher than that of human drivers. For example it could assist drivers
about signs they did not recognize before passing them. Specifically, speed limit sign
recognition could provide driver the present speed limit as well as giving an alert if a car is
driven faster than the speed limit.
In future, autonomous vehicles would have to be controlled by automatic road sign
recognition. As with any vehicle, an autonomous vehicle driving on public roads must obey
the rules of the road. Many of these rules are conveyed through the use of the road signs,
soan autonomous vehicle must be able to detect and recognize sings and change its behavior
accordingly.
1.3 Aim Aim of this research project is to present an Intelligent Road Sign Recognition System based
on state-of-the-art technique, the Support Vector Machine and image processing skills.
The project is an extension to the already known system that can recognize traffic sign. This
application can extract every character on the destination board then output the city name.
1.4 Contents arranged Chapter 1 Image processing
Image acquisition (this part introduced how and what the types of images was
captured)
HSV color model (this part introduced from the RGB color model to the HSV
color model, and the advantage the HSV have compared with
RGB in the segmentation field)
Shadow And Highlight Invariant Color Segmentation Algorithm (this part
shows how the
color was
extracted and
can be
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
6
distinguished
with other
colors)
Character extract (show how the characters was extracted)
Character Normalization (normalize the extracted characters to be 30*30 pixels)
Chapter 2 SVM theory
Introduction (introduce the SVM theory)
Machine learning (shows the origin of SVM)
Statistical Learning Theory (another part of the origin of SVM)
Support Vector Machine (what is SVM, and how it works)
Two situations (the liner and non-liner problems)
Kernel Function (introduce the kernel formula)
Chapter 3 Implementation (shows the steps how the theory works with real life problems)
Chapter 4 Analysis (analysis the application based on these theories, and shows out how the
application works, the result we get by this application)
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
7
Chapter 2 Theoretical background
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
8
2.1 Image acquisition Image acquisition is the first step of the Traffic Sign Recognition. An input image can be
either taken by the live stream from the camera mounted on the vehicle‟s deck or taken from
the video for an experimental purpose. The video format, acceptable by the OpenCV platform,
should be in the AVI format. Each frame of the video is in a RGB Image format. The
dimension of captured image is set to be 400 x 600 pixels set by my application.
Figure below shows such an example
Figure 2.1 sample image from video stream
2.2 The HSV color model
1. The image acquired by the camera is in RGB format is greatly sensitive to chromatic
variation of the daylight. The coordinates of three colors are highly correlated. As a
result of this any variation in the ambient light intensity affect the RGB system by
shifting the cluster of colors towards the white or the black corners. As a result, it
will be hard to recognize the object.
2. HSV was the ideal color model for the recognition problem since it decouples the chromatic and achromatic notion of light. This method is also preferable because
Hue feature is invariant to shadows and highlights.
3. HSV represents the colors in a similar way by which human eye senses the color.
2.2.1 Theory details
Every Color in this space is represented by three components:
1. Hue (H): the apparent light color (determined by dominant wavelength).
2. Saturation (S): the purity of light.
3. Value (V): the total light across all frequencies.
The HSV model is illustrated as a conical object. The cone is usually represented in the three
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
9
dimensional form. The hue is represented by the circular part of the cone. The saturation is
calculated using the radius of the cone and value is the height of the cone. Advantage of the
conical model is that it is able to represent the HSV color space in a single object.
2.2.2 HSV color model defination
Figure 2.2 HSV color model
The hue Red, Green and Blue (RGB) are the three primary colors used by computer monitors. 180
degree away from a primary, none of it is mixed in. These colors are the complement hues i.e.
Cyan, Magenta and Yellow. The next level colors are between the secondary and primary
colors, are called the tertiary hue colors. This process continues, creating a solid ring of colors
around the primaries. This definition of color describes just one dimension of color that is hue.
Hue is more specifically described by the dominant wavelength. Hue describes a dimension of
color readily experienced by the eye. Hence it is the dimension of color interpreted by the
human brain.
The value Value is the brightness of the color, ranges from 0 to 100% and varies with color saturation.
When the value is 0, the color space will be completely black. In terms of a spectral definition
of color, value describes the overall intensity or strength of the light. If The hue can be
thought of as a dimension. go around a wheel, then value is a linear axis like an axis running
through the middle of the wheel as shown in figure up.
The saturation Saturation refers to the dominance of hue in the color. On the outer edge of the hue wheel, are
the 'pure' hues. Near the center of the wheel, the hue to describe the color dominates less and
less. Exactly in the center of the wheel, no hue dominates. These colors directly on the central
axis are considered de-saturated. These de-saturated colors constitute the gray scale ranges
from 0 to 100%, running from white to black with all of the intermediate grays in between,
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
10
perpendicular to the Value axis.In terms of a spectral definition of color, saturation is the ratio
of the dominant wavelength to other wavelengths of color. White light is white because it
contains an even balance of all wavelengths.
Here are two images with RGB color field and HVS color field
Figure 2.3 RGB and HSV image
2.3 Image segmentation
1. Image Segmentation is a process by which the specific objects in the image are
distinguished from the background. Based on the color information candidate traffic sign
needs to be separated from the rest of the image.
2. By segmenting the image in the binary image, only two types of pixels are left to be
processed, those are “white and black”. In this way the complexity of the image
processing will be reduced for Traffic Sign Recognition.
3. The processing time will be improved too, because only two intensity levels will be
used for processing the image.
2.4 Shadow And Highlight Invariant Color Segmentation
Algorithm Most of the times, the weather condition will give big problems for us to extract traffic signs,
for example may strong sun shine will make some color of traffic signs missing.
Figures showing below
Figure 2.4 original image
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
11
Figure 2.5 color segmentation with better algorithm
2.4.1 Theory details The color segmentation algorithm is carried out by taking RGB images using a digital camera
mounted on a moving car. The images are converted to HSV color space. The hue, saturation,
and value are normalized into [0,255]. The HSV color space is chosen because the Hue
feature is invariant to shadows and highlights.
While normalized Hue is used as a priori knowledge to the algorithm, normalized Saturation
and Value are used to specify and avoid the achromatic subspaces in HSV color space. When
the hue value of color of the pixel in the input image is with the specified color range
specified in figure below, and its hue value is not in the achromatic area, then the
corresponding value in the output image is set to white. The output image is then divided into
a number of 16x16 pixel sub-images and used to calculate the seeds for the region growing
algorithm. A seed in initiated if the number of white pixels in the output image is above a
certain threshold level. Region growing algorithm is then applied to find all the objects in the
output image which are big enough to initiate at least one seed. Noise and other small object
are rejected because of the region growing algorithm. This has an advantage that no more
filtering is needed to delete these objected and the remaining objects are only the ones which
can be used for recognition.
2.5 The extraction of every character (traffic sign)
2.5.1 Character extraction algorithm Due to learn the traffic sign board, that gives some very important theories. Is for every traffic
board in the world, they all have a background. Then put the city name and other something
on the background.
Why the background will be painted out, is because, the background color was chosen very
carefully, totally different from the whole environment that can be easily looked, so the
background can give most attention to the people there is some information. If there was no
such background, only characters exist in the air. People are very easy to ignore they.
So the algorithm is built on such theory and combined with the background color which is
used in Sweden, light blue.
For the first, the application will get a image from the video stream.
Then applied the HSV algorithm to process the image. Extract the light blue color then turn
the light blue area white and the rest of them black.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
12
And rescan the image pixel by pixel from four directions. they are from top to bottom, from
bottom to top, from left to right, from right to left. And for every time scan, if the pointer meet
one white pixel them then break, and save the position values where they meet white pixel.
After rescan, should be given 4 position values, they are most top, most bottom,
Most left, and most right. So use these points, to find out the matrix that will be the area use to
extract the characters.
Combine the most top value and most left value to find out the left bound, and use the most
right, and most bottom value to find out the right bound.
After make sure the area where to extract.
Because there was commonly two colors being write down on the board (the white color and
the black color characters).
First for the white color, apply the HSV algorithm turn the whole image white color to be
absolutely white, and the rest black. then rescan again the image pixel by pixel inside the area
from top, left bound to bottom, right bound. and when find white pixel on any line and start a
matrix, continue scan, then for one line, if can not find any white pixels exist, then stop the
matrix, and keep the them into an array.
Second for the black color, apply the HSV algorithm turn the whole image black color to be
absolutely black, and the rest black. Then rescan again the image
Pixel by pixel inside the area from top, left bound to bottom, right bound. and
When find white pixel on any line and start a matrix, continue scan, then for one
Line, if can not find any white pixels exist, then stop the matrix, and keep the them
Into the same array as the white chars.
The next, is pop out every matrix, calculate the matrix position value, and from the values to
do scanning form left to right for every row from top to bottom, if can be find any color
change, then start a matrix to save them, and put it into the bottom of the array. if can not find.
Then means, this is a char. And put them into another array, that only keeps chars after
separation.
When every characters was separated, a final array we can get, that keeps every chars.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
13
FIGURES showed below:
Figure 2.6 original image
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
14
Figure 2.7 image process result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
15
Chapter 3 Support vector machine
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
16
3.1 Introduction Support vector machines are widely used for pattern classification because of their good
generalization ability compared with conventional classifiers. In support vector machines the
input space is mapped to a higher dimensional space called the Feature Space. The aim is to
find an optimal hyperplane in this higher dimension feature space that can separate the data in
the best way possible. Since training of a support vector machine is formulated as a quadratic
optimization problem with the number of variables being equal to the number of training data,
a global optimal solution can be achieved. Among those training data set the instances
necessary for the construction of the decision function are the ones closer to the class
boundary. These are called the Support Vectors.
3.2 Machine Learning Being a broad field of Artificial intelligence, Machine Learning is concerned with the
development of algorithms and techniques that allow computers to Learn. It has a wide
spectrum of applications including object recognition, medical diagnosis, speech and
handwriting recognition, robot locomotion, computer vision and many more. To be more
specific the goal of machine learning is to ensemble learning and adaptation abilities of living
species in computers; more deeply to program computers to use past experience to solve a
given problem. Machine learning under went a great deal of advancement in the late eighties
and nineties with the active research done in the field of Artificial Intelligence and Neural
Networks. These advancements in machine learning will lead researchers in understanding the
learning behavior in humans and animals and systems like I-Swarm Robots, that imitate the
behavior of ant colonies performing tasks which are much difficult and unsafe for humans to
performance, and the success of DARPA grand challenge have shown the achievements and
upcoming challenges in this field. Learning can be categorized in various types some as
follows:
Supervised learning
• Learning form examples.
• Learning by taking advice. Unsupervised learning
• Competitive learning.
• Clustering.
• Reinforcement learning. In context of object recognition, machine learning aims on finding a pattern of similarity or
structure in a data set that will lead to generalization of learning system and consequently
identification of unknown data.
3.3 Statistical Learning Theory Support vector algorithms are considered as the first practical spin-off of statistical learning
theory. Therefore, it‟s important to have a little insight about statistical learning theory
before going into details of Support Vector Machine. Statistical learning theory addresses the
fundamental issue of how to control the generalization ability of a neural network in
mathematical terms. Since SVM is a set of supervised learning algorithms, so statistical
theory is only reviewed in its context.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
17
There are three basic components interrelated with each other in a supervised learning model.
These are:
The feasibility of the system depends how much information does the training set has,
generated by the joint probability distribution function of environment and supervisor R(x, d) ,
for the learning system to have good generalization. Supervised learning problem can be
viewed as an approximation problem.
3.4 Support Vector Machine Support Vector Machine is a linear classifier, using the roots of statistical learning theory and
the very powerful kernel function, and are more demandingly used for solving classification
and regression problems. It‟s a linear machine closely related to classical Neural Networks,
infact a support vector machine with a sigmoid kernel function acts as a two-layer feed
forward neural network. SVM is based on the concept of decision planes that defines the
decision boundaries. . To explain the main idea of a support vector machine perhaps the
easiest way is to take the scenario of separating patterns that arises in context of pattern
classification. In that case the role of support vector machine would be to draw a decision
surface which will be called Hyperplane, Such that the distance between the closest samples
and the hyperplane is maximized. This distance between the closest sample and the
hyperplane is known as the Margin and the closest samples with respect to which we calculate
the margin are called the Support Vectors.
Finding a hyperplane with maximum margin is very important. It helps prevent data over
fitting problem and enables the system to classify unknown samples from testing set which
come closer to hyperplane. A hyperplane with maximum margin is called the Optimal
Hyperplane.
Any classification task consists of data instances divided into two sets:
• Training set: used to train the system.
• Testing set: used to test the learning of the system.
Now each instance in the training set has one “target value” called the Class Label along with
several “attributes” called as Features. The task of selecting the most suitable features for
learning and testing is called Feature Selection. It‟s these features that help the learning
system define the hyperplane.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
18
3.5 Two situations
3.5.1 liner separated problem For this problem, all the data can be line separated as figure showed below
Figure 3.1 liner separation model
The SVM can easily find some straight lines that can separated them.
According to the image:
Considering a finite set of input space
(3.1) generated through probability distribution function.
Xi represent data instance from input space X.
Di represent the corresponding output of input space { -1, +1 }
Optimal Margin Hyperplane:
Figure 3.2 optimal margin hyperplane
In neural terms a hyperplane separating a linearly separable data is represented by following
equation:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
19
(3.2) w is the weight vector orthogonal to the hyperplane (decision surface), controlling the angular
movement of the hyperplane.
b is the bias controlling the movement of the hyperplane parallel to the origin.
Figure below can present:
Figure 3.3 calculate separation line
The formal equation
Figure 3.4 equation of separation line
To emphasis the effect of choosing the decision surface with maximum margin let‟s take two
hyperplanes such that there orientation allows one to have greater margin then the other.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
20
Training sets
Figure 3.5 seperation line with training sets
Testing sets
Figure 3.6 seperation line with testing sets
Form this, we can easily find that: Data points in GREEN color are the points that come
inside the margin but still distinguished by the hyperplane but the data points in BLUE color
are the ones those are not recognized by the hyperplane.[6]
So, we can conclude from the above example figure some data instances came too close to
hyperplane but the left side hyperplane, the one with greater margin, was able to classify them
because of its flexibility but the hyperplane with small margin, the one on the right side,
wasn‟t able to classify some of the data instances as they lie on the hyperplane. Such flexible
hyperplane is called the Optimal Hyperplane giving the optimal results on both the training
and the testing set.
3.5.2 non-liner separated problem for the most of the real world problems is non-liner problem.[11][12]
These kind of problems requires non-linear dividing line for separating the instances into two
classes such as the one shown in the figure below:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
21
Figure 3.7 non-liner separation model
This is the point where some advance technique for handling the situation are required and
this is where the concept of Kernel comes in handy. Rather then fitting a nonlinear curve to
the data set the Support Vector Machine uses the kernel function to map the data into a
different space where a linear hyperplane can be used as the dividing line.
This higher dimensional mapping space is called the Feature Space. The functional concept of
kernel mapping is very important and powerful. It allows SVM models to perform separations
even on data set having very complex boundaries by using N-dimensional hyperplanes.
3.6 Kernel function Kernel defined the function to map the classes from a space that is non-liner separated to
another space that will be liner-separated.
Based on the kernel function, we can easily do training samples to get template, and input
data then get finally result.
3.7 Use kernel function solve non-liner problem A Support Vector Machine (SVM) performs classification by constructing an N-dimensional
hyperplane that optimally separates the data into two categories. SVM models are closely
related to neural networks. In fact, a SVM model using a sigmoid kernel function is
equivalent to a two-layer, perceptron neural network. Support Vector Machine (SVM) models
are a close cousin to classical multilayer perceptron neural networks. Using a kernel function,
SVM‟s are an alternative training method for polynomial, radial basis function and multi-
layer perceptron classifiers in which the weights of the network are found by solving a
quadratic programming problem with linear constraints, rather than by solving a non-convex,
unconstrained minimization problem as in standard neural network training. In the parlance of
SVM literature, a predictor variable is called an attribute, and a transformed attribute that is
used to define the hyperplane is called a feature. The task of choosing the most suitable
representation is known as feature selection. A set of features that describes one case (i.e., a
row of predictor values) is called a vector. So the goal of SVM modeling is to find the optimal
hyperplane that separates clusters of vector in such a way that cases with one category of the
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
22
target variable are on one side of the plane and cases with the other category are on the other
size of the plane. The vectors near the hyperplane are the support vectors. The figure below
presents an overview of the SVM process.
Figure 3.8 A Two-Dimensional example
If all analyses consisted of two-category target variables with two predictor variables, and the
cluster of points could be divided by a straight line, life would be easy. Unfortunately, this is
not generally the case, so SVM must deal with (a) more than two predictor variables, (b)
separating the points with non-linear curves, (c) handling the cases where clusters cannot be
completely separated, and (d) handling classifications with more than two categories.
three kernel mapping functions motioned last chapter can be used – probably an infinite
number. But a few kernel functions have been found to work well in for a wide variety of
applications. The default and recommended kernel function is the Radial Basis Function
(RBF).
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
23
Chapter 4 The Implementation
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
24
4.1 Real-time traffic signs recognition flowchart the system is based upon the four main steps (including sub-steps) which include one more
step of „Tracking‟ for faster search by the prediction of next search region. The flow chart in
figure depicts the final design of the real-time traffic sign recognition system:
Figure 4.1 flow chart of the project
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
25
4.2 application component The image showed below is the structure of the characters recognition system
Figure 4.2 the System processing flowchart
4.2.1 Implementation of background extraction module
In this part, and job is to extract the traffic sign background use HSV color segmentation.
Because of when extract the characters, and the environment will give a lot of noises, and
these noises will be very difficult to remove or remove them will take much CPU processing
resources, so extract the traffic sign background color, will produce the less noises, and easy
to make a region for extract characters only in the gray mode.
The reason to extract the background color is due to the destination board background plays a
very important role in the character extraction part. So for every image first extract the back
ground shows below, then do the image segmentation.
Because for the absorbing people attention aim. It is very easy to extract the light blue board
back ground from the image. Method and implementation showed in Chapter 4.2.4.
To get every image for the board here gives three method and works with almost every
destination board. Use array to save the matrix, then do pop check and separation. If for every
matrix. Counts was 2.then got results, if not, then continue pop check and separation till
Counts was 2.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
26
Figure 4.3 Original image
Figure 4.4 Extract the traffic sign background color use color segmentation
The color segmentation algorithm showed below:
Calculation formula:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
27
HSV is defined mathematically by transformations between the r, g, and b coordinates. Let r,
g, b ∈ [0, 1] be the red, green, and blue coordinates in RGB color space. Let max be the
greatest of r, g, and b, and min the least of r, g, and b. To find the hue angle h ∈ [0, 360] for HSV, compute the following equation:
(4.2)
(4.2) R image
Figure 4.5 R image G image
Figure 4.6 G image B image
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
28
Figure 4.7 B image H image
Figure 4.8 RGB and H image S image
Figure 4.9 RGB and S image V image
Figure 4.10 RGB and V image
After converting the RGB color mode to HSV color mode
According to the HSV color model, easily find the color range to extract
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
29
Figure 4.11 HSV color model for find color range
4.2.2 Apply Shadow And Highlight Invariant Color
Segmentation Algorithm The Swedish National Road Administration defined the colors used for the signs in CMYK
color space. These values are converted into Normalized Hue and Normalizes Saturation as
shown in Table below.
The color segmentation algorithm is carried out by taking RGB images using a digital camera
mounted on a moving car. The images are converted to HSV color space. The hue, saturation,
and value are normalized into [0,255]. The HSV color space is chosen because the Hue
feature is invariant to shadows and highlights.
While normalized Hue is used as a priori knowledge to the algorithm, normalized Saturation
and Value are used to specify and avoid the achromatic subspaces in HSV color space.
When the hue value of color of the pixel in the input image is with the specified color range
specified in Table below, and its hue value is not in the achromatic area, then the
corresponding value in the output image is set to white.
Table 4.1 color space relationship with different conditions
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
30
Table 4.2 specified color table
4.2.3 Algorithm Implementation Step 1. Convert the RGB image into HSV color space.
Step 2. Normalize the grey level of every pixel in the H image from [0,360] to [0,255].
Step 3. Normalize the grey level of every pixel in the S image from [0,1] to [0,255].
Step 4. Normalize the grey level of every pixel in the I image from [0,1] to [0,255].
Step 5. For all pixels in the H image
If (H pixel value >240 AND H pixel value= 0 AND
H pixel value < 10) Then H pixel value =255
Step 6.For all pixels in the S image
If corresponding S pixel value < 40 Then H pixel value = 0
Step 7.For all pixels in the V image
If corresponding ( V pixel value < 30) OR( V pixel value > 230) Then H pixel value =
0
Application result:
Figure 4.12 applied better algorithm
Figure 4.13 result with better algorithm
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
31
4.2.5 Extraction area implementation Because maybe the destination board was put in an very complex environment. and the
background was white the same color with characters on the board, this case especially
happened in Sweden, and if doing extract like that, will give o many noises and maybe will
cause the extraction failed, so the background of the destination board become very important.
That was the blue object can absorb human sight. And the algorithm is depends on this theory.
Extract the blue background color and make sure the area and first, then record the blue color
area coordinates, and next doing scanning inside the area. So can easily find out the characters
with less noises.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
32
See figures below
Figure 4.14 the image after extract blue background
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
33
Figure 4.15 the extraction area find
In this step, keep the 4 coordinates in an array. they are top, bottom, right, left.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
34
4.2.4 Second Time Image Processing Module In this part, due to use HSV color mode to extract the characters directly maybe will give
many noises, that will give big troubles to the character extraction module, so for here gray
image will be applied to extract the characters.
Figure 4.16 Gray image
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
35
Extract black characters
(define a number, bigger than that number will give black, otherwise white)
Figure 4.17 the black characters find
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
36
Extract white characters
(define a number, bigger than that number will give black, otherwise white)
Figure 4.17 the white characters find
Here, from these two images, we can still find a lot of noises, that will give troubles to the
character extraction, especially the 4.13, but in last chapter, we have already defined a region.
So for now, we just need to apply the 4 coordinates, so easy to fine a region showed as the red
line area on the image.
So for the next character extraction module, the application only need to start scan in side the
red line area, that will give only a little noises, and reduce the calculation time and cpu
resources.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
37
4.2.5 Character extraction module implementation 1.For the first category: Texts on the image are well arranged, row by row or line by line.
Figure 4.14 destination board suitable with height scan
Step.1
(1).Scan by every pixels from left to right, top to bottom.
Figure 4.15 scan the image from L to R Figure 4.16 scan the image from T to B
(2).If color changes, then give count.
(3).For this image, width scan counts like 3, height scan counts like 1.
(4).Width counts is more than height counts.
Step.2
(1).Use width scan to separate the image, find the matrix, see figure below:
Figure 4.17 image after separated (2).put all the matrix into the array one by one showed below
Figure 4.18 prepare the array
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
38
Figure 4.19 how the image was put into array
Figure 4.20 put image rects into container Step.3 get the first matrix from the array and redo step 1 and step 2, see figure below
Figure 4.21 get the first matrix
Figure 4.22 redo step.1 and step.2 and get new matrix
Step.4 put these new matrixes into the array again, showed below
Figure 4.23 put these matrixes into the array again
Figure 4.24 third round scanning and image separation
Step.5 then get them one by one, redo step one, when the scanning counts is equal to 1, that
means can not being separated any more. If not then step.2 separate and put them
into the array again. After images can not being separated anymore, Get result.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
39
Showed below:
Figure 4.25 results
2.For the second category: Texts on the image but with some crosses and arrows that will
interfere the texts separation field.
Figure 4.26 destination board suitable with width scan
Step.1
(1).Scan from left to right, top to bottom.
Figure 4.27 width scan Figure 4.28 height scan (2).If color changes, then give count.
(3).For this image, height scan counts like 3, width scan counts like 1.
(4).Height counts is more than width counts.
Step.2
(1).Use height scan to separate the image, see figure below:
Figure 4.29 image after separated
Step.3
(1).Image like this can not give right city name after processed.
(2).Add some threshold, the different color space if is less than a number, then
(3).Do not separated them.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
40
Figure 4.30 shows the threshold
Figure 4.31 after mixed the bound
Step.4
(1).Use the method introduced in last section, then can get result:
Figure 4.32 result
3.For the third category: the image was not the normal image, somehow strange, so we should
define some threshold in the image to help to separated them with correct output city name.
All the texts were linked together and different color. Use scanning method can only find a
big chunk. image showed below:
Step.1
(1).Width or height scan but with some threshold:
Figure 4.33 Image not well arranged Step.2
(1).deal with these kind of images. should put some threshold at the beginning
(2).Like for one row if black pixels number is less than threshold, then put them
all white. See figures below:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
41
Figure 4.34 find threshold
Figure 4.35 calculate according to the threshold
Step.3
(1).Apply the method showing in section 1, it should be very easy to find the city name.
Figure 4.36 final output of figure showing in 4.9
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
42
4.2.6 Training and testing with SVM
In this part, I will train the images coming from the image processing module, and testing
them with the application link to the SVM library that download from internet, and with four
different kernel functions. And show the result at chapter 5.
Figure shows how the image was been processed to be feature vector.
But for the first, build a character transfer table is very important, because this SVM lib only
accept the digitals and do calculations, not characters.
Table showed below:
0---------------------------------------------------------------------------------------------noises
1---------------------------------------------------------------------------------------------a
2---------------------------------------------------------------------------------------------b
3---------------------------------------------------------------------------------------------c
4---------------------------------------------------------------------------------------------d
5---------------------------------------------------------------------------------------------e
6---------------------------------------------------------------------------------------------f
7---------------------------------------------------------------------------------------------g
8---------------------------------------------------------------------------------------------h
9---------------------------------------------------------------------------------------------i
10--------------------------------------------------------------------------------------------j
11--------------------------------------------------------------------------------------------k
12--------------------------------------------------------------------------------------------l
13--------------------------------------------------------------------------------------------m
14--------------------------------------------------------------------------------------------n
15--------------------------------------------------------------------------------------------o
16--------------------------------------------------------------------------------------------p
17--------------------------------------------------------------------------------------------q
18--------------------------------------------------------------------------------------------r
19--------------------------------------------------------------------------------------------s
20--------------------------------------------------------------------------------------------t
21--------------------------------------------------------------------------------------------u
22--------------------------------------------------------------------------------------------v
24--------------------------------------------------------------------------------------------w
25--------------------------------------------------------------------------------------------x
26--------------------------------------------------------------------------------------------y
27--------------------------------------------------------------------------------------------z
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
43
Figure 4.39 image transferred to feature vecter
Three steps:
1. Prepare input data for training(sample).
2. Train these data, and get result file.
3. Predict information based on result file then get output.
Step 1.training data
If the problem is liner problem, then we can apply the liner function.
Due to the data was non-liner separated, applied kernel function separated them.
(4.3) Table the kernel functions(RBF was most common method, here were applied)
Figure 4.40 training data based on kernel function
Transferring the images into the feature data for training, figure showed below
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
44
Figure 4.41 training data
The template after processed the training data
Figure 4.42 the template
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
45
Figure 4.43 template file summary
Step 2.input data
Transferring the images into the feature data for training, figure showed below
Figure 4.44 testing data
Step 3. easily get output file that show result
Figure 4.45 output(0 means noises, 1 means character a)
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
46
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
47
Chapter 5 Analysis and result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
48
5.1 Analysis the application
5.1.1 Analysis with the character extraction part
5.1.2 Analysis of hsv color model image
1. Advantages of converting to HSV The HSV color space is quite similar to the way in which humans perceive color. The other
models define color in relation to the primary colors. The colors used in HSV are clearly
defined by human perception which is not always the case with RGB or CMYK.
Hue played the central role in the color detection because it is invariant to the variations in
light conditions as its scale invariant, shift invariant and invariant under saturation changes.
HSV model has been very helpful to resolve the problems of Shadows and Highlights or
the chromatic variation of the day light. For example a faded image is considered as one
with the low saturation; the value of saturation can be tuned of that color as per the weather
conditions. Therefore it is able to preserve the maximum image information.
2. Problems with hue in HSV Color Space The hue coordinate is unstable and small changes in the RGB caused strong variation in hue.
It suffered from three problems as stated by Fleyeh:
When the intensity is very low or very high the hue is meaningless.
When the saturation is very low, the hue is meaningless.
When the saturation is less than the threshold value, the hue becomes unstable.
5.1.3 analysis of the character extraction algorithm
Due to the special environment, especially in Sweden, the characters on the destination board
normally should be white, but for the winter in Sweden, everything is white. So use HSV
method directly to extract the white character. Will give many noises and, for this application.
All the extract images will be in the size 30*30, then normalize them to be full of the image.
Here comes problem, for the character l, full image is a black square, but for many small
black spot(noise), is also a black square, this gives recognition problem.
So, make sure the noise is less as possible as we can, are very important. after extract the blue
background to make sure the area should be extract. it still gives noise. So for here, we use
region growing to filter them second time.
And for the extraction, was divided into two parts.
The first part is extract the area with the full city name.
Showed below:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
49
Figure 5.1 first part extraction
After get this area, then put them into an array.
Second part
Take them out from the array do recursion to split every characters.
the first part processing time is liner, it is easy to calculate and estimated.
Time consume showed below:
Figure 5.2 first processing time growing
the second part processing, due to this part, nobody can make sure the city name structures, so
let recursion to finish the splitting. so the time spend is non-liner, if there was too many
characters inside a city name, or some arrow, cross or something. That will consume a lot of
time to process them.
Time consume showed below:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
50
Figure 5.3 second processing time growing
5.1.4 analysis of the noise filter algorithm For here I have applied the noise filter algorithm. This algorithm truly filtered a lot of noises.
And the algorithm time consume is liner and is very to control, for one testing image, the
difference can be found from the figure showed below:
Figure 5.4 noises filter
The red cols means the noises exist in the image, from the image can be easily find that, when
applied this filter, noises reduced a lot. Because of the time consume is liner, so it is possible
for users to decide filter the image with more times and use different size. That can gives
better result.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
51
5.1.5 Result 1.image result The algorithm used in this part, that can separated every character with good accuracy.
but for this algorithm, still for some images, that is need to define the threshold. Because of
the distance of the destination board image captured by the camera. When the destination
board is far, threshold may give wrong result. b
So under the complex destination board conditions for the whole video stream, maybe there
was only one or two frames, the application can get perfect result,
Here will be testing with ten images and see the result:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
52
Image1:
Figure 5.1 image processing
Figure 5.2 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
53
Image2:
Figure 5.3 image processing
Figure 5.4 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
54
Image3:
Figure 5.5 image processing
Figure 5.6 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
55
Image4:
Figure 5.7 image processing
Figure 5.8 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
56
Image5:
Figure 5.9 image processing
Figure 5.10 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
57
Image6:
Figure 5.11 image processing
Figure 5.12 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
58
Image7:
Figure 5.13 image processing
Figure 5.14 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
59
Image8:
Figure 5.15 image processing
Figure 5.16 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
60
Image9:
Figure 5.17 image processing
Figure 5.18 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
61
Image10:
Figure 5.19 image processing
Figure 5.20 character result
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
62
2.result analysis This character application runs based on:
CPU:
OPERATION SYSTEM: windows 7 64bits
The processing time of the algorithm: Due to the different of every image, and the application processing data is different, so for
every image, and in some part of the video stream will be very slow, or take long time to
process, but the average time of the core calculation of this application processing time is:
4.8ms. and the algorithm TIME COMPLEXITY should be N(logn).
Image Nr Image Time Total Extract Failed Rate
Image 1
0.578s 54 54 5 0.90
Image 2
0.328s 72 72 11 0.84
Image 3
0.429s 24 24 0 1.00
Image 4
0.414s 11 5 6 0.54
Image 5
0.371s 51 51 12 0.76
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
63
Image 6
0.489s 15 7 8 0.46
Image 7
0.398s 18 11 7 0.61
Image 8
0.542s 9 9 0 1.00
Image 9
0.372s 39 15 24 0.38
Image 10
0.401s 25 19 6 0.76
The accuracy rate of the extraction characters:
Totally the normal destination board extracted accuracy rate can be reach up to about 70%.
But for the complex destination board extracted depends on threshold.
Because of this application is use the threshold to make sure that every character can be
extracted with arrows and crosses inside them. So for some special case, it will failed to
extract them, or given out not accuracy result.
Figures showed below:
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
64
Figure 5.21 the slope background
for the image showed, it can be very difficult for the application to extract EXKOPING, is
because the black pixel counts from the top to the character ENKOPING, the structure is
different, but the total black pixel counts was closed, so the application can not extract
ENKOPING.
Figure 5.22 I and l character problem
From the figure showed here, can be easily find out that, due to the application was extract the
characters, and resize them to the full region, for this application is 30*30, so for the character
I and l was should a totally black square, this is the same thing with some noises extract from
the image. So this will give big problem to the character recognition part.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
65
Figure 5.23 the problem threshold makes
From the image up there, people can easily find it was a character O, but in fact, is not a
character O, it is a Sweden character, the full spell should be O with two nodes , but the two
nodes was missing. The reason cause this problem is due to the splitting threshold we make.
So, there should be existed a balance that can make them both good, if not, then the system
gives out the errors.
Figure 5.24 the light problems
Due to the environment lights, maybe this problem is not produced by this processing
application, but I still should motion them here, that is the light, the image showed up, was a
dark image, and the white character color gray level is different from the background, so it is
no problem to extract, but for the black characters, the color gray level is closed to the
background color level, so extract the black characters, was very difficult, that can leads to
two different situations, one is the character extracted was very fuzzy, even human with good
logical was difficult to recognize them, and another situation is will make them a totally white
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
66
chunk, no one knows what was that.(the figures shows how the image processed was before
this section figure 5.17)
Figure 2.25 special character combination
From the image showed up, that we can easily find out, due to some not normal spellings and
special characters, like V and A, put them together, can cause problems. For this application,
there was no method to solve these kind of problem. But train them as an special combination.
Then give two character output maybe can solve these problem.
5.1.6 Character recognition
5.1.7 SVM applied here For this application SVM was applied is because
1. For empirical risk minimization SVM is better than NN. 2. NN is very difficult to decide the hidden layer number, but for SVM the kernel
functions was given out, no more changes.
Figure 5.25 characters for testing from the image For every image showed before, then scan them from top to bottom and left to right. The first
pixel was given a number, and if the value was 0, then put 0 after the number, 255 put 1.
Figure 5.26 training sets
Then get template file
And predict from the testing sets
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
67
As the figure showed below
Figure 5.27 testing sets Then we get result
For here „1‟ equals „p‟
Test the accuracy rate with all the four kernel function.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
68
5.1.8 Test with Liner function Image Nr Image 270 image correctness 570 image correctness
Image 1
32% 27%
Image 2
35% 38%
Image 3
46% 43%
Image 4
34% 36%
Image 5
41% 32%
Image 6
43% 44%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
69
Image 7
41% 41%
Image 8
32% 33%
Image 9
27% 19%
Image 10
33% 33%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
70
5.1.9 Test with polynomial function Image Nr Image 270 image correctness 570 image correctness
Image 1
68% 72%
Image 2
62% 64%
Image 3
72% 77%
Image 4
69% 71%
Image 5
73% 73%
Image 6
73% 78%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
71
Image 7
75% 62%
Image 8
77% 79%
Image 9
69% 72%
Image 10
64% 62%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
72
5.1.10 Test with RBF function Image Nr Image 270 image correctness 570 image correctness
Image 1
72% 77%
Image 2
74% 75%
Image 3
76% 79%
Image 4
72% 75%
Image 5
80% 80%
Image 6
73% 73%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
73
Image 7
75% 78%
Image 8
79% 82%
Image 9
72% 75%
Image 10
77% 79%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
74
5.1.11 Test with sigmoid function Image Nr Image 270 image correctness 570 image correctness
Image 1
62% 64%
Image 2
67% 69%
Image 3
64% 65%
Image 4
63% 55%
Image 5
62% 54%
Image 6
63% 71%
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
75
Image 7
67% 62%
Image 8
70% 70%
Image 9
69% 75%
Image 10
77% 78%
The result showed that for the image problem, due to the size of the image is not very big and
very clear, so the accuracy is not very high, but for the image recognition problem, due to it is
belong to non-liner problem, so use liner-kernel function to train them, then predict, that gives
almost wrong output, and compared with the three kernel function, rbf, polynomial, sigmoid.
For here, rbf gives best output. And for the image counts. From the result showed, that can tell,
sometimes, training with more sample vectors maybe will give lower accuracy, that is because
when the kernel function mapping the non-liner space into some high dimension space and try
to find the liner space, but due to to many sample vectors, and will cause very difficult to find
accuracy liner separated space, so will give lower accuracy output even compare with less
training vectors.
So, for the problem type and make sure the best number training vectors for the the kernel
function due to use, will give very good result.
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
76
Chapter 6 Conclusion and future works
-
Qiu Peng Masters September, 2010
E3986D
Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se
77
6.1 Conclusion To conclude the thesis report, the Real-Time city name Recognition System presented here
has been a success. The aim based on right output city name which was the necessary
parameters for the real-time environment has been accomplished. The average accuracy value
achieved by the application is like 60%.
1. Image Segmentation is found to be the most critical task during the whole project. This is
because of the illumination conditions especially due to highlights and shadows. There is
always a need to tune the parameters in segmentation during the process. However the
Image Segmentation that has been done in the project was satisfactory.
2. Noise filtering with the multiple median filter has been very efficient somehow. as it
reduces the number of objects to be appointed as candidate traffic signs for recognition.
Multiple median filter is harmless to the internal information carried by the traffic sign
which could be used for the classification. Therefore it is very compatible for the traffic
sign recognition system in real time.
3. Character extraction is also the import part in my thesis. Use width and height scan to
extract the characters from the image, only works with formal image I have showed before.
If the image is very complex so should and some threshold. Or worked with the useful
frames from the video stream. Not all of them.
4. But for the Character extraction part, due to noise problem, and some image, especially with the white frame inside the blue background, if the image was slope, so maybe will
cause the every char extraction failed, instead will output a big black chunk.
5. The Support vector machine is very powerful to classify characters. And gives high
recognition rate. but there is one not good compared with neural networks is must train the
noise image also. And give them a feature space.so that the system can recognize it as
noise.
6. And for this SVM algorithm, first I have used another algorithm, that is save all the image in an array, and