Path-State Modeling for Time Series Anomaly Detection Matt Mahoney.
Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was...
Transcript of Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was...
![Page 1: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/1.jpg)
Carnegie Mellon University Pittsburgh
Clear Path Detection
A thesis submitted in partial satisfaction of the requirements for the degree
Master of Science in Department of Electrical and
Computer Engineering
By
Qi Wu
2008
![Page 2: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/2.jpg)
This is to certify that the thesis of Qi Wu has been approved. ____________________________________ Prof. Tsuhan Chen, Advisor ____________________________________ Prof. Marios Savvides
Carnegie Mellon University, Pittsburgh
2008
![Page 3: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/3.jpg)
Acknowledgments
I would like to express my sincere gratitude to my advisor, Prof. Tsuhan Chen, for his
guidance, encouragement, and support in my graduate study. His knowledge, kindness,
patience, enthusiasm, and vision have brought out the best from me in my M.S. study
and provided me with lifetime benefits. He teaches me how to create valuable ideas,
perform research, communicate efficiently, and work with team members. I feel very
fortunate to join his research group and be one of his students at CMU.
I would also like to extend my thank General Motors for providing me an opportunity to
work on this challenging project and Dr. Wende Zhang for his continuous
encouragement and support of my GM internship and M.S. thesis work. His guidance
and support have played crucial role in the completion of my work.
I also thank my wonderful AMP lab mates at CMU: Kate Shim, David Liu, Amy Lu, Devi
Parikh, Andrew Gallagher, Yao‐Jen Chang, Yimeng Zhang, Wei Yu and Congcong Li. My
best friends: Yi Zhang, Le Xie, Xiaoqian Jiang, Sicun Gao. their support of my research
and sharing the happy time.
Finally, I express my sincere love and appreciation to my parents, for their constant
encouragement, support, and sacrifice throughout what have been wonderful two years
of my life.
![Page 4: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/4.jpg)
Abstract
Based on the statistic of Department of Transportation, most accidents
occurred by carelessness of drivers who were not aware of any obstacles
ahead. Such a warning system of guiding a clear path in front can assist
drivers to avoid the potential dangers and prevent accidents. In this project, a
novel robust algorithm is proposed to detect clear path in real-time with the
specific goal of incorporating such smart technology into General Motors
cars so as to alert the drivers of any obstacles in frontal safe distance while
driving.
Different from the traditional approaches which try to build various
detectors to detect all types of obstacles on road, this proposed framework
focus on indicating the clear path ahead. We assume the video camera is
calibrated (the intrinsic and extrinsic parameters) and the vehicle
information (vehicle speed and yaw angle) are known all the time. Well
utilizing this prior knowledge of scenes, first, we generate perspective
patches inverse projected from world coordinate for extracting features
instead of traditional 2D patches on frame. Secondly, the proposed
framework takes advantage of each patch’s spatial and temporal constraints
based on prior to build a probabilistic refinement on the initial statistical
learning from training data, which hence enhances the performance and
speed.
Finally, our proposed framework is verified through experimental study to
show its robustness and efficiency. Especially in some challenge situation
like shadow and illumination changes, we can still get well performance.
![Page 5: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/5.jpg)
Table of Contents 1. Introduction ............................................................................................................................. 6
1.1 Motivation ............................................................................................................................. 6
1.2 Related Work ......................................................................................................................... 6
1.3 Overview of Proposed work .................................................................................................. 9
1.4 Thesis Structure ................................................................................................................... 10
2. Perspective Patch Generation ............................................................................................... 11
2.1 2D Patch Generation ........................................................................................................... 11
2.2 Perspective Patch Generation ............................................................................................. 12
2.2.1 Pinhole Camera Model ............................................................................................ 13
2.2.2 Generate Perspective Patches ............................................................................... 14
3. Feature Extraction ................................................................................................................. 16
3.1 Filter Bank ............................................................................................................................ 16
3.1.1 Leung-Malik(LM) Filter Bank ................................................................................... 16
3.1.2 Gabor Filter Bank ...................................................................................................... 16
3.2 Feature Representation ....................................................................................................... 18
4. Feature Selection ................................................................................................................... 19
4.1 Weak Classifier .................................................................................................................... 19
4.2 Adaboost Feature Selection ................................................................................................ 19
5. Learning Algorithm ................................................................................................................ 23
6. Patch‐based Refinement ....................................................................................................... 25
6.1 Spatial Patch Smoothing ...................................................................................................... 26
6.2 Temporal Patch Smoothing ................................................................................................. 28
6.3 Patch‐based Refinement ..................................................................................................... 31
7. Experiments ........................................................................................................................... 32
7.1 Data Description .................................................................................................................. 32
7.2 Feature Preparation ............................................................................................................ 33
7.3 Data Result .......................................................................................................................... 34
8. Conclusion and Future Work ................................................................................................. 37
Reference ...................................................................................................................................... 38
![Page 6: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/6.jpg)
1. Introduction
1.1 Motivation
There are many traffic accidents of cars or pedestrians being hit every day. Majority of
these accidents occur at the situation that either pedestrians appear suddenly or cars run
too fast to stop while frontal obstacles are close to the car. In either case, the inability of
drivers to be aware of obstacles in time causes the trouble. Therefore, it is very important
to keep a safe following distance clear. That does not only enable the driver to react to a
problem ahead without the need for a panic stop, which could cause a following driver to
crash; but also reduces the probability of drivers crash to frontal car, which stops or
slows suddenly. Pennsylvania driver’s manual identifies 4-second following distance is
the safe following distance.(Figure 1.1(A)) In the normal drive condition, this predefined
distance ahead of the car can allow drivers to steer or brake to avoid a hazard or accidents
safely. Hence, a system (Figure 1.1(B)) to detect any obstacles in this safe distance can
assist driver to avoid the potential dangers and reduce the accidents by driver’s
carelessness, cell phone usage, drowsiness and drinking.
Figure 1.1 Safe following distance (A) Safe distance defined by PA driver’s manual (B) Clear path detection system
1.2 Related Work
In the previous work, there are already many obstacle detection methods (e.g. pedestrian
detection, vehicle detection) built to predict potential dangers. (Figure 1.2)
![Page 7: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/7.jpg)
Figure 1.2 Pedestrian detection and vehicle detection
The most common approaches to detect obstacles are using active sensors[1], such as
radar-based (millimeter-wave )[2] [3] [4] and laser-based (Lidar) [5] [6] [7]. Radar
transmits radio wave into the atmosphere and the obstacles will reflect some of the power
back to the radar’s receiver. Lidar (Light Detection and Ranging) also transmits and
receives electromagnetic radiation in the ultraviolet, visible, and infrared region. From
these work, prototype vehicles employing active sensors have shown promising results.
Furthermore, in the nationwide autonomous vehicle competition, DARPA Urban
Challenge 2007, Rader and Lider are well used by all teams and give a great performance
for obstacles detection and road navigation, which prove their robustness and efficiency
[8].
However, active sensors have several drawbacks such as low resolution, short range, may
interfere with each other, and are rather expensive. On the other hand, passive sensors
such as cameras, combined with computer vision algorithms, offer a more affordable
solution and can be used to track more effectively obstacles. In last decade, various
approaches have been reported in the literature. For the vehicle detection, Bertozzi et al
[9]and Zhao et al[10], used stereo-vision-based methods to detect vehicles and obstacles.
In Matthews et al.[11], PCA was used for feature extraction and neural networks for
detection. Goerick et al.[12] used a method called Local Orientation Coding to extract
edge information to feed neural network classifier. Betke[13] used motion and edge
information to hypothesize the vehicle locations and template-matching for detection. In
the work of Schneiderman[14], the statistics of both obstacle appearance and non-
obstacle appearance were represented using two histograms that each histogram
represents the distribution of wavelet coefficients codewords.
For the pedestrian detection, wavelet response[15] and histogram of oriented gradients
[16]have been used to learn a shape-based model to detect human. Viola et
![Page 8: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/8.jpg)
al.[17]adopted spatial-temporal filters based on shifted frame difference to augment the
pedestrian detection using spatial filters alone, thus using both motion and appearance for
detection. Fablet and Black[18]used optical flow to learn a generative human-motion
model while Hedvig [19]trained a Support Vector Machine for detection.
However, generic detection which can detect all kinds of obstacles is a hard problem in
computer vision. Intra-class variability of the class of obstacles makes the detection
process very challenging. They can appear in different colors, textures and shapes due to
the various types of obstacles. Additionally, their movements in perspective scene are
usually leading to notable variability in appearance and articulation. The variations in
lighting conditions and fluctuations of weather further complicate the problem.
Therefore, due to obstacles’ variability, it is difficult to use a uniform detector to detect
all of them. That can be revealed from Figure 1.3(a) .In feature space, these features
extracted from kinds of obstacles (e.g. pedestrian, vehicle) are dispersed that we need
several boundaries to separate them. However, the clear path, which features are more
clustered, just need one boundary to be separated with other features and is easy to
classify.
(a) (b)
Figure 1.3 The feature space, (a) is feature space of pervious obstacles detection and (b) is the feature space of clear path detection.
Thus, in our approach, instead of designing various obstacle detectors, we would like to
build a clear path detector to determine the clear path ahead of the car directly. As shown
as in the Figure 1.3(b), the features of path with less variation in color, texture and shape
are more clustered and easy to classify from other obstacle features.
![Page 9: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/9.jpg)
1.3 Overview of Proposed work
Figure 1.4 The overview of clear path detection
The clear path detection system that we proposed learns to differentiate clear path and
other obstacles. Since the camera is moving with the host, it is not easy to find the
distinct from the motion. Thus, we use appearance such as texture and shape being the
features to detect clear path in a single frame. Figure 1.4 gives an overview of the system,
comprising of several components – perspective patches generation, features extraction
and selection, SVM learning and patch-based refinement. Compare to the traditional
detection algorithm, there two novelties in our proposed method:
1) Perspective Patches
Since the camera satisfies the perspective projection model, it is improper to build
patches on 2D frame directly. In our method, we generate the 3D patches on
world coordinate as shown in Figure 2.3(A), camera pinhole model and
calibration parameter are used to inverse perspective map them onto 2D frame.
This kind of patches does not only save the computation, but also reduce the
ambiguity in the patch that improves the overall performance.
![Page 10: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/10.jpg)
2) Patch‐based refinement with spatial and temporal consideration
Within frame, neighboring patches have constraints because of the spatial
continuity. Between neighboring frames, patch in current frame has constraints
with its responding region in previous frame because of temporal continuity.
Hence, we can use these constraints to refine the initial learning result. In our
method, two types of patch smoothing considering the spatial and temporal
information separately have been proposed. The spatial patch smoothing enforces
a smoothness constraint that the neighboring patches with similar textures should
have similar probability to be clear path or obstacles. The temporal patch
smoothing ensures patch’s appearance constraint and probability consistency
constraint between current frame and previous frames. Finally, combined these
two smoothing, the maximum likelihood estimator optimizes all the factors and
improves the overall performance.
1.4 Thesis Structure
The rest of the thesis is organized as follows: Chapter 2 outlines the process of
perspective patches generation using camera pinhole model and calibration parameters.
Chapter 3 and Chapter 4 introduce feature extraction and feature selection, which goal is
to find the most discriminative features for representing the patches. Chapter 5 describes
the learning process and Chapter 6 proposes a novel refinement using spatial and
temporal information to improve the overall performance. Chapter 7 details the
experimental results, followed by a discussion and future work in Chapter 8.
![Page 11: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/11.jpg)
2. Perspective Patch Generation
In this chapter, we describe the method of generating the detection patches from the input
frames. Compared to use of pixel directly, we use patches as the detection unit to
improve the computational efficiency. There are two kinds of well-known patches on 2D
images without considering any perspective information: fix grid patch and dynamic size
patch. However, in our proposed method, known the calibration information, we can
build our grid patches on the 3D world coordination and inverse perspective map them
into 2D for extracting the texture from the input frame. That makes the detection region
focus on the frontal path without any influence of other scenes in the frame. Thus, it can
great increase the speed and accuracy of our algorithm.
2.1 2D Patch Generation
Figure 2.1 Fix size patch. A) Detection region with small patches. B) Detection region with large patches. The blue circled patch covers clear path and obstacles.
Similar as shown in Figure 2.1, in the previous work [14], the input image is divided into
small patches with equal size. The features are extracted from each patches and fed to the
classifiers. However, the size of patch is difficult to be defined. If the patch is too small
as shown in Figure 2.1(A), the computation will be greatly increased and less
discriminative features can be extracted from one patch. But if the patch is too large as
shown in Figure 2.1(B), it will cover too much information, even cross different classes
![Page 12: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/12.jpg)
(In Figure2.1 (B), the blue circled patch covers both clear path and obstacles). That made
the patches too ambiguous to be classified correctly.
Figure 2.2 Dynamic size patch.
Based on the limitation of fixed grid patch, in the previous work such as [17], they
proposed dynamic size of sub-window (patch) to scan the whole image as shown in
Figure 2.2. In order to detect the presence of clear path in given image sequence, each
frame is scanned with sub-windows of different size at various positions and every
candidate sub-window is tested against the learned classifier. However, they waste a lot
of computation and time on testing the improper size of sub-windows, redundancy due to
overlap between two patches and unnecessary parts for detecting clear path, such as sky,
objects aside the road.
2.2 Perspective Patch Generation In the clear path detection, the detection region should be laid on the ground ahead of
camera which satisfies the perspective projection model. Hence, the far-off road and
obstacles are smaller while the ones close to camera are much bigger. Based on these
phenomena, it is improper to build the unique fix size patches. Furthermore, it is also
low-efficiency to build the dynamic size patches as we discussed above. In our proposed
method, instead of building patches on 2D image directly, we construct the patches on the
3D world coordinate as shown in the Figure 2.3. We first define the detection region in
the world coordinate and divide this region into small patches. Then, calibration
![Page 13: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/13.jpg)
parameters and pinhole camera model are used to inverse perspective map them onto 2D
image as the prior. Finally, we construct the 2D patches on the frame based on these
prior, which are called perspective patches.
Figure 2.3 Perspective Patch (A) Patches on the world coordination. Note the 3D patches are not equal size. (B) Inverse map 3D patches on 2D frame (C) Perspective Patches.
2.2.1 Pinhole Camera Model In the general pinhole camera model as Figure 2.4, the camera is assumed to perform a
perfect perspective transformation. Let tvum ]1,,[= be the projective image point of the
world point tzyxM ]1,,,[= , both expressed in the homogeneous coordinates. They satisfy:
PMm =µ Where P is a 43× projection matrix describing the perspective projection process. µ is
an arbitrary scale factor. The projection matrix can also be decomposed as:
]|[ TRKP =
![Page 14: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/14.jpg)
Here K is the matrix of intrinsic parameters composed by focal length f and camera
optical center (principle points) ],[ 00 vu
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
1000
00
0
vfuf
K
And ]|[ TR denotes the orientation and position of the camera with respect to the world
coordinate.
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡ −
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡−==
1000cossin0sincos
cos0sin010
sin0cos
cossin0sincos0001
γγγγ
ββ
ββ
ααααγβα
zyx RRRR
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
z
y
x
ttt
T
In our case, both known the intrinsic and extrinsic calibration parameters, we project the
grid patches in 3D coordinate onto 2D frame for synthesizing the perspective patch.
Figure 2.4 Pinhole Camera Model
2.2.2 Generate Perspective Patches Because the scene on frame is always perspective, if the obstacles are too far away from
the camera, their projected regions on the frame are too small to be detected. Therefore,
as shown in Figure 2.3(A), we first define the detection region in the world coordinate is
a 9 meters width by 25 meters length rectangle in front of the camera. Secondly, this
detection region is divided into small patches. It is interesting to note that the size of each
patch is not equal. The patches farther away are much bigger than ones close by (compare
![Page 15: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/15.jpg)
8 meters length farthest to 2 meters length nearest). That’s because we should keep the
projected 2D patches large enough to be detected (10 pixel square at least which we can
extract enough discriminative features). Then, calibration parameters and pinhole camera
model are adopted to inverse map 3D patches onto 2D frames as shown in Figure 2.3 (B).
However, the projected 2D patches on frame are arbitrary quadrilaterals and it is time
consuming to extract their features. Thus, we normalize the project arbitrary quadrilateral
patches (Figure 2.5) and synthesize grid patches based on that as shown in Figure 2.3(C),
which called the perspective patches.
Figure 2.5 Normalize arbitrary quadrilateral patches and synthesize perspective patches.
Using the proposed perspective patches, we do not only reduce the computation and
ambiguity of detection patch, but also make our detection available for considering the
temporal relationship between neighboring frames and build a refinement based on that.
![Page 16: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/16.jpg)
3. Feature Extraction
This section introduces the method of feature extraction. Similar to human vision system,
usually, texture, color and shape are extracted from each patch. However, in our case, the
clear path and obstacles can be represented in all kinds of colors with various light
conditions. Thus, we didn’t adopt any color features. Instead, the texture and shape
between clear path and obstacles are so different that they can both be the good
discriminative features. Therefore, we characterize texture and shape by their responses
to a set of orientation and spatial-frequency selective linear filters (a filter bank).
3.1 Filter Bank
3.1.1 Leung-Malik(LM) Filter Bank The Leung-Malik (LM) filter set is a multi scale, multi orientation filter bank which
contains first, second derivatives of Gaussian filters (edge and bar filters), Laplacian of
Gaussian filters (LOG) and Gaussian filters (spot filters), totally 48 filters. In our
implementation, we extend the original filter bank with putting more directions for
derivatives of Gaussian filters and scales for LOG filters. As show in the Figure 3.1, our
filter bank consists of first and second derivatives of Gaussians at 9 orientations and they
occur at 3 basic scales }22,2,2{=σ with an elongation factor of 3 (i.e. σσ =x and
σσ 3y = ), which make a total of 54 filters; Moreover, setting the basic scales
}4,22,2,2{=σ , the Gaussians and LOG occur from σ to σ3 separately, total 24
filters. Therefore, there are 78 filters in our extend LM filter bank. Figure 3.1(A)
demonstrates visualization of all extend LM filter bank.
3.1.2 Gabor Filter Bank A Gabor filter is a linear filter whose impulse response is defined by a harmonic function
multiplied by a Gaussian function. It has received considerable attention because the
characteristics of object shape can be approximated by these filters. In addition, these
filters have been shown to posses optimal localization properties in both spatial and
frequency domain.
![Page 17: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/17.jpg)
The two-dimensional Gabor function is shown as following:
( ) ( )⎟⎠⎞
⎜⎝⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛ +−= XYXyxG
λπ
σγ 2sin
2exp, 2
222
θθθθ
cossinsincos
yxYyxX
+−=+=
Where λ is the wavelength of the sin factor in the Gabor filter kernel; θ specifies the
orientation andγ called the spatial aspect ratio, represents the ellipticity of the support of
the Gabor function. To generate more kinds of Gabor filters for extracting different
features, in our Gabor filter bank, keeping the spatial aspect ratio 1 and 0.5 separately, we
vary the wavelength within set [3, 5, 7, 10, 13] at 9 directions each. There are 90 filters
in total. Figure 3.1(B) demonstrates visualization of all Gabor filter.
(A) (B)
Figure 3.1 Filter Bank (A) Visualization of extended LM filter bank with a mix of edge, bar and spot filters at multiple orientation and scales. (B) Visualization of Gabor filter
bank at 9 directions with different parameters
![Page 18: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/18.jpg)
3.2 Feature Representation
Figure 3.2 Feature representation
As shown in Figure 3.2, after cropping the detection region from the input frame, each
pixel is transformed to a 168 dimensional vector as the responses for the filter bank. Then,
we sum up all absolute responses corresponding to one filter within a patch. Since the
size of patch is changed with perspective, we normalize the features by dividing the area
of each patch. Finally, the normalized values are reshaped and concatenated to form a
168 dimension vector feature which can be treated as the histogram of 168 filters.
![Page 19: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/19.jpg)
4. Feature Selection After extracting the features from input frame, feature selection is introduced in this
section. There are two reasons for selecting the features: 1) some filters in LM filter bank
are similar to Gabor filter bank which provide the redundant features; 2) Not all features
are discriminative enough for feeding the classifier, we should select out the good
features obtained from the distinct filters and reduce other noisy variation. In our
proposed method, we build the weak classifier on each feature first. Then, Adaboosting is
adopted to choose the most discriminative features which improves the computational
efficiency and reduce the time consume.
4.1 Weak Classifier For each patch, either clear path or obstacles, there are 78 features obtained from LM
filters and other 90 features obtained from Gabor filters; we have a total of 168 vectors
that act as a pool of features for selection. Taking one column of features jx from the
training data x , we find the optimum threshold jθ that minimizes the overall
classification error would yield a weak classifier jh :
⎩⎨⎧
><
=jj
jjj
xx
xhθθ
if obstacles, if path,clear
)(
4.2 Adaboost Feature Selection
AdaBoost is a boosting algorithm designed to improve the performance using a
combination of weak classifiers. The procedure to choose the most discriminative
features is motivated by the face detection algorithm proposed by Viola-Jones [17] and
the detail is described in Table.1. Initially, every training data sample is weighted equally.
From the pool of 168 features, the one with minimum classification error is selected
without replacement. The data samples are re-weighted proportional to the
misclassification error before choosing the next feature with corresponding optimum
![Page 20: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/20.jpg)
weak classifier from the updated pool. The selected features (classifiers) also have an
associated confidence jα that is inversely proportional to their misclassification errors
and hence, is a measure of their “weakness”. The process of choosing classifiers is
continued until all the features have been selected out.
Table 1: Adaboosting Feature Selection
Given: the training data ),(,),,( 11 NN yxyx L where ix is the thi feature corresponding
the thi filter and }1,0{∈iy indicate the type of patch to be “obstacles” or “clear path”
Aim: Train the weight for each feature and select the feature by weight.
• Initialize the weight 21,2
1,1 mlw i = for 1,0=iy respectively, where l and m are
the number of obstacles and clear path examples.
• For Tt ,...,1=
o Normalize the weights ∑ =
= n
j jt
iti
www
1 ,
,,1
o Select the best weak classifier th with respect to the weighted error:
∑ −=i ijijt yhw ||minε
o Update the weights
1,,1
ietitit ww −
+ = β
where 0=ie if example ix is correctly classified by th and 1 =ie
otherwise. Now 1
t
tt
-εεβ =
Figure 4.1 depicts the visualization of all ranked filters selected by this algorithm from
the origin extended LM filter bank and Gabor filter bank. Figure 4.2 demonstrates the
filter response images generated by the top 3 and last 3 ranked filters. From the Figure
4.1, it is interesting the selection of the most discriminative vectors follows a similar
trend that the edge filters and bar filters with certain scales at the direction from around
45 degree to 135 degree ranked higher than other filters. It’s also proved by the Figure
4.2. The top 3 ranked filters provide results that clear path responses are distinct from the
![Page 21: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/21.jpg)
obst
dem
amb
tacles respo
monstrate m
biguity betw
Figure 4.
Figure 4.2 F
onses. On
many noise r
ween clear p
.1 Visualizat
Filter respon
the contrar
responses ar
path and obs
tion of all ra
nse images g3 ranked fi
ry, the resu
re generated
stacles.
anked filters
generated byilters (C) La
ults obtaine
d by the im
s (rank from
y selected fiast 3 ranked
ed from las
mproper filte
m top to bott
lters. (A) Ord filters.
st 3 ranked
ers which m
tom, left to r
rigin frame
d filters
make the
right)
(B) Top
![Page 22: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/22.jpg)
These results are due to the facts that clear path is usually flat with less texture while the
obstacles such as the cars and pedestrians have complex texture with a variety of edges
and lines. That’s the good factors we can distinguish clear path out of obstacles. Only the
filters with right scale and shape can generate the maximum filter responses. In addition,
the frame captured is perspective as we mentioned before. Therefore, the boundary and
texture of both clear path and obstacles will go along with the certain degree (around 45
degree on left and 135 degree on right) toward to the horizon line. Filters with these
orientations can get the higher discriminative features.
![Page 23: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/23.jpg)
5. Learning Algorithm Support Vector Machine is, arguably, the most popular off-the-shelf discriminative
classifier and was developed by Vladimir Vapnik [20]. It provides state-of-the-art
performance in many real-world applications like text categorization, bio-sequences
analysis, object classification etc. It is also known as the maximum margin classifier as it
learns the decision boundary to separate the positive and negative classes (assuming
binary classification problem) that maximizes the margin from the data.
Figure 5.1 Support Vector Machines when the data is linearly separable. The decision boundary is chosen so as to maximize the margin.
Consider the given training data set )},{( ii yx where Ni ,....,2,1= and ix is N-dimensional
feature vector with label iy equal to -1 and +1 denoting the obstacles and clear path. The
feature vectors are assumed to be normalized to zero mean and standard variance to
obviate the undesirable domination of any particular dimension(s) in deciding the
decision boundary. SVM strives to find the hyperplane 0=−bxitω that best separates the
training data with regards to the distance from this hyperplane. If the data is linearly
separable, the problem reduces to maximizing the margin ||/2 ω between the parallel
hyperplanes on either side of the decision boundary such that the following conditions
hold for all the features:
![Page 24: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/24.jpg)
1 if ,11 if ,1
−=−≤−
+=+≥−
iit
iit
ybxybx
ω
ω
That can be written as:
ibxy it
i ∀+≥− ,1)(ω
As shown in Figure 5.1, the sample points that lie on the hyperplanes are known as the
support vectors as they are the ones that decide the decision boundary. In general, if the
data would not be linearly separable, SVM will map vectors with appropriate kernel
function ),( ixxK into high-dimension of the feature space and satisfy the linearly
separable constraint. In our implementation, unlike the classical SVM which use Signum
function and has binary output (+1 or -1), we need to know the posterior probability
every class as the prior information for the later processing. Thus, we first obtain the
binary posterior probability distribution which can be written as:
∑=
=
−+=±=
N
iiii
i
xxKyxf
xfxyp
1),()(
))(||/1exp(11),|1(
α
ωω
Where iα denotes the Lagrange multipliers. ),( ixxK is the kernel function which we use
RBF (Radial Basis Function) kernel in our case. Shown in the Figure 5.2, given the
features, we classify each patch into type “clear path” or “obstacles”. Additionally, SVM
classifier provides the probabilities of both classes in order to be used for refining the
results in the next section.
Figure 5.2 Initial Type Estimation using SVM
![Page 25: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/25.jpg)
6. Patchbased Refinement
In our prior classification, each patch is classified into 2 classes: “clear path” and
“obstacles”. However, as same as any other machine learning algorithms, SVM cannot
guarantee 100 percents of accuracy in classification. Sometimes, the ambiguity on texture
will make SVM easily make the wrong decision. There are two classic errors shown in
the Figure 6.1
1) Spatial Error. Within the same frame, the clear path patch is wrong classified into
“obstacles”, while it is surrounded by clear path patches with right classification.
2) Temporal Error. Between the neighboring frames, the patch in current frame is
classified into “obstacles”, while its corresponding regions computed by the vehicle’s
movement in the previous frame are classified into “clear path”. That makes the conflict.
In actually, these kinds of errors can be corrected if we consider the relationship of
patches with their spatial and temporal information.
Figure 6.1 Classification Errors, the patches circled are classified into wrong classes (A) Spatial Error (B) Temporal Error.
Therefore, in this section, we proposed two types of patch smoothing considering the
spatial and temporal information separately. The spatial patch smoothing enforces a
smoothness constraint that the neighboring patches with similar textures (e.g. color)
should have similar probability to be clear path or obstacles. Thus, it updates detecting
![Page 26: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/26.jpg)
patch’s status with the influence of its neighboring patches status. The temporal patch
smoothing ensures patch’s appearance constraint and probability consistency constraint
between current frame and previous frames. Therefore, it updates detecting patch’s status
in current frame with the influence of its corresponding region’s status in previous frames.
Finally, combined these two smoothing, the maximum likelihood estimator optimizes all
the factors and improves the overall performance.
6.1 Spatial Patch Smoothing
Figure 6.2 Spatial Patch Smooth. The detecting patch js has 6 adjoining neighboring patches.
Within the one frame, the neighboring patches have certain relationship which can help
us to determine the class of detecting patch. Therefore, we set the spatial smoothness
coefficient of patch j to )(cn j which enforces the constraint that neighboring patches
with similar texture should have similar class c .
Let ls denote one of current detecting patch js ’s neighboring patches with its associated
initial probability obtained from SVM )(0 cPl and maximal likelihood estimate lc .
)(maxargˆ 0 cPc lc
l >=<
The class of patch js is modeled by a contaminated Gaussian distribution with mean lc
and variance 2lσ . We define spatial smoothness coefficient )(cn j to be:
![Page 27: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/27.jpg)
⎩⎨⎧
=+=∏ obtacles 2,pathclear ,1
c ),ˆ;()( 2
lsllj ccNcn εσ
Where ),,( 2σmeanxN is the Gaussian distribution and ε is small constants (e.g. 1010− ).
We evaluate the variance 2lσ using texture similarity, neighboring connectivity, and ls ’s
initial probability obtained from SVM:
1. Texture similarity of the patches lj.∆ , which measures the texture difference
between patches ls and js . We estimate the Texture similarity as defined
following:
Assume the texture is represented by colors, the patches should have similar color
distribution if their texture is similar. We first compute the color histogram of two
patches as their color distribution ( jC and lC ). Then Kullback–Leibler
divergence is used to measure the difference between these two distributions:
∑=i l
jjljKL
iCiCCCCD)()(log)||(
Since the KL divergence is not symmetric, in our computing the texture similarity,
we sum both divergences to measure the distance between these two distributions:
)||()||( jlKLljKLjl CCDCCD +=∆
Therefore, if the textures of two patches are totally the same, their texture
similarity lj.∆ is equal to zero, while lj.∆ is larger if these two patches are more
different.
2. The neighboring connectivity ljb , , which contains the percentage of patch js ’s
border between patches ls and js . As shown in the Figure 6.2 the neighboring
patch 2 has the longer border sharing with detecting patch j than neighboring
patch 1, which means patch 2 has more influence to the detecting patch j than
patch 1 and its neighboring connectivity is bigger.
3. Initial probability for patch ls obtained from SVM )(0 cPl
Therefore, the variance 2lσ is defined as
![Page 28: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/28.jpg)
),0;()( 2,,
22
∆∆=
σσ
ljljt
ll
NbcPv
Where v and 2∆σ are constants ( 8=v and 202 =∆σ in our experiment). Thus, if patch js
and its neighboring patch ls have similar textures. Also, if patch js ’s class is consistent
with its neighbor’s status estimates (they are both classified as obstacles or clear path),
we expect spatial smoothness coefficient of patch j to )(cn j to be large.
6.2 Temporal Patch Smoothing
Figure 6.3 Spatial Patch Smooth. The patches in current frame are projected back to
previous frame with known the speed and yaw angle of vehicle.
Because the camera on vehicle is moving forward, the temporal neighboring frames also
have certain relationship. Shown in the Figure 6.3, known the speed and the yaw angle
of the vehicle, we can calculate where the patches in current frame (blue rectangles in (B))
are located in previous k frames (red rectangles in (A), when 1=k ). Usually, the patch
and its corresponding regions in previous frames should have similar texture and they
also should be classified into same class. However, it is important to note there are
several kinds of occlusions that a patch might not have the corresponding patches in the
previous frames, such as
![Page 29: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/29.jpg)
a) Obstacles in front keep the same speed as our vehicle. The class of patch in
current frame is “clear path” whose corresponding region is occluded by obstacles
in the last frame. (Figure 6.4(A))
b) Obstacles appear from aside of our vehicle suddenly. The class of patch in
current frame is “obstacles” which did not inherit the same class from last frame.
(Figure 6.4(B))
Figure 6.4 Two types of occlusion
In these two cases, the patches with its temporal regions are not consistent. We call the
patch is not visible and set the consistency likelihood is a fixed probability instead of
computing the temporal consistency.
Therefore, in our implementation, the temporal smoothing coefficient )(, cc kj , which
ensures that the patch js ’s estimate is consistent with its corresponding estimates at
previous k frames, is computed based on temporal consistency, visibility, and patch ls ’s
initial probability obtained from SVM.
1. Temporal consistency.
Given the vehicle speed Tv , yaw angle Tθ and frame rate of video f , we first
calculate out the vehicle has moved distance fvT / between neighboring frames.
Secondly, the region of patches on world coordinate in the previous frame can be
computed back by:
![Page 30: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/30.jpg)
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡•−
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
0cossin
T
TT
cur
cur
cur
prev
prev
prev
fv
ZYX
ZYX
θθ
Then, combined changes of position with the calibration information and camera
model, we project patch js onto its neighboring frame. Finally, since the projected
region may cover several patches in previous frame, we calculate patch js ’s
projecting distribution )(0, cb kj based on the probability distribution at the projected
previous k frame to estimate the temporal consistency without occlusion.
∑∈
=jj Sx
xkrS
kj cPnum
cb )(1)( 0),(
0,
As show in the Figure 6.3(A), where ),( xkr is the patch index at the time k , on
which the corresponding pixel of the pixel position x on patch js is. And iSnum is
the number of the pixels on patch js . If the projected region’s status is consistent
with patch js ’s status, we expect )(0, cb kj to be large when patch js is visible at the
previous frame.
2. Visibility: kjv ,
Due to the possible occlusions we discussed before, a patch might not have the
corresponding pixels in the previous frames. We estimate the overall visibility
likelihood kjv , to measure how much the patch is visible. This factor is model as the
Gaussian distribution and computed based on the Texture similarity as defined
above.
),0;( 2,, ∆∆= σkjkj Nv
If the patch js is visible on the previous frame as shown in Figure 6.3, we can find
its corresponding region when we search the space on the previous frames. The
temporal consistency solutions offer large values )(0, cb kj . If the patch js is occluded
on the previous frame as shown in Figure 6.4, we cannot find its corresponding
region when we search the space on the previous frames. No solution provides large
)(0, cb kj value. Therefore, we use kjv , as a robust and computational-efficient
measure of patch’s visibility.
![Page 31: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/31.jpg)
3. Initial probability for patch ls obtained from SVM )(0 cPl
Now, we combine the visible and occluded cases. If the patch is totally visible,
)(, cc kj is calculated from the visible consistency likelihood )()( 00, cPcb jkj . Otherwise,
its occluded consistency likelihood is a fixed prior 0P (e.g. 2/1 ). Therefore,
. obtacles 2,
pathclear ,1c )1()()()( 0
,00
,,,⎩⎨⎧
=−+= PvcPcbvcc kjjkjkjkj
In our case, considering the computation efficiency, we just calculate the influence from
the previous frame (Let 1=k ).
6.3 Patchbased Refinement
Finally, we start to refine detecting patch js ’s initial probability )(0 cPj between its
neighboring patches and between its corresponding regions on the previous frames
iteratively. The updated probability of patch js : )(cPtj is updated iteratively as follows:
0,1,2,.... t obtacles 2,
pathclear ,1c
)()()()(
)(
}1,0{,
,1 =⎩⎨⎧
==∑ ∏
∏
=∈
∈+
cNk kjj
Nk kjjtj
cccncccn
cP
Where )(cn j is the spatial smoothing coefficient and )(, cc kj is the temporal smoothing
coefficient in each projected region on previous k frames ( 1=k ). Moreover, we just
update the probability of patch js : )(cPtj once ( 1=t ) with considering of the efficiency.
Thus, after refinement, the final maximal likelihood estimate jc of patch js is.
)(maxargˆ cPc tj
cj >=<
![Page 32: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/32.jpg)
7. Experiments
7.1 Data Description
The experiments data provided by General Motors is real time videos captured on the
running vehicle. The rate of these videos is 5fps and there are 4500 frames at 320*240
pixels in total. In these videos, various conditions are included. Some typical situations
are shown in Figure 7.1: urban, highway, shadow and illumination change. However,
some cases are challenge because they are ambiguous in texture and shape: the texture of
the fence in highway is similar to that of clear path; the appearance of shadow in the clear
path is easy to be confused with that of obstacles; the sudden changes of illumination also
greatly influence the texture and appearance. However, our experiments confirm we can
get reasonable performance on these situations.
Figure 7.1 Typical situations in experiment. (A) Urban (B) Illumination change (C)Shadow (D) Highway
Then, we calibrated the camera’s intrinsic parameters (camera’s focal length and optical
center) with checker board patterns offline and the extrinsic parameters (the translation
vector and the rotation matrix) with markers on the ground using Zhang’s method [62].
![Page 33: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/33.jpg)
The real time movement parameters corresponding to each frame, such as vehicle speed
and yaw angle, are read from the sensors on the vehicle.
7.2 Feature Preparation
Figure 7.2 Feature Selection (A) 5-folder cross-validation (B) ROC curve
As described in the Chapter 2, each frame is divided into patches in world coordinate and
inverse mapped to perspective patches on 2D. Hence, 30 patches varied from 6*7 to
23*15 in size are generated from one frame. We labeled 33201 patches (6224 obstacle
patches and 23986 clear path patches) extracted from 1007 frames covered all 4 typical
situations discussed above. Convoluted with extended LM filters and Gabor filters, there
are 168 features originally obtained from one patch. An Adaboost feature selection was
trained for the set of all clear path and obstacles data to rank the features. Then, we
trained the SVM-based status estimator with all training set by incrementally putting
features. In Figure 5.2(A), we showed the result of 5-folder cross-validation with the
increasing of ranked features. Also, we adjust the parameters of SVM and plot the ROC
curve in Figure 5.2(B). Clearly, having more number of features improves the
performance. The variation in the false alarm rate and detection rate was observed for
different number of features with SVM parameters changing. The one that provided a
good balance between speed and performance was chosen. In this project, we choose top
50 features with their corresponding filters. The SVM model is created using RBF kernel
with parameter C=32 and gamma=0.0313.
![Page 34: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/34.jpg)
7.3 Data Result
In the test procedure, every frame is represented by 30 perspective patches which are
extracted 50 features each. The initial classification created by SVM model with
probability output. Figure 7.3 shows some initial detection results in the urban and
highway case. The result shows our algorithm can already well identify the clear path out
of different types of obstacles (e.g. cars or roadsides) in various sizes, shapes and
orientations, even without any refinement.
Figure 7.3 Result of initial detection using SVM
Nevertheless, SVM could make the wrong classification because of its non-perfect
performance or ambiguity of the patches, such as landmark in the clear path patch as
shown in Figure 7.4(A)1. Furthermore, in the challenge cases like Figure 7.4(A)3 and 4,
the bridge’s shadow and illumination change greatly influence patches’ appearance and
texture that guides to wrong classification. Hence, spatial and temporal information are
considered to refine the initial classification. Figure 7.4(B) demonstrates the results after
refinement. The error patch in Figure7.4(A)1 is corrected by the constraint of its
neighboring patches. The error patches in Figure 7.4(B)3 and 4 are corrected by the
constraint of both temporal neighboring patches and spatial neighboring patches.
![Page 35: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/35.jpg)
Figure 7.4 Patch-based refinement
Table 7.1 indicates the performance before and after refinement. It is obvious from the
table that comparing with initial SVM classification which has 92.23% in accuracy, 4.6%
in FAR and 5.1% in FRR. Patch-based refinement can improve the precise to 94.57% and
sharply reduce FAR to 3.2% and FRR to 4.5%. We show more examples in Figure 7.5.
Figure 7.6 provide some extreme cases we cannot handle. The huge change in
illumination and over-exposure totally mess the texture, the wrong constraints from
spatial and temporal information amplify the errors. We can improve our algorithm by
increasing the training set covering these situations or finding more discriminative
features invariance to illumination change.
Accuracy FAR FRR
SVM 92.23% 4.6% 5.1%
SVM + Refinement 94.57% 3.2% 4.5
Table 7.1 Test result
![Page 36: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/36.jpg)
Figure 7.5 More test result
Figure 7.6 Extreme cases cannot be handled now
![Page 37: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/37.jpg)
8. Conclusion and Future Work Different from the traditional approaches which try to build various detectors to detect
the all types of obstacles on road, this proposed framework focus on indicating the clear
path ahead. We assume the video camera is calibrated (the intrinsic and extrinsic
parameters) and the vehicle information (vehicle speed and yaw angle) are known all the
time. Well utilizing this prior knowledge of scenes, we first generate perspective patches
projected from world coordinate for extracting features instead of traditional 2D patches
on frame. Then, the proposed framework also takes advantage of each patch’s spatial and
temporal constraints based on prior to build a probabilistic refinement on the initial
statistical learning from training data. The prior hence enhances the performance and
speed.
Finally, our proposed framework is verified through experimental study to show its
robustness and efficiency. Especially in some challenge situation like shadow and
illumination changes, we can still get well performance after proposed refinement.
In future, we will further improve the learning performance of prior knowledge by having
more training samples and better learning algorithms. Moreover, we will further improve
the performance by investigating the tracking using our prior of perspective projection.
Such as predicting where the tracked obstacles are located on world coordinate and
project them on next frames using motion, vehicle speed and calibration parameters.
![Page 38: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/38.jpg)
Reference
1. Herbert, M., Active and Passive Range Sensing for Robotics. Proc.
IEEE Int'l Conf. Robotics and Automation, 2000. 1: p. 102-110.
2. K. Kaliyaperumal, S.L., K. Kluge, An algorithm for detecting roads
and obstacles in radar images. IEEE Transactions on Vehicular
Technology, 2001. 50(1): p. 170-182.
3. S Sugimoto, H.T., H Takahashi, M Okutomi. Obstacle detection using
millimeter-wave radar and its visualization on image sequence. in
ICPR. 2004.
4. S. Park, T.K., S. Kang, and K. Heon. A Novel Signal Processing
Technique for Vehicle Detection Radar. in A Novel Signal Processing
Technique for Vehicle Detection Radar. 2003.
5. A. Ewald, V.W. Laser scanners for obstacle detection in automotive
applications. in Proceedings of the IEEE Intelligent Vehicles
Symposium. 2000.
6. J. Hancock, M.H., C.Thorpe. Laser intensity-based obstacle detection.
in International Conference on Intelligent Robots and Systems. 1998.
7. C. Wang, C.T., and A. Suppe, Ladar-Based Detection and Tracking of
Moving Objects from a Ground Vehicle at High Speeds. Proc. IEEE
Intelligent Vehicles Symp., 2003.
8. http://www.tartanracing.org/.
9. N. Bertozzi, A.B., Real-time lane and obstacle detection on the gold
system. IEEE Intelligent Vehicles Symp., 1996: p. 213-218.
10. G. Zhao, Y.S., Obstacle detection by vision system for autonomous
vehicle. IEEE Intelligent Vehicles Symp., 2003: p. 31-36.
![Page 39: Clear Path Detection - Carnegie Mellon University · PDF fileClear Path Detection ... PCA was used for feature extraction and neural networks for detection. ... thus using both motion](https://reader033.fdocuments.net/reader033/viewer/2022051523/5a726cbe7f8b9aac538d7ba6/html5/thumbnails/39.jpg)
11. N. Matthews, P.A., D. Charnley, and C, Harris, Vehicle detection and
recognition in greyscale imagery. Control Engineering Practice, 1996.
4: p. 473-479.
12. C. Goerick, N.D.a.M.W., Artificial neural networks in real time car
detection and tracking applications. Pattern Recognition Letters, 1996:
p. 335-343.
13. M. Betke, E.H.a.L., Davis, Multiple vehicle detection and tracking in
hard real time. IEEE Intelligent Vehicles Symp., 1996: p. 351-356.
14. H.Schneiderman, T.K., Probabilistic modeling of local appearance
and spatial relationships for object recognition. CVPR, 1998: p. 45-
51.
15. M. Oren, C.P., P. Sinha, E. Osuna and T. Poggio, Pedestrian detection
using wavelet templates. CVPR, 1997: p. 193-199.
16. N. Dalal, B.T. Histograms of Oriented Gradients for Human
Detection. in CVPR. 2005.
17. P. Viola and M. Jones. Rapid Object Detection using a Boosted
Cascade of Simple Features. in CVPR. 2001.
18. R. Fablet, M.J.B., Automatic Detection and Tracking of Human
Motion with a View-based representation. ECCV, 2002. 1: p. 476-491.
19. Sidenbladh, H., Detecting Human Motion with Support Vector
Machines. ICPR, 2004. 2: p. 188-191.
20. Burges., C.J.C., A Tutorial on Support Vector Machines for Pattern
Recognition. Data Mining and Knowledge Discovery, 1998. 2(2): p.
121-167.