Neural network-based detection and tracking of maneuvering … · 2.5 Noise in the Radar Receiver...
Transcript of Neural network-based detection and tracking of maneuvering … · 2.5 Noise in the Radar Receiver...
Neural network-based detection and tracking ofmaneuvering targets in clutter for radar applications.
Item Type text; Dissertation-Reproduction (electronic)
Authors Amoozegar, Seyed Farid.
Publisher The University of Arizona.
Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.
Download date 24/04/2021 09:11:15
Link to Item http://hdl.handle.net/10150/186824
INFORMATION TO USERS
This ~uscript has been reproduced from the microfilm master. UMI
films the text directly from the original or copy submitted. Thus, some
thesis and dissertation copies are in typewriter face, while others may
be from any type of computer printer.
The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality
illustrations and photographs, print bleedthrough, substandard margins,
and improper alignment can adversely affect reproduction.
In the unlikely. event that the author did not send UMI a complete
manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate
the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand comer and
continuing from left to right in equal sections with small overlaps. Each
original is also photographed in one exposure and is included in
reduced form at the back of the book.
Photographs included in the original manuscript have been reproduced
xerographically in this copy. Higher quality 6" x 9" black and white
photographic prints are available for any photographs or illustrations
appearing in this copy for an additional charge. Contact UMI directly
to order.
U-M-I University Microfilms International
A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. MI48106-1346 USA
3131761-4700 800:521·0600
Order Number 9502624
Neural network-based detection and tracking of maneuvering targets in clutter for radar applications
Amoozegar, Seyed Farid, Ph.D.
The University of Arizona, 1994
U·M·I 300 N. Zeeb Rd. Ann Arbor. MI 48106
1
NEURAL NETWORK-BASED DETECTION AND TRACKING OF
MANEUVERING TARGETS IN CLUTTER FOR RADAR APPLICATIONS
by
Seyed Farid Amoozegar
A Dissertation Submitted to the Faculty of the
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT
In Partial Fulfillment of the Requirements For the Degree of
DOCTOR OF PHILOSOPHY
WITH A MAJOR IN ELECTRICAL ENGINEERING
In the Graduate College
THE UNIVERSITY OF ARIZONA
1994
THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE
As members of the Final Examination Committee, we certify that we have
read the dissertation prepared by __ ~F~a~r~i~d~-~A~m~o~o=z~e~9~a~r ________________ __
entitled __ ~I~~e~u~ra~l~N~e~t~w~o~r~k-~B~a~s~e~d~D~e~te~c~t~i~o~n~a~n~d_T~r~a~c~k~i~n~g ______________ __
of Maneuvering Targets in Clutter For Radar
Applications.
and recommend that it be accepted as fulfilling the dissertation
requirement for the Degree of ~P~h~.~D~.~i~n~E~l~e~c~t~r~ic~a~l~E~n~g~l~·n~e~e~r~l~·n~9~ ______ __
Dr. Malur K. Sundareshan Date
Dr. Hal Thar <"/':;V/9y Date
Dr. Larr t;.fztfll Date I
Date
Date
Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copy of the dissertation to the Graduate College.
I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation
Date
3
STATEMENT BY AUTHOR
This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the library.
Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgement of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.
SIGNED: -",6-",-~~.;...:;..=.;;;....~~1fJr::;.<------
4
DEDICATION
To my parents for raising me
and my wife an two children for their patience
5
ACKNOWLEDGEMENTS
First and foremost, I wish to express my sincere appreciation to my dis
sertation director, Professor Malur K. Sundareshan for his advice and guidance
regarding both research and professional development throughout the entire peri
od of this study.
I also wish to thank the other members of the examining committee, Dr.
Larry Schooley, and Dr. Hal. Tharp, for their review of this dissertation.
I would also like to express my special appreciation to the following people
to whom lowe my achievements: my father :.<\.brahim who set high goals and
standars for me along with support throughout his entire life, my mother who
encouraged me at all times, my wife Afsaneh who gave me the courage, and my
children Hanieh Sadat and Seyed Mohamad who gave me the energy and drive to
succeed.
Finally lowe a particular debt of gratitude to two of my friends Ali Notash
for his vision and support in technical writing, and Seyed Hossein Sadati for his
critics and feedback in the earlier versions of this dissertation.
TABLE OF CONTENTS
LIST OF ILLUSTRATIONS
LIST OF TABLES
ABSTRACT ...
CHAPTER 1. INTRODUCTION
1.1 Basics of Radar Signal Processing
1.2 Components of a Radar System .
1.2.1 Constant False Alarm Rate Processing.
1.2.2 Moving Target Indicator .
1.2.3 Tracking Filter . . . . .
1.3 Multi-Target Tracking Applications and History
1.4 The Neural Network Approach .
1.4.1 What is Neural Computing?
1.4.2 Network Operation ..
1.4.3 Information Processing
1.4.4 Neural Network vs. Artificial Intelligence
1.5 Mathematical Preliminaries of Neural Networks
1.6 Organization of the Dissertation . . .
1.7 Contributions of this Dissertation
CHAPTER 2. REVIEW OF MULTIPLE TARGET TRACKING THEORY
2.1 Overview of Linear Filtering . . . . . .
6
Page
11
13
15
17
17
18
20
21
22
22
24
24
26
28
29
31
33
36
36
TABLE OF CONTENTS-Continued
2.1.1 Radar Signal Representation . .
2.1.2 Statistical Description of Signals
2.2 Random Processes
2.3 Matched Filtering
2.4 Neyman-Pearson Criterion.
2.5 Noise in the Radar Receiver
2.6 Radar Clutter . . .
2.6.1 Clutter Statistics
2.7 Target Modeling ..
2.8 Overview of MTI & CFAR Processors .
2.8.1 More on Doppler Effect . . . .
2.8.2 Radar Pulses with DopIer Shifts
2.8.3 Delay Line Cancellers
2.8.4 Adaptive MTI . . .
2.9 Review of Current Methods in Target Tracking .
2.9.1 Unknown Input Model
2.9.2 Multiple Model Approach
2.9.3 Multiple Hypothesis Testing (MHT) Method
2.9.4 Colored Noise Modeling of Maneuver
2.9.5 Variable Dimension Filter (VDF)
2.9.6 Input Estimation Model (IE)
2.10 Parallelism in Target Tracking
2.11 Sources of Nonlinearity and Their Problems
7
Page
37
37
38
42
42
45
47
48
49
53
55
56
57
58
60
61
63
64
65
66
67
i1
72
8
TABLE OF CONTENTS-Continued
Page
2.12 Data Association . . . . . . . . . . . . . . . . . . i3
2.12.1 Nearest-Neighborhood vs. All-Neighborhood Approach 73
2.12.2 Probability Data Association Filter (PDAF). . . 74
CHAPTER 3. A ROBUST NEURAL NETWORK SCHEME FOR CFAR DETECTION 79
3.1 Introduction. . . . . . . . . . . 79
3.2 Development of NN-CFAR Scheme . 86
3.2.1 Framework for Neural Network Training 87
3.2.2 Training with Optimum Detector as the NN Teacher 88
3.2.3 Selection of Input Features. . . . . . . . 92
3.2.4 Neural Network Architecture and Training. 98
3.3 Robustness Evaluations of NN-CFAR 101
3.4 Conclusions . . . . . . . . . . .
CHAPTER 4. NEURAL NETWORK IMPLEMENTATION OF THE MOVING TARGET INDICATOR
4.1 Introduction. . . . . . . .
4.2 Some Basics on MTI Designs
4.2.1 Current Approaches to MTI
4.2.2 The Radar Ambiguity Function .
4.2.3 Transversal Filters . . . . .
4.2.4 The MTI Improvement Factor
4.2.5 The Optimum MTI Processing Theory
124
126
126
127
130
132
133
135
13i
9
TABLE OF CONTENTS-Continued
Page
4.3 "Vhy Neural Networks for Implementation of NITI (NN-MTI)? 137
4.4 Neural Network Architectures for MTI . . . . . . . . 143
4.4.1 NN-MTI Doppler Shift Extraction from Pulse Series 143
4.4.2 Implementation of Pulse Canceller with Neural Networks 154
4.4.3 Analysis of NN-MTI Design with PRF Switching 157
4.5 Conclusions . . . . . . . . . . . . . . . . . . 160
CHAPTER 5. TARGET TRACKING BY NEURAL NETWORK MANEUVER MODELING li9
5.1 Introduction. . . . . . . . . 179
5.2 Neural Network Implementation of Maneuver Modeling 181
5.2.1 Problem Formulation .
5.3 The First Input Parameter
5.3.1 Statistical Properties of the Innovation Process
5.3.2 Estimation of States Using the Innovation Process
5.4 Optimum Bias Detection . .
5.5 The Second Input Parameter
5.5.1 Formulation of the Heading Estimate
State Equations and the Heading Estimate.
5.6 The Third Input Parameter . . .
5.6.1 Velocity Innovation Parameter
5.6.2 Quantization of the Noise Process .
5. -; Generation of the Training Vectors . .
183
187
188
189
191
197
199
202
204
205
207
210
TABLE OF CONTENTS-Continued
5.8 Neural Network Architecture and Training Data
5.9 Performance Evaluation .. . . . . . . . .
CHAPTER 6. CONCLUSION AND FUTURE WORK
6.1 Introduction. . . . .
6.2 Specific Contributions .
6.3 Directions for Further Research
REFERENCES ............ .
10
Page
217
217
257
258
262
264
11
LIST OF ILLUSTRATIONS
Figure Page
3.1 Schematic diagram of CFAR detector 82
3.2 Neural network architecture for NN-CFAR 100
3.3 Performance of NN-CFAR in Experiment # 3 112
3.4a Comparison of NN-CFAR and CA-CFAR for N=33 116
3.4b Comparison of NN-CFAR and CA-CFAR for N=25 117
3.4c Comparison of NN-CFAR and CA-CFAR for N =17 118
3.4d Comparison of NN-CFAR and CA-CFAR for N=9 119
3.5 Target between two clutter patches 123
4.1 Schematic diagram of pulse cancelers 162
4.2 Amplitude response of pulse cancelers 163
4.3 Transversal filter for MTI processing 164
4.4 Frequency response of pulse cancelers with distinct PRFs 165
4.5 Radar pulse train . 166
4.6 NN-MTI response for variable step sizes 173
4.7 Comparison of the NN-PC with a conventional canceler 174
4.8 Performance of NN-PC in separating slow and false targets. 175
4.9 NN-PC and PRF switching with binomially weighted pulses 176
4.10 NN-PC and PRF switching with unweighted pulses 177
4.11 NN-PC performance in presence of heavy clutter 178
5.1 Magill bank of N parallel filters 234
5.2 Block diagram of the optimum bias detector 235
5.3 The Neural Network Maneuver Detector . 236
5.4 Geometry for the target radial velocity measurement 237
12
5.5 Adaptivity to target maneuver through neural networks 238
5.6 A flowchart of the adaptivity to target maneuver 239
5.7 The two maneuver indicators together 240
5.8 Response of position innovation to maneuver 241
5.9 Indicators in case of consecutive maneuvers . 242
5.10a NN & IE x-coordinate position errors 243
5.10b NN & IE y-coordinate position errors 244
5.lOc NN & IE x-coordinate velocity errors 245
5.11 NN & IE y-coordinate velocity errors 246
5.12 Mean filtering errors for a variable acceleration profile 247
5.13 Mean filtering errors for consecutive maneuvers 248
5.14 An acceleration profile 249
5.15a Filtering error for IE method 250
5.15b Filtering error for NN method 250
5.16 Mean filtering errors for Singer models 251
5.17 Mean prediction errors for NN and Singer models ?-? _u_
5.18 Mean filtering errors for a radial trajectory 253
5.19 The standard deviations for NN and Singer models 254
5.20 The mean filtering error for a non-radial trajectory ?--_u;;>
13
LIST OF TABLES
Table Page
3-1 Comparison of ADT for CA-CFAR with optimal detector . i5
3-2a Comparison of ADT for NN-CFAR in Experiment # 2 . 91
3-2b Performance of NN-CFAR 92
3-3 Comparison of NN-CFAR and CA-CFAR 94
3-4a Performance of NN-CFAR with N =25 98
3-4b Performance of NN-CFAR with N=li 99
3-4c Performance of NN-CFAR with N=9 . 99
3-5 Performance of NN-CFAR in edge clutter. 105
3-6 Performance of NN-CFAR in Experiment # 6 106
4-1 NN-MTI classification of doppler shift with 5 mls 151
4-2 NN-MTI classification of doppler shift with 6 mls 1-" v_
4-3 NN -MTI classification for 10 independent pulses . 152
4-4 NN-MTI classification for 10 independent noisy pulses 1-" v_
4-5 Two-step classification of slow and fast targets 153
4-6 Low resolution doppler shift extraction by NN-MTI 154
4-7 High resolution doppler shift extraction by NN-MTI 155
4-8 Performance of the NN-MTI for noisy pulses . 155
4-9 NN-MTI classification in the presence of clutter 156
5-1-1 Summary of Data for Experiment # 1 . 215
5-1-2 Summary of Performance for Experiment # 1 215
5-2-1 Summary of Data for Experiment # 2 . 216
5-2-2 Summary of Performance for Experiment # 2 216
5-3 Summary of Performance for Experiment # 3 217
5-4 Summary of Performance for Experiment # 4
14
217
15
ABSTRACT
Until the recent past, almost all proposed methods for detection and track
ing of maneuvering targets in clutter have followed the algorithmic path. For most
multi-target tracking problems, however, the algorithmic approach generally re
quires a speed and a degree of parallelism which is far beyond the capabilities of
available computational resources.
This dissertation investigates the development of neural network-based
methods for detection and tracking of maneuvex:ing targets in clutter background
and focuses on three major operations required for this overall task. A detection
scheme is developed by utilizing the pattern classification ability of a trained neu
ral network which helps in a better representation of the clutter and the targets.
Utilizing the mapping property of neural networks, a higher probability of detec
tion is achieved while preserving a constant rate of false alarm. The second unit
is a Moving Target Indicator (MTI) which is trained through examples in order
to integrate a series of noisy radar pulses and provide estimates of target radial
velocity.
For the problem of tracking a maneuvering target, conventional algorithms
employ a Kalman filter which provides estimates of the target position and velocity.
While a Kalman filter is the most powerful linear estimator for continuous random
variables, it may fail to converge in the pres~nce of sharp measurement disconti
nuities which may be caused by clutter or sudden target maneuvers. A multilayer
feedforward neural network in conjunction with a Kalman filter can better resolve
the discontinuity in the measurement sequence. In the new approach proposed
16
here, a neural network is trained to provide an on-line estimate of the necessary
artificial noise components which will help neutralizing the corresponding bias in
Kalman filter estimates of target kinematic parameters.
Ii
CHAPTER 1
INTRODUCTION
1.1. Basics of Radar Signal Processing
Radar is a system which is used for detection as well as location of objects.
The word radar is an acronym derived from the phrase radio detection and ranging.
It has many useful engineering applications in which several extremely sophisticated
algorithms as well as engineering design complexities may be encountered. Radar
is an important remote sensing instrument. It is also an essential requirement for
surveillance systems that may employ more than one sensor. Sensors report both
target and background information. Background information occasionally includes
clutter, noise, intelligent interference, as well as false alarms that may obscure the
true target information. These unwanted signals, together with internal sensor
noise, add to the uncertainties in the kinematics of the target. The problem gets
even more complicated in scenarios where more than one sensor and more than one
target are present. Furthermore, parts of the background clutter may be stationary
while some other parts may be moving with some speed.
The primary objective in radar signal processing is to partition the sensor
data into sets of observations, or tracks produced from the same source. Observa
tions from sensors may be received either in the form of quantitative measurements,
such as the number of existing targets, estimates of target velocity, future predict
ed position, or as higher level obsen-ations such as target classification, shape, and
other attributes.
18
Radar signals in particular ~arry a lot more information than can be ex
tracted by today's digital processing methods. There are many reasons for this
shortcoming. First, radar signals are analog in nature, and conversion to digi
tal representation poses some problems. Particularly in the non-uniform cluttered
environment where the clutter may have a large dynamic range in amplitude vari
ation, linear filters fail to keep up with the resolution requirements. The limitation
of AID (i.e., Analog to Digital conversion) technology further complicates the
problem. The limited resolution power of AID convertors does not usually allow
the usage of full speed capability of the current digital processors. As we move
higher in the electromagnetic spectrum, such as laser radar or infrared detectors,
the problem gets even worse. The effect of a sudden change in clutter amplitude is
sensed by the radar receiver as a large step function at the input of a linear filter
that can mask the detection of other slowly varying targets. The second major lim
itation of current digital methods is the conflicting requirement of more reference
cells for sampling of background information and limitation of the processing power
for such high rates of data. Therefore, the computational requirement is the major
reason that in most detection schemes the target background is rejected before fur
ther processing (e.g. tracking, identification, situation assessment, classification)
on the target itself is started. Neural networks, on the other hand, provide means
of processing more information such that the target and the background clutter
may well be processed together in a more efficient way.
1.2. Components of a Radar System
Depending on the particular application, a radar system is composed of
many components and subsystems. One of the major applications of radar sys
tems is in the surveillance of an environment. Surveillance includes detection and
19
tracking of multiple targets in a cluttered background. A complete system that
performs this task is referred to as a Multiple Target Tracking (MTT) system. In
MTT systems, one or a collection of sensors are used to gather information about
the targets of interest. This information is then filtered through complex signal
processing units and valid target returns are then used for further data processing
to extract features of interest in the set of observations.
There are three main units in an MTT system. The primary function of the
first stage, which is called the Constant False Alarm Rate (CFAR) processing unit,
involves detection of the target against the background clutter. After the target
has been detected, the next stage is to determine whether the target is stationary
or moving with some velocity. This is referred to as MTI, which stands for Moving
Target Indicator. For a stationary target, the challenge is to suppress the clutter
from a :fixed background, such as the ground clutter in the detection of a stationary
vehicle. On the other hand, in the detection of a moving target, the background
clutter could be due to an aggregation of birds or other interfering targets that
move in the neighborhood of the target. The final stage of the MTT system is to
maintain the track of each individual target. Each track file may include target
position, velocity, as well as other features attributed to the target.
As we move from detection to tracking, the problem gets progressively more
complex. This increased complexity is due to the fact that at each level of pro
cessing, there are always some clutter data that remain unfiltered and they further
corrupt the input to the next unit. A target track may totally get lost during a
maneuvering period if the target's sudden acceleration input is not well estimated
by the tracking filter. As more clutter data leak through the other units into the
tracking filter, the process of estimating the target acceleration will become more
20
difficult. We now briefly describe the role of each unit. Further details on each of
these units will be given in Chapter 2.
1.2.1. Constant False Alarm Rate Processing
A false alarm is defined as the detection of a noise signal instead of the
true target signal. It takes place when the noise amplitude crosses the threshold
which is set for detection of the target signal. We may have false alarms in each
unit. However, it is primarily the job of the detection units (i.e., CFAR or MTI) to
preserve a constant false alarm rate. That is, the received pulses from the target
are integrated and the threshold at the output of a receiver is set to achieve a
desired probability of false alarm. The receive~ noise is generally modeled with
Gaussian distribution, whereas the clutter distribution is non-homogeneous and
non-Gaussian. Conventional target detection schemes are based on the matched
filtering technique which attempts to optimize the signal-to-noise ratio. The p
resence of unknown clutter, however, degrades the optimality of matched filtering
due to the non-white characteristics of the correlated clutter since the matched
filters are optimal if and only if the background is white Gaussian noise and the
signal shape is known. In usual practice, a pre-whitening filter is used to decorre
late the interference from the signal. The signal is then passed through a CFAR
processor for final processing. The CFAR detector observes the noise or clutter
background in the vicinity of the target and adjusts the threshold in accordance
with the measured background.
Detection is a general problem and is not specific to only radar signal pro
cessing. For example, there can be found a lot of medical applications for signal
detection. The major requirement in such applications is a form of parallel or se
quential processing that looks at several items in the background before a decision
21
can be made about an abnormal signal that could represent some sort of a dis
ease. Moreover, signals that are reflected from natural objects such as heart, blood
vessels in the human body, trees or a group of birds in the en .... ironment are more
complicated to model than echo signals from man-made objects. Therefore, novel
detection schemes that can intelligently utilize the parallel distributed processing
power of neural networks seem to be necessary in order to perform such complex
tasks.
1.2.2. Moving Target Indicator
A Moving Target Indicator (MTI) makes use of an important physical prop
erty of radar electromagnetic pulses. Each time a pulse hits a moving target, it
would either compress or stretch in wavelength depending on whether the target
is approaching or moving away from the radar. The MTI processor requires more
signal processing than a CFAR unit. In fact, a complete CFAR processor should
include doppler processing as well. This is primarily due to the fact that each pulse
may be corrupted by false doppler shifts due to clutter motion in the vicinity of
the target and there are several pulses that have to be considered. The presence of
correlated clutter can make the integration of pulses a rather difficult task. That
is, the integration of pulses is useful only if the clutter has an independent effect
on the pulses.
Depending on the wavelength of operation, MTI has different zones of blind
speed. That is, an MTI unit suppresses the clutter at certain velocity ranges.
Therefore, a target might be moving at some velocity which is not detected by the
MTI unit. Since the returned pulse amplitudes are modulated by the corresponding
doppler shift of the target, the job of an MTI unit is to calculate a set of optimal
weights for the pulses such that the weighted combination of pulses would result
in useful information about the target radial velocity.
22
1.2.3. Tracking Filter
The tracking filter is the final stage of the MTT system. There are many
noisy samples together with clutter data that enter the set of measurement inputs
for the tracking filter. A measurement se;; is defined as the set of all candidate
target returns. For example, if a total "of 20 pulses are transmitted by radar, some
of them may hit the target and return with some noise added to them. Some other
pulses may just hit different objects in the neighborhood of the target as the radar
scans through the target. Therefore, we may have several returns from a target
that have to be processed in order to come up with a more accurate estimate
of the target features (e.g., position and veloc.ity). Tracking in a multitarget,
cluttered, multisensor environment is characterized by uncertainty in the origin of
the measurements. The tracking function consists of: a) filtering, which is an
estimation of the current state of the target and b) accuracy, which is represented
by the covariance matrix of the predicted estimates. These two functions are
complicated due to the uncertainties in the dynamical models of the targets and
the noisy measurements of the desired states (i.e., position and velocity).
1.3. Multi-Target Tracking Application & History
Multiple target tracking systems are designed for a variety of applications.
The applications include surveillance and tracking of targets, instrumentation, air
traffic control, ground mapping, detection and recognition of objects, robotic vision
systems, and infrared vision homing missile systems, to name a few. Radar tracking
has a twenty-year history behind it. Operators of radar systems used to connect
"blips" on radar screens. In 1955, Wax [1] noticed the similarity between the
radar tracking problem and a fundamental problem in nuclear physics, where it is
required that the path of an actual particle be identified against a background of
23
random noise. He proposed that the elements of initial track formulation (birth),
track maintenance (life), and track deletion (death) were common to all problems
of multiple target tracking. Later on, modeling of radar signals as time series of
stochastic processes was soon introduced.
The next major breakthrough In MTT theory came in 1964 with a paper
published by Sittler [2] on Bayesian formulation. The use of Bayesian theory was
a :first step in the recursive estimation of states of interest. Sittler's work occurred
before the Kalman filtering approach was adopted as a widespread approach to
recursive state estimation and prediction of the target position. Thus, it was not
until the early 1970's that the MTT theory became a major topic of interest. The
papers by Barshalom and Singer [3,4] heralded the development of modern MTT
techniques that combined correlation and Kalman filtering theory.
An eminent scientist who introduced the stochastic filtering theory is Ko1-
mogorov. He is among the :first who studied the minimum mean-square estimation
in stochastic processes during the late 1930's. He also stated a fundamental theorem
that has been used in explaining the mapping properties of multilayer perceptrons,
which forms the basis for trajectory interpolation and estimation used in this dis
sertation [5]. The relevant work by Wiener [6] introduced and solved the problems
of linear filtering and prediction for stationary random processes which can be de
scribed in terms of their power spectral densities. The underlying mathematical
difficulties facing Wiener's solution generated additional research that led to the
development of recursive time domain approach by Le"inson and later by Kalman
and Bucy [1]. Kalman and Bucy used the state-transition models for dynamic
systems. The theory was then extended into filtering and prediction in nonlinear
24
dynamic systems as well as to adaptive estimation problems, which are the focus
of this dissertation.
1.4. Neural Network Approach
Parallel distributed processing is a new architecture for fast processors that
provides powerful computational capabilities required for most target tracking
problems. In tracking multiple maneuvering targets in a cluttered environment,
the requirement of a large bandwidth for multiple sensors puts a high load on the
computational efficiency of the tracking system. The computational effort required
for tracking n targets can grow eA-ponentially with the number of sensors observing
the scene or the number n of targets present.
There is an inherent parallelism in the nature of tracking multiple targets
that, if used efficiently, can render the tracking system less dependent on the num
ber of targets present. Neural network architectures can remove some of the limi
tations of model-based maneuver tracking filters. With regard to the background
clutter, neural network training ability and the distributed nature of its memory
can store the various clutter distribution parameters through training which results
in a better method of representing the target and the clutter fluctuations.
1.4.1. What is Neural Computing?
Neurocomputing is a computational process inspired by the human cogni
tion processes. Complex combinatorial functions can be handled by simple neural
structures which even the most powerful digital computers cannot do. The neural
network is an analogy to the human brain. Each neuron is a simple processing unit
which receives signals as inputs and performs a weighted summation followed by a
nonlinear thresholding on them. When the total signal magnitude is large enough
to pass the threshold, the neuron is said to fire, producing an output signal. The
25
human brain consists of billions of neurons that are densely interconnected. Arti
ficial neural networks are also based on this structure and the elements are usually
organized into groups called layers. A typical network consists of a sequence of
layers which are fully or partially connected. The connection may be only in a for
ward direction, or the network may include lateral as well as feedback connections.
The way the processing elements are connected determines the architecture of the
neural network. Each network structure is able to learn a class of problems and
training for each network may be different.
1.4.2. Network Operation
There are two main phases of operation ill: neural networks, namely learning
and recall. Learning, by definition, is the process of adapting or modifying the
connection weights in response to input vectors. IT the desired response is provided
at the output to compare with the actual network output, then the learning process
is called supervised learning [5]. IT no desired output is shown to the network, then
the learning is called unsupervised learning [5]. There are other forms of learning
that lie between these two types of learning. Whatever the learning procedure,
there must be some sort of learning rule that specifies how the weights are adapted
in response to new examples. The required training for a neural network is generally
a long process and may require several thousands of examples to be presented to the
network many times. The learning parameters may change over time to modify the
learning speed. The control of the learning parameters is called a learning schedule
[5].
An incoming input vector mayor may not have been presented to the net
work before. IT the learning process has been completed successfully, then the
network will be able to recall the desired response to any given input vector. The
26
simplest form of a network is one which has no feedback connections from one
layer to another, or from one neuron to itself. This kind of network is called a
Feedforward Network [5]. In feedforward networks, the information is passed from
the input buffer through the hidden layers (the intermediate layers) to the output
layer. Feedforward networks are powerful mapping tools since they use nonlinear
transformation functions.
1.4.3. Information Processing
Neural networks offer new information processing capabilities. Their ar
chitectures are quite different from that of conventional digital computers. There
are parallel processors based on serial machine~, but the neural network parallel
processing is different. Conventional computers process inputs one item at a time
and, in a sense, they lose the overall picture. Sequential processing has always had
a difficult time detecting the patterns hidden in the information presented due to
an inability to gain a view of the whole picture. Neural networks process many
inputs at once and work toward reaching a stable picture. For this reason, neural
networks can do a lot of tasks that are nontrivial for conventional computers with
considerably less effort. Tracking the state of a dynamic system, as an example,
is very time consuming and expensive for a sequential machine, while a human
observer can very well identify an object and track its trajectory as well as making
higher level decisions about the object.
Tracking demands an immense amount of computational power. It is a very
complex process whose success depends on the availability of computer resources
and their processing speed. As such, even powerful computers will soon reach their
capability limits in typical tracking problems. As massively parallel processors,
neural networks have proven to be extremely useful in many problems of practical
27
interest with tracking. Furthermore, ~sing conventional parallel processing requires
complex programming skills to take full advantage of the conventional parallel
machines. Sequential processing is good for procedural tasks that can be put in
ordered steps and parallel processing with this architecture is extremely difficult
for the class of problems that can't be 'put in an exact mathematical or procedural
form.
There are a number of technical problems for which neural networks have
demonstrated a strong potential for providing efficient solutions. These problems
include but are not limited to pattern recognition, classification, speech processing,
image understanding, radar processing, robotic c<?ntrol, target tracking, and missile
guidance. Once the network is well-trained, it can respond to the problems almost
instantly. The same architecture and hardware can be applied to a variety of
other problems as well. Although most of the neural network research requiring
simulations is still done on sequential machines, analog neural network circuits have
already been built and tested [5]. The processing elements are only a first order
approximation of biological neurons. Biological neurons are known to perform a lot
more than the simple operation of summation and thresholding in artificial neurons.
Therefore, neural network computing (neurocomputing) is about new architectures
for computing machines that complement the serial processing machines.
Neurocomputing is inspired by the surprisingly massive parallelism of the
brain. As an example, neurobiology has discovered that even in the human brain,
certain architectures exist that do particular tasks and this knowledge has shed
some light on the state of the neural network research. Therefore, instead of using
a large network to learn a complex problem, it is better to use a number of smaller
28
networks such that each network is trained with a specialized task. As neurobi
ology discovers new frontiers about the human brain, more will be known about
neurocomputing. Neural network technology is a multidisciplinary field which is
what makes it grow so fast.
1.4.4. Neural Networks vs. Artificial Intelligence
In Artificial Intelligence and e)"-pert systems, knowledge is made explicit
through the formation of rules. Every rule can be defined by some sort of mathe
matical function. The function may be arbitrarily complex and highly nonlinear.
A function however is a metaphor and transforms data into another form. H rules
can be defined by functions, the problem can be presented as a pattern-matching
problem. The nonlinear transfer functions of processing elements in a neural net
work allow representation of any arbitrary nonlinear function. The learning of the
function by a neural network means that the network has learned the rule. Train
ing can be performed by examples and the network can extract the rules through
the examples. The main function of a neural network is pattern recognition.
Traditional expert systems and statistical systems also use pattern recogni
tion as a scheme. Although neural networks do the same, they do it more efficiently.
Pattern recognition requires the ability to match large amounts of information si
multaneously (in parallel) and then produce a categorized output. Learning, on
the other hand, is the extraction of rules and generalization of the rules to similar
classes of problems. Neural networks can internally build the structures and fea
tures pertinent to a problem. This is in comparison to the statistical techniques
that require more resources and information processing before they can solve classi
fication problems. Neural networks can organize the data and extract higher order
statistics and learn from data with minimal external intervention.
29
Knowledge representation is useless unless efficient recallability of the knowl
edge is also available. Related data bases and knowledge matching are the basic
requirements of AI units. Once knowledge is represented, it should be efficiently
updated, which means knowledge is preserved as modules that need to be refreshed
or fused to new pieces of information .. Processing knowledge with structured pro
gramming techniques is a very difficult task since it is a distributed process. As
such, many unlike pieces of information need to be put together to produce a
high level decision. It is also very expensive to do knowledge processing. Neu
ral network associative memory and real time pattern matching provide extremely
powerful tools for approaching fuzzy and inexact algorithmically complex prob
lems. Knowledge is represented in a neural network through connection weights.
The weights are the memory units of a neural network and are well distributed
throughout the network.
The fault tolerance of a neural network is due to the distribution of infor
mation among the weights. IT some processing elements are destroyed, or some
connections are altered, the performance of the network would degrade gracefully
and this is because, unlike traditional computing systems, information is not con
tained in one place [5]. The fault tolerant characteristics of neural networks makes
neural computing systems extremely well suited for applications in which failure in
control or damage to memory units can cause disasterous results such as in nuclear
power plants, air traffic control, space operations, missile guidance, and the like.
1.5. Mathematical Preliminaries for Study of Neural Networks
Each processing element of a neural network has a transfer function [5,93].
Transfer functions are generally nonlinear functions such as a sigmoid, or a hy
perbolic tangent function. They have a fundamental role in the adaptation of the
30
network and determine the sensitivity of the processing element to its input from
previous modes. Processing elements can receive inputs belonging to multiple in
put classes. Separate inputs to each mode may play the role of an activation signal
derived from a common scheduling process to acth-ate a set of processing elements
synchronously. The parallelism in up·dating provides a means for parallelization
of neurocomputing hardware. A typical processing element input from a class of
inputs is a weighted sum in the follO\ving form
Ik = LWkjXkj jEl(
(1.1)
where Wkj is the weighting factor for class k inputs on the ph input of class k.
The weighting coefficient Wkj is called the local memory of the processing element
associated with class k. The mathematical data type of these weights associated
with the connections of each class of inputs can be defined as desired. The pro
cessing element output is usually a signal in the range { -1, + I} and can be either
digital or analog. For processing elements with n outputs, the output vector is
X = (Xl, X2, .•• , Xn)T where Xi represents the output signal of the ith element.
The domain of the vectors is an n-dimensional cube
As mentioned earlier, weights are adaptive coefficients within the network
that determine the intensity of the input signal. They are adjusted according to a
learning rule and the network learns by adjusting its weights according to the rule.
Some initial weight vectors should be assumed to start the learning. Assuming
there are N processing elements in a layer, the weight vector for that layer may
be represented by ltV = (WI, W2, ••• , ltV N ) T. Assuming that there are n weights
31
associated with each processing element, then Wi would be the weight vector for
each processing element. Further details can be found in [5].
1.6. Organization of the Dissertation
This dissertation mainly focuses on the employment of static multilayer
neural networks for the adaptive detection and tracking of a maneuvering target in
a cluttered environment. To show the diversity of available algorithms for detection
and tracking of a maneuvering target in clutter, some of the schemes that are often
discussed in the literature are briefly outlined in Chapter 2. We will also specifically
describe the mathematical preliminaries of the MTT subsystems in Chapter 2.
Chapter 3 concentrates on the design and implementation of a Neural
Network-based Constant False Alarm Rate (NN-CFAR) processor. We shall
demonstrate the superior performance of the NN-CFAR processor over the tra
ditional Cdl Averaging Constant False Alarm Rate (CA-CFAR) processor. We
will also discuss the statistical properties of the input parameters that are care
fully selected for a good representation of the background clutter. Although the
discussion is basically over the training of the neural network in a homogeneous
background, it will be shown through several simulation runs that the method can
be extended to a non-homogeneous background as well. It will be shown further
that the use of additional input parameters to the neural network can greatly im
prove the CFAR detection performance particularly when the number of reference
cells is less than about 30.
A neural network implementation of a Moving Target Indicator (NN-MTI)
will be discussed in Chapter 4. Different neural network architectures will be pre
sented with different sets of inputs to demonstrate the neural network properties
in application to Coherent Pulse Integration (CPI). We will show how the neural
32
network can compute the optimal weights for the received pulses in order to ex
tract the doppler information. This will be shown both in the absence and in the
presence of clutter in the 'Iricinity of the target. The extraction of the doppler shift
directly through amplitude distribution of the pulses can eliminate many complex
processings required by the traditional methods.
In Chapter 5, use is made of the nonlinear mapping property of neural net
works to implement a hybrid maneuver detector and compensator. 'We shall discuss
how a combined set of parameters may reflect sufficient information about a target
maneuver. 'We will then use these parameters as inputs to a neural network. The
two outputs are then used to compensate for the bias which has been accumu
lated during the previous sampling period. We also discuss how this scheme can
be extended to other types of target maneuvers (e.g., circular). We will perform
an in-depth analysis of the mathematics involved in the compensation property of
this unit. Comparison with some of the powerful traditional maneuver detector
and compensator techniques will illustrate that the neural network can be used
in conjunction with the Kalman filter for realizing an on-line adaptive tracking
filter, particularly when the duration of acceleration is comparable to the sampling
period.
The dissertation is concluded in Chapter 6 which summarizes the specific
contributions and outlines some possible extensions for further research. These
suggestions include the use of recurrent networks for some typical radar detection
and tracking problems as well as integration of target tracking subsystems with
guidance units.
33
1. 7. Contribution of the Dissertation
Radar signals are processed and analyzed through time series representa
tion of the amplitude and phase of the returned echoes. In recent years stochas
tic models of signals have provided a major mathematical framework for a good
representation of random processes such as echoes from natural objects or other
man-made fluctuating targets. However, analysis and design of signals have been
formidable tasks when more than one or two parameters are involved in signal
representation or other parameter estimations related to the random process are
required. Elimination of the requirement for a precise and detailed mathematical
model of radar signals together with the parallel. processing capabilities useful for
on-line implementation of detection and tracking algorithms make neural networks
very valuable, particularly in dynamical situations [94-106J. In this dissertation, we
have concentrated on three principal directions that play important roles in almost
any complete radar system. In this regard, the contribution of the dissertation can
be outlined as follows.
1) Performance of conventional Constant False Alarm Rate (CFAR) proces
sors degrade sharply as the number of reference cells (i.e., clutter samples
in the vicinity of the target) decreases. The need for more reference cells
in turn is due to the statistical requirements for the parameters that are
used in designing these processors. In reality, however, only a limited num
ber of reference cells is usually available. This may be either due to radar
constraints such as resolution and sampling time or the presence of other
interfering targets as well as clutter patches in the vicinity of the primary
target. In this research, a novel adaptive autodetection technique is devel
oped through the Neural Network implementation of Constant False Alarm
34
Rate processor (NN-CFAR) that combines a number of parameters and
uses it in a single processor. A multilayer feedfonvard network with back
propagation training scheme is used. The Optimum Detector is used as the
main source of training examples. The NN-CFAR which is designed for the
homogeneous background, enhances the performance of the traditional CA
CFAR by the use of additional parameters. Performance of NN-CFAR is
then compared with the CA-CFAR processor. It is shown that NN-CFAR
performs better than conventional CA-CFAR processor particularly when
the number of reference cells decreases. This is a very important feature in
practical implementations where the number of resolution cells might be lim
ited. Other advantages are the speed of response and the ease of hardware
implementation.
2) The Neural Network implementation of a Moving Target Indicator (NN
MTI) is also presented as an alternative solution to the conventional pulse
cancellation techniques. Radar pulse amplitudes are modulated by doppler
frequency shift which is due to a physical property of electromagnetic inter
action of pulses with a moving target. We make use of a multilayer neural
network to calculate the optimal weights that are necessary for Coherent
Pulse Integration (CPI) in order to extract the doppler shifts directly with
out further complex electronic hardware.
3) A hybrid approach to maneuver modeling is presented for tracking a ma
neuvering target using a multilayer feedforward neural network that works
in conjunction with a Kalman filter. The network is trained for a straight
line motion of the target executing sudden longitudinal accelerations. This
35
approach provides a framework for a fundamentally different method of ma
neuver modeling compared to those which employ purely statistical tech
niques. In particular, these latter techniques require more samples from the
target to compensate for the bias which is induced by target acceleration.
Furthermore, inclusion of more features in the estimation process for a more
accurate maneuver modeling results in an increase in computational com
plexity, particularly in situations involving short term accelerations. With
neural network maneuver modeling, on the other hand, we can minimize
the required number of samples by utilizing more features. It is shown that
the neural network maneuver modeling scheme performs better than pure
ly statistical techniques particularly when the duration of acceleration is
comparable to the period of measurement.
36
CHAPTER 2
REVIEW OF MULTIPLE TARGET TRACKING THEORY
2.1. Overview of Linear Filtering
In a radar environment, the data rate is generally higher than the rate that
can be efficiently handled by the available processing power of digital computers.
Several detection schemes based on statistical linear signal processing techniques
have been proposed in the literature [15-17]. ~ almost all radar detection mech
anisms, decisions are made based on comparing the output of the receiver with
some threshold level. IT the envelope of the receiver output is greater than the
threshold, a signal is said to be present. The decision, however, depends on the
rate of false alarm that can be tolerated. Preserving a constant false alarm rate is
peculiar to radar detection. A false alarm means that the detector declares that a
target is present while the true detected signal is due to noise only. In the literature
of statistics, this is called a Type I error [8]. There is also another type of error
which is due to missed detection and occurs when the signal amplitude is below the
threshold setting and therefore is declared as noise. It is not feasible to minimize
both types of errors simultaneously. Hence, in radar detection, the type II error is
attempted to be minimized at the cost oftype I error (i.e., false alarm).
A decision criterion which has been widely accepted by the radar commu
nity is the Neyman-Pearson observer. In this technique, the threshold is fixed by
allowing a certain false alarm probability. This is called Constant False Alarm
Rate (CFAR) processing. The signal-to-noise ratio of a single pulse, however, is
37
not always enough to reliably accept the decisions and the probability of detection
can be increased by utilizing multiple pulses as opposed to a single pulse. If the
shape of the signal was known as well as the background, then a matched filter
preceded by a whitening filter could be developed to result in an optimal decision.
Since there are so many random parameters affecting both the signal and the back
ground, samples from the background are required to help in classifying the target
and the clutter.
2.1.1. Radar Signal Representation
Radar performance measures are generally expressed in probabilistic terms.
No precise definition of signals is available. This has led to the extensive use of
statistical detection theory [9] as the primary procedure for processor design in
radar systems. In general, statistical detection theory is an abstract process based
on hypotheses testing. Processing is optimized from the detection point of view.
There are generally two hypotheses. One is that the target is present (HI) and
the other is that the target is absent (H 0). This can be carried over every indi
vidual resolution cell. A sequence of observations give rise to a set of conditional
probability density functions for every hypothesis. A likelihood ratio test is then
formed and a decision threshold is chosen based on some optimality criterion. The
optimality criterion for a Const;mt False Alarm Rate (CFAR) processor is to maxi
mize the detection probability under a constant false alarm probability. To further
analyze the processing performed on the signal, we must first represent the various
forms of signals, clutter, and noise as they propagate through the processor.
2.1.2. Statistical Description of Signals
Consider an ensemble of records of the same random process. If the records
are sampled at the same instant of time. then each sample will represent a different
38
value. This range of values satisfies the definition of a random variable and can
be described by a probability density function. For n samples, at time instants
tJ, t2,' .. ,tn, we have the n-dimensional density function
which statistically defines the random process. For a stationary random process,
the statistical moments (e.g., mean, variance) are constant with respect to time,
and P(x,t) reduces to
Therefore, we can determine the statistics of the process from a single time record
because the ensemble statistics and time statistics are identical. A purely random
process is one for which the probability density function is given by
P(x) = P(Xl)P(X2) ... P(xn ).
This model is used extensively in the Marcum-Swerling [8] approach to radar de
tection problems.
2.2. Random Processes
Random processes are stochastic models of time variations of signals. In
simple words, a random process is composed of a sequence of time-varying random
variables. As an example, the measured voltage across a noisy resistor at different
instants of time corresponds to a random process. As another example, the se
quence of radar pulse amplitude returns from a target which is fluctuating is also
39
a random process. One of the most important random process models which has
been used extensively in practice is the Gaussian process.
A random process {Xi} is Gaussian if the variables XI,X2, ••• ,Xn have a
joint (n-dimensional) Gaussian probability density function for any set of values
i 1, i2,." , in and any value of n. The Gaussian density function of a joint process
is given by
1 [ 1 T -1 ] P(X1 •..• ,Xn )= / /i'?'i exp --2(x-m) G (x-m) . (21l")n 2V IGI (2.1)
where X and m are the n-dimensional vectors
where mi corresponds to the mean of each Xi random variable. The covariance
matrix C is defined by
where
and the correlation coefficient Pij is given by
Cij Pij = --.
UjUj
Assuming a stationary Gaussian process, we have
Uj = (fj = (f
and
40
IT the sampling times t l , ••• ,tn have been chosen such that Xl, ••• ,Xn are uncorre-
lated, then
[Ul
:.1 c= u~
0 n
and hence
(2.2)
which is the product of the n first-order density functions, since Xl, ••• ,Xn are
independent.
There are two other random process models that are of prime importance
in target detection and tracking, namely the Markov and the VV"einer processes.
A Markov random process is a process in which the present value is dependent
41
only on the value of the process at the previous sample time and on a transition
probability. A Markov process is expressed by
P(x) = P(XI,X2, ... ,Xn) =:= P(xnlxn-d ... P(X2IxdP(xd
= P(xl)IIi=2P (Xil xi-d·
The transition probabilities are called conditional probabilities and reflect the basic
mechanism of a Markov process. The order of the Markov process depends on the
conditional density functions and the number of previous samples that relate to the
present sample. These processes are generally used in the target tracking literature
and are basically for modeling the target maneuvers. One example is Singer~s model
of target acceleration which makes essential use of the properties of the Markov
process.
The Wiener random process (or Brownian Motion) is a limiting form of the
random walk which is a sum of independent steps of size $, equiprobable in every
direction, taken at intervals S. The limiting process, when $ - 0 and 6 - 0 such
that s/~ is constant, yields a random process wet) with the following Probability
Density Function (PDF)
P[w(t)] = N[w(t); 0, at]
i.e., it is normal with zero mean and variance at. Note that. the Wiener process is
nonstationary. It relates to white noise, denoted here as net), by wet) = net).
42
2.3. Matched Filtering
A matched filter is a network whose frequency response function maximizes
the output peak signal to mean noise power. The frequency response of this filter
is H(f) = 5*(f) where H(f) is the spectrum of the corresponding matched filter
and 5*(f) is the conjugate of the input signal spectrum. The output of a matched
filter is not a replica of the input signal, and therefore the shape of the signal is
not preserved. The output of the matched filter is proportional to the input signal
cross correlated with a replica of the transmitted signal with a time delay. The
time delay is required for the filter to observe all of the signal before matching is
performed.
2.4. Neyman-Pearson Criterion
While matched filtering is used for the detection of a known signal in the
background of known statistics (i.e., white Gaussian noise), we need a more general
method for the detection of random signals. There are two major schemes for
signal detection based on hypothesis testing and these are the Bayes criterion and
the Neyman-Pearson criterion. The Bayes criterion is based on the probabilities
about the source of information (i.e., the signal) and some cost functions. The
Neyman-Pearson test, on the other hand, maximizes the probability of detection
subject to the constraint that the false alarm probability does not exceed some
preassigned 'value. Therefore, the Neyman-Pearson criterion is directly applicable
to radar and sonar, while Bayes criterion is mainly used in communication systems.
Both criteria consist of two steps. First, the probability ratio forms the likelihood
function. Second, this ratio is compared against a threshold to make the decision
about the presence of a target signal.
43
The Neyman-Pearson test reduces to matched filtering if the signal is known
and the noise statistics is white Gaussian. In radar and sonar systems however, the
correlation matrix of noise in the observed data is unknown and adaptive techniques
are required to set the appropriate threshold in each reference cell (i.e., samples
from the background). In a typical radar detection system, between 10 to 20 pulses
may be passed through each resolution cell and the returned echoes from each cell
form a time series of samples which represent the object in that particular cell.
The target detection problem is further complicated due to the lack of information
about the statistical description of clutter. The likelihood ratio is of the form
where H and Po, respectively, correspond to the hypotheses HI and Ho, and '"Y is a
predefined threshold. Neyman-Pearson criterion is a special case of Bayes criterion.
Example
Let S be the target signal and n the background noise. Assume that the
Gaussian random variable n has zero mean and 'variance q2 = 2, and S is a constant
equal to either 0 or 1. That is, if the target is present then S = 1 and in the absence
of the target S = 0 . We form the two hypotheses
Ho : S= 0
HI : S = 1.
Then using a Neyman-Pearson test with P(DIIHo) = 0.1 we want to form an
optimum decision rule. Assuming that the probability density function of the noise
is Gaussian. the probability density functions for the two hypotheses are given by
1 21 PO(Y) = --e-Y 4 2..ji.
and PI(y) = _1_e-(Y-l)2/4
2..ji.
44
where Po is the Gaussian density function with zero mean and PI is the Gaussian
density function with the mean equal to 1 (i.e., the density function for the signal
plus noise).
The likelihood ratio is then given by
>.(y) = PI(y) = e(2y-l)/4.
PO(y)
The Neyman-Pearson test is to choose HI if
e(2y-I)/4 > >. _ 0
or equivalently choose HI if Y ~ ,. The new threshold '"( is obtained by taking the
natural logarithm of both sides of the above inequality. To determine the threshold,
the false alarm probability is
By referring to the statistical tables, '"( can be determined to satisfy the above as ,
'"( = 1.8. Then,
45
This is the probability of detection based on the single observation y. With i =
1.8, the original threshold >'0 ~ 1.9 and the decision rule is then: choose HI if
>.(y) ;:::: 1.9, otherwise choose Ho.
2.5. Noise in the Radar Receiver
A good model for narrowband receiver noise is
net) = a(t) cos[wot + B{t)]
where Wo is the carrier frequency, a{ t) is the amplitude of the envelope modulation,
and B(t) is the phase modulation. The basic assumption in a narrowband noise is
that the bandwidth of the noise process Bn ~ wo/2r. , which is always satisfied in
radar receivers. Most of the significant noise in a radar receiver is at the front end
of the receiver near the antenna. The narrowest bandwidth of amplifiers determines
the bandwidth of the noise. Band-limited noise is, of course, correlated and this
further complicates the detection process.
The receiver thermal noise is almost always assumed to be of Rayleigh dis
tribution aCt) _a2
P(a(t» = -exp[-] O"~ 20"~
(2.3)
where O"~ is the statistical variance of aCt) and P(B(t» = 1/2r.. The phase distri-
bution is almost always assumed uniform. Yet another convenient mathematical
representation of the noise process is;
net) = xn(t)coswot - Yn(t)sinwot
46
where Xn = a cosO and Yn = a sinO. In this case, both Xn and Yn are Gaussian-
distributed 'with zero mean and are independent. Their probability density func-
tions are given by
P(Xn) = (?l )1/2 exp[-x!/2u;] Un _To
P(Yn) = (?l )1/2] exp[-y~/2u~]. Un _il"
(2.4)
The signal seen at the receiver is the combination of signals received from the
target, noise, and clutter. For a non-fluctuating target, the returned radar pulse
ret) with a maximum amplitude of C will be
ret) = C cos(wot + </».
Therefore, the received signal Set) will be of the form
Set) = C cos(wot + </» + aCt) cos[wot + O(t)]
Set) = [C cos </> + Xn(t)] coswot - [Csin</> + Yn(t)] sinwot
and the resulting density function of Set) is given by [9]
(2.5)
where C2 /2q~ represents the signal-to-noise ratio (SNR).
4i
2.6. Radar Clutter
The radar clutter echoes consist of radar returns from unwanted reflectors
and they often obscure the signal from targets of interest. Clutter is generally
caused by such things as rain, sea, clouds, chaff, or mountains. There are occa
sionally other interferences due to jamming which may have some intelligence, or
may be totally random in nature. Clutter, however, is a term used for the natural
kind of interference in the target background. Examples of radar targets are ship
s, aircrafts, satellites, and missiles. Clutter returns are often much stronger than
target returns and the processing requirement is to increase the signal-to-clutter
ratio (SIC). Depending on the type of target and its dynamics, several different
techniques are available. High resolution radars, for example, have the advantage
of small resolution cells and therefore see less clutter. Clutter is usually distributed
over a larger area than the target and is generally correlated both temporally and
spatially from one resolution cell to another.
The processors that reject clutter are divided into two major categories,
namely the CFAR processors and the MTI (i.e., Moving Target Indicators). '\iVhile
CFAR processors are generally used for stationary targets, the MTI processors are
used for moving targets. MTI is a generalization of CFAR and includes doppler
resolution cells in addition to range and azimuth cells. In this dissertation, the
implementation of both detectors using neural networks will be investigated.
2.6.1 The Clutter Statistics
The main objective of the CFAR and MTI designs is to suppress the back
ground clutter. Clutter distributions are generally unknown prior to the receiver
design but certain distribution families are used for quite a number of practical
situations. Clutter fluctuations are represented by amplitude statistics as well as
48
frequency spectra. The amplitude statistics give information about the percent
age of time during which the returns have a given range of values. The frequency
spectra, on the other hand, represent how rapidly the amplitude values change.
A classical radar distribution of clutter is Rayleigh distribution which repre
sents a background with a large number of equal size, uniformly distributed phases
of scatterers. IT one of the scatterers is dominant and is much larger than the other
scatterers, then the distribution would change to a Rician distribution [9]. In the
higher frequency radar systems, the resolution cells (i.e., the minimum dimensions
that can be resolved my the radar) and therefore the individual scatterers are of
smaller size and a larger deviation from the mean scattering value may occur which
results in longer tails in the probability density functions such as Log-normal and
Weibull density functions [9]. These kinds of densities occur in the low depression
angles and result from large scatterers that are shadowed most of the time but are
observed occasionally. The Log-normal clutter distribution of Vi is given by
1 {I [ Vi )2]} P(Vi) = vi? v.. exp - ?u2 In( - , --;rUe z - e J.Le
(2.6)
where Vi ~ 0, and J.Lc and U e are the two parameters of the distribution and Vi is
the target echo in each resolution cell. If we let ~i = In Vi, the resulting distribution
would be
(2.i)
which is a Gaussian distribution. This means that an idea110garithmic amplifier
in front of the CFAR detector would make the CFAR processing easier and this
scheme has been used in practice. The Weibull clutter density is given by
49
It should be noted that the Rayleigh density is a special case of the Weibull density.
Note also that Log-normal and WeibUll represent families of distributions. The
target and the clutter can be classified efficiently provided that the test statistics
can be extracted such that they represent the correct member of the family based
on the two parameters estimated by CFAR.
Both frequency and autocorrelation functions of clutter are generally used to
describe it, but frequency spectra are more often.used. The clutter autocorrelation
function is usually assumed exponential and its spectrum is given by
A P(f) = 1 + (fljc)n
where A is the mean value of the power density.
2.7. Target Modeling
Target amplitude fluctuation can greatly modify the signal-to-noise ratio
required to achieve a high probability of detection [90,92]. The more the target
characteristics are known, the better separability from clutter will occur. Since
clutter from natural objects is difficult to model analytically due to their large
varieties, a priori knowledge of the clutter model is not normally available. Despite
a lot of research in this area, the statistical models for radar environment and
clutter information still lack the desired performance. The target modeling, on the
other hand, has been easier because more is known about targets of interest and
there are only a few classes of targets of interest in each radar detection scenario.
50
Furthermore, one has to be concerned about the mathematical complexities of
these models. The analysis of signals gets more intricate as they propagate through
different units of the radar processor. That is why most of the target and/or clutter
models in the literature are from the same family of distributions to further simplify
the analysis and the design of the radar processors (e.g., CFAR).
Most targets of interest in radar detection scenarios are mainly man-made
objects which are more structured and result in some specific features in the reflect
ed signal. The challenge is then to design receivers that can extract these structured
signals buried in background noise and clutter. This is exactly what inspired this
research for neural network-based implementation of conventional CFAR and MTI
detectors. Neural networks are able to extract features in the structured target
echoes with some details that are hidden to conventional statistical receivers. As
an example, much research has been done to detect maneuvers when they are just
about to happen. Each target begins some preliminary actions to prepare for a ma
neuver, like banking its wings, or lifting the aircraft's nose. These features are very
difficult to extract from a moving target with conventional detectors. Furthermore,
these signals are generally distorted by eclipsing of pulses and other scintillation
parameters which are due to the relative motion of the target with respect to the
radar.
Targets are either fluctuating or non-fluctuating. The model [9] used for
non-fluctuating targets is called the Marcum model and is mainly used for distant
targets where fluctuation does not affect the SNR significantly. For higher reso
lution radars, however, the target is always fluctuating due to rapid changes in
the aspect angle. The well-known four Swerling models [8,9] have long been used
for statistical target modeling [90,92]. The Swerling models are of two classes, the
51
Rayleigh and Chi-square models [9]. Each model then includes two other models
which correspond to slow and fast fluctuations. The Rayleigh distribution is a
special case of the Chi-square distribution with one degree of freedom. It is given
by
(2.8)
where"Y is the average radar cross section (RCS) of the target and'Y is the instan
taneous RCS. The case of slow fluctuation is defined as a situation where pulses are
correlated in every individual scan, but are independent from one scan to another,
whereas rapid fluctuation is the case where the pulses are independent even whithin
a single scan. The Chi-square distribution with two degrees of freedom represents
a target with a single dominant scatterer which is non-fluctuating and is surround
ed by smaller individual scatterers. The Chi-square distribution has two cases as
well. The slow fluctuation in which pulses are correlated in individual scans, and
the fast fluctuation with independent pulses in every scan. The probability density
function for the Chi-square distribution is given by [9]
Assuming square law detection with N independent pulses and Rayleigh
density (non-coherent integration), we get a Chi-square density function with 2N
degrees of freedom. The output of the square law detector would be
N-l
z= Lxr (2.9) i=O
52
where Xi has Gaussian distribution with zero mean and variance of 0'2. The thresh
old for detection is derived in [9] and is given by
(2.10)
where Tm is called the threshold multiplier which is calculated in a different way
for each particular CFAR detection algorithm [9] and jJ. is the statistical mean of
the signal amplitudes in the reference cells. This threshold description is applied
in a system with Rayleigh pulse amplitude distribution and non-coherent pulse
integration. 'When a theoretical threshold is not known exactly, a greater SNR is
required to produce the desired Pd (propability of detection) for an allowed Pia
(probability of false alarm). This increase in SNR is called the CFAR loss.
The Swerling I, II models are generally used for a closed form analysis of
CFAR detectors due to the fact that these are the more standard mathematical
models. Furthermore, these models are often used for comparison of different CFAR
schemes. In the Swerling I model with pulse-to-pulse correlation and independent
scan-to-scan Rayleigh reflection, the probability density is P(x) = ie-z/%U(x)
where X is the signal-to-noise ratio and x is the average signal-to-noise ratio. The
probability of detection with a single pulse i.e., N = 1, is given by [7}
where Y is the detection threshold. For N pulses, Pd is given by
Pd = peN - 1, Y) + (Z : 1) N-l e-Y(z+l) x [1 - peN - 1, z ~z 1)] , N ~ 2
where N-l ym
P(N, Y) = L -, e-Y
m=O m.
53
and
z = N x ; the statistics
x = average signal-to-noise ratio on each sample.
Based on these equations, the detection curves are generated for different levels
of SNR for each specified level of Pia. It should be mentioned that most of these
curves are generated for Swerling models only. This is merely due to the availability
of closed form expressions for the probability of detection. Swerling case one and
two are more suitable for aircraft, while case three and four are suitable for rockets,
missiles, and satellites [8].
2.8. Overview of MTI & CFAR Processors
There are three major resolution cells that are involved in radar processing of
received pulses. These cells are referred to as range cells, angle cells, and doppler
cells. The objective of a CFAR processor is to preserve a constant rate of false
alarm. As mentioned earlier, constant false alarm rate means that the detection
threshold should be adjusted such that the probability of detecting a false target
is kept constant. To do this, a complete CFAR scheme is required which includes
doppler resolution cells as well as position cells (i.e., range and angle).
Due to the inherent design complexities present, the detection schemes for
stationary targets and for moving targets will be discussed separately in more detail
in Chapters 3 and 4 respectively. The term CFAR ordinarily refers to the concept
of extending the detection process to the neighborhood of the target (i.e., use of
auxilliary cells) for a better representation of the background clutter. On the other
hand, MTI refers to the processing of a sequence of pulses such that the moving
targets are separated from the background clutter. Once an MTI filter is designed,
54
there can be one MTI unit used for each position cell in the neighborhood of the
target cell (i.e., test cell). That is, for each range cell, there are a number of pulses
that have to be processed in order to decide whether the radar return is due to a
moving target or clutter. The mO'lring target is further classified as slow, fast, etc.
In Chapter 3 we focus on the amplitude fluctuations of the returned pulses
without regard to the frequency information carried by the pulses. Note that radar
pulse amplitudes are modulated by two major factors. One is due to the target
scintillation and eclipsing which is caused by target fluctuations (e.g., change in
aspect angles) as well as range gating of the pulses and timing jitters. We have
already discussed the available models for the tar~et fluctuations. The other factor
is modulation of the pulse amplitudes due to the doppler effect. These two effects
are separately accounted for in a CFAR design and are the subjects of discussion
in the next two chapters.
Targets of interest in MTI systems are moving targets with a radial velocity
which is usually higher than the the velocity of the background clutter. The radial
velocity has direct relation with the phase change of the radar returns. This phase
rate of change will cause a shift in frequency which is called the doppler shift and
can be used by the Moving Target Indicator (MTI) to remove the clutter.
MTI can be operated from a fixed or moving platform such as a ship, an
aircraft, or a satellite. There are a wide range of applications for MTI in surveillance
and detection of low altitude targets. MTI also has a number of applications in Air
Traffic Control (ATC) due to the presence of birds and other slow moving objects.
The slow mO'lring objects add to the complexity of MTI simply because clutter is
also moving slowly and a lot of overlap occurs between the unwanted spectrum and
the spectrum of the desired targets.
55
MTI and CFAR processors generally require many compromises in the radar
processor design and they add to the cost and complexity of the design of detection
and tracking algorithms. According to Skolnik [8], the basic concepts of MTI were
introduced during World War II. Since then, limitations of the processing tech
niques have always been a major factor in the complexity of MTI processors. The
advent of digital processing technology resolved many of the design constraints in
MTI. However, sinc~ the reliability and speed of processors are of prime impor
tance in radar design principles, new frontiers have yet to be discovered in the
neural network applications to MTI as a nonlinear fast processor with capabilities
far beyond the linear processing techniques that have been used in MTI since 1970.
2.8.1. More on Doppler Effect
The electromagnetic pulse of the radar incident wave undergoes a physical
phenomenon that causes a compression in, or extension of, the radar wave with
change in the radial velocity of the target or clutter with respect to the radar.
Doppler magnitude is computed by the derivative of the phase of the returned
pulse. The phase is given by <p = 4~R where R is the range and>' is the apparent
wavelength. It follows that, the doppler shift is given by !d = 211r • ~~ = ~, where
Vr is the radial velocity of the target with respect to the radar.
Since there are different scatterers on the target, the doppler shift is usually
spread over a range of values rather than at a single value. For example, the rotation
of the propellant blades in an aircraft engine introduces additional doppler shift
that may obscure the exact doppler shift of the aircraft body. Typical examples of
clutter include weather clutter which is composed of extended different parts (each
moving with some speed) or a group of birds where each bird has a slightly different
velocity. The change in the aspect angle of the target also causes additional spread
56
in the doppler shift. It is this spread and its a priori unknown characteristics that
make the MTI design a challenging problem.
The MTI function is composed of two major steps. Doppler filtering is first
performed to separate the target signal from that of the target's surroundings.
Secondly, position measurements of the target are extracted. The main function of
the MTI is detection of moving targets, which hence requires additional processing
to obtain the accurate position and velocity of each target (e.g., Kalman filtering).
2.8.2. Radar Pulses With Doppler Shifts
For a typical detection process, multiple pulses are required in order to
achieve a high signal-to-noise ratio. Doppler resolution is also inversely related to
the pulse width (r), which is the primary factor for range resolution. If / d > ~, the
doppler signal may easily be distinguished from a single pulse. However, if /d < ~,
pulses will be modulated in amplitude and many pulses are needed to extract the
doppler shift. This is shown in the following signal modeling. Let the input pulse
to the MTI be represented by a(t)ei"'dt, where Wd represents the doppler shift in
terms of radian frequency and a( t) is the pulse amplitude. Assuming that the next
pulse is received r seconds later with equal characteristics, the MTI output of a
single pulse canceller will be
where aCt - r) = aCt). Hence,
Ivol = 2Ia(t)I(1 - COSWdr )
Note that the MTI output is zero for Wdr = 2mT", which is what causes the
blind speed zones in the MTI output signal. When the mean doppler frequency
57
associated with clutter exceeds the radar's pulse repetition frequency, target data
will be mixed up with the clutter data due to this overlap in their frequency spectra.
Targets may also be aliased due to the same effect and therefore multiple targets
may be declared for each individual target. Aliasing is a side effect of uniform
sampling in linear processing. Multiple PRF (Pulse Repetition Frequency) are
used for reducing the aliasing effect at the cost of some loss in MTI performance.
2.8.3. Delay Line Cancelers
The simplest delay line canceler is a two pulse canceler which is a time
domain filter. The delay time is equal to a pulse repetition interval T which is on
the order of a few milliseconds for typical air-surveillance radars. The advantage of
a time domain MTI over frequency domain filters is that a single network operates
at all ranges and separate filters for each range resolution cell are not required as
in the doppler filter banks. It is more efficient, however, to have a combination
of time domain and frequency domain filters. The video signal received from the
target at range Ro is given by
(2.l1a)
The signal from the previous sample is delayed by T, therefore
V2 = ksin[21r!d(t - T) - <Po]. (2.l1b)
The MTI output Vo is then the difference of the two signals VI and V2 and can be
expressed as
(2.l1c)
58
The blind speed zones occur when the response of the single delay line canceler is
zero. This corresponds to /d = T = nP RF. The relative target velocities resulting
in zero response from MTI are called blind speeds and are given by Vn = ;~, where
n is an integer.
The presence of blind speed zones in the MTI response reduces the detection
performance of MTI. We have already discussed one common method which uses
pulse staggering to reduce the number of blind speed zones. For a single delay
line, one may put the first blind speed outside the range of the expected doppler
frequencies. This, however, causes an ambiguity in range measurements. The
residue Wi of the clutter sample at the ith sample is
Wi = Si - Si-l .
Broader clutter rejection nulls may be achieved simply by adding more delays or by
cascading two MTI filters based on the two-pulse canceler. In this case, the MTI
output will be the square of the output from a single canceler which is proportional
to sin27rfdT. For a two-pulse canceler, the residual clutter at the output of the
MTI is given by
Wi = Si - 2Si-l + Si-2 ,
where the coefficients of the above expression are similar to the coefficients in the
polynomial expansion of (x - y)2. Since these multipliers are the weights of the
MTI filter, higher order MTls employ higher order polynomial coefficients.
In general, the coefficients correspond to the coefficients of the polynomial
expansion of (x - y)n. As another example, a four-pulse canceller would have as its
coefficients 1, -3, +3, -1 which corresponds to (x - y)3. An N delay, binomially
59
weighted MTI is able to cancel amplitude samples of an (N _l)th order polynomial.
The cancellation does not start until all pulses are received.
2.8.4. Adaptive MTI
Current MTI processors are followed by Fast Fourier Transform doppler
filter banks. The bandwidth of each filter is matched to the duration of the dwell
time of the main beam on the target, that is, BD = iD where TD is the dwell
time. This would result in N = P:: returns that are coherently integrated. The
general MTI structure is basically a linear transversal filter followed by an envelope
detector (or with coherent integrators as described above).
Moving clutter such as sea, weather, chaff, groups of birds, and other inter
fering moving targets, however, require more intelligent ways of calculation for the
MTI coefficients. Adaptive MTI is the suggested structure in the current literature
for these changing situations. The basic assumption generally made in Adaptive
MTI (AMTI) is that target returns are spatially concentrated in a few resolution
cells (usually one or two) but clutter tends to be diffused in a number of resolution
cells extended around the target. If the cell being observed is considered as the test
cell, then the neighborhood cells that are usually limited in number to anywhere
from twenty to forty samples are used to estimate the clutter spectrum.
One method for the estimation of the clutter spectrum is the Maximum
Entropy Method (MEM) which is based on the prediction of clutter samples such
that maximum information (randomness) results. The MTI coefficients can then
be adjusted based on the estimated spectra. The power spectral density of an
Autoregressive model (AR) is given by [9,89]
SU) = p PeT 11 + I:k=l ak exp( -j27rfkT) 12
60
where Pc is the clutter power, T is the sampling time, and P denotes the order of
the process which depends on the amount of peak-to-peak variations. The MEM
algorithm can be used to estimate the {ak} coefficients. This will produce the
flattest spectrum of all possible spectra with an autocorrelation function similar
to the measurement data. This is still a least-square fit of the AR model to the
clutter data and no prescribed methodology is available for the determination of P
(model order). If a correct model order P is not chosen, the MEM algorithm will
not be useful in the sense that the estimate may be too smooth or too rough with
spurious details embedded in the spectrum.
2.9. Review of Current Methods in Target Tracking
There have been several different approaches to modeling a maneuver per
formed by a target being tracked. Maneuver is what makes target tracking a
challenging task. Even a single target can be lost by an evasive maneuver. The
background clutter, when present, adds to the complexity of the problem. This is
due to the fact that false returns from the clutter may introduce sharp changes in
target returns that can trigger a false declaration of a maneuver. When a target
maneuvers, it is actually trying to escape the tracking window which has a calculat
ed volume. This tracking window, which is usually elliptic or rectangular (though
other shapes may also be considered), is generally based on a priori knowledge of
the standard deviation of the background noise and the maximum expected speed
of the target.
In order to track the target under a maneuvering condition, the tracking
window size (i.e., the expected region of the target position and velocity) has to
be larger. However, as this window increases in size, more clutter will enter into
the measurement data. Some clutter data may be falsely accepted as target data
61
and cause a loss of track during the maneuver. Therefore, efficient methods of
maneuver modeling are required in order to make the tracking more robust to
different clutter distributions. In the following sections, we shall review some of
the existing methods for maneuver modeling.
2.9.1. Unknown Input Model
Since radar cannot measure accelerations, a priori models are necessary to
account for the sudden accelerations. In the Unknown Input model, it is assumed
that the maneuver command input is unknown [10,77]. Therefore, it is modeled as
a random process that is referred to as process noise. Depending on how harsh the
maneuver is, a certain level of white noise can be used in the target dynamic model
to represent the unknown acceleration. The target dynamics in a maneuvering
situation can be represented in the form
x(k + 1) = Fx(k) + GU(k) + vi(k) (2.12)
where u( k) represents the actual target input acceleration (i.e., maneuver command
input) and vi(k) represents the ith process noise level. In the Unknown Input
model, the term u( k) is set equal to zero because it is unknown and instead several
different levels of vi(k) are used. These levels are chosen based on the algorithm
developed. For example, the transition between the noise levels is provided either
by a transition probability calculation or by some sort of threshold settings on the
innovation sequence. As another example, a harsh maneuver would result in a
large magnitude of the innovation. We may assign a noise process Vi with a large
standard deviation as long as the magnitude of the innovation sequence is large.
Each noise level provides a certain amount of increase in the size of the
tracking window. There are different schemes for choosing the right noise level (i.e.,
62
the mean and the standard deviation of a Gaussian noise process). It should be
noted that an exact model for the actual target acceleration is practically impossible
due to the uncertainty in the exact time of acceleration. However, as will be
discussed later, it is possible to estimate u( k) at some time after the maneuver has
taken place and continue the estimation process after that time. Therefore, in such
methods not omy should the time of occurence of the maneuver be estimated, but
also a certain level of 'Y i( k) is still required to safeguard against the uncertainty
in the estimation of the input u( k). To summarize, in all maneuvering target
tracking problems, the uncertainty of target input acceleration is compensated by
the inclusion of an artificial noise. The challenge is then to integrate the available
information about the target to devise a noise process that leads to the convergence
of the tracking filter. Furthermore, convergence of the filter should take place before
another maneuver is initiated.
Innovation is defined as the difference between the current measurement
data and the predicted measurement at the previous scan, i.e.,
lI(k) = z(k) - Hx(klk -1), (2.13)
where H is the measurement matrix. Then the normalized innovation is defined as
where S( k) is the covariance matrix of the innovation. S( k) includes the uncertainty
in the new information which is due either to the noise in the sensor measurement
or to prediction error and is given by
S(k) = H(k)P(klk -l)HT(k) + R(k), (2.14)
63
where R( k) is the measurement noise covariance matrix. Obviously, since v( k) is
Gaussian distributed, S(k) will have a Chi-square distribution. If the maximum
expected innovation induced by the maneuver is denoted by Emu, the probability
P{Ev(k) ~ Emax} = 1 - a will indicate the probability that the target is not ma
neuvering. Upon exceeding the threshold Emax , the covariance of the process noise
will be increased until Ev( k) is reduced below this threshold. One should remember
that a maneuver is not a stochastic process. A maneuver is a deterministic pro
cess, which is unknown for tracking purposes. The confusion about the nature of
a maneuver could arise because modeling uncertain events, such as a maneuver, is
popularly handled using the concepts and tools from the theory of probability and
stochastic processes.
2.9.2. Multiple Model Approach
This approach is similar to the Unknown Input model approach, except that
instead of adjusting the covariance of process noise, the models are already set up
and corresponding to each model a probability or a likelihood function is obtained
for the correct model. Let Mj be the event that model j is correct with prior
probability,
P(Mj) = Jlj(O) j = 1, ... ,r.
Then the likelihood ratio at time k given that model j is correct is given by
where vj(i) is the innovation vector under the assumption that model j is correct
at time i. Assuming Gaussian noise in measurement, the PDF (i.e. Probability
Density Function) of the innovation function of tracking filter j is
(2.15)
64
Using Bayes's rule,
where p;(O) can be assumed as a lower bound for the initial model probability.
The measurement set for each scan period can be represented by
where mk corresponds to the number of the received measurements inside the
tracking window and Zi( k) refers to each individual measurement. The cumulative
set of measurements is represented by
where Z(j) represents the set of validated (i.e., gated) measurements in scan j that
have fallen in the validation gate. The state estimates will be the weighted average
of states conditioned on the model, i.e.,
r
E{x(k)IZk} = LE{x(k)IM;,Zk}p{M;IZk} ;=1
2.9.3. Multiple Hypothesis Testing (MHT) Method
One can always create new hypotheses for each received set of data and add
to the complexity and accuracy of the tracking filter model. Upon large deviations
of innovation function v( i), one can define a maneuver hypothesis and open a new
file for the target tracks. There are several ways to do this; however, all methods
depend on detecting the maneuver in its early stages. In order to speed up the
65
maneuver detection, the measurement space may be restricted to a subspace of the
target velocity which undergoes the dominant change when the target prepares for
a maneuver. As an example, if Z(k) = [x(k),J(k)]T represents the measurement of
position and frequency of the carrier signal, one may restrict the innovation process
to a normalized change in velocity, which in turn is proportional to the change in
carrier frequency (i.e., doppler shift). That is,
(k) = [J(k) - j(klk _1)]2 v S(k)
The addition of one hypothesis for every time that a maneuver is detected is not
without a cost. Each single hypothesis will result in a new tree of more hypotheses
in later scans which may have to be pruned at a later time.
2.9.4. Colored Noise Modeling of Maneuver
A more realistic model of a maneuver is a correlated (colored) noise process
rather than a white noise. In this approach, the target acceleration aCt) is modeled
as a zero-mean random process with exponential autocorrelation given by
R(T) = E[a(t)a(t + T)] = O"~e-QITI
where O"~ is the variance of target acceleration and 1/ Q is the autocorrelation time
constant. Singer [11] has presented a probability density function for this kind of
maneuver model. Based on this probability density function, the variance of target
acceleration can be calculated and is given by
O"~ = A~ax [1 + 4Pmax - Po] . (2.16)
66
A correlated noise process for a maneuver that uses target acceleration may be
modeied as a first-order Markov process of the form
a(k + 1) = pa(k) + J1- p2um r(k) (2.17)
where the correlation coefficient p is defined in terms of the sampling interval T
and the correlation time constant T, and r( k) is a white Gaussian process with unit
variance. The correlation coefficient is assumed to be of the form of an exponential
function given by
p = e-T / r •
The time constant T is referred to as the approximate time duration of target
maneuver and is generally between 10 to 60 seconds for a typical target of inter
est. Longer time constants (e.g., 400 seconds and more) are typical for the slower
maneuvering targets.
2.9.5. Variable Dimension Filter (VDF)
The VDF approach which is developed by Barshalom [10] suggests two gen-
eral modes of operation, i.e., the quiescent mode, and the maneuver mode. In the
case of the quiescent mode, the trajectory is assumed to be a straight line with
constant velocity. Only position and velocity are estimated in the quiescent mode
while in the second operational mode the acceleration can be added as an addi
tional state. Some radars with adaptive sampling increase the sampling rate upon
detection of the maneuver. However, experience has shown that the problem is not
resolved just by faster sampling [1]. The two filters for the two modes of operation
will have the forms :
[ . .]' x = xxyy
67
which is the state vector for the quiescent model and
for the maneuvering model. Upon detection of a maneuver, the state vector is
augmented by the acceleration state. The acceleration state itself can be modeled
by anyone of t~e available methods.
2.9.6. Input Estimation Model (IE)
In this section, we describe the acceleration model proposed by Bogler [41],
which is referred to as Input Estimation (IE) method. This model is primarily de
veloped to account for fast maneuvers of the target and serves as a good procedure
for comparing with the neural network-based model presented in this dissertation.
Bogler's algorithm will be briefly outlined in the following.
Consider a system with one dimensional state equation
x(k + 1) = Fx(k) + GU(k) + V(k) (2.18)
where U is an unknown input modeling the target maneuver and V is a white noise
process assumed zero-mean with covariance Q. The observation sequence is given
by
z(k) = Hx(k) + W(k)
where the observation noise W is also a zero-mean white noise with covariance
R and is independent of the process noise V(k). In the absence of the target
maneuver, the estimation of the state is performed by using the model without
input (i.e. non-maneuvering model) which is given by
x(k + 1) = Fx(k) + V(k).
68
From the innovations of the Kalman filter based on the non-maneuvering model,
the input U(k) is to be detected, estimated and used to correct the state estimate.
Assume that the target starts maneuvering at time k. Its unknown inputs
during the time interval [k, ... , k + s] are U(i), i = k, ... , k + s -1. The estimates
from the (now mismatched) filter based on the non-maneuvering model will be
denoted by an asterisk. The one-step prediction will be
x*(i + Iii) = F[I - W(i)H]x*(ili -1) + FW(i)z(i)
= <i>(i)x*(ili - 1) + FW(i)z(i) i = k, ... ,k + s - 1
where q>(i) = F[I - W( i)H] and the initial condition for the predicted state is
x*(klk -1) = x(klk - 1).
The recursion equation for one-step prediction in terms of the initial condi
tion is given by
iii
x*(i + Iii) = [II <i>(j)]x(klk -1) + L [ IT <i>(m)]FW(j)z(j) j=k i=k m=i+l
for i = k, ... ,k + s - 1. Now if the inputs (i.e. U(k) and V(k» were known,
the correct filter based on the input model would yield estimates according to the
recursion
xCi + Iii) = <i>(i)x(ili -1) + FW(i)z(i) + GU(i) iii
= [II <i>(j)]x(klk -1) + L [ IT <i>(m)][FW(j)z(j) + GU(j)]. i=k j=k m=j+l
Note that the only difference is the last term containing the inputs. The corre
sponding innovations based on the two filters are
v(i + 1) = z(i + 1) - Hx(i + Iii)
69
and
v*(i + 1) = z(i + 1) - H£*(i + Iii).
These innovations can be related as
i i
v*(i + 1) = v(i + 1) + HI: [ II q>(m)] GU(j). i=k m=i+l
Assume a constant input over the time interval [k, ... , k + s], i.e.,
U(j) = U , j = k, . .. , k + s - 1
which yields
v*(i + 1) = wei + I)U + v(i + 1) , i = k, ... , k + s - 1
where i i
wei + 1) = HI: [ II q>(m)]G. i=k m=i+l
It can be seen from the expression for v*( i + 1) that the innovation v* of
the non-maneuvering filter is a "linear measurement" of the input (maneuver) U
in the presence of the additive "white noise" v. It then follows that the input can
be estimated using a least-squares criterion from
y= wU+e
where
_ (V*(k.+ 1») y = . and
v*(k+s)
_ (W(k.+ 1») '11- .
w(k+ s)
70
are the stacked "measurement" vector and matrix respectively, and the "noise"
€ = (V(k ~ 1)) v(k + s)
is of zero-mean with block diagonal covariance matrix S, which was given in equa-
tion (2.14).
The estimation can be done in a batch form as
(2.19a)
with the resulting covariance matrix
(2.19b)
Based on this estimate, a maneuver is declared "detected" only if it is statistically
significant. The test for significance for the vector estimate fJ is
d(fJ) = fJTL-IfJ ~ C
where c is a threshold.
The choice of the threshold is as follows. If the input is zero, then
fJ ~ N(O,L) (2.20)
i.e., the estimate is a normal random variable with zero mean and covariance L.
Then the statistics of d is Chi-square distributed with nu degrees of freedom (nu is
71
the dimension of the vector U) and c is chosen such that the probability of incorrect
decision is
P{d(U) ~ c} = Q,
with Q = 10-2 or any desired confidence level.
When a maneuver is detected, the state has to be corrected as follows:
xU(k +8 + 11k+ 8) = x*(k+ 8 + 1Ik+8) +MU (2.21)
where xU is the new corrected state with input modeling. In this equation, the
matrix M, called the propagation matrix, is given by
k+s k+s
M = L [ II ~(m)] G (2.22) j=k m=j+l
and the covariance associated with the new estimate xU is
pU(k+s + 11k +s) = P(k +s + 1Ik+s) + MLMT. (2.23)
A maneuver is considered finished when the input estimate based on mea-
surements from the sliding window of length 8 becomes insignificant. The length
s is a design parameter. In the cases where the duration of a maneuver is short
relative to a sample interval, a window size of 8 = 1 or 2 sampling periods is ap
propriate. However, in most practical cases, it will be necessary to consider data
over a longer period in order to produce a reliable estimate. This is the general
requirement for every statistical parameter estimation.
2.10. Parallelism in Target Tracking
The ultimate goal in parallel processing of the target tracks is to come up
with an alogorithm that is independent of the number of targets to be tracked.
72
The lowest level of parallelism starts at the instruction level of the processor. For a
complex algorithm, each task has to be described at the lowest level of description
to make use of the maximum parallelism inherent in the structure of the algorith
m. This is not always practical with complex tasks. For multi-sensor multi-target
tracking algorithms, we need to describe the target dynamics in time, space, and
feature domain, some of which are not efficiently described by the low level instruc
tion sets of current digital computers.
According to Pattipati et. al. [12], the most important tasks in target
tracking can be broken into five steps as: (1) track prediction, (2) gating, (3)
track update, (4) clustering, and (5) formation of the global hypothesis. The
optimum use of parallelism in these non-uniform tasks is to distribute the tasks
such that the alogorithm executes uniform number of operations for all steps. It is
only by distribution of the number of operations by an optimal use of this inherent
parallelism can be made.
2.11. Sources of Nonlinearity and their Problems
There are two major sources of nonlinearity in tracking problems. One is
due to the measurement done in one coordinate system and filtered in another.
As an example, for airborne radars, an inertial (non-rotating) reference or fixed
coordinate system will be the preferred one for tracking, whereas for ground-based
radars the cartesian coordinate system is used. A cartesian coordinate system is
more convenient for track prediction. The second source of nonlinearity is the target
acceleration. When target acceleration is added to the state vector, the dynamical
equations will also become nonlinear. The available theory, however, is limited
to linear filtering which is only optimum for Gaussian noise in the measurement
process. Further processing is required in conjunction with a linear filter in order to
73
compensate for the bias in the state estimates which has been caused by a sudden
acceleration of the target.
2.12. Data Association
Gating and data association is the heart of target tracking. It is through
this step that all infeasible hypotheses about the correlation of target returns are
dropped. The updating of tracks starts with gating around predicted positions. It
is then at the discretion of the algorithm designer, as well as the limitations of the
speed and available time, to allow more than one measurement return per target
in each gate. It is also important to have a strategy in the case of an overlap in
the measurement sets for different targets. The overlap occurs when targets stay
too close to one another for one or more scan periods.
2.12.1. Nearest-Neighbor vs. All-Neighbor Approach
In the nearest-neighbor approach to data association, at most one observa
tion can be associated with the corresponding target and this will be based on a
distance measure. In this approach, a given observation can be used only once. A
distance measure is minimized over all possible cases. In a multiple target situa
tion, there is a large probability of error with this approach. This is due to a large
number of observations in each gate with equal probability of occurrance around
the predicted position. The likelihood function for association of observation j to
track i is given by [10]
where Si is the residual covariance matrix for track i and 4i denotes the statisti
cal distance of received observation data from predicted position. The statistical
distance is given by
d2 -TS-l-ii = lIii i lIii
74
where Vij is the residual vector from observation j to track i. The product
1riM/2yIjSJ is the total volume of the gate centered around the expected tar
get position. The statistical parameter i determines the confidence level, and M
is the dimension of the state space. The assignment matrix (i.e., assignment of the
observations to the tracks) can be modeled as an optimization problem to minimize
an overall distance function.
Another problem with the nearest-neighbor approach is that when several
equiprobable observations have fallen in the validation gate, the algorithm just
takes the closest one without paying attention to the probability of the observation
being correct. This is because the covariance matrix of error does not account for
the probability of the incorrect measurement being processed. The other approach
is the all-neighbor approach in which all observations within the gate are consid
ered with some probabilities and a given observation can be used again to update
the multiple tracks. It remains to calculate the probability of association of each
individual observation and then averaging them probabilistically. This approach
has been very effective for single targets with one or more sensors. Based on the
second approach, a filter has been developed by Barshalom [13,14]. This filter is
used in this dissertation for the purpose of training the neural network in Chapter
5 and hence will be reviewed in the next section.
2.12.2. Probability Data Association Filter (PDAF)
In this approach, the probability of each event is calculated before the event
is considered. This probability calculation assigns the uncertainty to each event.
The PDAF decomposes the estimation with respect to the origin of each element
75
of the latest set of measurements Z(k) = {Zi(k)}. However, it is assumed that
there is only one target of interest modeled by the dynamical equation
x(k + 1) = F(K)x(k) + v(k)
z(k) = H(k)x(k) + W(k)
where v and Ware zero-mean, mutually independent, white Gaussian noise pro
cesses with covariances Q( k) and R( k), respectively. By the assumption of one
target, it is meant that only one observation belongs to the target in the valida
tion gate and all other observations are assumed to be from the residual clutter.
The term residual clutter refers to the remaining clutter that has not been filtered
by the CFAR or MTI processors. These extraneous observations are modeled as
identically distributed random variables with uniform spatial distributions.
The PDA filter has two cases of interest, that of being optimal and of bieng
suboptimal. Since the probabilities are calculated based on the measurement sets
received up to time (k - 1), the optimal PDA recalculates the new sequence of
probabilities from the beginning up to the arrival of the new set of measurements.
This exhaustive batch processing approach is normally replaced with the proba
bility calculations based on latest measurement sets only leading to a suboptimal
filter. The basic assumption is that
P[x(k)IZk-
1] = N[x(k);x(klk -l),P(klk -1)] (2.24)
which means that the true target state (e.g., position) is assumed to be normally
distributed around the predicted state.
The following events need to be defined next:
(h(k) = {zi(k) is the target-originated measurements}, i = 1, ... ,mk
76
Bo(k) = {none of the measurements at time k is target-originated}.
Then we define
i = 0,1, ... , mk
where (3i (k) is the probability of each event being correct. The events are mutually
exclusive and hence mA:
L (3i(k) = 1. i=O
The state estimate is then a weighted average over these events, which can be
computed as x(klk) = E[x(k)IZk]
mA:
= LE[x(k)IOi(k),Zk]p[Bi(k)IZk] i=O mA:
= L Xi(klk){3i(k). i=O
It may be noted that xi(klk) = E[x(k)IBi(k), Zk] is the state estimate under the
assumption that the event Bi(k) is correct. Furthermore, xi(klk) is given by
i = 1, ... ,mk (2.25)
where
and
S(k) = H(k)P(klk - l)HT(k) + R(k).
The term W(k) is the weighting for innovation or the new information contained
in each event Bi(k), and S(k) is the covariance matrix of the innovation vector.
77
The innovation component corresponding to each new measurement is
(2.26)
Observe that once 80 (k) is considered, the filtered state is set equal to the predicted
state, i.e., once
xo(klk) = x(klk -1)
which means that if we do not receive any measurement within the predicted gate
volume, then there is no need for filtering and the filtered estimate is set equal to
the predicted estimate. Combining the equations we get
x(klk) = x(klk -1) + W(k)lI(k) mA:
lI(k) = L.Bi(k)lIi(k). i=l
This filter is highly nonlinear since the corresponding covariance matrix,
unlike the standard Kalman filter, is dependent on the data. This is due to the un-
certainty in the origin of the measurement assumed earlier. The event probabilities
are then derived by Barshalom [3] as
i = 1, .. . ,mk
with
and
78
where PD is the probability that the target is detected by the radar, PG refers
to the probability that the target is detected inside the predicted region (gate),
and .A is the spatial density of a false measurement in a Poisson-distributed clutter
environment and is given by
mk is the total number of measurements received and Vk denotes the volume of the
two-dimensional elliptical (i.e. Gaussian-based) validation region centered around
the predicted state and is given by
Various PDA Filters can be developed by different ways of calculating the
association probabilities f3i(k). However, they are all based on the underlying
assumption that only one valid target exists in the validation gate. Due to this
assumption, when there is actually another target present in the validation gate,
the PDA filter will be confused and it picks only one of the targets at the time of
the crossing trajectories. PDAF can track multiple targets as long as they are not
too close to overlap within the validation gate.
The PDA Filter focuses on one validation gate at a time. This is the nature
of PDA that was originally designed for tracking a single target in clutter. The
dependence of the association probabilities on innovation is such that the larger
the innovation is, the less likely the data is associated to the target. Furthermore,
larger innovation indicates that the data is farther away from z(klk - 1) which
is the expected measurement vector of the target return. There quite a number
of other algorithms that have been developed such as those given in [4,74,84,94],
however, we summarized the major assumptions and mathematical background
which is common to most tracking algorithms in the current literature.
79
CHAPTER 3
A ROBUST NEURAL NETWORK SCHEME FOR CFAR DETECTION
3.1. Introduction
The complexity in target detection by radar systems generally arises from
the fact that the return signal to the radar (echoes) at any particular scan of the
antenna may consist of the signal from the target to be detected, the background
clutter and some thermal noise, all of which may be highly correlated. Detection of
stationary radar targets in nonstationary noise and clutter offers more challenges
compared to situations when the target is moving due to the fact that in the latter
case the differences in the doppler spectral characteristics (MTI techniques) could
be exploited. Historically, the problem of stationary target detection is handled as
a statistical detection problem by treating the clutter as interfering background.
Most of the modern work in target detection by radar signal processing is
inspired by the pioneering work of Finn [15,16], who employed statistical model
ing of clutter and noise and used a false alarm probability regulation mechanism
which involved making the detection threshold proportional to a spatially sampled
maximum likelihood estimate of the output variance of the cell under test (where
this variance is due to the clutter environment). The goal of Finn's approach was
to develop a Constant False Alarm Rate (CFAR) processor which maximizes the
probability of target detection Pd while maintaining the probability of false alarm
Pja below a prescribed value. By comparing the processed voltage signal from each
resolution cell to an adaptive threshold, which is obtained from estimates of the
80
mean level of the interference over the adjacent range celis, automatic detection
of targets in nonstationary clutter and noise background is obtained, while achiev
ing a constant rate of false alarms when the interference is homogeneous over the
reference cells.
In the most basic CFAR detection scheme, called the Cell Averaging-CFAR
(CA-CFAR), the threshold for detection is set adaptively by computing the arith
metic mean of the outputs from a number of adjacent cells [16,17]. The detection
probability, Pd, in this scheme improves as the number of reference celis, N, in
creases and in the limit as N -+ 00 , Pd approaches the detection probability of the
optimum detector (i.e., the classical Neyman-Pearson detector) which is based on
a fixed threshold determined from an a priori knowledge of the mean level of in
terference. A serious degradation in detection probability, however, results from a
reduction in the number of available reference celis. Several factors such as any lim
itations of the radar system under use (in terms of resolution and sampling time),
presence of interfering targets and clutter patches in the vicinity of the primary
target may contribute to the reduction in the number of reference celis.
For operation in variable clutter environment, the performance of several
CFAR processors proposed in the literature [18,19,45] also depends on the efficiency
of the clutter classification scheme employed [47,76], which in turn depends on the
number of independent data samples that can be processed during every scan. A
critical problem with decision-making using this approach is the correct estimation
of the filter parameters (those of the whitening and matched filters), which in
turn depends on the selection of a model (AR or ARMA) of appropriate order for
representing the time-series data.
81
It is widely acknowledged that the design of a CFAR processor which is
capable of delivering a consistently high level of performance in all situations that
may include not only homogeneous background but also various forms of nonho
mogenities, caused by clutter edges, clutter patches, multiple interfering targets
etc., is not feasible. This is due to the fact that the inherent assumption in the de
sign of CA-CFAR processor, viz. the statistics of interference at each reference cell
are the same as the statistics of the test cell, is violated. This has prompted a flurry
of activity in this area leading to several modifications of the basic CA-CFAR algo
rithm, the differences mainly stemming from the selection logic used for extracting
the signal that will be compared with the signal from the test cell to perform detec
tion. For a brief description of the underlying processing, consider the schematic
diagram of a typical CFAR detector shown in Fig. 3.1. The reference window of
width N = 2n + 1 is split into a leading part (of width n) and a lagging part (of
width n) symmetrically about the cell under test and the square-law detected sam
ples from the adjacent leading and lagging reference cells are summed individually
and processed by the selection logic. The processed signal Ys is multiplied by a
threshold multiplier Tm and is compared with the sample Yn+l from the test cell,
and a "target detected" or "target absent" decision is made depending on whether
or not Yn+l exceeds the threshold value TmYs. It is in the specific algorithm used
by the selection logic that the various CFAR processing schemes differ. In the basic
CA-CFAR scheme, Ys is obtained as Yiead+ Yiag where Yiead = E;':~~2 Yi (i.e., sum
of the samples from the reference cells leading the target cell) and Yiag = E~=l Yi
(i.e., sum of the samples from the lagging reference cells). In a variation of this
scheme, called the Greatest Of-CFAR (GO-CFAR) [20,21,46], Ys is selected as
max(Yiead, Yiag), whereas the selection Ys = min(Yiead, Yiag) results in yet another
modification called the Smallest Of-CFAR (SO-CFAR) [22,23].
Incollling Satnl,les
A,ljnr~1I1 nrfrr< ... r. C"II,
Sill/ale Law
I>cleclor
II ....
E 1,,+.
Y,,,. - E .. , ,.,,+1
I/n .. ' = Sillllftl r.o," 1r.1I rell
II,
E
Y" •• t ... I.'
J',
Comparator
Tn,Y,
Threaholcl Multiplier 'l'n.
Fig. 3.1 A Schematic Diagram of CFA R Detector
-Detection Decision
00 l\:)
83
Several other modifications such as the Ordered Statistics-CFAR (OS-CFAR) [24],
the Trimmed Mean CFAR (TM-CFAR) [25] and the Censored Mean Level detector
(CMLD) [26] also exist in the literature. A detailed performance evaluation of these
schemes can be found in [27].
When the background is homogeneous and the reference cells contain in
dependent and identically distributed (iid) observations governed by exponential
distribution, the basic CA-CFAR yields optimum target detection performance.
However, the performance degrades (increase in Pia and/or increase in detection
threshold leading to a lower Pd) when these assumptions are violated, particular
ly in nonhomogeneous background situations. It is precisely to compensate for
these performance degradations in the various cases of background nonhomogeni
ties that different modifications of CA-CFAR have been developed. For instance,
the selection logic used in GO-CFAR is tailored to overcome the performance loss
when step increases in the background noise level (such as that produced at clutter
edges) are present, while that in SO-CFAR is tailored to yield good performance
in multiple target environments for resolving two closely spaced targets (such as
when an interfering target lies within the reference cells of the primarv target to
be detected). Despite the underlying differences stemming from the selection log
ic employed, a common requirement for all of these mean-level CFAR detectors
(CA-CFAR and its various modifications) is the availability of an adequate num
ber of samples from the reference cells, or a fairly large-sized reference window
N. This is due to the statistical requirements for the parameters that are used
for representing the target fluctuations and the clutter background. The detection
performance hence degrades very sharply when the size of the reference window
is less than about 30. Furthermore, each modification of CA-CFAR is developed
84
to specifically handle the performance loss arising from a specific situation that
violates the generally held assumptions about the environment and may not offer
any benefits (and in some cases may further degrade the performance) if the situa
tion encountered is a different one. For instance, as noted by several investigators
[23,27,48,49], while the GO-CFAR detector efficiently regulates the false alarm rate
in the presence of edge clutter, it may indeed worsen the performance in multiple
target environments (such as when an interfering target with strength equal to the
primary target appears in the reference window).
In this chapter, we shall present a novel neural network-based CFAR detec
tion scheme (referred to as NN-CFAR scheme for abbreviation) that offers a robust
performance in the face of loss of reference cells and also other nonideal conditions
corresponding to nonhomogeneous background environments. This scheme employs
a multilayer feedforward neural network trained by error backpropagation approach
[28] using the optimal detector as the teacher. The excellent pattern classification
capabilities of trained neural networks are exploited in this application to efficiently
counter performance degradations due to reduced reference window sizes and other
nonidealities.
Artificial neural networks are emerging as very attractive alternatives to tra
ditional methods (maximum likelihood techniques, nearest-neighbor classification
etc.) in the development of computer-based pattern classification algorithms, since
they can learn to perform the required classification without the assumption of
probabilistic models for the input patterns. Pattern classifiers are mappings that
define partitions of feature space into regions corresponding to class membership.
Classification problems that are not linearly separable and require nonlinear de
cision boundaries can be solved using multilayered neural networks with neurons
85
having nonlinear transfer characteristics. This area has witnessed an explosion
of research in the recent past and one of the important results that has come
out is based on the celebrated theorem of Kolmogorov. This result states that
any continuous nonlinear mapping can be approximated as closely as desired by a
multilayered neural network with a feedforward topology and sigmoidal nonlinear
functions [31-33].
The basic idea underlying the present work is the employment of a neural
network for a better representation of the target and the background, such that the
samples from a smaller sized window can be used without significant CFAR perfor
mance loss. For a brief description, it may be noted that the information loss due
to the reduced reference window size can be compensated by the use of additional
parameters. The conventional mean-level CFAR detectors primarily use one or
two parameters for representation of the background (for instance, average clutter
power used to set the threshold) and any attempts to use more parameters general
ly result in significant increases in design and implementational complexities thus
neutralizing any possible performance gains. On the other hand, a neural network
implementation of the CFAR detection scheme provides a convenient approach
for accommodating more input features without corresponding increase in design
complexities owing to the parallel processing capabilities of the neural network.
The fault tolerant properties of the neural network-based design also need
a particular emphasis. When the actual clutter distribution encountered deviates
from the ones used for training the NN-CFAR detector, the level of detection per
formance is maintained due to the utilization of more input features. Furthermore,
in such highly nonhomogeneous situations such as when the returns from some
of the reference cells are defective (particularly when some dead cells are present
86
on both sides of the test cell), the NN-CFAR scheme offers a better performance
than the conventional methods. Performance evaluations of the presently devel
oped scheme in these and other interesting scenarios will be described in a later
section of this chapter.
The primary emphasis in this chapter will be on describing the input features
used for training the neural network and on demonstrating the viability of this
approach for target detection in diverse background environments. Consequently,
to keep the discussion simple, we will limit ourselves to providing performance
comparisons with the basic CA-CFAR scheme and will highlight the advantages of
employing the NN-CFAR scheme in these scenarios. Expanding the approach to
more efficiently handle a particular scenario for which a specific modification of the
CA-CFAR (such as GO-CFAR or SO-CFAR) has been developed is straightforward
and will require more training examples to be selected from that specific scenario.
3.2. Development of NN-CFAR Scheme
Targets of interest in radar detection usually result in specific features in the
reflected signal which are however buried in thermal noise and clutter. Due to its
pattern classification ability [29,50], a carefully trained neural network can more
efficiently distinguish features in the structured target echoes by utilizing some
details that are generally hidden to conventional statistical receivers. The primary
objective of the present design is to employ a neural network scheme in order to
enhance the performance of conventional mean-level CFAR processors, particularly
when the number of reference cells is reduced and/or other nonideal conditions are
present in the detection environment.
87
3.2.1 Framework For Neural Network Training
In this section, we shall briefly describe the assumptions used in establish
ing a framework for generating the training examples and also for the simulation
exercises that will be discussed in the next section. These assumptions are how
ever common to most existing CFAR design procedures and hence facilitate a fair
comparison of performance later. It should however be emphasized that NN-CFAR
design does not really require these assumptions or is not limited only to environ
ments satisfying all of these assumptions. In other words, the training vectors can
indeed be generated and the neural network can be satisfactorily trained with these
vectors even when the assumptions are not all satisfied.
(i) Square-law detection will be used (Fig. 3.1) and samples are sent serially
through a shift register of size N = 2n + 1, where n is an integer.
(ii) The cell at the center (viz. the (n + l)th cell) is used for the primary target,
such that the leading and the lagging windows are of equal width.
(iii) Only range cells will be used (however, an extension of this approach to two
dimensions is straightforward) and the output of each range cell is assumed
to be exponentially distributed with probability density function given by
f(p) = (2~) exp( -p/2q), p~O. (3.1)
(iv) Targets in the reference window (both primary and any interfering targets)
have only temporal fluctuation and the amplitude fluctuation is according
to the Swerling-I model. Thus, under the null hypothesis Ho (no target
present in a range cell), q in (1) is the average power of the total clutter
plus thermal noise, which will be denoted by J.L. Under the alternative
hypothesis HI (target present in the range cell), q refers to the average
power due to all three returns (clutter, noise and reflection from target) and
88
is represented by q = Jl (1 + S), where S denotes the average signal-to-noise
ratio (SNR) of the primary target. For any interfering target with power I,
1/ S represents the interference to target power ratio, which will be used in
the performance analysis. Thus, for a reference cell containing an interfering
target, q = Jl (1 + I) represents the average signal power.
(v) The dutter and noise residues in the range cells are assumed to be indepen-
dent and identically distributed (iid). It must be emphasized that although
this assumption will correspond to a rather specialized case for training, as
mentioned earlier our primary interest in this chapter is to compare the per
formance of the NN-CFAR scheme with that of the basic CA-CFAR (which is
particularly designed for a homogeneous background). Once the NN-CFAR
is trained with the selected examples, the performance will be evaluated to
highlight the robustness to the loss of reference cells as well as to the devi
ations of the actual clutter distribution from the original distributions used
for training.
3.2.2. Training with Optimum Detector as Teacher
In the development ofNN-CFAR scheme, the neural network will be trained
for a number of distinct false alarm rates (i.e., PIG values) using the decisions from
the optimum detector as examples and the detection performance (after comple
tion of training) will be compared with that of the CA-CFAR scheme in diverse
operational scenarios, such as when the size of the reference window is progres
sively reduced. In order to establish a precise framework for executing these tasks
and for stating the performance metrics used in the comparison, we shall briefly
identify some performance quantities of interest for the optimum detector and the
CA-CFAR detector.
89
Under the assumptions of homogeneous background and knowledge of the
total noise power p., the probability of false alarm in the optimum detector is given
by
(-Xo) P,o. = Pr[x > Xo IHo] = exp '2j;: (3.2)
where x denotes the signal from the test cell, Xo is the fixed threshold and Ho de
notes the null hypothesis (noise only). Under the other (signal present) hypothesis
Hl , the optimum detection probability p;pt is
(3.3)
whe!~ S denotes the signal-to-noise ratio. In the case of CA-CFAR, the corre
sponding quantities can be evaluated by observing that the signal from the test
cell x is compared with a variable threshold TmYs (see Fig. 3.1). Hence,
(3.4)
which can .be expressed [21] as
P =M (Tm) 10. 2p.
(3.5)
where M (.) denotes the moment generating function (mgf) of the random variable
Y". Similarly, the detection probability is
(3.6)
which can be expressed [21] as
(3.7)
90
As noted in [21] and [22], the average detection threshold (ADT) provides a
convenient mechanism for estimating the loss of detection performance (due to the
finite reference window size) of various CFAR processing algorithms and is defined
as
(3.8)
For the CA-CFAR, under the assumption of exponentially distributed homogeneous
noise background, this simplifies [27] to
(3.9)
which is independent of f.l. For the optimum detector, however, the threshold is
fixed and hence, using (2), the ADT can be evaluated as
Xo ( ADTOI'I = 2f.l = -In PI")' (3.10)
Of particular interest to our work is the change in ADTc F AR as the size of
reference window N decreases. The detection probability p;FAR and the detection
threshold Tm can be evaluated in this case as
CFAR [ Tm]-N Pd = 1 + 2f.l (1 + S) (3.11)
and
which clearly illustrate the effects of reducing N. In particular, as N decreases,
Tm increases which consequently results in a lower probability of detection for CA
CFAR compared to that of the optimum detector. This, of course, is the price for
keeping PI" constant.
91
The variation of ADT as a function of N has been studied in the literature
for different CFAR processors. Table 3-1 gives a quantitative comparison of ADT
for the CA-CFAR processor for different values of N with the optimum detector
threshold computed for several PIC values. This table is also given in [27].
Table 3-1 Comparison of ADT for CA-CFAR with threshold for optimum detector
Optimum N=9 N= 17 N=25 N=33
PIO ADT T ADT T ADT T ADT T ADT
10-4 9.21 2.1 17.3 0.78 12.4 0.47 11.2 0.33 10.6 10-6 13.8 4.6 37.0 1.37 21.9 0.78 18.6 0.54 17.2 10-8 18.4 9.0 72.0 2.16 34.6 1.15 27.7 0.77 24.9
Evidently, reduction in the number of averaging cells results in progressively
raising the threshold and consequently the target is masked when N is considerably
small. Furthermore, as N tends to infinity, the CA-CFAR detection threshold
matches that of the optimum detector (if the background stays homogeneous). This
explains the rationale for using the optimum detector as the teacher for training
NN-CFAR. Table 3-1 also contains another useful information. Observe that for
N = 33, the ADT for CA-CFAR gets reasonably close to that of the optimum
detector, while progressively worsening as N becomes smaller. Hence, for designing
the NN-CFAR, we use the same number of cells (i.e., N = 33) for training (the
performance of the trained network will, however, be evaluated for N < 33 and
also for nonhomogeneous background situations). Another reason for selecting N
= 33 during the training phase is due to the statistical requirements on the input
features used; most of these signals need to have a sample size of at least 30 in
92
order to give an unbiased estimate of the mean or the variance that will be used. Of
course, there is no upper bound on the size of N and larger the value of N selected
during the training phase, the better representation of the background clutter the
neural network will be exposed to. Thus, N = 33 is a representative selection and
this will be used for all of the further development.
From the analysis of the CA-CFAR scheme given above, it is evident that
the average power in the reference cells should be included in the set of input
parameters simply because it serves to represent the background power. Thus in
the training process, we expose the average power of the cells to the network for
several distinct values of PIG. Of course, in addition to this parameter, we will
use a few more parameters such that the network continues to have a recall of the
actual target and background distributions even when the reference window size
gets reduced. Details of the training process will be given in a later section.
3.2.3. Selection of Input Features
The NN-CFAR detector is trained to make decisions based on features de
rived from the radar data from the N = 2n + 1 reference cells collected during
one course of a single antenna scan. Of fundamental importance for a satisfactory
training of the neural network and for the pattern classification performance of the
trained neural network is the selection of an appropriate set of input features. In
this section, we shall describe the specific input features that are used to represent
the target and the clutter fluctuations in the NN-CFAR scheme, and also briefly
describe the motivations for their selection. An obvious selection of the input fea
ture to be used is the output of the test cell (or the center cell of the reference
window) Yn+l. Additional parameters that provide statistical characterization of
93
the samples from the reference cells on either side of the test cell will be used to
complete the input feature set.
The motivation for using more parameters for training the neural network
comes from the observation that statistical parameters generally lose their effec-
tiveness as the number of samples is reduced. To compensate for this loss in the
face of reduced reference window size, we employ several parameters that attempt
to characterize the same statistical properties of the available sample set. It must
be noted that most of these parameters have been individually used in the design
of different types of CFAR algorithms earlier. The ability of the neural network for
simultaneously processing these various signals (i.e., the signal fusion capability)
permits all of them to be used together in this application for obtaining a better
representation of the background clutter.
Statistical Mean Over the Reference Window:
The first feature included in the input set is the statistical mean J.LT' which
reflects the total average power in the reference cells (including the cell under
test). Since during training we are using N = 33 resolution cells, statistically this
constitutes a sufficient number to compute the sample mean, particularly when
each cell has an independent distribution. If the outputs of the range cells (i.e.,
output of the square law detector), Yi, i = 1,2, ... N, are independent exponentially
distributed random variables, the mean J.LT given by
1 N J.LT = - "'Yi N~
z=1
is also exponentially distributed.
94
Average Powers of the Leading and the Lagging Windows:
Evidently, in the presence of a target, the return from the test cell (Yn+l)
affects the parameter J.lT considerably. In order to provide a sense of the background
by itself, we use the average powers of the leading and the lagging sides of the
reference window as two input features. These signals Yiead and Yia9 are given by
1 2n+l
Yiead = - L Yi n i=n+2
and 1 n
Yia9 = - LYj, n j=l
where n = N21. It is evident that Yiead = Yia9 if the background is homogeneous
and exposing the neural network to these features helps the network learn the
distributions in the cases when the background is no longer homogeneous (due to
a clutter edge or due to the presence of interfering targets, for instance) causing
Yiead and Yia9 to differ significantly. As noted in Section 3.1, these features have
been used in certain modifications of the basic CA-CFAR scheme to enhance the
detection performance in nonhomogeneous background scenarios.
Variance of the Leading and Lagging Windows:
Target and clutter fluctuations affect the accuracy of the features discussed
so far. To represent these fluctuations, we use "fead and ufa9' which are the vari
ances of the leading and the lagging windows, respectively. Use of both these signals
would enable the NN-CFAR to detect the deviations from the gaussian behavior
by comparing the variances. For illustration, consider the case when the average
power of the leading window is higher than that of the lagging window. This could
correspond either to a scenario where a clutter edge discontinuity which is uniform
in each side with a sharp difference in amplitude is present, or to a scenario where
the presence of some interfering targets on one side that could contribute to the
95
average power is indicated. The use of the variance on each side would help the
NN-CFAR to distinguish these situations and determine whether there is an edge
clutter or an interfering target, or the background is still homogeneous.
The t-statistic:
The features discussed thus far mainly reflect the background magnitude
and dispersion with respect to the target. For representing the target fluctuation
itself, one could attempt to use intelligent parameters that statistically relate the
output of the test cell Yn+l to the returns from the reference cells. One such
parameter is the t-statistic defined by Goldstein [18]
1 "N t = Yn+l - Iii L.Ji=1 Yi
J k Lf::l (Yi - -k Lf=1 Yi) 2
(3.12)
As noted in [18], a subtraction of the maximum likelihood estimate of the
mean of Yn+l appears in the numerator of this expression, while the denominator
performs a normalization by dividing by the maximum likelihood estimate of the
standard deviation of Yn+l. In [18], this parameter was used (more exactly, a
modified parameter "log-t" obtained by replacing the outputs Yi in the expression
(12) by their logarithms was used) to automatically adjust the detection threshold
for maintaining false alarm regulation in log-normal clutter and in Weibull clutter.
Median of Clutter Samples:
One parameter that has not been used extensively in the literature on CFAR
processing is the median of clutter samples. This parameter, however, has been
extensively used in digital image processing applications and we will use it for NN
CFAR training. For an ordered sample set X = Xt,X2, ••• ,XN of size N, where the
96
samples are arranged in increasing order of magnitude and N is odd, the median
is defined by the statistic
1] = X(N+l). 2
While the mean and the median together give a good representation of the behav
ior of the sample set X, the median indicates some useful statistical characteristics
not reflected in the mean. For instance, while the mean provides a measure of
the central tendency of the sample set, it can be significantly affected by extreme
values, such as the ones resulting from the presence of an interfering target with
a high interference-to-signal (1/ S) ratio. Evidently, such extreme values influence
the computed average power and presence of interfering targets hence results in a
raising of the detection threshold if not taken care of. Some suggested modifica
tions in the literature, such as the Censored-ofCFAR (CO-CFAR) [23], attempt to
isolate the interfering targets and use only the remaining cells for threshold calcu
lation. It is easy to see how this operation further reduces the number of available
cells.
Use of the median as an input feature for NN-CFAR training has the ad-
vantages of simple computation and of providing a better representation of the
background clutter when interfering targets are present. However, in dealing with
samples from a population such as radar samples in each scan period from an ex
tended background, the sample mean does not vary as much from sample to sample
as does the median. This implies that the sample mean is more stable than the
median for estimation of average clutter power and is more suitable when consis-
tent interference does not exist. Consequently, using both parameters one could
obtain some indication of the presence of an interfering target (i.e., if the two pa
rameters differ significantly then the presence of an interfering target is strongly
97
indicated, whereas if the two parameters are close then most probably there will
be no interfering target).
Correlation Between Leading and Lagging Cells :
Yet another useful input feature to be used in NN-CFAR training is the
correlation coefficient between the leading and the lagging portions of the refer
ence window. The correlation coefficient plays an important role in bivariate data
analysis problems and in the present case will help identify and parameterize any
deviations from independence of clutter data on the two sides of the test cell. Thus,
from the samples Yt.Y2, ... ,Yn from the lagging cells and Yn+2,Yn+3, ... ,Y2n+l
from the leading cells (see Fig. 3.1), the correlation coefficient is calculated as
~n l~n ~n L.Ji=l YiYn+i+l - n L.Ji=l Yi . L.Jj=l Yn+j+l
corr= --~~--------~~~----~~----TJ • O'lead • O'lag
where O'lead and O'lag represent the standard deviations of the two sides and are
given by
and
2 _ 1 O'lead - (n - 1)
2 _ 1 O'lag - (n -1)
To illustrate the usefulness of this parameter, observe that a value of corr =
.6 would indicate that approximately 60% of the clutter data on the two sides of
the test cell are linearly related. This in turn can suggest the existence of some
kind of correlated background, such as edge clutter. Specifically, for a reference
window of size N = 21, n = 10 reference cells are on each side of the test cell and
98
in the case of an edge clutter, the clutter amplitude in 6 of the leading cells might
be a scalar multiple of the average clutter amplitude in the corresponding 6 cells
on the lagging side. Since we are using a static neural network (i.e., feedforward
processing only), only the spatial correlation is considered in the computation of
carr.
3.2.4. Neural Network Architecture and Training
Fundamental to the employment of a neural network in any specific appli
cation is its function approximation capability and, as mentioned in the Section
3.1, several powerful analytical results [31-33] have been established to confirm
the existence of a multilayer neural network with feedforward topology and sig
moidal nonlinear functions. However, as is true with other function approximation
procedures (such as the use of polynomials, Fourier series and general orthogo
nal functions). these results do not give a procedure for estimating the number of
terms needed, i.e., the number of layers and the number of nodes in each layer, for
achieving a desired degree of approximation. These have to be determined by trial
and error for the specific problem at hand.
The basic processing element (neuron) in these function approximating net
works has an input-output characteristic which is obtained by forming a weighted
sum of the several inputs received and producing an output which is a nonlinear
function of this weighted sum, according to the relation
vet) = f (t, WiUi(t))
where Ui('): ~ -+ ~,i = 1,2, ... m, are the inputs, v(·): ~ -+ ~ is the output and
Wi E ~,i = 1,2, ... m, are the weights. 1('): ~ -+ ~ is an approximately selected
nonlinear activation function that satisfies the following conditions:
(i) xf(x) > 0 for all x E ~ (first and third quadrant function)
(ii) liml:J:l_oo f(x) = k sgn(x), k >·0 (saturating function)
(iii) f~Zltl ~ f~22) for all IXII :::; IX21 (nondecreasing function).
99
Commonly used activation functions are the sigmoid characteristics (e.g : f( x) = tanh()'x), or f(x) = (1 +e-Z)-l).
In order to accept the nine input features described in the last section,
viz. {Yn+b IlT' Yiead, Yiag, Qfead,Qfag,t,7],corr} , a network with an input layer
consisting of nine nodes was employed. Since no further preprocessing of these
input features is needed, these nodes serve only to fan-out the signals. Two hidden
layers with 14 nodes in each layer were selected with the sigmoidal activation
function f(x) = [1 + e-zt l . Since the output of the neural network represents
the decision on the presence or the absence of target, one output node, whose
output is a linear combination of the outputs from the previous layer, was used to
complete the network architecture. A specified number d1 is used at the output
node to indicate the presence of the target and the absence of target is indicated by
another specified number d2 , with the decision threshold set appropriately between
dl and d2 for separating the two cases (see Fig. 3.2).
Training of this network by error backpropagation was conducted following
the generalized delta rule with momentum [28] and using the decision of the op-
timal detector for computing the error in each case. Data from a homogeneous
background with 33 reference cells that include samples from the background as
well as the target were used for training. The motivation for using a reference win
dow of size 33 during training is from the statistical effectiveness of the parameters
used in the input feature set, as discussed earlier. Examples from nonhomogeneous
background situations were not used in network training on purpose, since our
100
primary emphasis in this study is to compare the performance losses sustained by
NN-CFAR with those of CA-CFAR in various cases when the background deviates
from being homogeneous.
Inp~t Layer
11 .....
p-
1';., --0 ..
(f-, ... --0
o o
o t --0 0
TJ --+c.....r---~-.
Hidden Laye:s
Output Layer
Fig. 3.2 Neu..~ ~e:work Arci:itecture
Threshold
101
For generating the training vectors, four different levels of false alarm rates
that were two orders of magnitude apart (i.e., P'C = 10-2 ,10-4 ,10-6 , 10-8 ) were
considered. For each P! c value, six different levels of SNR ranging from 1 db to 20 db
were used. The target and the background were considered to follow exponential
distribution such that they yield iid samples in each resolution cell. Target, clutter
and noise samples were generated independently and added together according to
the assumed distribution. Training vectors were generated such that there would
be 30 examples from each combination of P'C and SNR levels, in order to ensure
statistical independence of samples. Thus a total of 720 training vectors were
generated.
The training examples were exposed to the network in the batch processing
mode [34] i.e., by computing the error accumulated after each cycle, which is one
complete sweep of the 720 training vectors. The training was rather smooth and
there was no significant need for using the momentum term. The algorithm was
run for 800 cycles when steady-state conditions were attained (i.e., error reducing
to appreciably small values) and the training was terminated. For establishing
a decision threshold at the output node of the network, the values d1 = 8 and
d2 = 1 were used and any response above (d1 + d2)/2 = 4.5 was classified as
"ta.rget present". The exact values selected for these parameters are of no particular
significance and the indicated values are only a representative selection which were
maintained in conducting performance evaluation tests that will be described in
the next section.
3.3. Robustness Evaluations of NN-CFAR
Our objective in this section is to evaluate the performance of NN-CFAR
trained as above in a variety of different scenarios and operating conditions that
102
deviate significantly from the conditions for which the network is trained. The
primary focus in this study is to establish the robustness of the NN-CFAR scheme
to deviations caused by SNR values outside the training range, high clutter levels
beyond those used in training, reduction in the size of reference window, nonho
mogeneous background conditions due to the presence of clutter edges, interfering
targets and dead cells, for which the network is not exposed to during training. For
establishing a benchmark level of performance, we shall employ the basic CA-CFAR
scheme and compare the NN-CFAR performance with this in each case.
Due to the fact that analytical methods are not available for evaluating
the performance of neural network-based schemes, the comparisons were done by
simulation experiments. For ensuring fairness in the comparison, in each exper
iment the average of 100 different runs executed independently for different PIe.
values, different clutter distributions etc., were evaluated. These experiments have
revealed that NN-CFAR consistently provides a superior performance in each of
these scenarios. Brief descriptions of a few of the more interesting experiments will
be given in the following.
For conciseness, the results will be given either in a tabular form or in
a graphical form. The following notations will be used to describe the various
quantities:
S = Average SNR of the primary target (in the test cell);
I = Average SNR of any interfering target;
ASNR = Average of SNR in 100 independent runs (in db);
ACNR = Average of CNR (clutter-to-noise) ratio in 100 independent runs (in db);
PDCA = Percentage of detection for CA-CFAR scheme;
PDNN = Percentage of detection for NN-CFAR scheme;
% CCA = Percentage of correct classification for CA-CFAR;
% CNN = percentage of correct classification for NN-CFAR;
103
Merit = A figure of merit which indicates whether NN-CFAR performed better or
worse than CA-CFAR in the particular test. A "+" sign indicates higher probability
of detection (Pd) together with less increase in false alarm rate (P,..) for NN-CFAR
compared to CA-CFAR. (Evidently, any increase in Pd for one detector is worthy
of notice only if P,o. has not increased beyond that for the other detector). Thus,
in all of the performance comparisons given here, we will be interested only in this
overall performance.
It should be emphasized that the performance variations depicted in the
tables and graphs may not appear smooth (as intuition would suggest). This is
only due to the fact that the average from a finite number (100, to be exact) of
runs, with parameter values selected by a random number generator, has been
calculated in each case.
Experiment # 1
The first experiment was directed to testing the generalization performance
of NN-CFAR i.e., the detection performance by processing test vectors not included
in the training patterns to which the network was exposed to during the training
phase.
A total of 120 test vectors, generated by a random selection of false alarm
rates in the range 10-6 ~ P,o. ~ 10-2 , were used in this experiment. The input
feature set {Yn+I. tLT' ¥lead, ¥lag, qfead' qfag' t, 7], corr} was generated in each case
with the usual assumptions on the distributions and provided the test vector for
processing by NN-CFAR. More specifically, for each selection of P,o.' a random value
for the average SNR , S, in the range 1 ~ S ~ 20 db was generated. A separate
104
random number generator was used for the clutter-to-noise ratio (CNR) and the
clutter data was transformed into exponential distribution using this average CNR
value. This was done independently for each cell.
Our objective in this experiment is to compare the detection performance of
NN-CFAR with that of the optimum detector for which the probability of detection
is computed from p;pt = [P,..J 1/I+S, using the values for P'A and S. For simulating
the presence or the absence of the target and its detection with the optimum
detector, a uniform random number Po between 0 and 1 was generated and, if Po ~
p;pt, the target was declared "present?' and its radar return was then generated
following an exponential distribution with the selected values of S (average SNR).
For the 120 input vectors tested with, NN-CFAR resulted in a correct de
cision 102 times (i.e., 85% correct classification) which indicates that NN-CFAR
is a very capable processor. It should be emphasized that this performance level
is only a representative one and by no means constitutes the best performance
possible with NN-CFAR. Indeed, with additional training effort (i.e., training with
more examples) and attempts to tune the architecture of the neural network, it is
possible to realize further improved performance levels. Since our primary focus in
this work is only on a proof-of-concept demonstration (that is, to demonstrate the
feasibility of a neural network-based algorithm for CFAR detection) and further to
demonstrate the robustness characteristics of this algorithm, no additional work
for optimizing the NN-CFAR performance was conducted.
The conclusion from this test that NN-CFAR has been adequately well
trained to follow the distribution of background clutter does not come as a sur
prise., since the testing conditions involved a homogeneous background and all the
necessary parameters for representation of homogeneous background are present
105
in the input feature set presented to the neural network. In fact, the analysis of
CA-CFAR scheme [23,27] indicates that the specific input feature !J.T by itself is
sufficient to represent a homogeneous background. This may lead one to suspect
the need for the other features that NN-CFAR uses as inputs. It should be noted,
however, that when the background conditions deviate from being homogeneous
and/or other nonideal conditions (such as reduction of the reference cells etc.) are
present, the CA-CFAR scheme does not ensure the same high level of performance.
It is the robustness of the NN-CFAR in the face of such deviations from ideal
conditions that will be established in the sequence of experiments to follow.
In order to test the performance of NN-CFAR when the input feature
set was reduced in size, we also conducted several experiments by training neu
ral networks with smaller numbers of input nodes (specifically, 8 input nodes
and 7 input nodes while maintaining the rest of the architecture unchanged)
and by selectively dropping one or two input features from the feature set
{Yiead, Yiag, O"fead' O'fag' t, 77, corr}. It should be noted that the return from the test
cell Yn+l and the average power !J.T were retained as inputs in each of these ex
periments. Tests conducted over a wide range of ASNR and ACNR levels and for
different PI" levels indicated a degradation of detection performance (reduction in
percentage of correct classification % CNN) ranging from about 1 to 4%. While
this may not appear highly unattractive under the present test conditions involving
homogeneous background scenarios, the greater fault tolerance properties induced
by the additional inputs will equip the NN-CFAR to continue to offer a high lev
el of performance in the case of nonhomogeneous background scenarios and when
reduced window sizes are encountered.
106
Experiment # 2
To evaluate the performance of NN-CFAR when an interfering target is
present, in this experiment we maintained the size of the reference window at
N = 33. A Swerling-l interfering target with l/S = 1 (i.e., average fluctuating
amplitude of the interfering target at the same level as that of the primary target)
was placed in cell #4. For a value of PIG = 10-5 , the input features were generated
for different selections of S by a random number generator such that the average
of SNR over 100 independent tests, ASNR, was maintained at a certain level.
The results of processing these inputs by NN-CFAR and a comparison with the
performance of CA-CFAR algorithm in these cases are summarized in Table 3-2a. It
must be noted that each entry of the table refers to the average of 100 independent
tests conducted with distinct SNR values selected such that the average ASNR
value is at the indicated level. The consistently better performance offered by
the NN-CFAR scheme compared to the CA-CFAR is clearly evident. This also
underscores the usefulness of the median parameter (TJ ) included in the input feature
set, which alerts the neural network to the presence of the interfering target (note
in contrast that the CA-CFAR scheme is totally blind to this new situation which
causes a deviation from homogeneous background).
To further test the robustness of the NN-CFAR scheme, the above exper
iment was repeated with values of ASNR considerably outside the range of SNR
values used in the training phase (viz., 1db - 20 db). Despite the two mismatch
es (i.e., presence of interfering target and different SNR values) from the training
conditions, NN-CFAR maintained a high level of performance depicted in Table
3-2b. The superior performance offered by NN-CFAR despite the fact that the
SNR values used in the test were up to 11 db out of range (i.e., beyond the training
107
levels) indicates that the background clutter distribution is more important for
the NN-CFAR performance than the actual magnitude of the SNR. It may also
be noted that the use of t-statistic facilitates to reduce the dynamic range of the
clutter.
21.7 20.5 20.1 19.4 19.1 18.3 17.8 17.2 16.7 16.4 15.0 13.2 9.7
Table 3-2a Comparisons of NN-CFAR in Experiment # 2 (Interfering Target Present)
0.53 0.63 53 65 + 0.48 0.64 55 71 + 0.44 0.61 47 67 + 0.47 0.62 62 77 + 0.51 0.53 52 58 + 0.33 0.51 49 68 + 0.30 0.45 40 57 + 0.21 0.37 44 57 + 0.23 0.39 43 60 + 0.18 0.35 41 56 + 0.15 0.28 50 56 + 0.04 0.22 48 57 + 0.01 0.05 73 75 +
108
Table 3-2h Performance of NN-CFAR for SNR outside the training range in Experiment # 2
31.6 0.55 0.88 54 88 + 31.2 0.65 0.86 65 86 + 30.5 0.55 0.77 55 78 + 29.9 0.60 0.79 61 80 + 29.4 0.61 0.80 61 80 + 29.0 0.63 0.86 62 86 + 28.8 0.50 0.80 49 80 + 28.5 0.63 0.82 60 82 + 28.0 0.57 0.77 55 78 + 27.7 0.56 0.77 56 77 + 27.3 0.56 0.81 56 81 + 26.6 0.55 0.78 56 81 + 25.9 0.61 0.82 60 82 + 25.1 0.67 0.78 69 79 + 23.7 0.63 0.82 64 85 + 22.7 0.56 0.74 61 77 + 20.9 0.47 0.60 46 64 +
109
Experiment # 3
In this experiment we examined the robustness of NN-CFAR for increasing
clutter levels. With all conditions of the test identical to that in the previous
experiment except that the interfering target is removed, the detection performance
of NN-CFAR was evaluated in comparison with that of CA-CFAR for various SNR
levels and various CNR (clutter-to-noise ratio) levels. The results of the test are
summarized in Table 3-3 (where each entry of the table once again corresponds
to the average from 100 independent runs). It is particularly noteworthy that
despite the increase of clutter average amplitude above 10 db beyond the levels
used for training, the statistical pattern of the clutter has been quite well picked
up by the NN-CFAR. This once again confirms our conclusion from the previous
experiment that the input features are well tailored to provide the information on
clutter distribution during the training process.
One particular aspect that may not be apparent from the stated results is
the role of the sigmoidal activation function in the neural network to reduce the
sensitivity of NN-CFAR to out of range signal amplitudes. The saturation at the
hidden layers for higher SNR values helps keep the performance similar to the cases
in the lower SNR ranges, and this is an attractive feature in detection schemes since
detection algorithms generally do not perform well when the receiver is saturated.
For instance, if the individual sensors in the CA-CFAR resolution cells become
saturated due to limitation of the sensor dynamic range, the CA-CFAR algorithm
would not have any inherent cure to compensate for it.
Table 3-3 Comparison of NN-CFAR and CA-CFAR for variation in clutter level (Experiment # 3)
ASNR ACNR PDCA PDNN %CCA %CNN MERIT
30.7 30.2 .91 .98 92 99 + 30.4 30.2 .90 .98 90 98 + 29.9 29.7 .90 .98 90 98 + 29.6 29.8 .86 .95 87 95 + 29.1 29.5 .89 .97 90 98 + 28.8 29.4 .86 .96 86 96 + 27.9 30.0 .87 .96 89 97 + 27.8 28.1 .86 .96 87 96 + 27.4 26.9 .87 .95 87 95 + 27.0 26.1 .81 .90 83 92 + 26.3 25.9 .84 .89 86 89 + 26.1 26.2 .80 .94 80 94 + 25.5 25.5 .83 .90 86 90 + 24.1 24.5 .81 .88 86 90 + 22.8 22.5 .86 .93 88 93 + 22.1 21.6 .83 .88 86 90 + 20.6 20.9 .80 .87 85 90 + 19.7 18.4 .74 .83 84 87 + 19.4 19.4 .73 .77 83 83 + 18.2 16.7 .62 .69 72 76 + 17.0 16.7 .67 .72 82 82 + 16.4 15.6 .69 .71 78 80 + 15.6 15.7 .73 .74 79 79 + 12.4 14.3 .51 .53 73 70 = 11.8 11.5 .41 .39 63 65 = 10.3 10.3 .51 .46 65 71 = 8.0 8.8 .31 .26 49 61 = 4.6 3.4 .22 .17 49 89 +
0.79 0.41 .14 .03 36 86 -
110
Investigation of the activation function effects on the NN-CFAR sensitivity
to out-of-range target SNR is also of interest due to another reason. In a typical
operational scenario, one may have to contend with jamming signals with SNR
beyond the range that can be tolerated by the detector. Due to the requirements
of the interface electronics (e.g., use of linear amplifiers), almost all detectors are
111
operated in the linear region of the signal amplifier and hence it is vital for the de
tection algorithm to have the least sensitivity to signal amplitude level, particularly
at high and low levels. Fig. 3.3 depicts the variation of probability of detection (Pd)
against SNR for both NN-CFAR and CA-CFAR. While the superior performance
of NN-CFAR for SNR levels above 14 db is clearly evident, it should be emphasized
that the apparent lower performance for SNR levels below 14 db is indeed mislead
ing due to the fact that the false alarm rate was also much more for CA-CFAR
than for NN-CFAR in this range. This is evident from the results in Table 3-3
(the same values were used for sketching Fig. 3.3), where the percentage of correct
classification (% CCA and % CNN) are also given in addition to the probabilities
of detection (PDCA and PDNN). For example, note that for ASNR=1O.8, PD
CA=O.51 and PDNN=0.46 (as Fig. 3.3 depicts), whereas % CCA=65 is below %
CNN=71. For levels of SNR above 14 db, both PDNN and % CNN are consistently
higher than PDCA and % CCA respectively.
.... c:::i
/ .... \ : ~
j' ........
•............... •...........
0.0 7.0 14.0 21.0 SNR (db)
a •
NN-C~AR CA--CrAR
28.0
112
Fig. 3.3 Performance of NN-CFAR in Experiment # 3 (Increased clutter levels)
113
Experiment # 4
The focus in this experiment is to evaluate the robustness of NN-CFAR when
the reference window gets reduced in size. With all other conditions remaining the
same as in the previous experiment, two different tests were conducted to study
the amount of degradation in detection performance caused by the loss of reference
cells. The results obtained with a reference window of size N = 25 are summarized
in Table 3-4a for the case when the target is in the clutter region (i.e., the sample
from the test cell includes both target and clutter returns) and the average SNR and
CNR values are varied. These results are to be compared with the entries in Table 3-
3, which give the performance of NN-CFAR under identical conditions except with
N = 33. It is evident that NN-CFAR maintains a high level of detection accuracy
despite the loss of reference cells and it is only in the lower end of the table, when
the signal levels are very low, that the detection performance is compromised.
For a neural network-based scheme, this only indicates that during the training
phase the network is not exposed to an adequate number of examples reflecting
that particular range of operation, and in the present case, including more training
examples emphasizing lower signal levels will help overcome the performance loss in
this range of SNR values. Experiments were also conducted with further reduction
of the size of reference window. For illustration, only a few of the results obtained
for N = 17 and N = 9 are given in Tables 3-4b and 3-4c * to indicate the general
* In comparing the entries of these tables it should be noted that the variation of detection
probability with respect to the average SNR is not smooth due to the averaging with a finite
number of independent runs (viz. 100 runs) and also due to the different clutter levels used in
these runs which changes the signal-to-clutter ratio.
114
trend which is sufficient to appreciate the excellent robustness characteristics of
NN-CFAR.
ASNR
30.8 30.5 29.9 29.8 29.4 28.6 28.2 27.3 26.5 24.9 23.8 22.3 21.2 18.8 16.7 11.0 3.4
Table 3-4a Performance of NN-CFAR with a reduced reference window, N=25, for different clutter levels
(Experiment # 4)
ACNR PDCA PDNN %CCA %CNN MERIT
31.1 .79 .94 79 94 + 30.8 .74 .95 76 95 + 28.7 .80 .94 82 94 + 30.2 .79 .94 79 94 + 28.9 .87 .96 88 97 + 28.9 .79 .94 79 94 + 28.6 .78 .93 80 94 + 27.1 .80 .95 83 98 + 25.9 .83 .95 85 96 + 23.7 .72 .91 76 93 + 23.7 .79 .90 83 91 + 22.3 .70 .84 .79 87 + 20.0 .67 .86 73 87 + 18.6 .60 .74 73 80 + 14.6 .59 .67 79 78 + 10.8 .40 .40 68 63 -4.1 .17 .03 58 81 -
ASNR
30.7 29.8 29.1 28.3 27.8 25.8 24.0 21.2 20.2 18.7 18.1 14.8 12.3 4.6
Table 3-4b Performance of NN-CFAR with reduced reference window; N=17 (Experiment # 4)
ACNR PDCA PDNN %CCA %CNN MERIT
30.4 .64 .97 64 97 + 29.5 .61 .93 62 93 + 28.2 .58 .91 61 91 + 28.4 .60 .94 62 94 + 27.6 .64 .91 68 91 + 25.9 .56 .88 62 90 + 24.8 .57 .89 61 89 + 21.0 .51 .83 57 84 + 20.6 .52 .84 61 87 + 19.2 .49 .79 62 85 + 19.8 .50 .82 61 87 + 14.2 .55 .59 77 77 + 12.2 .38 .48 75 76 + 3.8 .13 .12 70 76 +
Table 3-4c Performance of NN-CFAR with reduced reference window, N= 9 (Experiment # 4).
ASNR ACNR PDCA PDNN %CCA %CNN MERIT
21.7 22.4 .51 .84 61 89 + 27.5 27.9 .08 .82 11 83 + 25.4 25.7 .05 .83 11 85 + 24.9 24.5 .07 .80 14 81 + 24.3 23.9 .09 .82 14 84 + 21.6 22 .02 .75 12 80 +
115
The test described above assumes that the target is in the clutter region
and in this case, an increase in the probability of detection Pd occurs at the cost
of increase in the false alarm rate. We also conducted another test with the target
in the clear region while the adjacent reference cells are from the clutter region.
In this case, the false alarm rate will be reduced together with a reduction in Pd
116
when the number of available reference cells goes down. The results of this test for
progressively reduced values of N = 33,25, 17 and 9 are shown in Figs. 3.4a, b, c
and d, where the performance of NN-CFAR is compared with that of CA-CFAR in
each case. It is of interest to observe that CA-CFAR completely loses its detection
capability for smaller values of N (N = 9 and smaller). The superior performance
delivered by NN-CFAR is clearly evident from these graphs.
c::t
,r./-···~· .. rJ iii
c -. - NN-CF"AR CA-CF"AR
Q~----------~----------~----------~----------~----------~ 0.0 7.0 14_0 21.0 28.0 35.0 SNR (db)
Fig. 3.4a Comparison of NN-CFAR and CA-CFAR for a window size of N = 33
CD c:i
6 • ..;> -:>
!1:i CD
c:::::;)
c.o ~
-:> • ..;> ....;>
:S---: o co
..D o ~
co
CI NN-CF"AR • CA-CF"AR
I············· .. /\ ........ r •...•....•.....
•..............
117
c:i~---------------___ ~ _______________ -r __________________ ~ _______________ ~ _______________ --,
5.0 11.0 17.0 23.0 29.0 35.0 SNR [db)
Fig. 3.4b Comparison of NN·CFAR and CA·CFAR for a window size of N = 25
118
CI NN-CF"AR • CA--CF"AR
-c::i
" :1 .............. ..·······-···· .. ~ .......... ······u .....•.......•........... \.... -
C>
c::i~----------__ r------------r----------__ ~-----------r----------~ 0.0 B.O 12.0 18.0 24.0 30.0
SNR (db)
Fig. 3.4c Comparison of NN-CFAR and CA-CFAR for a window size of N = 17
CD c:i
6 .~ u CD
~ci CD
c:::::I
c.o ~ ~ .J ~
:5~ o co
..D o ~
c - NN-CF"AR • - CA--CF"AR
119
co ............................................ .
=~ ... --------~-4~------~~~~~----~----------~-----------, S.O 10.0 IS.0······ 20.0 25.0 30.0 SNR (db)
Fig. 3.4d Comparison of NN-CFAR and CA-CFAR for a window size of N = 9
120
Experiment # 5
In this experiment, we evaluate the performance of NN-CFAR for the case
of an edge clutter, which is a situation that results in the average powers in the
leading and the lagging portions of the reference window to differ significantly.
Under such circumstances, if the target is in the clear region or in the region of
lower clutter levels, serious target masking could result (due to an unnecessary
raising of the threshold caused by the heavy clutter on the side of the test cell),
whereas if the target lies in the region with higher clutter level with some of the
reference cells in the clear, the false alarm rate could increase very sharply [23].
For the performance evaluation, we set the target in the clear region (which
offers the more challenging case in the presence of edge clutter) and conducted var
ious runs with different average SNR values and also different values of N. A few of
the illustrative results are briefly summarized in Table 3-5, where the Yiead and Yiag
values are included to indicate the differences in the average powers used to rep
resent the clutter edge in these tests. In appreciating the robustness of NN-CFAR
it should be kept in mind that the clutter edge situation was not included in the
training examples used to train NN-CFAR. Thus, there are three mismatches from
the training conditions, viz., presence of a nonhomogeneous background, reduced
reference window size and SNR values outside the training range. The superior
performance offered by NN-CFAR is hence highly noteworthy. Also, the role of the
input features Yiead and Yiag in signalling the NN-CFAR that the background has
deviated from the homogeneous case with which it was trained, needs a particular
emphasis.
N ASNR
33 28.5 27 23.7 19 22.8 17 19.1
Table 3-5 NN-CFAR performance in edge clutter (Experiment # 5)
Yiead Yiag PDCA PDNN %CCA %CNN MERIT
29.9 26.0 .75 .83 78 86 + 26.0 20.2 .37 .42 48 51 + 26.0 20.0 .27 .46 42 61 + 21.9 17.9 .23 .51 42 70 +
121
As noted in the Introduction, the GO-CFAR which is a specialized modifica
tion of the basic CA-CFAR scheme, has been specifically designed for handling edge
clutter situations and has been shown to provide an improved level of performance
in these situations. Hence a question may be raised why the performance of NN
CFAR is not compared with this specialized scheme. The reasons are two-fold. On
the one hand, the NN-CFAR has not been trained for any type of nonhomogeneous
background and hence such a comparison may not lead to any valid conclusions.
On the other hand, it is well known that GO-CFAR, being a specialized design
for edge clutters, does not perform as well as the basic CA-CFAR in the other
cases (for instance, homogeneous background, presence of interfering targets etc.)
[23], and our present interest is to establish the robustness characteristics of NN
CFAR against a variety of diverse types of deviations in the operating conditions.
It is evident that a specialized NN-CFAR that offers the best performance in the
face of a particular type of deviation can be designed very simply by including a
large number of examples depicting that specific type of operating conditions in
the neural network training set.
122
Experiment # 6
A case of special interest is when some of the reference cells in the clutter
region are defected on both sides of the test cell and there is no return from these
cells. Such a situation is illustrated in Fig. 3.5 and is of particular importance in
fault tolerant detection. This also represents a highly nonhomogeneous situation
and, to the best of our knowledge, has not been addressed in the literature on CFAR
detection. It is clear that a GO-CFAR algorithm would have been appropriate
if the dead cells were on one side of the test cell only, in which case the GO
CFAR would select the side with greater clutter power and simply ignore the dead
cells. However, when the dead cells are located on both sides of the test cell, a
more challenging situation is encountered. The performance of NN-CFAR in such
situations was evaluated for different average SNR and CNR values and also for
various window sizes. For conducting these tests, the samples in the reference cells
were generated for the specific SNR and CNR levels and samples from four of these
cells (two on each side of the test cell) were killed by replacing them with zeros.
Some illustrative performance results from these tests are summarized in Table 3-6,
once again underscoring the superiority of the NN-CFAR scheme.
Table 3-6 NN-CFAR performance in Experiment # 6 (Target between two clutter patches with defective reference cells)
N Dead Cells ACNR ASNR PDCA PDNN %CCA %CNN MERIT
33 4,8,19,25 26.9 28.9 .77 .79 82 84 + 27 4,8,19,25 26.4 28.7 .79 .83 81 85 + 17 2,6,12,14 17.6 18.8 .38 .41 63 66 + 15 1,5,10,14 23.2 25.7 .55 .51 61 67 +
123
Fig 3.5 Target between two clutter patches with defective reference cells
(Experiment # 6)
124
3.4. Conclusions
The major contributions of this chapter are the development of a neural
network scheme for CFAR detection (NN-CFAR) and the establishment of its ro
bustness characteristics. The details of employing a neural network in this applica
tion, specifically the selection of the input features and the network training using
these inputs, were described. Performance of the NN-CFAR scheme in a variety
of operating scenarios, some of which correspond to significant deviations from the
training conditions, was quantitatively evaluated. A comparison with the perfor
mance expected from the CA-CFAR scheme is given in each case to establish the
superiority of the presently developed scheme.
The following major conclusions can be drawn from the performance eval
uation results from a number of experiments reported in this chapter. While in
homogeneous background scenarios with an adequately large number of available
reference cells (typically N ~ 33), NN-CFAR matches the performance offered
by CA-CFAR, as the size of the reference window reduces NN-CFAR maintains a
high level of performance significantly better than CA-CFAR. In scenarios where
the background deviates from being homogeneous, as in the case of clutter edge,
presence of defective cells on both sides of the test cell and interfering targets, NN
CFAR continues to deliver superior performance over a wide range of SNR and
CNR levels. The underlying reason for this robustness is the ability of the neural
network to follow the statistical variations of the target and the clutter significantly
better than CA-CFAR, which in turn is facilitated by its capability to process more
statistical parameters as input features for target detection. This characteristic,
together with the parallel distributed architecture of the neural network, which
125
facilitates considerable hardware implementational benefits, makes the NN-CFAR
scheme a highly attractive and viable procedure for CFAR detection.
As mentioned at various places in this chapter, the primary emphasis in
this study has been to demonstrate the robustness of NN-CFAR to deviations in
the operating scenarios from the conditions used for training the neural network
and consequently the neural network was trained for homogeneous background
cases only. Evidently, there is no reason to limit the training examples to any
specific case or a set of cases only, and by appropriate training for nonhomogeneous
background cases (edge clutter, interfering target etc.), a further increased level of
performance together with robustness to loss of reference cells and SNR variations
can be expected. Thus the NN-CFAR scheme offers the potential for integrating
the strong points of several specialized CFAR processors (such as the GO-CFAR,
SO-CFAR etc.) all in one single processor.
CHAPTER 4
NEURAL NETWORK IMPLEMENTATION OF
THE MOVING TARGET INDICATOR
4.1. Introduction
126
As described in Chapter 1, Moving Target Indicator (MTI) is one of the
most important functions of a high quality radar system [51-63]. One of the major
applications of MTI processing is in Air Traffic Control (ATC) systems where
several different objects are moving in the vicinity of the aircraft being tracked.
Furthermore, detection of a group of birds which may be flying near the aircraft
engine during the take off or the landing phase is of prime importance. Therefore,
in this chapter the term clutter refers to the unwanted moving targets which have
to be suppressed. Our objective is to conduct a series of detailed experiments
on the processing of radar pulses using a neural network. We will discuss how
a multilayer feedforward neural network with backpropagation learning may be
employed in order to perform the functions of the MTI processor without excessive
complexity in the receiver design. The Neural Network-based MTI (NN-MTI) is
trained through examples in order to integrate a series of noisy radar pulses and
provides estimates of the target radial velocity in an on-line fashion. The mapping
property of a trained neural network is utilized to extract this information from
the radar pulse amplitude distribution. The key advantages over the traditional
methods are the speed of response, hardware implementation, and flexibility in
127
designing for variable-bandwidth doppler filter bank (which is not offered by the
classical methods).
4.2. Some Basics on MTI Designs
The principal functions of an MTI processor are to utilize the doppler fre
quency shift produced by a moving target in order to: i) determine the relative
velocity of a target and ii) separate a desired moving target from undesired sta
tionary objects (clutter) [8,64,66]. The doppler frequency shift is related to the
radial velocity of the target by
where Vr is the radial velocity of the target with respect to the radar, >. is the
wavelength of the signal, and <P is the phase shift of the signal after it hits a
moving target. The input to an MTI includes a sequence of pulses which are either
returned from the target of interest or from its neighborhood objects, i.e., clutter.
The output of the MTI is either in the form of a decision about the target (i.e.,
moving or stationary) or it can include more precise information about the velocity
of the target. Obviously, for more information about the velocity of the target, one
has to go through more complex design procedures [65,67,68]. Once the pulses are
received by an MTI, each pulse goes through a certain amount of delay in time
which is an integer multiple of the period between the pulses and then they are
integrated (i.e., added in some fashion). The result of this pulse integration is
either compared with a threshold or further processing might be needed to extract
additional information about the target velocity.
We described the mathematical representations of the input and the output
of the MTI filter in section 2.8. As an example, equations (2.11a, b) represent the
128
two inputs of a simple MTI filter (i.e., a two-pulse canceler) and the corresponding
output is given by equation (2.11c). Fig. 4.1 illustrates two basic designs of an
MTI filter and their frequency responses. Fig. 4.2 further illustrates the frequency
responses for a number of MTI filters with different number of delays. For example,
a three-pulse canceler is one that waits until three pulses are received and uses two
delaying elements (e.g., a shift register) to process the pulses. Also note that
other configurations are possible by rearranging the delay elements (e.g., cascade).
Another type of MTI which has been used in this dissertation for comparison with a
trained neural network is depicted in Fig. 4.3a. This is a more general architecture
for an MTI filter and the weights Wi can be calculated through different methods.
Despite the differences in the various architectures, the basic function of of an MTI
can be stated by the following:
Given a series of N pulses p{ i), where i = 1, ... , N, that are returned from a
moving target, what are the best weights {Wi} that must be multiplied by these
pulses such that the weighted sum z = L:f WiPe i) gives a representation of
the target velocity (or indicate that the target is moving).
The neural network-based MTI is inspired by this definition and, as will be
discussed in the following sections, provides several advantages over the classical
architectures of MTI. The number of pulses to be processed in an MTI filter is
an important design parameter. This is in turn dictated by the Pulse Repetition
Frequency (PRF), which is inversely related to the period T of the pulse train.
Furthermore, as we discussed in Section 2.8.3., the doppler shift is also related to
the PRF, i.e., n !d = -=nPRF T
129
where n is an integer. It can be seen that for a higher doppler sensitivity, which in
turn corresponds to a better velocity resolution, one needs to choose PRF as high
as possible. On the otherhand, if PRF is too high then the pulses will be closer
together which results in ambiguities in range measurements. There are quite a
number of other reasons that constrain the choice of PRF in practice. Some of
these constraints are the following :
1) The radar has to have a certain amount of dwell time on each target and
several pulses have to be received in order to obtain a high signal-to-noise
ratio. On the other hand, the dwell time is restricted by the scanning rate
of the radar.
2) Each pulse is required to have a certain amount of power and hence is
limited by its amplitude and duration. Furthermore, the width of a pulse
determines the range resolution of a radar. The pulse width also relates to
the bandwidth of a radar, which is usually a constraint by itself.
3) The blind speeds, which are integer multiples of a certain velocity that
coincide with the MTI notches (as defined in Section 2.8.3) in the frequency
domain, put another constraint on the choice of PRF (see Fig. 4.2). In
other words, the output of the MTI is zero when the doppler shift is an
integer multiple of PRF, i.e., !d = if. = nPRF. Furthermore, these target
velocities resulting in zero response from MTI are given by Vn = ~; where
n is an integer.
4) Any limits on the computational resources can in turn limit the data rate
that can be received by an MTI.
130
5) The flexibility of producing pulses with certain duty cycles (i.e, TIT, where
T is the pulse width and T is the time interval beteen the pulses) of a trans
mitter might be limited due to size restrictions of the radar for a particular
application.
6) Other external requirements, such as jamming (interference by a hostile
source), Electronic Counter Measures (ECM) etc., may also restrict the
number of pulses.
Ideally, it is desirable that a radar designer combine several different func
tions in a single radar. Such radars are called multi-function radars and are gener
ally more difficult to design due to several conflicting requirements. Our objective
in this chapter is to demonstrate that the MTI functions can be implemented by
a well trained neural network which offers greater flexibility in design, which is
quite advantageous for a multi-function radar system for air-survillance and other
applications.
4.2.1. Current Approaches to MTI
Since the advent of digital computers there have been two major classes of
MTI design procedures. The first method is based on Pulse Cancellation approach
which is implemented through a Linear Transversal Filter, which has been the
dominant method for the implementation of linear signal processing algorithms.
Linear signal processing, on the other hand, puts severe limitations on the flexibility
of the MTI design. When an MTI is designed using a transversal filter with a £Xed
number of delays, it only accounts for as many pulses as there are delay units,
and additional received pulses are essentially wasted. Linear operation in these
processors further accentuates limiting the dynamic range of clutter amplitude
levels that spreads the clutter spectrum. The spread in clutter spectrum degrades
131
the improvement factor (i.e. signal-to-clutter ratio) of the MTI. The second method
is based on the Fast Fourier Transform (FFT). In the first method the processing
is done in the time domain, while with the FFT method the MTI functions are
implemented in the frequency domain [8,57].
Most of the current MTI techniques can be represented through appropri
ate linear constant coefficient difference equations for which simple Z-transform
techniques are available for design purposes. In scenarios characterized by non
Gaussian clutter distribution, however, non-uniform sampling is required to avoid
blind speed regions. A method that is generally used to resolve the ambiguity in
velocity (i.e., blind speeds) is called pulse staggering, which is another name for
using pulse series with different PRFs. This, however, limits the use of linear trans
form techniques in the analysis and design of linear MTI processors. Additionally,
the desired MTI response is one with fiat passbands. A fiat passband indicates
that the MTI reponse is uniform for all target velocities that are of interest. Fig.
4.4 illustrates the response of two MTI :Biters with pulse staggering. Note that it is
desirable to have a fiat response for an MTI filter at regions of interest. However,
lack of sufficient work on the theory of non-uniform sampling limits the optimal
calculation of the MTI filter weights and one has to sacrifice the improvement factor
(defined in Section 4.2.4) in order to obtain a fiat frequency response. The MTI
weights are also called coefficients since in a digital implementation of an MTI,
which is represented by a difference equation, the coefficients in the equation will
serve as the weights to each delay element. In the following discussions, we will
refer to MTI filter weights and coefficients interchangably.
Computational complexity is another limiting factor in the computation of
the MTI filter coefficients. Furthermore, the number of coefficients is also limited
132
(since the number of coefficients is exactly equal to the number of pulses used) which
in turn limits the flexibility of design. Some recently developed MTI techniques
make use of the data association and state estimation methods through linear
prediction theory [35]. Although these methods perform well in situations where
high signal-to-noise (SNR) ratios are present, they do very poorly in cases of low
SNR. In summary, the linear MTI techniques have almost reached their theoretical
limits and despite additional research in linear MTI techniques, certain problems
such as those mentioned above still remain. These problems are mostly due to the
underlying assumptions of linear processing. Therefore, it is of particular interest
to direct the MTI research towards the use of nonlinear processing techniques
provided by the use of neural networks. In the following sections, the general theory
of optimal MTI processing as well as the underlying mathematics are reviewed and
the proposed NN-MTI with its performance evaluations will be described.
4.2.2. The Radar Ambiguity Function
Performance of an MTI is very much constrained by the resolution require
ments for range and velocity. The separation of two closely-spaced targets as seen
by the radar depends on the resolution of the pulses. The pulses have to be as nar
row as possible in order to have a high spatial resolution. Radar state variables of
interest are range, azimuth, and radial velocity. The radar ambiguity function is a
complex envelope of the response of a matched filter receiver to the radar transmit
ted wave which has been reflected from a point target [9]. This function represents
the radar ambiguity in time and frequency domain. The choice of PRF as well
as the width of the pulses are critical design parameters due to the conflicting
requirements of range and doppler resolutions.
133
As mentioned before, multiple pulses are needed in order to achieve the
required signal-to-noise ratio for the detection process. Multiple pulses also cause
a long transient response in feedforward pulse cancelers with the result that the
MTI processing may not be able to take full advantage of the clutter correlation.
That is, by the time the filter reaches a steady state operation, the pulses may be
decorrelated from clutter. Doppler shift (fd) is also inversely related to the pulse
width (T), which is the primary factor for the range resolution. IT /d > ~, the
doppler signal may easily be distinguished from a single pulse. However, if /d < ~,
pulses will be modulated in amplitude and many pulses are needed to extract the
doppler (see Fig. 4.5) [9,71].
The weights of an MTI filter are optimized according to some a priori as
sumptions made about the doppler frequency. To separate noise and clutter from
the moving target, particular points in the doppler frequency domain have to be
removed without too much narrowing down the MTI bandwidth. The magnitude
response of the MTI looks like a comb filter which has stopbands at regions of
heavy clutter and passbands in regions where the target doppler spectrum is ex
pected with minimal attenuation. Linear MTI performance degrades even with a
small degree of skewness in noise or clutter from the assumed Gaussian distribu
tion. Due to this shortcoming, adaptive schemes are needed in order to use spectral
estimation of clutter before cancellation is performed.
4.2.3. Transversal Filters
An MTI with a transversal filter has a frequency response proportional to
sinn 7r/dT, where n is the number of delay lines used. The corresponding weights
are given [8] by
i-I n! wi=(-l) (n-i+1)!(i-1)! , i = 1,2, .. , n + 1,
134
which are the binomial coefficients (weights). The average ratio I = (S / C)out/ (S / C)in
, which is the ratio of signal-to-clutter of the output to that of the input is defined
as the MTI improvement factor. The improvement factor provides a measure of
performance for MTI systems and is independent of the target velocity and only
depends on the weights Wi, the clutter autocorrelation function, and the number
of processed pulses. In general, there is only a small difference (less than 2db) in
improvement factor between the optimum weights and the binomial weights [8].
Delay line cancelers with amplitude responses in the form of sinn7rJdT, where n
refers to the number of pulses, are optimum in the sense that they approximately
maximize the average clutter attenuation and probability of target detection at the
midband doppler frequency and its harmonics. Too much narrowing the passband
reduces the number of detectable targets. Furthermore, as more delay lines are
used, the notches at de and the PRF harmonics will be too broad which limits the
passband.
A transversal filter with N outputs can be used to form a bank of frequency
filters to cover from de to the maximum desired PRF. Define the weights applied
to the outputs of the N taps as
TXT. _ e-i [211"(i-l)k/NJ YYlk - , i = 1,2,· .. , N & k = 0,··· , N - 1
where each value of k corresponds to a different set of N weights, and a different set
of doppler filter responses. The impulse response of the corresponding transversal
filter is then given by
N
hk(t) = L S[t - (i _1)T]e-i211"(i-l)k/N
i=l
135
and the corresponding Fourier transform is
N
Hk(f) = e-j27r/ t L e j27r(i-l)[/T-k/Nl.
i=l
Hence,
IHk(f) I = It ej27r(i-l)[/T-k/Nll = ISi~[7rN(fT - kiN)] I. i=l sm[r.(fT - kiN)]
It can be seen that the peak response of the filter occurs at 0, IIT,2IT, etc.
This kind of filter bank leads to a coherent integration and good SNR performance.
We argue that a neural network will be far more efficient in a coherent integration
because even in a staggered PRF situation, which is needed for the enhancement of
blind speed situations, the peaks of doppler filter banks and their bandwidths can
be shaped through training. In contrast, the linear doppler filter bank implemented
with FFT and transversal digital filters can only provide a uniform set of doppler
filters with equal bandwidths and fi."Ced nulls.
4.2.4. The MTI Improvement Factor
As defined earlier, the MTI Improvement Factor is the signal-to-clutter ratio
at the output of the MTI system divided by the signal-to-clutter ratio at the input,
averaged over all target radial velocities of interest. This is given by
Ie = (SoICo)/(SdCi )
where S and C represent the signal and clutter power, respectively. This equation
can be rearranged as
Ie = (SoISi)(CdCo)
where (So lSi) is called the MTI gain, which is the ratio of the signal average at the
output of the MTI to the average of the input signal. The term (Cd Co) represents
136
the clutter attenuation factor. This factor is independent of the target velocity
and depends merely on the MTI weights and the power spectrum of the clutter
as well as the number of pulses used in the process. Clutter power may well be
overshadowing the target (e.g. 10,000 times stronger than target power).
The improvement factor in temis of the covariance matrix functions and the
complex weight vectors is given by
WTMsW* Ij= WTMeW *
where W represents the vector of complex weights and Ms and Me are the signal
and interference covariance matrices, respectively. We can calculate the improve-
ment factor of coherent MTI using the equation [9]
~n-l 2 L..Jj=O Wj
Ie = ~n 1 ~n 1 ( . k) L..Jj=O L..Jk=O WjWkPe J -
where the term on the numerator is the MTI gain and W j denotes the MTI's weight,
Pc is the clutter correlation coefficient, and n is the number of pulses processed by
the MTI. The clutter correlation coefficient for a Gaussian clutter density is given
by
where T is the pulse interval period and (j~ is the variance of the clutter distribution.
Therefore, the improvement factor depends on the MTI weights and hence the
optimum weights that maximize the improvement factor can be calculated. We may
also use binomial weights as mentioned before since the difference in performance
is only in the order of 2dB.
137
4.2.5. The Optimum MTI Processing Theory
The MTI filter, as discussed thus far, provides a comb filter that ideally has
a flat passband in the expected target regions and stopbands in the heavy clutter
regions. As mentioned before, the MTI is effective only in improving the signal-to
clutter ratio and has no capability in improving the system signal-to-noise ratio.
Therefore, radar detection theory is used only for single pulse MTI analysis. For the
SNR enhancement, an appropriate integration is needed following the MTI. This
integration can be implemented by coherent or incoherent integration methods.
One way to do a coherent integration is to use an FFT algorithm to form a bank
of filters [9,67]. The whole system of the MTI f~llowed by an integrator acts as a
matched filter which is matched to the target spectrum. This is the case for uniform
doppler frequencies (i.e., assuming that the targets are distributed uniformly across
the doppler frequency band). An alternative method to achieve integration is to
cascade MTI with an incoherent integrator. The incoherent integration, however,
causes a detection loss due to the fact that the MTI receiver noise is correlated
incoherently. This loss increases as more pulses are integrated.
4.3. Why Neural Network For Implementation of MTI (NN-MTI) ?
The linear representation of time series, in general, is more constrained
than the regression capability provided by the neural networks. In almost any filter
design problem, including the MTI, the main idea is to shape the filter response for
any arbitrary condition without long transient responses. Other polynomial filters,
such as a Chebyshev filter, can also be used. However, this will cause ripples in
the passband and still a large number of delay lines are needed for a highly shaped
filter response. For example, Chebyshev design results in a wider passband but
only at the cost of a lower improvement factor. IT only a few pulses are available,
138
the shaping of the MTI response is very hard to form. This is another issue that
we will address in the NN -MTI design.
Nonrecursive transversal filters provide N zeros for synthesizing the MTI
response. However, as mentioned earlier, this requires a large number of zeros for
a highly shaped filter. An alternative design is to make use of recursive filters.
The presence of feedback loops, however, causes a very poor oscillatory transient
response. The additional degree of freedom in a recursive transversal filter is due
to more connectivity among the delay units. A multilayer feedfoward architecture
offers the required larger connectivity without needing the feedback loops and
there is no transient response for NN-MTI beca~se everything is done in parallel.
Hence, a desirable steady-state response can be readily achieved with NN-MTI.
Also the poor transient response due to the presence of feedback loops in recursive
transversal filters results in a severe ringing when large clutter returns are received,
which effectively act like a step input to the MTI. Ringing is undesirable because
it causes a masking of the target signal until the transient response fades away.
Digital MTI designs have another major limitation which is the restriction
on the dynamic range imposed by the analog to digital (A/D) converter. The A/D
converter must operate at a speed high enough to preserve the informatio~ content
of the radar signal and the number of bits into which it quantizes the signal must
be sufficient for the precision required. The number of bits in the A/D converter
determines the maximum improvement factor that the MTI radar can achieve
[8,61]. A limiter is generally used to make sure that the A/D converter covers
the peak excursion of the detector output. Therefore, the practical constraints on
the speed and dynamic range of A/D converter pose major drawbacks in a digital
implementation of the MTI processors. This problem does not exist in a neural
139
network implementation of MTI simply because the nonlinear transfer function of
the hidden layers will do the required normalization through training.
The residual clutter can compound the incoherent integration which totally
destroys all improvement produced by integration. We will show that the nonlinear
processing capability of a neural network can successfully combine the doppler
processing and integration performed by MTI and its coprocessor (i.e., integrator).
This means that the MTI implementation with this method reduces the need for
an effective number of independent pulses. There are several other conflicting
requirements for the optimum MTI design where the algorithmic procedures may
not be as efficient. In summary, shaping the ma&nitude frequency response of MTI
demands the fiexibilities offered by the neural network.
The connectivity of the processor elements in the neural network creates a
number of different weight combinations that can be optimized based on a certain
number of pulses and interference distribution. In NN-MTI, the weights are opti
mized by feeding back the error in an iterative process until a desired performance is
achieved. The feedback feature is peculiar to the proposed neural network architec
ture (i.e. multilayer feedforward with backpropagation learning). The steady-state
response starts after the training is completed, therefore the ringing effect and poor
transient response observed in recursive delay-line filters are not of any concern in
a neural network implementation of MTI. The weights are optimized subject to the
condition that the signal-to-clutter ratio be maximized. The linear MTI is only
optimized for one kind of distribution, mainly Gaussian, while the neural network
has much more memory and can be optimized for several different distributions,
not just one. It is therefore possible to add more features to result in a single and
more compact design by utilizing the fiexibilities offered by the neural network.
140
Traditionally, as we discussed in the previous chapter, for the detection of
a stationary target one needs to integrate the amplitude fluctuations of the radar
pulse sequence and map the information into the correct decision about the target
with regard to its absence or presence. We demonstrated in Chapter 3 that the
neural network mapping property provides an efficient methodology for integrating
several different parameters to well identify the presence of the target in clutter
with robustness to the loss of resolution cells. In the light of the same concept
one may rely on the NN-MTI to do the required processing with fewer number
of pulses. Radar pulse series can be either random in phase (i.e., each pulse has
a different phase angle) or they can be generated with the same starting point
(i.e., phase angle) for each pulse. Furthermore, the pulse series may have a defined
phase modulation to enhance the detection process. The random phase pulses are
referred to as incoherent pulses whereas the in-phase or phase modulated pulses
are called coherent pulses. Coherent pulses can be utilized to extract information
about the target motion based on the doppler shift. This is in turn reflected by
the rate of change of the phase angles (Le., frequency shift) with respect to the
reference wave.
The underlying mechanisms in the NN-MTI and MTI are totally different.
The NN-MTI outputs represent information other than the residual clutter as well
as the processed signal. The output of the NN-MTI is the decision and declaration
about targets, not the signal itself. In other words, whereas the output of an MTI
processor gives a map of the target and the background which has reduced clutter,
the NN-MTI helps in the final decision about the separation of the target from
clutter and makes no mapping of the clutter itself. This is due to the nature of
neural network processing that a residual signal at its output does not have any
141
meaning other than what is interpreted. Therefore, the NN-MTI that we are going
to introduce in this chapter is designed only for separating the moving targets
from clutter as well as providing the target radial velocity. Furthermore, this will
be done without the need for the Fast Fourier Transform (FFT) methods and other
hardware complexities such as precise -timing of the pulses.
As mentioned before, clutter peaks occur in the MTI input which may not fit
in the dynamic range of the MTI processor, which is in turn due to the limitation
of the AID devices. The nonlinear operation of the limiter creates additional
harmonics of clutter which causes a spread in the clutter spectrum and results in
a reduction of the improvement factor. Distortion of the clutter statistics due to
the AID limiting causes a deviation of the MTI weights from their optimal values.
Therefore, a nonlinear processor that can optimize its weights from the beginning
based on this nonlinear effect will definitely provide a greater improvement. This
feature is provided by NN-MTI since the first hidden layer will always normalize
the input vector accordingly and each sample is normalized in parallel so that the
clutter variation may have a far more dynamic range.
From the neural network standpoint, the function of MTI processing is to
calculate the weighted combination of the pulse amplitudes in order to demodu
late the effect of doppler on the corresponding pulse sequence. To perform this
task, many different methods have been proposed for calculation of these weight
s [9,68]. We have already discussed the deficiencies of the existing methods as
well as the flexibilities offered by the neural network architectures. Since a neu
ral network provides a tool for the calculation of these weights through training,
one can concentrate more on shaping the frequency response of the MTI filter as
well as more efficient presentation of the pulse sequence to the processor. In other
142
words, the use of other parameters as an aid to better code and decode the pulse
sequence becomes possible. This is a very important feature for the neural network
method of implementation of MTI, since there may not be as much restriction on
the waveform design and coding of the pulses.
To summarize, with the algorithmic methods and linear transversal filter
implementation techniques, the flexibility of design is much more limited as dis
cussed in this section. Furthermore, one can think of the neural network as a more
general Fourier transform method which has been one of the major tools for the
analysis of radar signals. The Fourier transform, which is a linear transformation
based on eigenfunction expansion of the signal,. has only two degrees of freedom
i.e., the amplitude and frequency of each harmonic. On the other hand, the neural
network activation functions can be of different forms for each layer with differ
ent sensitivities as well as different learning parameters. The eigenfunctions for
the Fourier transform are of periodic nature which serves as a major drawback of
this technique in several radar processing problems such as those characterized by
the presence of the blind range or blind speed zones. Moreover, the analysis of
non-periodic pulse sequences (i.e., non-uniform sampling) with Fourier method is a
formidable task. Therefore, exploration of new implementation techniques through
the employment of nonlinear transform methods, such as that provided by a neural
network, holds out particular attraction in handling the problems mentioned in this
section.
143
4.4. Neural Network Architectures of the MTI
In this section we shall investigate several different design structures for
the MTI and analyze their performance compared to the classical Pulse Canceler
method. As will be seen in the following discussions, the NN-MTI has far more
flexibilities than the pulse cancellation method which is implemented by a linear
transversal filter. We will move in a step by step manner through a sequence of
simulation studies towards the goal of obtaining better design procedures for MTI.
4.4.1. NN-MTI Doppler Shift Extraction From Pulse Series
The fundamental problem to study in the NN-MTI design is to see how well
a neural network extracts the doppler shift from·a series of pulses in the absence of
clutter. In the following experiments we will study this property through training
of several different neural networks.
Experiment # 1
For this purpose (i.e, doppler processing by neural networks), we generated
a series of 5 pulses with a peak amplitude of 100 from a moving target with a radial
velocity range of [Om/s, 75m/s]. A 3-layer neural network with 5 input nodes, 10
hidden nodes, and 1 output node was trained with 300 training vectors which were
generated as follows:
1) A velocity step size of 5m/s was used. As will be discussed in the follow
ing experiments, the choice of the step size for the velocity increments in
generating the training data has a significant effect on the robustness of the
resulting NN-MTI scheme.
2) A doppler sensitivity of 6.7Hz/m/s was used which is the sensitivity for
the 1 GHz carrier frequency. This quantity was chosen based on the radial
144
velocity range to avoid blind speed zones in the training data since the neural
network will be mislead from an inconsistency in the training set.
3) Uniform pulse intervals were used.
4) The output was divided into three different classes of doppler shifts i.e.,
(0. Hz, 120. Hz), (120. Hz, 300. Hz), and (300. Hz, 500. Hz).
5) Generalized Delta Rule was used for the weight adjustments.
6) The activation function was the two sided sigmoid function, bounded by
[-1,+1].
Note that this resembles a bank of thr~ doppler filters with non-uniform
bandwidths! The Fast Fourier Transform (FFT) method cannot generate non
uniform bandwidths for the doppler filter bank. The reason for this is the periodic
nature of the FFT. That is, for each range cell of the radar, one can generate N
uniform (and only uniform) bandwidth doppler filters with the FFT algorithm,
where N depends on the number of pulses as well as the duration of each pulse
[8,9]. Similarly, the Pulse Canceler methods only provide a wideband MTI filter
with a non-flat response. The non-flat response is due to the small number of
pulses. The Pulse Canceler method cannot separate the velocity ranges which is a
drawback in the case of multiple target situations.
In the following tables, P( i) denotes the peak amplitude of the ith pulse
which is in the range [-100,100]. Note that during the training phase, the pulse
amplitudes are normalized with respect to the maximum magnitude of the pulses
in the sequence. The quantity termed "desired solution" refers to the desired value
that has been used for training the neural network. For instance, the numbers
30.0,20.0, and 10.0 in Table 4-1 and the other tables represent certain classes of
145
target velocities selected (e.g., 30.0 stands for a slow target and 10 stands for a
fast target). The last column denoted "NN solution" indicates the response of the
trained neural network in each scenario to a set of test samples that were different
from the samples used during the training phase. For brevity, only a few tests (out
of 100 conducted tests) are shown in these tables. The rate of correct classification
and the average velocity error that was incorporated for testing the trained neural
network are indicated at the top of each table. In order to generate the test data, we
selected 100 different samples of target velocities and we added a uniform random
number to each velocity. Furthermore, this average error (i.e., mean of the uniform
random numbers) was either equal to or greater than the velocity step size used for
training the neural networks. Then we simulated the doppler shifts corresponding
to these random target velocities which resulted in new pulse amplitudes other
than those which were used in training. In describing the results, we will also refer
to the test samples as "test vectors" which denote the group of pulses that were
generated as such for testing and evaluation.
Table 4-1 illustrates the response of the NN-MTI for the test data after the
training was completed in Experiment # 1. The percentage of correct classifica
tion was 89%. Note that the neural network solution (i.e., output) in this case
is compared with a threshold which is defined as the average of the numbers rep
resenting two adjacent classes. Hence, the number 29.9 in Table 4-1 is compared
with (30.0+20.0)/2 which is 25.0. Now since the number 29.9 is greater than 25.0,
we classify it as a slow target which means that the neural network response was
correct for the test vector in the corresponding row (i.e., for the specified values of
P(1),P(2),··· ,P(5».
146
Table 4-2 shows the performance for a set of 10 test vectors which were
generated with an average error of 6m/ s in the radial velocities of the target. The
percentage of correct classification in this case reduced to 76%. This indicates
that teaching the NN-MTI by direct use of pulse amplitude distribution is not
sufficient with a step size of 5m/ s for- velocity increments in the training vectors
if more noise immunity is desired. In other words, the trained neural network
has learned to generalize as long as the average noise in velocity is less than or
equal to the increments used in training. For example, the number 13.8 in the
first row of Table 4-2 which denotes the neural network (NN) solution in this case
is well below the threshold value (i.e., 25.) that was mentioned above. Therefore,
it results in a false classification of the target velocity. On the other hand, the
numbers 29.6,18.1,19.1,18.2,19.1,23.6,7.8, and 9.2 represent correct classification
since they are close enough to the correct number which represents that class (i.e.,
30.0,20.0,10.0 referring to slow, medium, and fast). It may be noted that using
the neural network not only do we get an indication of target movement but also
we can further classify the speed range of the target.
It may also be noted that, in the absence of clutter, one can easily train
the neural network with smaller step sizes in radial velocity and make use of the
associative memory property of neural networks and simply store the responses
corresponding to each pulse amplitude distribution. However, in the presence of
clutter, we need to parameterize the amplitude modulation of the pulses such that
the noisy pulse amplitude distribution can be trained to the network.
The effect of an increase in the number of pulses was studied by considering
a case with 10 pulses. The neural network architecture was the same except for
the number of inputs which was 10 in this case. The number of training vectors,
14i
leaxning rates, and the activation function were maintained the same as before.
The radial velocity step size was also 5m/s. The training was stopped when the
error reached a steady state. As can be seen from the three samples in Table 4-3,
the NN-MTI can imitate a doppler filter bank (of 3 in this case) with variable
bandwidths. The percentage of correct classification increased to 92% giving a 3%
improvement compared to the case discussed earlier (as represented by Table 4-1).
Obviously, the use of more pulses accounts for this improvement.
Despite the increase in the number of pulses, the test vectors for the 6m/ s
velocity error reveal that with a slight change in the radial velocity error beyond
the training step size, the response of NN-MTI. deviates considerably (Le., 14%
decrease) from the desired classification rate as illustrated in Table 4-4. Therefore,
regardless of the number of pulses used for the training, the NN-MTI cannot cor
rectly classify the radial velocities except for the velocity values for which it has
been trained with some variation (i.e, 5m/ s). Note in comparison that the Pulse
Canceler method cannot classify the target radial velocity at all. That is, it simply
indicates whether the target is moving or not by looking at the pulse amplitude
distribution. On the other hand, with the Fast Fourier Transform method (FFT),
one can extract the radial velocities with an algorithmic implementatic~ of the u
niform doppler filter bank. However, the net amplitude response of an FFT-based
filter is not flat. From the entries in Tables 4-1 through 4-4, it is evident that we
can further improve the performance of the NN-MTI to do as well as the FFT
algorithm in an on-line fashion with a variable bandwidth doppler filter bank such
that the amplitude response stays flat. Recall that the variable bandwidth filter
bank is particular to the neural network-based MTI and is not feasible with the
FFT method (at least not in an efficient way).
148
Table 4-5 shows how the NN-MTI with a direct presentation of pulses at
the input nodes (without any additional preprocessing) has learned to perform a
classification of the moving target. This time only two classes were considered,
i.e., the class represented with the numerical value of 10 is considered as slow and
the class represented by the numerical v-alue of 0 is considered as fast. A similar
test experiment illustrated that the NN-MTI with direct presentation of the pulses
learns to classify the radial velocities within a slight variation from that of the
training vectors.
Experiment # 2
To demonstrate that the neural network ~an efficiently learn the nonlinear
functional relationship between the doppler shift and the pulse amplitude distribu
tion, we trained a neural network with a similar structure as discussed above and
arranged for the training data such that there were 10 examples for every 1m/s
increment in velocity. Furthermore, we directly presented the 5 amplitude mod
ulated input pulses to the network with the one output representing the doppler
shift that has caused the corresponding pulse amplitude modulation. We performed
this experiment for a velocity range of [Om/s, 70m/s1 with a total of 700 training
vectors (i.e., 10 examples for each increment of 1m/s). The network learned the
correspondence between the input pulse sequence and output doppler shift exactly.
We conducted these experiments on the pulse sequences without any pre
processing and without any noise or clutter samples in the pulse series. From these,
we can conclude that in the absence of noise and clutter, neural networks can be
used for an efficient on-line mapping of modulated pulses onto the doppler shift.
This is a task which ordinarily requires a large amount of storage locations as well
as much computational effort for implementing with the use of the FFT algorithm.
149
The performance evaluations conducted here indicate that the error can be reduced
by further training with smaller step sizes on the noise-free data.
Adapting the training data to the required resolution is a subtle point in
the design of multi-spectral sensor systems which require the fusion of data from
different sources with different resolutions. Table 4-6 illustrates the training data
(i.e., the sequence of 5 input pulses and the corresponding desired output which
is the doppler shift) that correspond to a step size of 5m/s for training which
resulted in a low resolution of doppler shifts (Le., comparing the difference in the
last two columns in Table 4-6). On the other hand, Table 4-7 shows how well
the neural network learned this nonlinear funct~onal relationship for the velocity
increments (i.e., step size) of 1m/s which is a rather small value in terms of radar
measurements in the microwave regions. The entries in Table 4-7 indicate that,
as long as the average error in velocity is less than 1m/ s, the neural network can
produce the correct doppler shifts within two decimal digits. It is interesting to
note that a similar network structure and similar training algorithm may be used
for millimeter wave radars as well as radars which operate at optical frequencies.
Having a unified electronic circuitry that is capable of providing efficient operation
in such a ,vide spectrum of wavelengths is a very attractive feature of the neural
network application to radar design.
One of the most important parameters in radar detection (this is true for
other detection schemes as well), which is based on the transmission of a series of
pulses, is the actual probability of detection for each individual pulse. In all of the
above experiments we assumed that the probability of detection (Pd) for each pulse
was equal to 1. That is, training vectors were generated based on the assumption
that Pd = 1. In practice, however, depending on the probability of detection, some
150
of the returning pulses may be either too much corrupted with noise or they miss
the moving target and hit a stationary target in its neighborhood. The end result
is that some of the pulses could be very different from those actually available at
the time of training the NN -MTI.
Note that in this experiment we are disturbing the pulse amplitude pattern
in a totally different way such that only one or two pulses out of a total of five are
completely distorted, whereas in the previous experiments all pulses were slightly
affected by the addition of a uniform random noise to the target velocity. In Table
4-8 only some of the entries (denoted by an asterisk) corresponding to the pulses
in Table 4-7 have been corrupted with noise. As an example, the second pulse
in the pulse series of Table 4-7 that corresponds· to 0.67 of doppler shift has been
changed from -9004 to +100. This is a rather large deviation compared to the
maximum amplitude of each pulse which is 100. Note that in Table 4-8 only one
of the pulses is significantly different from the correct ones used for training. A
similar experiment with two corrupted pulses revealed that, if training is based on
noise-free pulses, satisfactory results can be achieved. Although training with noisy
pulses is the ultimate objective of this research, this experiment illustrates the level
of neural network tolerance to deviations of pulses from the correct pattern.
Another important observation in this experiment is that if we train a neural
network with a very fine resolution (i.e., less than Imjs of velocity step sizes), then
more noise in the measurement can be tolerated. The significance of this observa
tion can be further emphasized by noting that in generating the training data we
did not take into account any kind of noise or clutter. This example illustrates that
lowering the step size in training brings the reward of more immunity to deviations
from the true pulse sequence. Furthermore, this benefit of noise immunity can be
realized with only a small number of pulses (5 in these experiments).
151
Experiment # 3
We will now attempt to train the neural network with the inclusion of clutter
in the training data. A network with a similar structure to what was used in the
previous experiments was trained with 300 vectors. Once again only 5 pulses were
used. However, in this experiment, the pulses were all corrupted with a uniform
clutter distribution characterized by a mean value of 5m/8, which is equal to the
step size of the target velocity. The main objective of performing this experiment
was to observe the behavior of the neural network to the presentation of corrupted
training data. Note that in all of the traditional methods for NITI processing,
particularly the FFT-based methods, the coinci~ence of the clutter velocity with
the target velocity resolution has been a major problem. From this experiment we
want to analyze how the neural network will respond to this situation, in which
pulses are directly used as inputs without any preprocessing. Furthermore, we
would like to study the effect of the training set data that has been corrupted
with clutter samples. The results summarized in Table 4-9 show that the neural
network does not have any problem with learning from the corrupted training data.
However, the performance was reduced to 82% (correct classification) as compared
with the situation represented in Table 4-1 (that yielded 89% correct classification).
Therefore, training with clean data results in a better performance.
We now decrease the doppler sensitivity from 6.7 Hz/m/s to 0.7 Hz/m/s in
order to extend the range of the velocities. Recall that for lower doppler sensitivity
we have a wider range of velocities before we run into the aliasing problem (i.e.,
152
blind speed zones) *. We generated 300 training vectors which were composed of
variable step sizes in velocity. That is, instead of using a constant step size as
before, we used 10 levels with 1 m/s increments, 5 levels with 8 m/s increments,
and 5 other levels with 15 m/s increments. A total of 20 examples were generated
for each velocity level. The network consisted of 5 input nodes which received the
5 input pulses, 6 hidden nodes, and one output node for representing the doppler.
The results of this training are illustrated in Fig. 4.6. The average error for
doppler shift was 43.5Hz which corresponds to 43.5/.7 :::::: 62.1m/s in the target
velocity. The vertical axis in Fig. 4.6 denotes the the normalized doppler shift of
each group of pulses which is caused by the velocity that they represent. That is,
each point in this figure refers to a series of pulses that were processed in parallel.
The performance of the NN-MTI in providing a close approximation of the doppler
shift underscores the capability of neural networks in MTI applications. It must
be noted in comparison that the traditional pulse cancelers can only tell whether
the target is moving or not without providing any information about the target
velocity. Therefore, despite the apparent deviation from the desired performance,
this is still much better than the response of an ordinary pulse canceler.
The particular situation where the clutter velocity corresponds to the
doppler resolution of the MTI filter has attracted particular attention. Even the
FFT-based algorithms perform very poorly in this situation. One can see that
* Presence of blind speed zones in the training data keeps the neural network from learning.
One way to get around this problem (i.e., confusion of the network in the blind zones) is to
manually assign the correct decision to resolve the conflict in training data by slightly perturbing
the data in a deterministic fashion. We leave this for a later time.
153
the parallel processing of the pulses provided by the neural network helps in rec
ognizing the pulse amplitude pattern corresponding to each velocity even in the
face of presence of clutter at the input during the training phase. In comparison
to training with a noise-free data set, the results indicate that training a neural
network with noise or clutter samples superimposed on the training vectors will be
more efficient only if some additional parameters are included in the input. We
will discuss this aspect in a greater detail in the other experiments to follow.
Before we close this section, we conclude the following from the series of
experiments that were conducted thus far. First, it was demonstrated that there
is sufficient information for the neural network to extract the doppler shifts from
the pulse amplitude distribution. In other words without the need to employ the
FFT algorithm, a neural network-based procedure can be designed to offer an
alternative solution which is much faster and applies to more general forms of
signals received. (By more general forms of signals we mean that one can perform
more sophisticated coding and make use of more complex modulation techniques in
order to provide immunity for the pulse sequences against noise and clutter and yet
perform an on-line extraction of the doppler shift with simpler hardware design).
Secondly, by using arbitrarily smaller step sizes in the training data, one can gain
more robustness against the variation of pulse amplitudes due to noise or any other
source of unwanted pulse modulation (e.g., eclipsing of pulses at the receiver). The
other interesting observation was that it is possible to achieve a variable bandwidth
doppler filter bank (as opposed to the FFT algorithm which provides a uniform
bandwidth filter bank). The advantage of a variable bandwidth filter bank is that
different patterns of clutter spectrum can be removed. It remains to make use
of some methodology to cross-correlate the pulses such that any residual error
154
due to clutter is decorrelated from the pulse sequence before the pulses are used
for training. This can be done in a number of ways [53,69,70] such as the use
of a) modulation techniques, b) statistical parameters, and c) waveform coding.
Furthermore, it appears that some of these methods can themselves be implemented
by a neural networks (e.g., Pulse Coding).
4.4.2. Implementation of Pulse Canceler With Neural Networks
As was discussed previously in this chapter, the digital Pulse Canceler is a
classical time-domain approach to the MTI design as compared to the FFT algo
rithm which is a frequency-domain approach. The operation of the Pulse Canceler
is based on the integration of binomially weighted pulses. The output of a digital
Pulse Canceler is a signal on the radar screen that indicates the moving target. As
more pulses are used in a Pulse Canceler, a more uniform (i.e., :flat) MTI response
will be achieved. A uniform response indicates that targets moving at different ve
locities are equally detected. The problems arising from the presence of noise and
clutter must be differentiated here. A high signal-to-clutter ratio (SIC) indicates
that less extraneous scattered data will appear on the radar screen while a high
signal-to-noise ratio (SIN) is manifested as a brighter spot at each instant that
the target is detected. With a Pulse Canceler, the clutter around the de level (i.e.,
slowly varying clutter) will be canceled. The problem with the Pulse Canceler is
that it does not provide the doppler shift. Moreover, the Pulse Canceler is not
as efficient when some of the pulses are missing (i.e., Pd < 1). That is, it only
enhances the signal-to-clutter ratio (SIC) and not the signal-to-noise ratio. While
the clutter spread is more concentrated near the de range, noise spectrum may be
spread over the entire spectrum. In the last section we mentioned that a modula
tion of some kind is required to preprocess the pulses before they are used as the
inputs to the neural network.
155
Experiment # 4
To examine the neural network response to binomially weighted pulses, we
provided 500 training vectors that included 25 different velocity levels covering the
range [0,125] with 20 examples from each. The step size for the velocity increments
was 5m/s and the average clutter velocity was set to O. The time interval between
the pulses was 1 millisecond. The neural network architecture comprises of 5 input
nodes, 10 hidden nodes with a nonlinear activation function which was selected as
a two-sided sigmoid function, and one output node which performed the weighted
sum of the pulses from the hidden nodes. A similar network with only 6 hidden
nodes learned with about the same degree of effo:rt and resulted in almost the same
minimum error. One may think that a neural network is just learning to be a
summer in this case. Although this is true, there are more advantages than just
adding the pulses if some classification process takes place at the same time. We
did not include clutter samples in the training data for the reasons dis~ssed before.
Inclusion of clutter data in training requires coding, modulation, or some statistical
treatment. The main objective of this experiment was to train the function of a
Pulse Canceler to the neural network. The primary desired feature of the neural
network here was to achieve robustness to noise as we discussed in the previous
experiments.
The binomially weighted pulses will provide an improvement in the signal
to-clutter ratio, while the neural network properties of associative memory and
parallel processing of the pulses will facilitate preserving the magnitude distortion
of pulses in the presence of noise (i.e., when Pd ::5 1), which improves the signal
to-noise ratio as well. It must be emphasized that in a Pulse Canceler which
is implemented by a transversal filter, the interrelation of pulses as a group is
156
not accounted for. On the other hand, the trained neural network can play an
important role in providing a graceful recovery of the doppler information which
is embedded in the amplitude distribution of the pulse train. That is, the group
of pulses that hold the doppler information constitute a pattern and hence when
a pulse is missing in the sequence or if its magnitude has been distorted by noise,
which is usually the case in practice, the neural network implementation of MTI will
outperform the classical pulse cancellation methods. It must also be emphasized
that with the neural network approach, processing the pulses in parallel will further
help relating each pulse to all other pulses through the connection weights of the
hidden nodes.
Another advantage of the neural network implementation of Pulse Canceler
(NN-PC) is the achievement of fast on-line response. We reserve the term NN-PC
for a class of NN-MTI which is trained to function like an ordinary pulse canceler
while the term NN-MTI will be used with a more general meaning. To demonstrate
the capability of a trained neural network pulse canceler, we generated a series
of undistorted pulses which were amplitude modulated by doppler effect and used
them as inputs to NN-PC (which was trained to merely indicate the target motion)
as well as the conventional PC (i.e., the one with binomially weighted pulses as
depicted in Fig. 4.3). As can be seen in Fig. 4.7, the two schemes identically
responded to each group of pulses (5 pulses in each group) and provided the correct
indication that the target is moving. However, any variation in the target speed
cannot be classified by the conventional PC since it does not have a fiat frequency
response (see Section 2.8.3). Therefore, the vertical changes in Fig. 4.7 do not
correspond to the actual target speed. The NN-PC, however, can be trained to
classify the target speed. To illustrate this feature of NN-PC, we conducted a
157
separate test on another NN-PC which was trained to classify a slow target and a
fast target by setting the output to a numerical value of 10 for the velocity range
of [20m/ s, 120m/ s] and zero otherwise. The performance is illustrated in Fig. 4.8,
where the response of NN-PC is compared to the true (i.e., desired) response.
Another experiment with the binomially weighted pulses in the conventional
PC revealed that one cannot extract doppler shifts from these pulses, whereas this
can be done by direct input of the unweighted pulses to the neural network as
was discussed previously. It is an important observation that a neural network
does not learn doppler shifts from the binomially weighted pulses while performing
more efficiently when the pulses are directly fed ~ received. We conclude that the
role of binomially weighted pulses, when used in the neural network training, is
merely to cancel any existing clutter and that doppler shifts can not be extracted
from them in the way that they are extracted in ordinary pulse cancelers. As
we will show in another experiment, if the sum of the binomially weighted pulses
is used as an input to the neural network, a more efficient processing can result,
particularly when other parameters are also used as inputs in conjunction with
it. Note that the binomial pulses have been amplitude modulated two times, once
with the doppler variation and then weighted again by the binomial coefficients
to achieve robustness to clutter. This is why training a neural network to extract
doppler shifts from binomially weighted pulses is more difficult than training with
the unweighted pulses.
4.4.3. Analysis of NN-MTI Design With PRF Switching
Up to this point, our main objective was to study the mapping of a set of
coherent pulses that have been amplitude modulated with doppler shifts onto the
158
corresponding doppler shifts. Use of a single PRF (i.e., uniform time interval be
tween pulses) has the disadvantage that the range of velocities that can be detected
without ambiguity is limited. The ambiguity is due to the aliasing problem which
is a consequence of uniform sampling. The PRF switching has been a traditional
method for overcoming this ambiguity. However, there exist some problems with
non-uniform sampling. Some of these are: a) the underlying theory is not well
developed for a clear procedural design of the MTI filter, which means that the
analysis of the PRF switching with Fourier transform methods is very difficult, and
b) the frequency response of the multiple PRF pulse sequence is not flat, which
leaves a lot of room for further development. As will be seen in this section, the
flexibility offered by a neural network implementation can provide a new way of
forming a desired MTI frequency response when multiple PRFs are employed.
Experiment # 5
A neural network with 9 input nodes and 10 hidden nodes with nonlinear
activation functions similar to those used in earlier experiments was trained. The
training examples were generated based on two sets of pulse sequences. Each pulse
sequence was composed of 3 individual pulses. The only difference between the two
sets of pulse sequences was the PRF; specifically, the time interval between the first
set of pulses was 1 milliseconds whereas it was .863 milliseconds for the second set.
Furthermore, the pulses in each set were multiplied by binomial weights. Recall
that the multiplication of the pulse amplitudes by the binomial weights adds to the
clutter visibility and is used in the traditional methods (e.g., the pulse canceler).
However, as it was illustrated in an earlier section, direct use of the binomially
weighted pulses as inputs to the neural network does not result in a satisfactory
training.
159
We have already discussed some reasons for the failure of neural networks
in learning directly from the binomially weighted pulses. Despite this, we included
binomial weighting in this experiment in order to further enhance the robustness of
NN-MTI to clutter. In this experiment we also provided some additional informa
tion to further evaluate the efficacy of using binomial weighting. This additional
information consisted of three parameters, viz., 1) the mean of the pulses which
was calculated over the total of 6 pulses, 2) the mean of the pulses for the first
sequence, and 3) the mean of the pulses for the second sequence. We conducted
this experiment both with and without the binomial weightings. Doppler sensi
tivity was set to 6.67hz/m/s and the velocity increment was 5m/s. A total of
25 different velocity levels were generated with 5m/s increments and 20 examples
from each level were included in the training set. This gave rise to a total of 500
training examples. The output was composed of three different classes of radial
velocity i.e., slow, medium, and fast moving targets.
In these experiments, we did not include clutter in the training data. The
results of training with and without binomial weighting are illustrated in Figs.
4.9 and 4.10, which show no difference in the performance of the two methods.
However, what is remarkable in this experiment is that despite the fact that the
neural network could not learn from direct presentation of binomially weighted
pulses (as illustrated earlier), we have found a way to include them in training.
This is through the inclusion of the arithmetic mean of each set of pulses and the
overall mean of all the pulses which serve to create a deterministic relation among
the two sources of modulation hence preventing any possible conflict in the training
data. There are two important achievements here which need to be emphasized.
One is the flat frequency response that can be obtained by the neural network
160
scheme regardless of the multiple PRFs used in the pulse sequence. The other is a
possible increase in clutter visibility which will be discussed next.
In order to underscore the use of binomial weighting for some advantage, we
repeated this test with the inclusion of clutter in the test vectors. We added moving
clutter samples to the pulses such that each clutter sample imposed a random
doppler shift in addition to the true doppler shift of each pulse. The average
velocity of clutter samples was set at 30m/s. A performance comparison of the
NN-PC trained with binomially weighted pulses (as described above) with the case
when the pulses were not weighted is illustrated in Fig. 4.11. This is where it does
make a difference to include a clutter visibility fe~ture (i.e., additional weighting of
pulses with binomial coefficients). The advantage of combining the clutter visibility
feature of pulse cancelers with the noise immunity, :fiat frequency response, as well
as many other design :fiexibilities offered by a trained neural network is clear from
these experiments.
4.5. Conclusion
Biological systems (e.g., insects, birds, flies) have capabilities beyond those
of the conventional MTI processors. Insects, for example, can easily detect a mov
ing target as it approaches them. Furthermore, the size of an insect is not even
comparable to the size of the simplest MTI filter. It is hence of interest to the
radar community to explore how the functions of an MTI processor can be mod
eled by Artificial Neural Networks (ANN). Our objective in this chapter was to
initiate a thorough investigation of radar pulse processing with neural networks.
Although techniques of radar signal processing, including MTI, have been vastly
improved by the availability of digital computers in recent years, these methods are
generally based on complex mathematical procedures which make the engineering
161
and design of radar receivers rather costly and vulnerable to electronic faults (e.g.,
loose connections, short circuits). Work reported in this chapter has provided some
evidence that doppler filter banks can easily be implemented with neural networks
even in situations where a limited number of pulses are available for processing. It
was shown that training a neural network for MTI filtering is more successful when
clean data is used.
We showed that, in contrast to the ordinary pulse cancelers which are based
on optimally weighting the input pulses, the neural network can do the weight
ing within the process of its training and make use of binomial weighting as an
additional factor to further enhance its perfo~ance in clutter. Furthermore, it
is possible to shape the frequency response of the NN-MTI as desired without
needing the complex process of pole placement, which is traditionally required in
both digital and analog filter design procedures. A rather important feature that
was explored during the course of these experiments is that non-uniform sampling
(multiple PRFs) can be efficiently handled with a trained neural network. Also,
a variable-bandwidth doppler filter bank is much simpler to implement with neu
ral networks when compared to the case with linear transversal filters. Since the
fault tolerancy of neural networks in the face of loss of connection weights has
already been established in the literature [5,83], we did not conduct any further
comparisons with the traditional MTI filters when some of their connections are
lost.
As a final note in this chapter, the Fast Fourier Transform (FFT) technique
has established itself as a 'valuable tool for digital processing of radar signals. Al
though the FFT-based MTI filters were not thoroughly addressed in this chapter,
we outlined some of the outstanding features of the neural network-based methods
162
which are not easily available in FFT-based methods. Our experiments with neu
ral network processing of coherent radar pulses have revealed that neural networks
provide convenient mechanisms for alternative modulation and pulse transform
techniques with more attractive engineering design features. A good example is
the parallel processing of pulses in time, space, and frequency domain which is in
demand for future distributed sensor systems.
InpJ_o_el....;Oy_'i_ne_--'~~ rl,loelOY line ~~ ~I. _____ ~ ~ .. _____ ~ __ ~~
OuTeuT
(al
IneuT I., Deloy line OUTPUT
Fig 4.1 (a) Two-pulse canceler; (b) three-pulse canc.eler
163
Freauency
Fig 4.2a Relative frequency response of the single-pulse canceler (solid
curve) and the two-pulse canceler (dashed ~urve).
zO.--......,.------,--...,.----------, (1)
Freauency
Fig 4.2h Amplitude response for (1) three-pulse canceler. (2) £"e-pulse
canceler, (3) 15-pulse canceler
Input
Summer
Output
Delay r,.. -1
164
Fig 4.3 General form of a transversal filter for MTI processing
1.0 IV
g 0.8 Co
~ 0.6 ... ~ 0.4
o
-10
I , , , , ,
2/72 2/T, 3/72 Frequency
(a)
\ , \ I \ , , I , , , ' , ,
3/T, 4/72
\ I \ I , , I , \ I \ I
\ I , I
\ \ \ \ I I I I ,
iii ~ -20 I '
I ' II , I , ' , I , I \ "
, , , IV II> c: o Q. II>
~ -30
-40
, , I, 1/ \I
I' II " II
--- Fixed prf
--- Staggered prf
II II
" " I'
I I I I t
-50~ ______ ~~ ______ ~~ ______ ~~ _____ ~~ o 1.0 2.0 3.0 4.0
Target velocity relative to first blind velocity at f.ixed prf
'Cb)
Fig 4.4 Frequency response of pulse cancelers with two distinct
PRF (a) single-pulse; (b) five-pulse
165
166
t-T-j
-~ ~ ~ nuwM lal
-IV \A N \A (IJ I
==ri---- °--------1 ----~--------~
----'------
leI
Fig 4.5 (a) Radar pulse train: (b) video pulse train for doppler frequenc:·
!d > 1/7; (c) video pulse train for doppler frequency h < liT.
167
Table 4-1 NN-MTI classification of doppler shift with 5 mls average error in test vectors with '89% correct classification
P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
30.9 58.8 80.9 95.1 100.0 95.1 58.8 -58.8 -95.1 0.0 30.9 -58.8 80.9 -95.1 100.0
30.0 20.0 10.0
29.9 19.9 14.1
Table 4-2 NN-MTI classification of doppler shift with 6 mls average error in test vectors with 76% correct classification
P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
24.9 -48.2 68.4 -84.4 95.1 30.0 13.8 -48.2 -84.4 -99.8 -90.5 -58.8 30.0 29.6 68.5 -99.8 77.1 -12.5 -58.8 20.0 18.1 -84.4 -90.5 -12.6 77.1 95.1 20.0 19.9 95.1 -58.8 -58.8 95.1 0.0 20.0 18.2 -99.8 -12.5 98.2 24.8 -95.1 20.0 19.1 98.2 36.9 -84.5 -68.5 58.8 20.0 23.6 -90.5 77.1 24.9 -98.2 58.8 10.0 7.8 77.0 98.2 48.2 -36.7 -95.0 10.0 30.0 -58.8 95.1 -95.1 58.8 0.0 10.0 9.2
Table 4-3 NN-MTI classification for 10 independent pulses with 92% correct classification
168
P(l) P(2) P(3) P(4) P(5) P(6) P(7) P(8) P(9) P(lO) Desired NN
30.~ 58.7 80.9 95.1 100.0 95.1 80.9 58.7 30.8 0.0 30.0 29.9
95.1 58.8 -58.7 -95.1 0.0 95.1 58.7 -58.7 -95.1 0.0 20.0 19.9
30.9 -58.7 80.9 -95.1 100.0 -95.1 80.9 -58.7 30.9 0.0 10.0 13.0
Table 4-4 NN-MTI classification for 10 pulses with 78% correct classification
P(l) P(2) P(3) P(4) P(5) P(6) P(7) P(8) P(9) P(10) Desired NN
24.8 -48.1 68.4 -84.4 95.1 -99.8 98.2 -90.4 77.0 -58.7 30.2 15.0 -48.1 -84.4 -99.8 -90.4 -58.7 -12.5 36.8 77.0 98.2 95.1 :;0.0 -0.15 68.4 -99.8 77.2 -12.5 -58.7 98.2 -84.4 24.8 48.1 -95.1 20.0 18.3 -84.4 -9004 -12.5 no 95.1 24.9 -68.4 -98.2 -36.8 58.7 20.0 -1.9 95.1 -58.7 -58.7 95.1 0.0 -95.1 58.7 58.7 -95.1 0.0 20.0 18.6 -99.8 -12.5 98.2 24.8 -95.1 -36.8 90.4 48.1 -84.3 -58.7 20.0 9.5 98.2 36.8 -84.4 -68.5 58.7 90.4 -25.0 -99.8 -1204 95.1 20.0 13.5 -90.4 ii.O 24.9 -98.2 58.7 48.2 -99.8 37.0 68.5 -95.1 10.0 9.6 77.0 98.2 48.1 -36.6 -95.0 -84.4 -12.6 68.2 99.8 59.1 10.0 8.9 -58.7 95.1 -95.1 58.7 0.0 -58.7 95.1 -95.1 58.7 0.0 10.0 9.8
Table 4-5 Two step classification of slow & fast moving targets with 94% correct classification
P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
30.9 58.7 80.9 95.1 100.0 0.0 0.001
58.7 95.1 95.1 58.7 0.0 0.0 0.001 80.9 95.1 30.9 -58.7 -100.0 0.0 0.001 95.1 58.7 -58.7 -95.1 0.0 0.0 0.001 100.0 0.0 -100.0 0.0 100.0 0.0 0.03 95.1 -58.7 -58.7 95.1 0.0 10.0 10.0 80.9 -95.1 30.9 58.7 -100.3 10.0 9.9 58.7 -95.1 95.1 -58.7 0.0 10.0 9.9 30.9 -58.7 80.9 -95.1 100.0 10.0 10.0
169
Table 4-6 Low resolution doppler shift extraction by NN-MTI with step size = 5m/s
P{l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
53.5 -90.5 99.2 -77.0 30.9 0.67 0.53
-90.5 -77.0 24.9 98.2 58.8 1.34 1.56 99.2 24.9 -92.9 -48.2 80.9 2.01 1.58 -77.0 98.2 -48.1 -36.8 95.1 2.68 2.4
30.9 58.8 80.9 95.1 100.0 3.35 3.1
24.9 -48.2 68.4 -84.4 95.1 4.02 3.79 -72.9 -99.8 -63.7 12.5 80.9 4.69 4.12 98.2 -36.8 -84.4 68.4 58.8 5.36 5.16 -92.9 68.4 42.6 -99.8 30.9 6.03 6.45 58.8 95.1 95.1 58.8 0.0 6.7 6.37
170
Table 4-7 High resolution doppler shift extraction by NN -MTI with step size = 1mjs
P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
53.5 -90.4 99.2 -77.0 30.9· .67 .67
-90.4 -77.0 24.9 98.2 58.8 1.34 1.34 99.2 24.9 -92.9 -48.2 80.9 2.01 2.01
-77.0 98.2 -48.2 -36.8 95.1 2.68 2.68 30.9 58.7 80.9 95.1 100.0 3.35 3.35 24.9 -48.2 68.4 -84.4 95.1 4.02 4.02 -72.9 -99.8 -63.7 12.5 80.9 4.69 4.69 98.2 -36.8 -84.4 68.4 58.7 5.36 5.36 -92.9 68.4 42.6 -9.9 30.9 6.03 6.03
58.8 95.1 95.1 58.8 0.0 6.70 6.70
171
Table 4-8 NN-MTI performance for a probability of detection less than one * indicates that the pulse is noisy
P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
53.5 100.0· 99.2 -77.0 30.9 .67 .88 -90.4 -77.0 ~.O. 98.2 58.8 1.34 1.57 ~.O. 24.9 -92.9 -48.2 80.9 2.01 1.84 -77.0 98.2 -48.2 -36.8 0.0· 2.68 3.68 30.9 58.7 80.9 0.0" 100.0 3.35 2.76 ~.O. -48.2 68.4 -84.4 95.1 4.02 4.12
-72.9 ~.O. -63.7 12.5 80.9 4.69 4.27 98.2 -36.8 0.0· 0.0· 58.7 5.36 4.57 -92.9 68.4 42.6 -9.9 0.0" 6.03 6.20
58.8 95.1 0.0 58.7 ~.O. 6.70 6.47
Table 4-9 NN-MTI classification in presence of clutter with S2% correct classification
P{l) P(2) P(3) P(4) P(5) Desired Solution NN Solution
30.9 5S.S SO.9 95.1 100."0 30.0 30.0
5S.S 95.1 95.1 5S.S 0.0 30.0 30.0
SO.9 95.1 30.9 -5S.S -100.0 30.0 29.9
95.1 5S.S -5S.S -95.1 0.0 20.0 19.9
100.0 0.0 -100.0 0.0 100.0 20.0 lS.9 95.1 -5S.S -5S.S 95.1 0.0 20.0 lS.S SO.9 -95.1 30.9 5S.S -100.0 20.0 lS.9 5S.S -95.1 95.1 -5S.S 0.0 20.0 lS.S
30.9 -5S.S SO.9 -95.1 100.0 10.0 14.6
0.0 0.0 0.0 0.0 0.0 10.0 9.9 -30.9 5S.S -SO.9 95.1 -100.0 10.0 12.7 -5S.S 95.1 -95.1 5S.S 0.0 10.0 9.1S -SO.9 95.1 -30.9 -5S.S 100.0 10.0 9.11 -95.1 5S.S 5S.S -95.1 0.0 10.0 9.14 -100.0 0.0 100.0 0.0 -100.0 10.0 9.0S
172
173
c -
: .... :
Oes~red MTI Response NN-MTI Response
~ .. _Je. i ~·"l .r- :._.
e·:
·e·:
c::::>
·e·
.... ....
~~------------~------------~------------r-----------~~-----------' 0.0 4.0 8.0 12.0 16.0 Coheren~ PuLse Group
Fig. 4.6 NN·MTI response for variable step sizes in training data and identical
velocity error for pulse groups in the test data
20.0
C>
a PuLse CanceLer- Respanse - NN-PC Respanse
174
~~------------~------------~~----------~~------------~------------, 0.0 5.0 10.0 15.0 20.0 Caher-en~ PuLse Gr-aup
Fig. 4.7 Comparison of the NN·PC with a conventional
Pulse Canceler for indicating the target motion only
25.0
o - Oesi..r-ed Response NN-PC Response
175
~~--~~~~==~======-===~-T------------~~------------r-------------' 0.0 2.0 4.0 6.0 8.0 10.0 Coher-en~ PuLse Gr-oup
Fig. 4.8 Performance of NN-PC in separation of slow and fast targets
. _ .. _ .. _ .. _ ..
::i-
CI
. _ ... _ .
··w···w··_··_·
o - Des~~ed Response NN-PC Response
176
..•.
ci~-------------T------------~~------------r-------------~------------~ 0.0 5.0 10.0 15.0 20.0 25.0 Cohe~en~ PuLse Group
Fig. 4.9 NN-PC performance with PRF switching and binomially weighted pulses
(no clutter)
177
_00 0_0 00_00 0_00_00_0
c:::>
~-
00_00_00_00
c:::> ...;-
c:::>
00_00 0_0
CI
• Des~red Response NN-PC Response
ci~-------------'~------------r-------------r-------------'-------------~ 0.0 5.0 10.0 15.0 20.0 Conaren~ PuLse Group
Figo 4010 NN-PC performance with PRF switching and unweighted pulses
(no clutter)
25.0
· ... : c::>
:... .:
~"·i ~ .•. : : .. -
'-I i ... :
o •
178
r·· t_:
:.-: .•. ,
~ .. j
B~nom~oL We~gh~~ng No B~nom~oL Re~9h~~n9
ci~ ____________ -, ____________ ~~ ____________ ~ ____________ ~ __________ --,
0.0 5.0 10.0 15.0 20.0 Coh_ren~ PuLae Group
Fig. 4.11 performance of NN·PC with and without use of the binomial weights
for target velocity classification in presence of heavy clutter
25.0
179
CHAPTER 5
TARGET TRACKING BY
NEURAL NETWORK. MANEUVER MODELING
5.1. Introduction
Target tracking systems that operate in a track-while-scan mode have great
difficulties in maintaining the track when the target performs unpredictable ma
neuvers. A maneuver is a sudden change in ~cceleration which can take place
in different directions depending on the capabilities of the target being tracked.
Tracking a single target would be a simple task if there were no maneuvers. Tar
get maneuvers add significant complexities to the signal processing required for
tracking. The complexity arises due to lack of measurements from the target ac
celeration. Radar measurements are limited to range, angle, and sometimes the
radial velocity. These measurements are further corrupted by noise and background
clutter.
Ordinarily a Kalman filter is used to estimate the true position of the target
in an optimal fashion as long as the measurement noise is Gaussian. Sudden
accelerations, however, cause a bias in the measurement sequence. Unless this bias
is compensated for, the filter will diverge and the true track will be lost. Tracking
a maneuvering target in a cluttered background constitutes a difficult problem
that has been addressed in the literature for many years. As outlined in Chapter
2, several different methods have been introduced to model target accelerations.
The classical methods are mainly based on one statistical parameter that is used
180
to detect the presence of the maneuver. Upon detection of the maneuver, an
artificial noise is generated to substitute for the true acceleration in obtaining
future estimates.
Use of a single parameter often requires that the error be propagated over
several samples in the past for correction of the previous estimates. This is due
to the fact that the artificial noise cannot be correctly generated unless enough
samples are received. The waiting time for more samples can however result in a
total loss of the track since the target can begin a new maneuver. If the target
begins a new maneuver before the first one is compensated for, the filter will never
converge. Therefore, most of the proposed algori~hms in the current literature have
the disadvantage of losing the target in situations of short term accelerations, in
which the duration of acceleration is comparable to the time period between the
measurements. One method to resolve this problem appears to be the use of more
features in the estimation process so that fewer samples would be required. This is
a formidable task for the current algorithms for reasons of computational effort and
that maneuver is a real-time problem. The issue of the practical implementation
of tracking algorithms has also attracted some attention by itself in the literature
on target tracking. For example, Fitzgerald [37] and [101-103] discusses some of
the computational requirements needed by the current algorithms and addresses
the limitations of existing microprocessors for practical implementations. With the
advent of the neural network technology, some of these difficulties can be removed
in an efficient way. This makes the neural network approach more appealing to
target tracking problems.
The objective of the research reported in this chapter is to design and eval
uate a neural network-based maneuver modeling scheme that requires only a small
181
number of required samples for the compensation of bias in the measurement se
quence. A multilayer feedforward network with backpropagation learning is used.
For simplicity in the illustration of basic ideas, only longitudinal acceleration will
be considered, although the approach can be readily extended to other types of
maneuvers. The neural network uses three input parameters in order to identify
the presence of a longitudinal acceleration. Upon detection of the acceleration, the
amount of noise required to compensate for the bias is generated by the network.
The correct estimate of the target position and velocity are then recalculated using
the output of the neural network only for one time step in the past. The proposed
design has the following primary advantages:
1) It has a quick response for short-term accelerations.
2) Detection and compensation for the maneuver are done in one step.
3) The neural network controller works in conjunction with a Kalman filter
giving rise to a hybrid tracking system. Therefore, the Kalman filter can be
kept simple with only position and velocity as the states.
5.2. Neural Network Implementation of Maneuver Modeling
The principal idea behind the use of a neural network in this application can
be described as follows. In tracking a target using Kalman filter, the main source
of divergence is the bias which is introduced by the target maneuver, especially
when the tracking filter has reached a steady state and the filter gain has low
values. We use a neural network as an adaptive mechanism to help adjusting the
filter gains in the presence of accelerations. This provides a hybrid approach to
compensate for the bias in the Kalman filter estimation of states. That is, we
retain the Kalman filter as the main filter for tracking while the neural network
is employed to detect the presence of an acceleration and make up for the bias.
182
The estimation of states is still performed by the Kalman filter and the neural
network helps adjusting the target dynamical model only when a maneuver (i.e.,
sudden acceleration) is detected. Furthermore, detection of a maneuver is done by
the neural network through the use of a normalized innovation parameter that is
generated by the Kalman filter.
Most of the existing algorithms traditionally use the innovation sequence
for the estimation of this noise [38,104,105]. It turns out, however, that the inno
vation sequence is only one of several parameters that can be used for obtaining
an indication of maneuver. There are other parameters each of which is capable
of indicating a different characteristic of the target maneuver and these have been
used separately by different algorithms, such as the Heading Assisted Filter [39]
which relies on the target heading estimate. In our approach, we use two other
parameters in addition to the innovation sequence to further improve the detection
and compensation of the bias in the Kalman filter.
Classifying the type of maneuver performed by the target using more than
one parameter is what makes the neural network quite appealing. Our approach is
to employ some of these parameters as inputs to a multilayer neural network that
is trained to generate the required compensating noise signal. For simplicity, we
confine our discussion to the longitudinal acceleration only. However, our study
shows that other types of maneuvers (e.g., circular, sinusoidal, etc.) can also be
modeled with this approach.
The target dynamical model, as will be described below, is adjusted through
adding the noise components U x and u y generated by the neural network. It is this
noise that different algorithms such as the ones proposed by Singer [40] and Bogler
183
[41,72] try to model. We have already discussed in Chapter 2 some of the ma
jor problems with these methods. The goal here is to employ a neural network
to overcome these problems which can potentially enhance the performance with
these methods. It is important to include the coupling of the acceleration compo
nents in the model for target acceleration. These components are coupled due to
the fact that during the transformation of range and angular measurements from
the polar to cartesian coordinates, the measurement errors are no longer indepen
dent. Bogler's method, for example, lacks the coupling effect of the acceleration
components. According to Bogler [72], other methods have been proposed in the
literature, however, they are computationally inefficient and are. With a neural
network-based procedure on the other hand, both components can be generated
through the same network and hence they can be used directly to update the tar
get dynamical model. That is, a single network is used for estimation of both
components and no further processing is needed to include the effect of coupling.
5.2.1. Problem Formulation
For a precise description of the problem and the parameters that will be used
in the neural network-based maneuver modeling, let us consider a two dimensional
tracking situation in which the state vector consisting of the positions and velocities
of the target in the two coordinates is given as
xT(k) = [x(k)i(k)y(k)y(k)] (5.1)
and the state equation is represented by the dynamical model
x(k + 1) = Fx(k) + Gu(k) + v(k). (5.2)
The second and the third terms in the above model refer to an accelera
tion input and an additional correction factor respectively. The acceleration input
184
however is unknown to us and it has to be estimated with some uncertainty. The
matrix F is termed the iransition matrix and G, the noise matrix. These matrices,
when multiplied by the vectors x( k) and u( k) in the above equation, will represent
the equations of the target motion in one sampling period T [43], i.e.,
and
y(k + 1) = y(k) + iJ(k)T + ~T2Uy(k).
The matrix G is called the "noise matrix" because it is multiplied by the vector
u( k) which is unknown and its components need to be estimated. Expressing the
above equations in the form (5.2), one obtains the F and G matrices given by
F= [~ : and (5.3)
The unknown input u( k - 1) for modeling the target maneuver is to be
estimated by the neural network. The term v( k) is a zero-mean white noise process
with covariance Q, i.e.,
E{ } {Q, k=j;
Vk Vi = 0, k 'I- j (5.4)
where Vi and Vk are two samples of the noise process v(k). The observation se
quence is given by
z(k) = Hx(k) +w(k) (5.5)
185
where w( k) represents the measurement noise, which is assumed zero-mean with
covariance R and is independent of the process noise v( k).
The filter designed on the basis of the non-maneuvering model (i.e., u( k) =
0) would cause the innovation sequence (see Eq. (2.13)) to build up in magnitude
(when u #: 0). Using the innovation sequence with other indicators (which will be
discussed later), the corresponding input noise level can be estimated through the
neural network. Filter gains are then adjusted through covariance matrices of the
Kalman filter which incorporates the variance of error due to the neural network
estimate in the form
(5.6)
where P(klk) is the covariance matrix of the state estimates, which reflects the
Kalman filter errors in its estimates of states, and q~n is a term which represents
the neural network estimation error. It is a term which is computed off-line during
training when there is no maneuver ( i.e., u( k) = 0). In the presence of a maneuver,
the neural network estimate li(k - l)nn = lux uyjT would be used until the filter
reaches the steady state. Appropriate conditions for reaching steady-state rapidly
can be incorporated in the training examples. In other words, the precision of the
algorithm can be enhanced by making use of smaller step sizes of acceleration to
generate the training examples. Since the neural network is trained to estimate
the acceleration one step in the past, the prediction correction is done based on
the equation
xnc(klk -1) = Fx(k -11k -1) + GUnn(k -11k -1) (5.7)
186
where xnc is the neural network correction for the prediction of the states in the
previous step. The filtered estimate is then
Xnc(klk) = xnc(klk -1) + I{(k) [Z(k) - HXnc(klk -1)]. (5.8)
where K( k) is the Kalman filter gain which is given by
K(k) = FP(klk)HT [HP(klk)HT +R] +Q. (5.9)
The covariance matrix of prediction will change from the non-maneuvering
case (i.e., zero process noise) to
P(k + 11k) = p P(klk)pT + GQGT (5.10)
where
(5.11)
This approach to the compensation of bias induced by the maneuver does
not require a long waiting time for several measurements, as in the scheme pro-
posed by Bogler [72] which requires at least 5 to 6 measurements. Computation
of the propagation matrix M (Eq. (2.22) in Chapter 2) from the estimated time
of maneuver up to time k is not required since the update takes place at every
step during the period of acceleration. The calculation of the propagation matrix
represents the primary processing load in Bogler's method. It may be noted that
in this method estimation of u( k) is merely based on the residual information of
the nominal Kalman filter. With the present neural network method, however,
we make use of additional parameters to represent the maneuver which helps in
identifying the acceleration in a shorter period of time. Consequently, there is no
need for a propagation matrix as required by Bogler's method.
187
5.3. The First Input Parameter
In effect, the neural network replaces a bank of N parallel filters which would
have been required otherwise. Instead of matching each filter to a separate quan
tity of the residual information (i.e., innovation), we let the neural network learn
the nonlinear relationship that exists' between the acceleration and the residual
information. As mentioned earlier, the idea of using a bank of parallel filters was
first introduced by Magill [41,94] (Fig. 5.1). He developed an expression for the
posterior probability that the nth Kalman filter is the correct one to use.
The first input parameter that we use is the innovation term which is mul
tiplied by the Kalman filter gain to smooth out the predicted estimate, that is
v(k) = z(k) - H x(klk -1). (5.12)
This expression is representative of the actual newness of information about the
position measurements. The change in residual information is sensitive to changes
in velocity and therefore different initial velocities should be included in the training
set (which will be discussed in a greater detail in a later section). In terms of the
duration of the acceleration, however, each estimate is updated after a sampling
period. In this method, the target maneuver is jointly detected and corrected in
a single continuous operation without much computational complexity. The other
methods in the literature that have the same feature are computationally involved
and have long delays due to the sequential nature of processing.
The changes in position innovation were normalized with respect to the
covariance of innovation S( k), which is
S(k) = H P(klk -l)HT +R (5.13)
188
where R is the measurement covariance matrix. The components of innovation
were normalized separately and their summation was used as the input parameter
which is given by
(5.14)
where the terms SuCk) and SuCk) are the diagonal components of the covariance
matrix S(k). Note that we have combined the two components to keep the input
feature set minimal. 'We could use Vz and Vy (i.e., the normalized Vz and Vy com
ponents) independently as input features together with some higher level relations
among them such as vz/vy or vzvy. For the time being, we want to show that the
neural network produces satisfactory results if appropriate features of the maneuver
are presented as inputs. Features should be descriptive of the maneuver class and
be focused on the incremental changes rather than the exact instanteneous values.
Reducing the features to represent incremental changes will result in a much less
training effort.
5.3.1. Statistical Properties of the Innovation Process
Let the vector z(klk -1) denote the minimum mean-square estimate of the
observed data z(k) at time k, given all the past values of the observed data up to
time k - 1. This is actually the Kalman filter estimate. As mentioned before, the
innovation process associated with z( k) is defined as
v(klk -1) = z(k) - z(klk -1), k=1,2, ... (5.15)
where the vector v(klk - 1) represents the new information in the observed data
z( k). The innovation process has the following properties
189
1) The innovation process v(klk -1) corresponding to the observed data z(k)
at time k, is orthogonal to all the past observations z(l), z(2), ... , z(k -1),
i.e,
E [v(k) Z(j)] = 0, j = 1,2, ... ,(k -1).
2) The innovation process consists of a sequence of vector random variables
that are orthogonal to each other, as shown by
j = 1,2, ... ,(k -1).
3) There is a one-to-one correspondence between the sequence of vector ran
dom variables z(l), z(2), ... , z(k-1) representing the observed data and the
sequence of vector random variables v(1), v(2), . .. , v( k) representing the in
novation process, and therefore one sequence may be obtained from the
other by means of a linear transformation without loss of information. This
can be stated as
{z(l), z(2), ... , z( k)} {::::} {v( 1), v(2), ... , v( k)} . (5.16)
5.3.2. Estimation of States using the Innovation Process
In the classical approaches, the state estimates may be expressed in the form
of a linear combination of the sequence of innovation process v(l), v(2), ... , v( k).
In previous sections we argued that these procedures are generally slow and require
many samples of innovation sequence whereas when a maneuver takes place it has
to be detected and compensated within a minimum number of sampling intervals.
There is no procedure in the current literature that uses parameters other than v( k )
in the estimation process without adding more dimension to the state vector of the
190
Kalman filter. This is primarily due to the nonlinear relationship between these
parameters. IT we call these additional parameters as additional feature vectors
that identify the maneuver, then the estimation with an augmented state vector
through Kalman filter may seem appealing. However, Kalman filter as a linear
minimum mean square estimator will fail to converge when there is a bias in any of
these parameters since the orthogonality assumption fails. In addition to this, due
to the extra amount of computations, adding more states will cause additional lag
time in reaching the steady state. In the next section we shall discuss the process
of bias detection and show how the bias induced by a maneuver may be detected
with the neural network. However, before we discuss that, let us relate the neural
network method that we have used with a linear estimation technique using the
innovation process.
Consider the state u(k) and its estimate u (jIZk), where j is the time index
for the state and Zk represents the observation sequence up to time k. Then the
estimation using the innovation process is expressed as a linear combination of the
samples in the sequence v( k) in the form
k
u(jIZk) = L Bj(i) v(i). (5.17) i=l
The set {Bj(i)} represents the matrix sequence that has to be determined. Ac
cording to the principle of orthogonality, the predicted state-error vector should
be orthogonal to the innovation process.
The neural network architecture can play an important role in the orthogo
nalization of the sequence. Classical techniques use parallel filter banks to find the
filter that gives the minimum residual error. This approach is quantitatively limited
to a finite set of discrete values of v(m). With the use of a neural network, however,
191
we extend the quantity v( m) over its magnitude rather than the time index. That
is, we:fix m = ko (i.e., the instant that maneuver is detected) and quantize vq(ko),
with q denoting the number of quantizations desired which depends on the training
effort. An interesting property now is the interpolation capability afforded by the
neural network architectures which extends the power of neural network implemen
tation beyond that possible from ordinary parallel architectures. By appropriate
generation of examples of innovation sequence, the training process reduces to min
imizing the average cumulative error which is E {[u(j) - ft UIZk)] vq(ko)} in the
linear estimation case.
5.4. Optimum Bias Detection
In the previous section we discussed the statistical properties of the inno
vation sequence and it was mentioned that the innovation sequence (or residual
sequence) that is used in the Kalman filter should be a white noise process with ze
ro mean. In the presence of a maneuver or any other consistant interference, there
will be a nonzero mean in this sequence. If this mean is not removed from the
filter, it will propagate through the filter parameters and the covariance matrices
and results in a deviation from the true track. The purpose of the adaptive scheme
is to detect this bias as soon as it appears and to generate appropriate signals to
compensate for the bias. Both the maneuver and the unwanted clutter contribute
to the bias. The adaptive scheme should have the following properties
1) It should detect the bias as early as possible. This is called the maneuver
detection [99] process.
2) The performance of the maneuver detector should not be affected by clutter.
Clutter data can introduce an additional bias term which may result in a
false maneuver detection.
192
3) Correction for the bias should take place before the next target return is
received by the radar.
According to McAuly [42J, the innovation sequence can be modeled as
v(k) = vo(k) + m(k - ko) (5.18)
where vo(k) is a zero-mean, white noise process with variance No. The second
term in the above equation is due to the bias which has occurred at time k ;:::
ko• We assume that the maneuver corresponds to the introduction of a constant
acceleration in the interval of estimation (i.e., from the time bias is detected back
to the time the maneuver was first started). This bias will then manifest itself as a
quadratic function of (k - ko ) in the measurement sequence since the position and
acceleration are related as such. Therefore, the bias term could be approximated
as
(5.19)
where T is the sampling time, koT denotes the unknown time at which the maneu
ver was initiated, and J.l is a parameter related to the magnitude of the acceleration
or the nonlinearity of the model. We can describe [42J the process of a maneuver
detection as follows:
Detection of a maneuver is equivalent to detecting the presence of a
deterministic signal of unknown amplitude and time of arrival in a
background of zero-mean white noise.
Assuming that the measurement noise is Gaussian, the generalized likelihood
ratio test will be to declare the presence of a maneuver if
{
k k } L(k) = ~ v2(n) - ~l! ~ [v(n) - J.lS(n - kO)J2 ;:::,x, (5.20)
where
and
Now, define the functions
SCi) = (iT? u(i),
u(i) = {O, 1,
k
i < OJ i ~ O.
E(k,ko) = L S2(n- ko) n=O
k
p(k, ko) = L lI(n) Sen - ko), n=O
and use in equation (5.12) to give the likelihood.ratio as
193
Since ko (the actual starting time of the maneuver) is unknown, many dif
ferent discrete values are assumed and the corresponding bias terms are calculated
as
A(k ) = p2(k, ko) J.L 0 E(k,ko)'
which reduces the likelihood ratio to
Given that we are currently at time kT, and that we are testing for a bias which
may have been initiated at any time (k - l)T, (k - 2)T, (k - 3)T, ... , there is only
a finite number of the past measurements that need to be processed. This yeilds
ko = k -j, j=1,2, ... ,M
194
where M is the number of the past sampling times used for estimating the magni
tude of the maneuver. The likelihood ratio now becomes
L(k) = max p2(k, k - j) m=1.2 •...• M E(k, k _ j)
where k
p(k,k-j) = L v(n)S(n-k+j), n=O
and k
E(k, k - j) = L S2(n - k + j). n=O
However, Sen) = 0 for n < 0 and therefore
m
E(j) = L S2(n). n=O
Now, we define a bank of M filters where each filter has an impulse response
h ·(1) = SCi - 1) u(1) ] JE(j) ,
j = 1,2, ... ,M
then k
p(k, k - j) = JE(m) L yen) hiCk - j). n=O
The time of initiation of maneuver can be estimated as shown in 142] based on the
following decision.
j = m=5~.,M [t yen) hiCk - n)]2 ~.A. n=O
(5.22)
To summarize, the residual sequence v( n) which is generated by the Kalman
filter through a bank of filters is matched to the time jT. The magnitude of the
quadratic bias function is then compared against a fixed threshold. IT the threshold
is exceeded, a maneuver is declared. ·When the maneuver is declared to have taken
195
place at time jT in the past, the bias compensation process is started based on the
information provided by the maneuver detector (i.e., magnitude of the acceleration
as well as the time of its occurrence). A block diagram representation of the optimal
bias detection is shown in Fig. 5.2. The neural network bias detection scheme that
we have used follows the same path except for the following differences.
1) Instead of quantizing the time of occurrence of the maneuver to M different
values, we quantize the threshold A to a finite set of values {Aq }.
2) We maximize the likelihood ratio L( k) with respect to Aq• Therefore, the
problem reduces to the following:
Assuming that the maneuver started "at time k - 1, what should the
magnitude of the input acceleration be so as to come up with the threshold
of Aq?
Therefore, instead of fixing the threshold, we quantize it to N discrete values
{Aq; q = 1, ... , N}. This will change the likelihood ratio to
_ max {P~(k' k - 1) Lnn(k) - q Eq(k,k-1) q= 1, ... ,N} (5.23)
where the subscript nn refers to the neural network likelihood ratio. The problem
is now reduced to a single nonlinear filter (i.e., neural network) which is matched
to the computed bias in one sampling time. As can be seen in Fig. 5.3a, the bank
of N parallel filters is replaced with a single neural network. Note that v(k) refers
to the innovation vector with the two components VI (k) and V2( k) which are the
first and second input parameters (i.e., the position and velocity innovation). Fig.
5.3b illustrates the inputs and outputs of the neural network. There are several
advantages to this scheme which are summarized below.
196
1) Several parallel filters are replaced by one nonlinear system of neural net
work. The on-line adaptation of this scheme should be emphasized. Since
the training is performed off-line, the response is calculated almost instan
tenously.
2) It is important to avoid a false detection of maneuver particularly when
clutter is present. A clutter sample may falsely indicate the new position
of the target which may look like a sudden change in the position measure
ments. Using a large number of quantized levels N for the threshold settings
reduces the chances of losing track in the cases when a maneuver is falsely
detected.
3) The ratio of the filter time constant T and sampling time T (i.e., TIT) is
not critical for this design.
It may appear that as N increases, more a.daptivity is gained. We found
this to be not true. This is due to the fact that as the acceleration reduces in
magnitude, clutter will be the major factor in introducing the bias in the resid
ual sequence which may result in a false detection of maneuver. Therefore, the
resolution is chosen such that the maximum error equals the standard deviation
of the measurements ( i.e., q R of the radar sensor). Given the expected range of
accelerations (e.g., 0 - 20 ml s2 ), we quantized this range into step sizes that cor
respond to q R. For example, assuming a maximum target acceleration of 20 m I S2
and a maximum scan time of 10 seconds, the required precision for the neural net
work estimate should be within 1 ml S2 in order to keep the position error below
the resolution of the radar. This error in the acceleration estimate corresponds to
100 meters in the position per scan period. The sampling time of 10 seconds is a
197
nominal value and the neural network response is based on the assumption that
the actual sampling rate (i.e., radar scan period) is less than 10 seconds.
5.5. The Second Input Parameter
In the previous sections it was noted that in order to capture the maneuver
as soon as it occurs additional input p~ameters are needed. One may note that in
most of the current algorithms, the two processes of detection and compensation are
performed in two distinct phases. In contrast, with a neural network-based scheme,
we do not need to separate the two processes. The way to accomplish this is to
find appropriate additional input parameters such that they are generated through
independent processes and contain useful infonp,ation about the maneuver. We
suggest using independent parameters for the following reasons.
1) We want to keep the filter simple for a high speed of response. Therefore,
we use the neural network as an adjunct device to reduce the load off of the
Kalman filter. Without such a strategy, each individual feature parameter
has to be defined as a separate state in the overall state vector increasing
the complexity of the Kalman filter.
2) It is better to use different estimators for the parameters instead of including
all in one state vector of the Kalman filter. This will keep the errors in the
estimation process independent from each other particularly in the presence
of a bias.
3) As mentioned before, it will be difficult to compensate for the bias due to
the coupling effects of the errors in the components of the acceleration. We
suggested earlier that there should be at least three parameters that con
tribute to the actual estimation of the acceleration input. These parameters
198
should at least contain the following information about the target maneu
ver: a) The intensity of the acceleration, b) A sense of the direction of the
acceleration, and c) The clutter visibility.
As mentioned at the beginning of this chapter, we are using the neural net
work scheme for tracking the longitudinal acceleration. However, we chose to take
a general approach for training in order to be able to extend the training process
for a circular maneuver as well. Therefore, we still need estimates that identify
the heading of the target in the presence of noise and clutter. The estimation of
the heading from noisy position measurements is in itself a problem of consider
able practical importance. The common appro~ch to this problem is to measure
the x and y positions simultaneously. IT one could assume a constant velocity, the
heading estimate could be evaluated as the ratio y/&, where & and y are derived
from a least square estimator. However, the problem with this approach is that it
assumes constant velocity, which is not the case in our problem where the speed
profile is totally unknown. This estimator is very sensitive to changes in speed and
in a noisy environment (particularly clutter) it gives highly erroneous results. This
is due to the fact that the heading is mainly due to some correlation between the
(x, y) measurements in the cartesian coordinates. The use of & and y causes rapid
changes in the heading. Also, it should be noted that the use of & and y from the
Kalman filter output is not highly appropriate because they are already corrupted
by the bias. Therefore, a good estimate is one that uses a line fit through all the
measurements.
The importance of the heading stems from the fact that its first derivative
is the angular velocity, which can be used instead of the heading to describe the
199
motion of a turning aircraft. It is well known that tracking systems using heading
estimates exhibit a very stable performance in the event of long sampling intervals.
5.5.1. Formulation of the Heading Estimate
Methods for tracking of targets with constant heading and variable speed
in a fixed direction fall into two categories. The first category refers to those with
prior assumptions about the target speed profile whereas the other category consists
of those that do not make such assumptions. However, in both cases maximum
likelihood estimators are used with different sets of assumptions. For a precise
discussion of these estimators, let us briefly state the problem of interest.
Given the measurements (Xm(t), Ym(t)) at times t = ti, i = 1, ... , N, we
are looking for the maximum likelihood estimate of the target heading H(t). The
constant-heading trajectory satisfies the conditions
H(t) = 0
H(O) = Ho.
The measurements in cartesian coordinates are
where the measurement noises nx and ny are Gaussian random variables with zero
mean and variances 0"; and 0";. Let us define
M = y(t) - Yo = tanHo x(t) - Xo
(5.24)
200
where (xo, Yo) is an unknown coordinate at t = O. 1\1 is called the heading and it
is this quantity in which we are interested to obtain an estimate in order to use as
our second input parameter to the neural network.
There are a few approaches in the literature for obtaining this estimate [39]
of which we discuss the two most related to our application. These two methods
are described as cases (a) and (b) as follows.
Case (a): 0-; and 0-; are assumed unknown. In this case, the method of least
triangles, which is given in [39], provides the heading estimate as
where N
l:f-l(Ym(i) - y)2
l:~l(Xm(i) - x)2
J.l = sgn L(Ym(i) - Y)(Xm(i) - x). i=l
(5.25)
Case (b): Both 0-; and 0-; are known. In this case the maximum likelihood
solution (MLE ), given in [39], may be used.
It should be noted that the Kalman filter computes i: and y and a value for
the heading estimate can be determined by M = y/i:. We have already discussed
some of the deficiencies of the y / i: estimate; specifically, we do not wish to use
the Kalman estimates because once the maneuver is initiated these estimates are
biased. Also, in order to eliminate the need for any a priori assumptions, we rule
out the .i~lLE estimate, since it requires the knowledge of 0-; and 0-; which are
based on the assumption of the noise being Gaussian in the measurements. We do
this mainly because we are dealing with clutter data which may have an unknown
statistics inside the validation gate. Furthermore, we limit N (i.e., the number of
the data points to be used in the estimation of the heading) to three scan periods
201
since we want to use the change in heading as an input parameter as opposed to the
heading itself. Note that in each scan period we may have several measurements.
This feature is more general in the sense that lateral accelerations can be included
in the training set as well. However, for the time being we limit the training to the
longitudinal acceleration only.
In the computation of the heading estimate by MLT in equation (5.25), we
use the set of data given by
(5.26)
That is, we use the filtered estimates for the tiD;le step k - 2. This approach will
make use of the Bayesian association and puts more emphasis on the correction
from the time k -1 to time k.
Angular velocity measurements can be obtained from the difference between
the new measurement and the last valid heading estimate. Determination of an
acceptable threshold for the turn rate is a critical issue here. The usual practice
in the current literature is to set the threshold based on a rough estimate of the
velocity. This is due to the relation w = ~ where w is the turn rate, v is the
tangential velocity, and an is the lateral acceleration. To set a threshold of 0.0025
rad/ sec for the turn rate for a straight line trajectory, a large combination of (an, v)
may satisfy this relation with a nontriviallatteral acceleration. A constant angular
velocity threshold corresponds to different lateral acceleration thresholds for an
aircraft fiying at different velocities. That is, a target heading may be changing
slowly such that, even though the turn rate is small, the target may be deviating
from the straight line. In contrast, in our approach, examples are generated from
assuming different speed profiles. We provide the change in heading with two other
202
parameters, one of which has already been discussed in detail. In other words,
the position innovation together with the other parameter (i.e., doppler change)
provides a measure of change in the velocity. We then let the neural network learn
the different quantization levels and associate each turn rate to the expected change
in velocity.
This adaptive thresholding for the maneuver is helpful in extending the
training to the lateral acceleration as well. However, we have limited the training
to straight line trajectories only. The plant noise (i.e., the acceleration input) to
be estimated by the neural network has to be in conjunction ,vith these maneuver
detection thresholds so that an aircraft executing slow maneuvers with angular
velocity under the threshold can be tracked with the straight line assumption. We
assume that the aircraft is subject to random zero-mean accelerations uncorrelated
from sample to sample and constant during each sampling interval. This is in con
trast to the assumptions made by Singer [40], since he assumes some correlation
among the samples. However, even though this is true in most practical cases, it
results in a slow adaptation since the amount of correlation is not known until sev
eral samples are processed. Therefore, fast maneuvers cannot be corrected within
a sampling period.
5.5.2. State Equations and the Heading Estimate
In the previous section we discussed the importance of the heading estimate
in modeling the target maneuver. We shall now discuss the effect of including the
heading estimate more precisely in regard to the modeling of longitudinal acceler
ation. Heading assisted maneuver tracking has been investigated in the literature
and has been shown to have some advantages [39]. It can tolerate longer sam
pling times (i.e., fewer samples) as well as maneuvers of irregular shapes. These
203
properties are in confirmation of our belief that heading estimate is an appropri
ate parameter for our purpose and contains valuable information about the target
while it is performing a maneuver. In contrast with other heading assisted tracking
filters which employ the estimate obtained from the output of the Kalman filter
(which could be biased) we will use the heading estimate MLT obtained from the
method of least triangles as described earlier.
Another reason motivating the use of MLT comes from the following obser
vation. Note that one other possible method is to include the angular velocity Wk
in the overall state vector of the filter and use WkT for the change in heading. The
state equations will then be represented by
:i; = Vz + noise
V:z: =wv, + noise
iJ = VII + noise
Vy = -WV:z: + noise
w=O+noise
and the measurement equation becomes
[xm] [1 0 0 0 0] [~l Ym = 0 0 1 0 0 Y Wm 0 0 0 0 1 VII
W
+ noise.
These equations represent a nonlinear system and require the use of an extended
Kalman filter. The needed manipulation of 5 x 5 matrices for implementing this
algorithm are unacceptable in real-time tracking. This once again confirms the
efficacy of using the JVILT estimate.
204
We calculate the change in the target heading by
(5.27)
where MLT is given by equation (5.25). It must be particularly emphasized that the
angular velocity state is no longer needed in the state equations, thereby reducing
the burden on the Kalman filter. Thus, there are two main advantages in using this
parameter (i.e., change in heading estimate) other than longer sampling period, viz.
the state equations are simpler and errors in the input parameters to the neural
network will be independent with this approach.
5.6. The Third Input Parameter
We now investigate the role and significance of the third input parameter in
the training. In the last section it was pointed out that there are three fundamental
issues that need to be considered in obtaining an acceleration estimate. These are:
1) the intensity of acceleration,
2) direction of tangential velocity,
3) initial velocity at the time of acceleration.
Each of these factors can be significantly affected by the presence of clutter. The
intensity and direction have already been discussed in detail. These parameters
should meet the following objectives:
1) Maneuver has to be detected as soon as possible (i.e., within two to three
scan periods).
2) At least one of the parameters must reflect the needed information about
the changes in velocity. Since doppler shift can be measured within each
205
scan period, this can provide information about the target preparation for
a maneuver and hence can aid in an early detection of the maneuver. Sen
sitivity of this parameter, however, is not without a cost, since more false
detections of maneuvers may occur. This explains why we may need all of
these parameters so that together they can reliably identify the maneuver.
For example, the velocity innovation is more sensitive to the starting and
ending times of the maneuver while the position innovation is more sensi
tive to the intensity of the acceleration components. H we use the velocity
innovation by itself as an indication of the maneuver, it may cause a false
detection of the maneuver, since a slight change in the velocity, if not cou
pled with information on the intensity level, might indicate falsely that a
maneuver is initiated.
3) Upon detection of the maneuver, we need to quantize the intensity of the
maneuver which is mainly the role of the first parameter.
5.6.1. Velocity Innovation Parameter
As described in earlier chapters, the echo from a moving target produces a
shift in the radar carrier frequency, which is the doppler effect. Doppler shift takes
place when the wave radiated from a point source is compressed in the direction of
motion or when it is spread out in the opposite direction and is directly related to
the target radial velocity through the relation
h(k) = fo(k) - io(klk - 1) (5.28)
where h is the doppler shift in frequency and fo is the center frequency of the
transmitted wave. Since each radar pulse is a modulated wave of some kind (e.g.,
sinusoidal), the shift in frequency could be measured on a pulse to pulse basis.
206
Therefore, one can take an average of these changes for each pulse sequence per
scan which is the task of an MTI processor (whose design was discussed in Chapter
4).
As shown in Fig. 5.4, the radial velocity R is related to the tangential
velocity VT according to
(5.29)
where L is the angle between the line of sight and the target heading (i.e., direction
of tangential velocity). It should be noted that as this angle approaches 900,
the doppler shift will vanish and no measure of target velocity will be available.
Furthermore, the target velocity is scaled by VT · = RI cos L and hence unless the
value of angle L is somehow provided with the measurement of R, there will be
no useful information in providing the velocity innovation. Using the heading
estimate !VI (obtained by kILT as discussed earlier) the expression for the velocity
will become VT = RlcosL in which L = {) + ¢, where angles {) and ¢ are given by
Using these equations, VT can be expressed in the form
VT
= (xx + yy)/...;x+Y ~ . cos [tan-1 (;) + tan-1 (M) ]
Thus VT can be expressed as VT = "iI! (x, x, y, y, <1» where "iI!(.) is a nonlinear
function representing the relation of the tangential velocity to the states and the
heading estimate. The corresponding change in tangential velocity due to the
207
change in x, y, x, y and heading is described by another nonlinear function S VT = ~ (Sx,Sx, Sy,Sy, S¢» where S¢> = tan-1 M(k)-tan-1 M(k-l) is the angular change
in the heading estimate. One can see that, without providing the heading change
estimate (i.e., S¢», the correct scale for VT cannot be extracted by the network.
The variance of the doppler shift is related to the variance of the range rate
which is given by [43]
where
2 >'~fs (1 =
R 4JSNR
and ~fs is the doppler filter bandwidth. We shall assume that doppler measure
ment is available. It should be noted that we do not include it in the overall state
equations, but merely use the normalized measurement directly for generating a
neural network input. This is to ensure that not only the input parameters have
independent sources but also the filter is kept simple. Thus, the third parameter
(neural network input) is independently measured and normalized with respect to
a worst-case clutter variation (e.g., (1/4 = 30mls). We shall call this parameter
V2, which is given by
where >. is the wavelength of the transmitted wave.
5.6.2. Quantization of the Noise Process
(5.30)
For a precise description of the quantitative approach that we are taking in
the maneuver modeling, in this section we shall discuss how a numerical quantity
is computed to scale the covariance matrix of the input process noise. Consider
the target state equations
x(k + 1) = Fx(k) + v(k)
z(k) = Hx(k) + w(k)
where x = [x X y yf,
Z = [::]
H = [~ ~ ~ ~
F= [~ ~ o 0 o 0
208
k = 0,1,2, ...
and v( k) and w( k) are zero-mean white noise processes with covariance matrices
Q and R. Hence
E[v(k) vT(j)] = Qa(k,j)
and
v(k) = Gu(k)
where u( k ) is the acceleration noise process which is what we are trying to estimate
with the neural network. That is, we estimate fl(k - 1) first and then recompute
x(klk) based on the fact that usually a maneuver is detected at least two to three
samples later. This dictates that we go back in time to correct for the bias. The
209
quantitative approach in the literature, however, does it as follows [10]. With the
noise matrix G given by
and
we can write
where
[~T2
G- T - 0 . 0
E[uCk) u(j)] = U;Okj,
The scale factor q is the power spectrum of the process noise which is unknown.
It is important to realize that in contrast with our neural network approach, the
noise samples considered here are independent and identically distributed. The
covariance matrix scaled by q is given by
210
It is very difficult to select the scale factor q off-line. As one can see, there
is no apTioTi information on 0"; • The quantity q and its related parameters are
given by
6 . T m = maneuver tIme constant
0"2 ~ variance of acceleration "
It must be noted that T m and amaz are unknown and are selected prior to
tracking. The situation for this approach will become more complicated yet when
the target performs a few short maneuvers with different time durations as well
as different acceleration levels. In such cases, Bogler's and Singer's methods will
result in a crude estimate of O"a. Bogler used the Input Estimation (IE) method [72]
to estimate O"a (i.e., equation 2.19b). However, as we argued before, using several
samples (e.g., N = 6) for the estimation of O"a is appropriate only if Tm ~ T
(that is, if the duration of acceleration is much longer than one sampling period).
Typical values for T m are in the range of [5,200] seconds and the sampling period
for a track-while-scan radar is in the range of [5,10] seconds.
5.7. Generation of the Training Vectors
Based on an expected range of the target velocity we generated a series of
examples such that the longitudinal acceleration was included at different initial
velocities. The expected range of velocity was assumed to be 200m/s to 700m/s.
The expected range of acceleration was assumed to be zero to 20 m / s2 • Only
longitudinal acceleration was considered in the example set. It should be noted
211
that all three of the input parameters are incremental values which further prepare
the neural network for a sudden change in acceleration. The examples are based on
the maximum tolerance for the tracking error per scan. That is, the incremental
changes for acceleration magnitude is 1 m I s2. This corresponds to a 10 m I s error in
speed, which is typical of an error induced by clutter variations. To prepare for the
worst case, we assume a doppler uncertainty of 30 ml s in the velocity innovation
which results in 15 % error for the minimum velocity. The straight line trajectories
were generated for 0 :5 Q:5 45° \vith an angular separation of one degree. It may
be recalled from the earlier discussions that the choice of (j 6~ = 1 ° is based on the
fact that in air traffic control systems the practical heading estimate transmitted
by the beacon system has a one degree uncertainty. Also, a turn rate of 3° I sec is
typical of a slow turn , and hence, as a rule of thumb we use 1/3 of this value for
the tolerance of the heading change estimate. The third parameter is computed
from equation (5.30).
For a better doppler sensitivity, a shorter wavelength must be used. The
millimeter wave radar has a high doppler sensitivity, e.g., 233.3 Hzlmls. That is,
for each 1 ml s change in the closing rate fl, the transmitted frequency will shift
233.3 Hz. This is a rather high sensitivity which can detect early changes in target
radial velocity. Therefore, one can see how important this factor is in the pattern
recognition of a maneuver.
In the simulation results, we will demonstrate the performance of the pro
posed neural network method in comparison with classical techniques that make
use of doppler returns and we will point out some interesting observations. For sim
ulation purposes, we used a wavelength which is commonly used in precision target
tracking systems, viz. ). = 8.57 X 10-3 m. The range of values for the position
212
innovation sequence can be approximated by the maximum size of the validation
gate [10] which corresponds to
where n z is the dimension of the measurement space (which is in this case 2) and
The parameter 'Y corresponds to the 99% probability region, which is obtained from
the Chi-square distribution tables [14,91] and has the value 16.
Recall that the weighted sum of the innovation sequence has a Chi-square
distribution with the number of degrees of freedom equal to n z • Also note that 'Y is
a preselected value and is kept constant for most applications. The gate probability
PG is related to 'Y and as shown in [10] is given by
.., -~
=l-e (5.31)
where P G is the probability that the target is inside the gate. Then the proba
bility that the target is detected inside the gate is PDPG • The probability that
all other targets detected inside the gate are false targets (i.e., clutter returns) is
1 - PDPG • This choice for 'Y (Le., 'Y = 16) corresponds to a rather heavy clutter
environment. As the number of clutter returns increases in the validation gate,
the magnitude of the innovation increases and is further adjusted by the scale fac
tor q2 ('xVk, P D) which is introduced by Barshalom [10] and can be provided as a
look-up table. Therefore, the first input parameter lies somewhere in the range
213
o.s ~ vl{k) ~ 2. Note that the innovation sequence moves toward a smaller value
as the clutter increases, that is less probability is assigned to each data originating
from the target. As each data falls further away from the predicted position, the
corresponding magnitude of the innovation increases but its Bayesian probability
of being originated from the target decreases. That is
j = 1, ... ,m (S.32a)
where Pj{k) is the probability that the measurement Zj is from the target, and
PoCk) is the probability that all other measurem~nts are false and is given by
[ m(k) ]-1
PoCk) = b{k) b{k) + ~ ej{k) (5.32b)
where
and
(5.32c)
Recall that mk is the number of measurements that fall inside the gate (including
the false measurements from the clutter) and is given by
(5.33)
We computed the average magnitude of the normalized innovation in the
absence of an acceleration (i.e., constant velocity) over 100 simulation runs with
50 samples per run as
( 1 ) (1) 100 SO .
ev = 100 50 ~ t; e~{k) (5.34a)
214
where
(5.34b)
and
(5.34c)
such that Q2, as mentioned above, is the correction factor for the combined in
novation. We have to use the Q2 scale factor since we do not have the correct
innovation in PDAF (which was discussed in Section 2.9); rather, we have the
combined weighted innovation which results from the combination of all changes
due to each data point (both from the target and the clutter) inside the gate. This
is how the extraneous data (i.e., those from clutter returns) get normalized. When
the acceleration takes place, ell is allowed to raise as high as 16. If the maneuver
is not corrected by the time the innovation reaches this magnitude, the model fails
to compensate for the sudden acceleration input. We preset the value of / = 16
for the neural network estimate but for the simulation of the Input Estimation
method we use a window of N = 4 with /1 = 10, /2 = 16, /3 = 20, and /4 = 50.
The validation gate for the Input Estimation (IE) method has to be kept large
enough to make sure that the target is still within the gate for at least N sam
pling periods after the occurrence of the maneuver. Corresponding to these values,
thresholds of 2.0, 2.5, and 2.7 were assigned. Note that in simulating the neural
network scheme we do not need to make these assumptions and there is only one
gate because N = 1. That is, once the maneuver is declared, the neural network
tries to compensate for the induced bias in just one sampling period. If the bias
215
still exists (as indicated by the parameters), it repeats the process. The ranges of
input values for all of the parameters are given below.
and the output ranges are
0.2 ~ M(k) ~ 1.5
0.5 ~ .vl(k) ~ 2
0.0 ~ v2(k) ~ 80
0.5 ~ uy ~ 20.
Since there are three input features, in each backward pass in the application
of the backpropagation algorithm, the sensitivity of the error in acceleration is
calculated with respect to each parameter. This new error is called the scaled local
error at each processing element in the output layer and is given by
(0) oE oE OUj e· =---=----) alh) OUj oI~h)
) )
= (uj(k -1) - uj(k -1))f'{oI?)) (5.35a)
where the term E represents the cumulative sum. of the squared errors after each
sweep of the N training examples and
(5.35b)
represents the activation function for the nodes in the hidden layer. With this
function, the derivative J'(z) used in equation (5.35a) becomes
J'(z) = J(z)(1 - J{z))
216
and
(5.35c)
where I?) is the output of a node in the hidden layer, the superscript h and h-1
denote the hidden and input layers in that order, and x~h-I) is the input from node
i of the input layer.
The training algorithm can be summarized as follows.
1) Run Kalman Filter with the first set of initial conditions.
2) Generate the first example by accelerating the target with minimum step
size (e.g., 1 mjsZ).
3) Calculate the three input parameters to the neural network (Le., VI, Vz, ch).
4) Forward pass to the neural network and calculate the error in acceleration.
5) Repeat the process until maximum error bound (e.g., 100m) is reached.
6) Backward pass the error to adjust the neural network weights.
7) Repeat steps 1-6 until the desired number of levels of acceleration, heading,
and intial velocities are generated.
8) Calculate the residual training error standard deviation (inn and use it to
further adjust the Kalman filter covariance matrices.
Figs. 5.5 and 5.6 show how the trained neural network is used for adaptation to
target maneuver. The threshold c in Fig. 5.6 (also see Section 5.4) is set during the
training and depends on the required sensitivity of the maneuver detection scheme
(e.g., for (i R = 100, c = 1mj s2).
217
5.S. Neural Network Architecture and the Training Data
Two different neural networks were designed-one for the estimation of the
input acceleration and the other for information reduction on the innovation se
quence, referred to here as NNa and NNq • While the training of NNa is more
challenging, training of N N q is a relatively simple task. The N N q simply stores a
scale factor Q2, which is a function of >'Vk and PD and serves as a look-up-table.
Use of the N N q provides an efficient way of storing the values of the Q2 factor which
is necessary for tracking in the presence of clutter and is a normalizing parameter
in the PDAF filter [10]. In the absence of clutter, the inputs to the NNa network
do not need to be scaled by Q2.
There are three input nodes in the maneuver modeling neural net architec
ture (i.e., NNa) and one hidden layer with 14 nodes. The activation function for
the nonlinear hidden nodes was selected as fez) = 1/(1 + e-:), while the output
nodes where chosen as linear. The starting learning rate for the hidden and output
layers were selected as 0.003 and 0.18, respectively. We employed the Generalized
Delta learning rule with momentum for adusting the weights. For the total of 800
training vectors, the cumulative error was only 71.13. As mentioned before, af
ter the training is completed, we include this residual training error of the neural
network to the covariance matrix Q of the Kalaman filter as U nn in equation (5.11).
5.9. Performance Evaluation
In this section we demonstrate by illustrative examples that the performance
of the proposed neural network-based maneuver modeling scheme is superior to
that provided by the existing techniques, particularly in the event of a short-term
longitudinal acceleration. A short-term acceleration is one that has duration T
comparable to the sampling period T (i.e., T ~ T). This is in contrast to the
218
conventional methods which usually require either T « Tor T » T for giving a
good performance. When T« T, the acceleration is too short and a random noise
process modeling is usually adequate. Also, for the case when the maneuvering
period is much longer than the sampling period (T » T), a correlated noise process
such as the one generated in Singer'somethod [11,40] may be used to model the
maneuver and then to compensate for it. The more difficult case, which is where the
capabilities of the neural network-based maneuver modeling scheme are definitively
established, is when the acceleration is not short enough to be considered trivial
nor is it long enough to be correctly modeled by purely statistical methods.
Before describing the details of the variou~ simulation experiments conduct
ed, we shall give an illustration of the effects of a sudden target acceleration on the
quantities used as inputs to the neural network.
1) It is important to note the combined effect of the input parameters 111
and 112 in the detection of a maneuver. As an example, an acceleration of 5mjs2
was performed by a target at t = 40 seconds with a heading of 45° away from the
origin with an initial velocity of 300 m j s. Duration of this maneuver was 50 seconds
and the probability of false alarm was PIG = 0.00055 with uniform clutter density.
Fig. 5.7 illustrates how the normalized velocity innovation changes according to
this acceleration. Note how ii2 (i.e., the normalized velocity innovation) responds
at one sampling instant later. The top part of the curve is not :flat because of the
disturbance by clutter.
2) In Fig. 5.S, another example depicting a similar situation is shown
except that the acceleration was introduced at t = 20 seconds for only one sampling
instant. Note that the first peak is generated by the true acceleration while the
second peak is due to a clutter data. Recall that a clutter data in the validation
219
gate will cause an increase in the combined magnitude of the normalized position
innovation VI. This means that we definitely need another indicator that will
neutralize the second false maneuver.
3) Using VI and V2 together will reduce the effect of clutter because both
VI and V2 are sensitive to the true maneuver. This example shows why the combined
use of VI and V2 gives the neural network a sense for the maneuver intensity. A
harsh maneuver will result in a longer duration of V2 and a larger peak for iiI.
4) Fig. 5.9 illustrates the role of the maneuver indicators. The first maneu
ver is performed at t = 50 seconds and lasts 10 seconds which is rather short (i.e.,
TIT = 1). Recall that we use a sampling period.of 10 seconds which is typical for
track-while-scan radar systems. The second maneuver starts one sampling period
later, i.e., at t = 60, seconds and lasts 30 seconds. Note that the VI parameter is
not responsive to the second maneuver until one to two scans later. Therefore, it
is the role of the second parameter (heading estimate) to provide a confirmation of
target direction and to declare that this sharp change in the position is not due to
a change of heading. With the present method, the maneuver will still be captured
one scan later and it will be compensated.
Experiment # 1
The first experiment involves a short duration acceleration which is small
in magnitude. For the purpose of comparison of the performance of the present
approach with that from an existing one, a sampling period of length N = 2 was
used for the Input Estimation method proposed by Bogler [72]. The target is
assumed to follow a straight path from the initial position of (100 m, 100 m) with
respect to the radar with an initial speed of 250mls and radially moves away from
the radar with a heading of 450• The scan period was assumed to be T = 10
220
seconds and there was no clutter in this experiment. The standard deviation of the
measurement error was assumed to be dependent on the range with a maximum
of 100 meters (which is typical for radar measurements). Therefore, as the target
moves away from the radar, CT R increases to a maximum of 100 meters. For this case,
we considered five different values as the target moved away from the radar (e.g.,
5m,20m,40m,60m, and 100m). The azimuth standard deviation was assumed
to be CTe = 0.003 radian. The size of the correlation gate was set to be "'I = 16 for
the neural network. For the IE method, "'II = 16 and "'12 = 20 were used for two
consecutive scans.
An acceleration of 5mJs2 was initiated at t = 40 seconds and lasted for 10
seconds which is one sampling period only. The simulation was averaged over 100
runs. A summary of the main input data for each simulation run is given in Table
5-1-1. Table 5-1-2 summarizes the results of the track statistics. The mean filtering
errors of target position and velocity for the neural network-based method (NN)
and for the IE method are shown in Figs. 5.10a-5.10c. The overall probability
of detection of target along the path was assumed to be 100 %. It may be noted
that the shorter track life * for the IE method is due to a small sampling window
of length N = 2 (i.e., two sets of measurement data). Also, the velocity error is
relatively very high for the IE method. This seems to be due to the short duration
of acceleration as discussed earlier. Note that we defined a track life to be complete
if the filter converged. Otherwise the track life was assumed incomplete as long as
the filter error was less than 250 meters. Other definitions may be used for both
methods.
* Track life is defined as the number of consequtive sampling periods that the target kine
matic parameters are estimated within a predefined accuracy [10].
221
Experiment # 2
In this experiment, the target starts a maneuver at t = 40 seconds with
an acceleration of 20 m/52 • The target path is the same as that in the first run.
However, a clutter region is now present which extends between the 10th and 20th
scan. The PIa in this clutter region is· assumed to be 0.6 in a gate of 5 km radius
around the predicted target position. The radar coverage is assumed to be 40 km,
and PIa in the clear region is set to 0.000001. The probability of detection is a
function of the signal-to-noise ratio. The measurement uncertainty for the range is
assumed to be dependent on the range value and range values of 5, 10, 15,20, 25, 30
meters were considered. The use of lower values .is facilitated by using a high reso
lution radar (e.g., millimeter wave) with a high doppler sensitivity of 233Hz/m/5.
Note that the IE method uses the position innovation only for the detection and
the correction of the maneuver and it makes no use of the doppler measurement.
In contrast, one of the inputs of the neural network-based model is a normalized
change in the doppler shift. It is well established in the literature [43,75,85] that
doppler information significantly enhances the tracking performance. However,
there are several problems with the way it is commonly used as an additional state
of the Kalman filter. These will be discussed in a greater detail under Experiment
#6.
In the present experiment, the size of the correlation gate was kept the
same for the neural network (i.e., I = 16 for all simulation runs) whereas for the
IE method we used a window of length N = 4 with 11 = 10, 12 = 16, 13 = 20,
and 14 = 50. The standard deviation of the radial velocity was assumed to be
3m/s and the clutter data included the spread of the doppler clutter spectrum
222
of 30m/s. A summary of the data used in this experiment is given in Table 5-
2-I and the performance is summarized in Table 5-2-2. The track statistics, as
summarized in Table 5-2-2, show that an average Pd of 94.2% was achieved for
the target detection along the path and only 1.2% of the target data was rejected
for the neural network scheme. The· clutter rejection was not as expected but
clutter data was reduced to 37% inside the correlation gate, which is still much
better than 65% in the gate for the other scheme. It must be emphasized that the
doppler shift, which is utilized in the third input parameter in the neural network
scheme, helps avoiding a nonlinear filter which will result if the radial velocity is
used as an additional state variable. The percentage of the rejected target plots
was low for both filters but the mean track life for the neural network scheme was
again higher than that for the other scheme. A plot of the mean filtering error is
shown in Fig. 5.11 which clearly confirms the superior tracking performance of the
present scheme.
Experiment # 3
The trajectory is maintained to be the same as in Experiment 2. The
target initial position is again (x,y) = (100m, 100m) with an initial velocity of
200m/s. The first maneuver takes place at t = 60 seconds with an acceleration
input of 5m/s2 and lasts for one sampling period. The second maneuver takes
place at t = 90 seconds with an acceleration of 10 m/ s2 two sampling periods after
the first maneuver. A window of length N = 2 was used for the IE method. The
measurement uncertainty values are the same as that in Experiment 2. The results,
as summarized in table 5-3, indicate that the mean track life is considerably less
for the IE method when compared to that of the neural network scheme. This
is due to the short interval between the two maneuvers and the fact that both
223
maneuvers have short durations. Fig. 5.12 illustrates the mean filtering errors for
both methods.
Experiment # 4
In this experiment the scenario is similar to that considered in the previous
experiment except that the duration of the second maneuver is longer and there is
only one sampling period of difference between the two maneuvers. As it can be
seen from the data in Table 5-4, the mean track life has dropped slightly for the IE
method. This is because the second acceleration is longer in duration and hence
it is modeled better. The short interval between the two accelerations, however,
causes a degradation in the overall filter performance. The sharp peak at scan
25 is due to the bias that was not compensated for earlier at the time the first
acceleration took place. Therefore, it causes an increase in the mean error on top
of what is due to the first acceleration.
The IE method easily fails as the interval between the two accelerations is
reduced. In contrast, with the neural network scheme, both maneuvers are well
compensated since the first acceleration is compensated for even before the second
one is initiated, whereas with the IE method the second acceleration starts before
the first one is fully corrected. This situation arises due to the short interval
between the two maneuvers and that it takes longer for the IE method to do the
corrections. In the next experiment we will see how the tracking error of the IE
method increases without converging as the interval between the two accelerations
decreases to less than one sampling period. The mean filtering errors are depicted
in Fig. 5.13 which clearly demonstrates that the neural network scheme offers a
better performance.
224
Experiment # 5
This time we include a range measurement error of q It = 100 m. The accel
eration profile is shown in Fig. 5.14. The peak for the IE filter indicates that the
filter sees the two accelerations as just one incident which occurs around t = 54
seconds. The IE filter detects the first "maneuver right at the middle of the interval
during which the second maneuver is taking place. The second maneuver has a
different magnitude and different duration different from the first one. All tracks
were lost in the 100 trials (i.e., filter errors were beyond 250 meters before the 15th
scan) for the IE filter. The neural network scheme, on the other hand, responded
more faithfully, as can be seen in Fig. 5.15a & ~.
A close look at the acceleration profile and the neural network scheme reveals
that the neural network sees the two accelerations as one and they are both of a
short duration with a total time of 25 seconds. The neural network response time
is within approximately 40 seconds. The clutter effect is reduced considerably
due to the higher velocity and doppler property of the second input parameter.
This example illustrates that at higher velocities and for more sudden and short
accelerations, the performance of the neural network is superior to that of the
IE technique. of the acceleration. A lower initial velocity of 200 m/ s with similar
conditions was tested and a larger steady state error for a longer time was observed
for the neural network together with some ringing effect. This shows that the
proposed neural network technique performs well in most conditions which which
can be handled by IE method.
As a summary of the overall performance, for short-duration maneuvers the
neural network scheme converges with a longer track life compared with the IE
225
method. The best performance from the neural network scheme is achieved under
the following conditions:
1) the clutter is uniform (which is what it has been trained for),
2) the acceleration is large in magnitude,
3) a sudden change in the acceleration profile occurs, and
4) the target initial velocity at the time of maneuver is higher than the maxi
mum clutter velocity.
Experiment # 6
In this experiment we perform a comparison of the neural network scheme
with a tracking algorithm that incorporates the radial velocity measurement. We
have already discussed the effect of the doppler information on the tracking accu
racy in the previous sections. In this example we give a more precise description
of the use of the doppler information which results in a nonlinear filter (Le., the
Extended Kalman Filter). In general, the radial velocity information is used to
improve the tracking performance at the following stages:
1) Initialization;
2) Estimation of the track parameters;
3) Plot-to-track association in a dense environment.
The radial velocity information speeds up the initialization phase because
it requires only one plot to indicate the target speed instead of two or more plots
needed by the position measurements. A more accurate calculation of the track
parameters improves tracking in the sharp acceleration situations, which is the
primary objective in this chapter. Unfortunately, the radial velocity measurement
226
gives only a limited information about the target velocity, hence losing the accuracy
as the target path deviates from the radial approach to the radar. With the neural
network model, however, we combine the target heading information so that the
relevance of the doppler information to the actual target speed is trained to the
neural network. Therefore, the performance of the neural network scheme does
not degrade severely as the target approaches the radar in a nonradial path. A
considerable improvement is achieved by the related tracking filter when p (i.e., the
radial velocity) is related to the target heading. Thus a neural network can play
a significant role in relating the actual target velocity (i.e., tangential velocity) to
the doppler information.
Traditional filters that incorporate the doppler information lack this feature,
which implies that the doppler information is wasted in most trajectory patterns
that are different from the radial path. By including the doppler measurement, the
neural network not only eliminates the need for the nonlinear Kalman filtering but
also it provides a more efficient use for this parameter. Once again, the reasons
that doppler information is usually not combined with the heading estimate in the
traditional filters are:
1) the computational constraints,
2) the longer delay to reach the steady state, and
3) the large coupling errors.
The dynamical equations for the Singer model with which the performance
of the neural network scheme will be compared are given in [40j. The performance
is evaluated in a clear region with the parameters
m O"p = 150m ,O"e = 0.003rad ,0". = 22-.
p s
227
The target initial position was (x, y) = (10 km, 10 km) with the initial velocity
of 350 m/ $ along a radial trajectory specified by a = 45°. Radar scan period
was assumed to be 5 seconds. The longitudinal acceleration started at t = 75
seconds with a magnitude of 20 m/ $2 which lasted 150 seconds. The expected
duration of the target acceleration was assumed to be 120 seconds for the Singer
model. The probability of target maximum acceleration for the Singer model was
set equal to 0.01 and the probability for uniform straight line motion was set equal
to 0.9. A uniform false alarm probability of Plo. = 0.000001 was assumed for clutter
data. For the Singer model, the variance of acceleration was found according to
2 a 2
Ua = 3(H4P,!9o.;£",-Po). and the size of correlation gate (for the Singer model) was
set to 100.
It may be noted that in the Singer model the processes of detection and esti-
mation of acceleration are done in two separate steps. For the maneuver detection
by the Singer model, we set a threshold of 2.6 for the innovation sequence. The
performance results are illustrated in Figs. 5.16-5.19 for the Singer filter with and
without doppler measurement and are compared to the performance resulting from
the proposed neural network scheme. Fig. 5.16 illustrates a significant improve
ment in the tracking error when the doppler measurement is used. Note that this
improvement is achieved through a complex nonlinear processing of the doppler
measurement. Despite the computational complexity of the conventional nonlinear
Kalman filter using doppler measurement as an input feature, this parameter (i.e.,
the doppler shift) loses efficiency for a nonradial target path. Note that, as ex-
pected, the proposed neural network scheme reflects the error a few samples faster
than the Singer model. This is due to the capability for the processing of more
parameters for the detection of the maneuver.
228
The standard deviation of the error for the neural network, however, slight
ly increased over that for the Singer model. This is due to the fact that the neural
network schemel is primarily designed to model short term accelerations, whereas
the Singer model is particularly designed for accelerations with longer durations.
However, the improvement that is achieved by the neural network scheme is con
sistent for nonradial trajectories as well. As the simulation was repeated for the
heading angle of a = 20, it was noted that the performance of the neural network
scheme stayed graceful while the Singer model with the doppler measurement re
sulted in more errors. This is because p does not completely reflect the true change
in the target tangential velocity in the absence of the heading information in the
Singer model, whereas the target heading information is reflected in the third input
parameter supplied to the neural network. The results obtained are depicted in
Fig. 5.20.
This example illustrates that in general the neural network compensation
method can eliminate the need for a nonlinear extention of the Kalman filter.
Furthermore, use of additional parameters that can help in the maneuver pattern
recognition does not add to the computational time of the neural network, partic
ularly when the training is performed off-line. It should be mentioned that the
Singer model of acceleration is generally more appropriate for long duration of ac
celeration due to the correlation between samples. Therefore, the performance of
the neural network model is almost comparable to this filter with a slight decrease
in the mean filtering error and a slight increase in the standard deviation of the
error. The standard deviation of the error as shown in Fig. 5.19 increases for the
later part of the track which indicates that the neural network model performs the
229
best for short-term accelerations. Singer model on the other hand does not perfor
m satisfactorily for the short term accelerations [41]. Clearly, a model that does
not assume any correlation between the samples is more efficient for accelerations
of short duration. For neural network model, we do not assume any correlation
among the samples. Such assumptions· are only applicable when larger samples are
available. The neural network model for the maneuver detection and compensation
has one unique advantage over all classical models which is its fast on-line response
and more efficient hardware implementation.
s.to. Conclusion
The primary objective of the research reported in this chapter was to develop
a neural network architecture that generates the required artificial noise signal in
order to compensate for the bias in the Kalman filter which is caused by a sudden
target acceleration. We introduced a hybrid approach to tracking a maneuvering
target in clutter by means of a neural network which was employed as an aid to
the Kalman filter. Furthermore, we showed that an efficient use of the inherent
parallelism of the neural networks can eliminate the need for several filters running
in paralleL We conclude that the following issues can be addressed by the neural
network-based model in a more efficient way if a hybrid approach is used:
a) Detector delay time for the maneuver pattern recognition,
b) quantization of the maneuver intensity,
c) coupling of the projected acceleration components,
d) maneuver classification and the choice of appropriate noise model, and
e) correlation coefficient of the samples in the maneuvering period as in the
Singer modeL
230
Although the multilayer feedforward architecture may not be the best candidate
to replace the Kalman filter (due to its static processing nature), as we have shown
in this chapter, it can work as a highly efficient coprocessor with the Kalman filter.
Therefore, we propose the hybrid approach to relace the Kalman filter working
alone. We showed that a single neural network can replace several parallel filters
with an identical performance and perform superior in cases of sharp discontinuities
in the acceleration profile. We chose the IE paradigm proposed by Bogler [72] to
generate the training vectors because we agree with Bogler and others [72,73] and
[78-82] that this method is the "optimal" approach for modeling sharp maneuvers.
Furthermore, we showed how the discontinuity in the target acceleration can be
modeled by the neural network. Although we used a one step backward estimation
of the acceleration input components (i.e., estimating uj(k - 1) ), the approach
can be extended to more samples in the past. In other words, the CHP algorithm,
which is the basis for Bogler's model, can be more efficiently implemented with the
neural network approach.
The use of a neural network in the present application provides the means
for a quantitative approach to maneuver modeling. It can serve as a fast com
pensator for the bias in the innovation sequence which is caused by a sudden
acceleration. The neural network can also help to classify different types of ma
neuvers. A problem with the neural network scheme that was designed here is that
it may not perform well when the target initial velocity is low. This drawback
is due to the fact that at lower velocities clutter data may be confused with the
target data. Since the traditional methods use a higher sampling period, clutter is
rejected more efficiently at lower velocities. Despite this disadvantage, the present
neural network-based approach offers significant performance benefits in modeling
231
the maneuvers for target tracking in clutter. In this chapter we have established
the many advantages offered by this approach which are not efficiently done with
the traditional methods. In conclusion, the parallel distributed processing of neural
networks can remove many of the complexities of the current tracking algorithms
when used in conjunction with the Kalman filter. Table 5-1-1 Summary of Data for Experiment # 1
Data for Radar Sensor Detection Probability Probability of False Alarm Radar Scan Period
100% at all ranges 0.000001
Standard deviation of range measurement Standard deviation of azimuth measurement
10.00 seconds 5,20,40,60,100 meters 0.003 radians
Clutter Data Probability of false data Correlation coefficient Size of clutter patch
Tracking Filter Data Acceleration Duration of maneuver Size of normalized correlation gate (IE) Size of normalized correlation gate (NN)
N/A N/A N/A
5 .!!!. 8 2
1 scan 16,20 16 (£Xed)
Table 5-1-2 Summary of Performance in Experiment # 1
% Detection % Detection % Detection of Mean track of target out of clutter inside life in alon~ the path correlation ~ate the ~ate # of scans
IE 100 1.6 N/A 16/20 NN 100 1.0 N/A 19/20
Table 5-2-1 Summary of Data for Experiment # 2
Data for Radar Sensor Detection Probability Probability of False Alarm Radar Scan Period Standard deviation of range measurement Standard deviation of azimuth measurement
Clutter Data Probability of false data Correlation coefficient Size of clutter patch (centered around predicted position)
Tracking Filter Data Acceleration Duration of maneuver Size of normalized correlation gate (IE) Size of normalized correlation gate (NN)
90% at 40 km 0.000001 10.0 seconds 5,10,15,25,30 m 0.003 radians
60% 0.6 5 km rectangular
20 ~ 3 scan 10,16,20,50 16 (fixed)
Table 5-2-2 Summary of Performance in Experiment # 2
232
% Detection % Detection % Detection of Mean track of target out of clutter inside life in along the path correlation gate the gate # of scans
IE 95.5 2.5 65 14/20 NN 94.2 1.2 37 17/20
233
Table 5-3 Summary of Performance in Experiment # 3
% Detection % Detection % Detection of Mean track of target out of clutter inside life in along the path correlation gate the gate # of scans
IE 82.4 10.0 65 9/20 NN 93.2 2.6 39 16/20
Table 5-4 Summary of Performance in Experiment # 4
% Detection % Detection % Detection of Mean track of target out of clutter inside life in alonK the path correlation gate the Kate # of scans
IE 85.0 19.6 . 63 8/20 NN 90.2 2.8 41 16/20
rzl~
~ L.,.... _____ ...:
r ll r ~
_II L-._______ r _rmgc::wm.tC
JiveD a U1 target p::x::s :::ode!.
~n'"_1 ;N_ .. ... 1 _....;KaI~r.::_::1l:;,;;n.;.;fi_lte:' __ ....:
Fig. 5.1 Magill bank of N parallel filters. The magnitude of the in-
no\-ation sequence is discretized into N clliIerent values. This can b~ done more
efficiently with the neural network.
234
NoM:lleln'el'
Fig. 5.2 Block Diagram of the optimum bias detector. K(k) is the
Kalman filter gain, rP is the transition matrix, H is the measurement transformation
matr..x, y(k) is the measurement ·;ector, and R(k) is the imlovation sequence.
235
r--;K~ , A -'
I fHl" I We L!!.J ! (kIK-!) I - - - -- - --
a) Neural network in the Kalman filter loop.
vl(k) ----! C vz(k) ----1 NN a
oh(k)---=1...-___ -.J
ACAPTMTY 'TCT~ ~
- -- .,
I
IDE.Ayl I I
- - -- J
ti.,(k - 1)
ti,(k - 1)
b) The inputs and outputs of the neural network. iii is the combined position
innovation of both x and yeo-ordinates, iiz is the normalized incremental doppler
shift, and Sh is the change in heading angle.
236
~ (k.'k"" - I
Fig. 5.3 The Neural Network Mane-olver Detector can replace the bank
of parallel filte:!. It can pciorm the de:ec-..lon and identification of mane-.lye: in
one step.
237
----------~
Fig. 5.4 it = V:r cos L is the relation for the target radial velocity and
tangential velocity. We include the target heading information (i.e.,. 6h ) in the
input parameters in order to relate the incremental doppler shift ii2 to the actual
change in target tangential velocity V:r •
Fig. 5.5
... 1_~_,_·O'lE_RJC:,!""l.i_R.A._1._..Jr y < --= >; I r" I
EV 4L1:AlE JCAL.\tAN G.UN
KaPH1'S·1
I T
c:)V 41UA."~ UPDA~C; P.'I.ICH,P
ICII: c r.: ICI\:.I
I Y
238
Adaptivity to target maneuver through neural network. Upon
detection of maneuver, the filter state i(klk) is recalculated using the neural net
work generated noise model [u:c(k -1), uy(k -1)].
• )
I !
C~_JETtJFN ___ )
Fig. 5.6
y
EVAU.:ATESOItMAUZm INNOVAncNCOIoIPONENrS
~ fWI.
B +
Far .. 1.2 N . 'TET'. "t! e
239
.. . I NOMANEUV5'I
( ~)
A fiow6art of the Adaptivity to target maneuver through
neural network. The out?ut of the neural network is iii( k - 1). This is then used in
the new dynamical equations of the tar:;et for time k and .: N N (klk) is calculated.
a
a
a
.0 ..•.
~ Q - 'ooat.&.I.GI" ~L~cP\ O-~.~~!I~"""
Fig. 5. i The two maneuver indicators together can pre\'ent divergcce
of the validation gate. Note that the second input parameter (i.e., the velocity
innovation) stays constant for the actual duration of acceleration. Together they
renect the intensity and duration of the acceleration.
240
Fig. 5.8 The increase in the magnitude of lnno'\"ation is due to the
'\\;dening of the validation gate in order to capture the target after it is known to
have initiated a maneuver. A wide validation gate collects more clutter. The first
peak is due to the actual target maneuver whereas the second peak is due to a
clutter data.
241
242
~ • - Q .... &.a.an ~
• - lie,""",,-!! W'tftOYa~
• . ... ..
Fig. 5.9 Position and velocity maneuver indicators for a case with two
consecutive accele!"ations \vlth al = 5mj s: and a'Z = 20mj s'1.. The "duration of al
is 10 seconds \ ... he!"eas a2 lasts 30 seconds.
CI
:1 In-eg
C! CI-V>
CI CD .,;_ t.-CD
..J CD ECI
cOl .; ~ t.CI ~ ~-1:1." • c:: c .~ C! .~ c-IDc:' C c.. ~ ~i 7'
CI gj
:1 :g-o
CI C ,., N .
0.0 2.5
L.E:SENO c - IE • - NEURAL NC-~ORK
. . . s.o 1.5 10.D 12.5
ScmpLi.ng t.i.m8
243
~
.~\..... .l··-••• ...
15.0 17.S
Fig. 5.l0a Neur-.J ~etwork and Input Estimation = co-ordinate position
errors. The acce!e.~tion is 5m/s2 which lasts for one sampling period T = 10
seconds.
t. o t.= t. • IDle c: o
I
.~ = -. . ; ~J-o· c.. = '" ~ I
= ~. N I
=
.•....•........ ; ..•.
~.'. ! ...•
j \ ..... .
I .-
!.£GOlO a - NETJRF.L NCWORK • - IE
244
\ . •
it .. .. l· .......
~l IJa-.-O---2~~-S----S~~-D----7~~S----1~D-_D----~1~-S----1~S-_D----l~7.-S--~~.D SampLi..r:= t.i..ma
Fig. 5.10h Neural ~etwork and Input Estimation y co-ordinate position
er.-ors.
~l :J
-0" c: o u = .. III •
to = ~
",-
Ill...,
~..; .J
= E
.~ ~J ~N = . ~ . .J .~ U
~ ;:;1 :. . CI:I
~-
a:t ,..:-. CI
·r··· .. f ~\ ! \ .•
t \
\, •
LEGENO a - NEURAL NETto/ORK • - IE
2L-~~~~~~~~~~' , 0 ~ i.s 5'.0 7.S lIi.a 12.S 15.0 17.S Zl.:l • ~=mpL~n9 ~~ma
245
Fig. 5.lOc Neural ::etwork and Input Estimation r co-ordinate velocity
errors.
c
~-N
c: III ~_ c.: III .-III Ec .s :g-
c. o tc: III In-
c: o ~ .-= • ...1 .,;_ 111-C· c..
L£GENO c - IE . • - NEIJRF.L NCWORK
. ..~ .... ..... .
...
./\~ . : ~ ..... . : : •.....
~ \! r_
•
Fig. 5.11 Neural :\etwork and Input Estimation :r: co-ordinate position
errors for an acceleration of a = 20m/ s2 which lasts for'" = 30 seconds. :-rote the
ringing of the neural network error in the first cycle which is caused by the clutter
data.
246
c:: c=
c c-
:1 CD ct., ... IZI ~ IZI Ec c: e-
• .) N
c c-... .
c
• . : '. [ .... j -\
!£GEND c - IE: • - NEURAL NC"roIDRK
'-
\.1
;l • ~a-.-~----5~.-J----l~a-.~-----15~·.-J----2D~·-.a-----2S~.-a----3D~·-.a-----3S~·.-a--~4~a.~
Scmpl.i.ng t.i..me
247
Fig. 5.12 Mean :1ter=..ng enors for Neural network and Input Esti:na.tion
methods for a. variable acceleration proDle. The first acceleration is 41 = 5m/52
whi6lasts for ~1 = 10 seconds. The second acceleration a2 = 10m! S2 with '2 = 10
seconds. The sampling pe.~od is T = 10. seconds.
= 5-N
= N_ CI . c ~-•
~
~l
l.ES2lD c - IE: • - NEURAL NcwaRK
• . ,:' '.
248
i-
~I • -iQ~.-::OO--S.~=----1l-:.-2SJ--1-6":":S7S--22.""'"-:"·-s:!O--2-.:-· .-Zi--3l-.75D---:sg-.-· :!7S--iS-~:c:l
5cmpli.nS t.i..ma
Fig, 5.13 Mean ::.1tering errors for the Neural Network and Input Es-
timation models when ell = 5m/s2 and 'T'l = 10 seconds followed by a large:
. accele:-ation of el2 = 20m/s2 with duration 'T'2 = 20 seconds.
~ --N
~c III • a:-t.'N o ~
IIIC
~ =-~-
c... o ~C
= .na..-c: .~
C:c o . ..J N
' ... -o t. III ~c
~ ai-u ~
C
:J P'I
c c~ ____ ~ __ ~ ____ ~ __ ~~ __ ~~ __ ~ ____ ~ __ ~
0.0 12.5 25.0 37.S 50.0 62.5 75.J S7.S 10C.:l Ti.llle i.n seconds
249
Fig. 5.14 The acceleration ?ro:file for simulation example 5. The :firSt
acceleration takes place with al = 10m/s2 and 1'1 = 5 seconds. the second accel·
~tion starts 5 seconds late:' and lasts for 10 seconds. the time between the two
accelerations is only T /2.
"c ~ - .-zc
0= ... C I
N
"!-7'
Q:co= Q: .-Q:'( w z cc - " t- ..;~
en C ~IC en ~_ ;C:l'l Q:'
'" ~-
c c::
250
C
I ~a-.-~----I.~~-_----l~.~-~-~---S-:~~------7.~Soa-----9-.~J7s-----11~.-~----1-J~.1-Z:----1~5.~ SiiMPL:NG TIME
Fig.5.15a The mt':lll filtering error of the Input Estimation (IE) method
f~r the acceleration profile i:l Fig. 5.::'4. The IE method fails to compensate for
this profile since the sampi~ .;ize is too small for the noise estimate pro\-ided by
this method.
= :£-N
c ~-
% Cc
~ iG-
1 c...c
~ ~-I c::'
= • I
:l :-
=
~i =1 ~~I------~----~----~----~----~----~----~----~ O.:mD 1.S7s 3.750 5.$ 7.~ S.J7: 11.250 13.:Z IS.lm
SfiMPL!NG TIME
251
Fig. 5.ISh The Neural Network mean filtering error for the acceleration
profile in Fig. 5.14. Since the :naneu\'e~i:lg period is always equal to one sampling
period for ~eural ~etwork noise model. Neural Network noise estimate is not
calculated statistically. rathe~ it is picked up through training.
= ~-,.,
Fig. 5.16
t •
~ j .. •
~,~ \ V • L£SENO
c - WITH DOPPLER • - WI~riOUT DOPPLER eo • o ••
•
252
Mean filtering errors for the x co-ordinate position with and
without doppler processing for Singer model. Doppler processing significantly im
proves the tracking accuracy for radial trajectories.
o 8-In
o
8-,..,
o CD": ~ sCIl'" .-CIl eo I: =_ .~ =
.f. ~.:1: : .,
• j
••
E {r'."! ~ = : .-CIl Q- • _. -:; 1 • ..-
I: .e.::..--":-" c 'I~ -.~ _0 .~ ..:
~ ~-I' ~'
1:_ C""": CIl 5:-e:
1 • I ~-,..,
o 8-
LEGEND c - S i NGE.~ MaOE"~ • - NEURF.L NE7WORK MOD~
:11 8...!. __ ~~ __ ~--~------~--~----~~ ~
0.:0:1 D~ 125 66~a gg~;75 132.alO 1S5·.~ 19S.1SO Z51.S7S 2S.!lO:J n.me (sec]
253
Fig. 5.1i Mean prediction enors for neural network and Singe:- models
in a radial trajectory.
c: ~--
c 2-7
c
~-• c:
LC"E~C a - SINGER /'100:::" • - NE":JRAL NC"riORK MOOa
~+-~--~--~.~-=~~. --~~~~ D.:l 32.5 &5." !l.S 1::'-1 I~S· ISS.O ZZl.S 2El.:l
Ti.me [sacl
254
Fig. 5.18 Me8!l Slte..ooing e:rors for Yeura.l NetWOrk and Singe: :no del ::
a radial trajectory.
= 8-....
~1 • I
ffi-'"
e
. c:: Itnc
=-;;;'
c: s-
LEGEND c - S! NGER MODEL • - NEURAl.. NETWORK MODE:..
, . .
•
• .r •
!13.7S 12S.CD ISS.2S I81.SO 218.75 S.CO Tlome (sacl
255
Fig. 5.19 The standard deviation of the filtering error for Neural Net·
work and Singer model.
256
CI
8-N
Fig. 5.20 The mean filtering enor for a non-radial trajec:or::. Tram-
tion~ methods lose doppler efficiency as trajectory deviates from a ramal path.
CHAPTER 6
SUMMARY, CONCLUSIONS & SUGGESTIONS
FOR FUTURE RESEARCH
6.1. Summary
25i
The parallel processing capabilities of neural networks together with their
on-line mapping properties make use of neural networks a powerful means for var
ious radar signal processing applications. Estimation of the parameters which are
nonlinearly related to a set of received imprecise data has generally been a difficult
problem in the modeling and analysis of the stochastic processes which have been
used for radar signal analysis. A neural network provides a convenient tool for ex
tracting useful information from a set of noisy radar measurements. On the other
hand, classical methods such as Kalman filtering and other parameter estimation
techniques provide appropriate mechanisms for utilizing the correlation among the
random processes characterizing the radar data. A neural network-assisted filter
ing method which makes use of the available algorithms in radar applications can
hence remove a lot of complexities in the analysis as well as the synthesis of existing
signal processing algorithms.
In comparison with currently available approaches to radar detection and
tracking, neural network methods have the ability to handle more information in
real time, which is of particular importance for on-line processing in most radar
applications. Furthermore, an exact mathematical modeling which is generally a
258
requirement for a programmed computing approach is not a prerequisite for a neural
network solution to these problems. The parallel processing method of neural
networks is significantly different from conventional parallel processing methods.
That is, while the mapping of complex algorithms to conventional parallel machines
is a difficult task and is often very inefficient, a neural network mapping, through
its parallel architecture, is fault tolerant and is more efficient. This feature allows
for a simultaneous processing of several statistically important parameters in radar
detection and tracking problems. In this dissertation we have focussed on designing
neural network-based methods for the detection and tracking of targets in clutter
environments. For this purpose, we have selected three of the major subsystems of
a complete tracking system.
In Chapter 1, we began with a definition of the problem of Multiple Target
Tracking (MTT) in clutter together with a brief discussion of neural networks and
their application to engineering problems. Some distinguishing features of neural
networks in comparison with those of current artificial intelligence systems were
also discussed. We also addressed the relation of neural networks to time series
analysis of radar signals. In Chapter 2 we briefly outlined some of the schemes
that are often discussed in the literature on target detection and tracking. We
also specifically described the mathematical preliminaries of the MTT subsystems.
Since there exists a vast and diverse number of methods in this area, we have limit
ed our discussion to the approaches which have received more significant attention
from researchers. Chapter 3 was devoted to the design of the Neural Network-based
Constant False Alarm Rate (NN-CFAR) processor and an evaluation of its supe
rior performance over the traditional auto-detection methods. A brief discussion
of optimal detection theory, which was the main source of the training examples,
259
was also presented in that chapter. The Neural Network implementation of a Mov
ing Target Indicator (NN-MTI) was presented in Chapter 4 where several neural
network structures were designed and analyzed for MTI applications. Finally, in
Chapter 5 the nonlinear mapping property of neural networks was used to imple
ment a hybrid manuever detector and compensator. The performance of the neural
network-assisted Kalman filter was evaluated by comparing its performance with
the most powerful existing tracking algorithms for tracking a manuvering target in
clutter. The principal contributions of the dissertation will be highlighted in the
next section.
6.2. Specific Contributions
This dissertation makes several specific contributions to current knowledge
in radar detection and tracking as well as to neural network applications to engi
neering problems. Some of these contributions will be briefly highlighted in this
section.
A serious degradation in the detection probability of conventional Constant
False Alarm Rate (CFAR) processors used in the automatic detection of radar
targets results from a reduction in the number of available reference cells. Several
factors such as the radar system constraints (in terms of the resolution and sampling
time), the presence of interfering targets, and clutter patches in the vicinity of the
primary target, may contribute to the reduction in the number of reference cells.
In Chapter 3, we presented a novel neural network-based CFAR detection scheme
(referred to as NN-CFAR) that offers robust performance in the face of a loss in
the number of reference cells. This scheme employs a multilayer feedforward neural
network trained by a backpropagation approach using the optimal detector as the
teacher. The excellent pattern classification capabilities of trained neural networks
260
are exploited in this application to effectively counter the performance degradations
due to reduced reference window sizes. In particular, it was demonstrated that a
neural network implementation of the CFAR detection scheme provides an efficient
approach for accommodating more input parameters without increasing the design
complexity for countering the information loss due to reduced reference window
sizes.
The potential application of neural networks to the processing of radar pulses
for extraction of target radial velocity was demonstrated in Chapter 4. We designed
and analyzed several neural network architectures for Coherent Pulse Integration
(CPI) of noisy radar pulses. Some very import~t features of the neural network
design of a Moving Target Indicator (NN-MTI), such as the flexibility of non
uniform sampling, shaping the MTI filter frequency response in the case of pulse
staggering, and enhancing the capability for varying the doppler filter bandwidths,
were discussed. Several training guidelines for radar pulse integration with neural
networks were outlined in this chapter. The principal feature of the neural network
in this aspect is the ability to correlate pulse amplitude distributions with the
modulation which is caused by the doppler effect due to target motion. Parallel
processing of the radar pulses gives the neural network more immunity to a mis
detection of the target in some instances where the probability of detection is less
than unity.
A new approach to tracking a maneuvering target using a neural network
based scheme was introduced in Chapter 5. The neural network models the target
maneuver and assists a Kalman filter in updating its gains in order to generate
correct estimates of the target position and the velocity. A performance evaluation
of the target tracking scheme is conducted under various interesting scenarios. The
261
parallel processing capabilities of trained neural nets are exploited in this applica
tion for realistically handling more input features to correct for the bias induced by
the target maneuver. The synergistic functioning of a trained neural network with a
Kalman filter that provides estimates of the position and velocity of a maneuvering
target is the principal feature of the present approach. The feasibility of employing
neural nets for maneuver modeling and for updating the Kalman filter gains to
correct for the bias induced by target maneuvers is demonstrated through exper
iments depicting several maneuver scenarios. While the performance delivered by
the present scheme is shown to exceed that possible by existing approaches in these
scenarios, additional performance evaluations in several other scenarios (tracking
in a cluttered environment, for instance) attest ·to the strength of this approach.
This work hence makes a useful contribution of the application of neural network
technology in the filtering and estimation areas. It should be emphasized that the
performance gain and superiority of NN-CFAR, NN-MTI, and maneuver modeling
over that of the conventional methods is not only in computational efficiency but
also in simplicity of design as well as hardware implementations.
6.3. Directions for Further Research
We now outline a number of possible extensions to the studies presented in
this dissertation. Since the introduction of Kalman filter theory, there have been
many efforts by researchers to either augment or modify this filter to resolve two of
the major problems arising with target tracking. The first problem is the tracking of
multiple targets in clutter. The theory of probabilistic data association developed
by Barshalom [3] has gained a viable reputation in the application of Kalman
filtering to tracking multiple targets in a cluttered environment. The second major
problem that has received much attention in the target tracking research is the
262
modeling of target maneuvers in the presence of clutter. The primary approach to
maneuver modeling has been through the generation of a noise process produced
by an autoregressive modeling which utilizes the past information. Based on this
approach, linear estimation theories have been used as the primary tools for the
modeling of target maneuvers. In light of the classical linear regression analysis,
the neural network approach can be thought of a nonlinear regression tool which
offers more fiexibilities in design.
In both of the problems which were mentioned above, on-line computation
is the primary limitation. We demonstrated how a multilayer feedforward neural
network can be used in the modeling of target maneuvers. A valuable extension
to this work is to make use of dynamical hidden layers such that more varieties in
the target maneuvers or clutter models can be assumed. As an example, in every
model of target acceleration, a very critical parameter that has to be estimated
for an efficient use of the model is the maneuver time constant. Therefore, more
on-line processing is required since we need to integrate even more parameters to
correctly match the timing and the intensity of the acceleration.
The hybrid approach that was taken here can also be extended to the mul
tiple target tracking case such that a dynamical neural network assists the Kalman
filter in clustering the data at every cycle of data association. The Joint Probablis
tic Data Association Filter (JPDAF) which has been developed by Barshalom [10]
has two major weaknesses. The first problem is due to the on-line computational
requirements which limits the number of targets that can be considered, partic
ularly when some of the target tracks are closely crossing. The second problem
is the way the filter sees the targets. If we put a dynamical neural network in a
closed loop with the JPDA, we may be able to create more separability in order
263
to adaptively separate the feature maps of the target tracks and simplify the track
association problem in cases of crossing trajectories. While some researchers such
as ntis [44] have introduced an approach to the first problem, which is simply a
way of implementing the JPDA with the Boltzman Machine, the second problem
has not been addressed in the literature as of the date of this dissertation.
Integration of the guidance and tracking subsystems through a neural net
work is another interesting extension of this research. As an example, in a missile
target interception problem, the true target acceleration has to be estimated and
then transferred to the guidance unit such that the next missile command is cal
culated. Depending on the geometry of the sit~ation (e.g., head-to-head or tail
chase), there are many nonlinearities involved in the estimation of the target and
missile accelerations. A trained neural network which classifies the type of engage
ment may be considered in a closed loop with the Kalman filter. The maneuver
classification followed by a joint estimate of the missile-target accelerations can
bring about more reliable solutions to a variety of interesting scenarios. With
modifications, the same approaches can be applied to the design of robotic vision
guided systems.
Some possible extensions to the NN-CFAR scheme is to consider the cor
relation among the clutter samples. Radar detection in correlated clutter is still
a problem of considerable complexity. In the work reported in this dissertation,
we have assumed independent samples in each resolution cell both temporally and
spatially. A real target may occupy more than one resolution cell which further
complicates the estimation of the clutter spectral parameters in the neighborhood
of the primary target. Note that this situation may arise for the primary target as
well as for the interfering targets. Once again, a neural network with dynamical
264
hidden layers may be used to readjust the size of the resolution cells or censor a
few cells occupied by the same target before clutter estimation is started.
The NN-MTI processing can play an important role in the future radar signal
processing methods. The parallel processing of the pulses allows more sophisticated
pulse coding and modulation techniques particularly for a coherent integration of
the pulses. We considered several multilayer neural network architectures that take
a series of pulses and respond with the target radial velocity. Although the missing
of some of the pulses was taken into account, the spread in the clutter velocity was
not studied. As we include more pulses, we can make use of some coding techniques
such that the clutter is decoupled from the puls~ sequence.
For more than three decades, FFT-based algorithms have been the major
tools for the frequency analysis of signals and systems. However, as discussed in
Chapter 4 of this dissertation, FFT is limited to linear filter design and lacks the
fiexibilities which are required for multiple sensor applications. As an example,
future surveillance systems will be based on quite a number of different sensors
operating in various bands of the electromagnetic spectrum. Furthermore, these
sensors may need to be connected through complex networks with several distorting
factors such as nonlinearity of the communication channels and different false alarm
rates of the sensors. Therefore, new approaches to signal modeling are needed and
the parallel processing of the time domain and frequency domain representations of
pulses will be an interesting problem to investigate. In conclusion, neural networks
provide a novel approach to parallel computing which suits the computationally
intensive problems such as target tracking in a cluttered environment.
265
REFERENCES
[1] S. S. Blackman, "Multi-Target Tracking With Radar Applications", Ded
ham, MA:Artech House, 1986.
[2] R.W. Sittler, " An Optimal Data Association Problem in Surveillance The
ory" IEEE Trans. on Military Electronics Vol. MIL-8 pp 125-139, April
1964.
[3] Y. Barshalom and E. Tse, " Tracking in a Cluttered Environment with Prob
abilistic Data Association" Proceedings of the 4th Symposium on Nonlinear
Estimation Sept 1973.
[4] R. A. Singer and K.W. Behnke, "Real-Time Tracking Filter Evaluation and
Selectio=:l For tactical Applications", IEEE Trans. on Aerospace and Elec
tronic Systems Vol. AES-7, pp 100-110 January 1971.
[5] D. E. Rumelhart, G. E. Hinton and R. J. Williams, " Learning Internal
Representations by Error propagation", Parallel Distributed Processing, D.
Rumelhart and J. McClelland (Eds), Vol. 1, (MIT press, Cambridge, MA,
1986).
[6] T. Kailath, " An Innovation Approach to Least-Square Estimation", IEEE
Trans. on Automatic Control, Vol. AC-13, pp. 646-655, December 1968.
[7] P. Swerling, "Probability of detection of Fluctuating Targets" IRE Trans.
on Information Theory, Vol. IT-6, pp 269-308, April 1960.
[8] M. Skolnik, "Introduction To Radar Systems" Mc Graw Hill, Second Edi
tion Chapter 4 pp 101-148 , 1980.
266
[9] J. L. Evans and E. K. Reedy, " Principles of Modern Radar" Van Nostrand
Reinhold 1987.
[10] Y. Barshalom and T.E. Fortman, Tracking and Data Association, Academic
Press: Sandiego, 1988.
[11] R. A. Singer and R.G. Sea, "New Results in Surveillance Systems Tracking
and Data Correlation performance in dense Multitarget Environment" IEEE
Trans. on Automatic Control, AC-18 pp 571-581 December 1973.
[12] Krishna, R. Pattipati, T. Kurien, R.T. Lee, and P. Luh, "On Mapping a
Tracking Algorithm Onto Parallel Processors" IEEE Trans. on Aerospace
and Electronic Systems Vol. 26, No.5 September 1990.
[13] Y. Barshalom and K. Birmiwal, "Variable dimension filter for maneuvering
target tracking", IEEE Trans. on Aerospace and Electronic Systems, Vol.
AES-18, pp 621-629, 1982.
[14] K. Birmiwal and Y. Barshalom, " On Tracking a Maneuvering Target in
Clutter" IEEE Trans. on Aerospace and Electronic Systems Vol. AES-20,
No.5 September 1984.
[15] H. M. Finn, "Adaptive Detection in Clutter", Proc. Nat'l Electronics Con
ference, Vol. 22, pp 562-567,1966.
[16] H. M. Finn and R.S. Johnson, "Adaptive detection mode with threshold
control as a function of sampled clutter level estimates", RCA Review, Vol.
29, pp 414-464, 1968.
[17] R. Nitzberg, "Analysis of the arithmetic mean CFAR normalizer for fluc
tuating targets", IEEE Trans. on Aerospace and Electronic Systems, Vol.
AES-14, pp 44-47, 1978.
267
[18] G. B. Goldstein, "False Alarm Regulation in Log-normal and Weibull Clut
ter", IEEE Trans. on AES, Vol. AES-9, pp 84-92, Jan. 1973.
[19] A. Mahmoodi and M.K. Sundareshan, "An adaptive scheme for optimal
target detection in variable clutter environment", Proc. 20th IEEE Conf.
on Decision and Control, San Diego, CA, Dec. 1981.
[20] V. G. Hansen, "Constant false alarm rate processing in search radars", Proc.
of IEEE 1973 International radar Conj., London, pp 325-332, 1973.
[21] V. G. Hansen and J. H. Sawyers, "Detectability loss due to greatest of selec
tion in a cell averaging CFAR" ,IEEE Trans. on Aerospace and Electronic
Systems, Vol. AES-16, pp 115-118, 1980.·
[22] G. V. Trunk, "Range resolution of targets using automatic detectors", IEEE
Trans. on Aerospace and Electronic Systems, Vol. AES-14, pp 750-755,
1978.
[23] M. Weiss, "Analysis of some modified cell-averaging CFAR processors in
multiple target situations", IEEE Trans. 0';7, Aerospace and Electronic Sys
tems, Vol. AES-18, pp 102-113, 1982.
[24] H. Rohling, "New CFAR processor based on ordered statistic", Proc. of
IEEE 1984 International Radar Conf., Paris, pp 38-42, 1984.
[25] J. T. Rickard and G. M. Dillard, " Adaptive detection algorithm for multiple
target situations", IEEE Trans. on Aerospace and Electronic Systems, Vol.
AES-13, pp 338-343, 1977.
[26] J. A. Ritcey, "Censored mean-level detector analysis", IEEE Trans. on
Aerospace and Electronic Systems, Vol. AES-22, pp 443-454, 1986.
268
[27] P. P. Gandhi and S. A. Kassam, "Analysis of CFAR processors in nonhomo
geneous background", IEEE Trans. on Aerospace and Electronic Systems,
Vol. 24, pp 427-445, 1988.
[28] D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning Internal
Representations by Error propagation", Parallel Distributed Processing, D.
Rumelhart and J. McClelland (Eds), Vol. 1, MIT press, Cambridge, MA,
1986.
[29] R. R. Lippman, "An Introduction to computing with Neural Nets", IEEE
ASSP Magazine, Vol. 4, pp 4-22, April 1987.
[30] H. Rohling, "Radar CFAR thresholding in clutter and multiple target situ
ations", IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-19,
pp 608-621, 1983.
[31] K. Hornik, M. Strinchcombe and H. White, "Multilayer feedforward net
works are universal approximators", Neural Networks, Vol. 2, pp 359-366,
1989.
[32] G. Cybenko, "Continuous value neural networks with two hidden layers are
sufficient", Math. Controls, Signals and Systems, Vol. 2, pp 303-314, 1989.
[33] K. Funahashi, "On the approximate realization of continuous mappings by
neural networks", Neural Networks, Vol. 2, pp 183-192,1989.
[34] S. 1. Sudharsanan and M. K. Sundareshan, "Training of a recurrent neural
network for nonlinear input-output mapping", Proc. 1991 Int. Joint Con
f. on Neural Networks (IJCNN-91), Seattle, July 1991 (Also to appear in
International Journal of Neural Systems).
269
[35] Michal Tuszynski, " Adapative MTI Filters For Uniform and Staggered
Sampling", IEEE Trans. on Aerospace and Electronic Systems, Vol. 27,
No.5, September 1991.
[36] A. Farina and A. Protopapa, " New Results on Linear Prediction For Clutter
Cancellation", IEEE Trans. on" Aerospace and Electronic Systems, Vol. 24,
No.3, May 1988.
[37] R. J. Fitzgerald, "Development of Practical PDA Logic For Multitarget
Tracking by Microprocessor, Proceedings of the American Control Confer
ence, Seattle, Washington, pp 889-898, 1986.
[38] A. Gelb, "Applied Optimal Estimation''", Cambridge, MA: M.LT. Press,
1974.
[39] S. R. Rogers " Tracking Targets With Constant Heading and Variable
Speed", IEEE Trans. on Aerospace and Electronic Systems Vol. 26, No.3,
May 1990.
[40] R.A. Singer, "Estimating optimal tracking filter performance for manned
maneuvering targets" IEEE Trans. on Aerospace and Electronic Systems,
Vol. AES-6, No.4, July 1970.
[41] P.L. Bogler, Radar Principles with Applications to Tracking Systems, Wiley
: New York, 1990.
[42] R.J. McAulay and E. Denlinger, "A decision-Directed Adaptive Tracker",
IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-9, No.2,
March 1973
[43] A. Farina, and F. A. Studer, "Radar Data Processing", Vol. 1 & 2, Letch
worth: Research Studies Press, 1985.
270
[44] D. Sengupta and R. ntis, " Neural solution to the multitarget tracking data
association problem", IEEE Trans. on Aerospace and Electronic SysteTn$,
Vol. 25, pp 96-108, 1989.
[45] P. Swerling, "Recent Developments in Target Models For Radar Detection"
AGARD Avionics Technical Symposium in Advanced Radar Systems, Istam
bul, Turkey, May 1970.
[46] F. Aluffi Pentini, A. Farina, and F. Zirilli, "Radar Detection of Targets
Located in a Coherent K Distributed Clutter Background" lEE Proceedings
F, Vol. 139, No.3, June 1992.
[47] W. Stehwein and S. Haykin," A Statistical Radar Clutter Classifier", IEEE
Int. Radar Conference, Dallas, TX, March 1989.
[48] D. E. Schmieder and M. R. Weathersby, " Detection Performance in Clut
ter With Variable Resolution", IEEE Trans. on Aerospace and Electronic
Systems, Vol. AES-19, No.4, July 1983.
[49] Arie Berman, and Amnon Hammer, "False Alarm Effects On Estimation in
Multitarget Trackers", IEEE Trans. on Aerospace and Electronic Systems,
Vol. 27, No.4, July 1991.
[50] B. G. Boone and R. A. Steinberg, "Signal Processing For Missile Guidance:
Prospects For The Future", John Hopkins Technical Digest, Vol. 9, No.3,
1988.
[51] A. Farina and A. Russo, "Radar Detection of Correlated Targets in Clut
ter" ,IEEE Trans. on Aerospace and Electronic SysieTn$, Vol. AES-22, No.
5, September 1986.
271
[52] Andrews, G.A., "Performance of a Cascaded MTI and Coherent Integration
in a Clutter Environment" , NRL Report 7533, march 1973.
[53] Brennan, 1. E., 1. S. Reed, "Optimum Processing of Unequally Spaced
Radar Pulse Trains for Clutter Rejection", IEEE Trans. on Aerospace and
Electronic Systems Vol. AES-4,· No.3, May 1968.
[54] Kendall, M. and A. Stuart, The Advanced Theory of Statistics, Vol. 2, Ch
29, London Griffin, 1969.
[55] F. William and M. Radant, "Airborne radar and the three PRFs", Mi
crowave Journal, July 1983.
[56] E. Aronoff and N. Greenblatt, "Medium. PRF radar design and perfor
mance", 20th Tn-Service Radar Symposium, 1974.
[57] D.C. Schleher, "Performance of MTI and ~oherent doppler processors",
IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-9, No.2,
March 1973.
[58] D.C. Sch1eher, "Performance of comparison of MTI and coherent doppler
processors", IEEE Int. Radar Con/., London, Nov. 1982.
[59] J. Marcum, "A statistical theory of target detection by pulsed radar", IRE
Trans. Vol. IT-6, No.2, April 1960.
[60] G. Dillards and J. Richard, "Performance of an MTI followed by an in
coherent integrator for nonfiuctuating signals", IEEE Int. Radar Con/.,
Washington, DC, April 1980.
[61] F. Kretschmer, F. Lin, and B. Lewis, "A comparison of noncoherent and
coherent MTI improvement factors", IEEE Trans. on Aerospace and Elec
tronics, Vol. AES-19, No.3, May 1983.
272
[62] H. Ward, and W. Shrader, "MTI performance degradation Caused by lim
iting", IEEE EASCON, Washington, DC, Sept. 1968.
[63] G. Trunk, "MTI noise integration loss", NRL Rep. 8132, July 1977.
[64] G. Andrews, "Optimum radar doppler filtering techniques", NRL Rep. No.
7727, May 1974.
[65] G. Andrews, "Comparison of radar doppler filtering", NRL. Rep. NO. 7811,
Oct. 1974.
[66] R McAulay, "A theory of optimum moving target indicator (MTI) digital
signal processing", Supplement 1, MTI Lincoln Laboratory REP., Lexing
ton, MA, Oct. 31 1972.
[67] G. Andrews, "Performance of cascaded MTI with coherent integration filters
in a clutter environment", IEEE radar con/., Washington, DC, April 1975.
[68] D.C. SchIeher and Schulkind, "Optimization of nonrecursive MTr', lEE Int.
radar con/., london, Oct. 1977.
[69] H. Thomas, N. Lutte, and M. Jelffs, "Design of filters with staggered PRF,
a pole-zero approach", lEE Proc. Vol. 121, No. 12, Dec. 1974.
[70] R. Roy and O. Lowenschuss, "Design of MTI detection filters with nonuni
form interpulse periods", IEEE Tram. Vol. CT, No.4, Nov. 1970.
[71] P. Prinsen, "Elimination of blind velocities of MTI radar by modulating
the interpulse period" , IEEE Tram. on Aerospace and Electronics Systems,
Vol., AES-9, No.5, Sept. 1973.
[72] P. Bogler, "Tracking a maneuvering target using Input Estimation", IEEE
Tram. on Aerospace and Electronic Systems Vol. AES-23, NO.3, May 1987.
273
[73] Y.T. Chan, A.G.C. Hu, and J.B. Plant, "A Kalman Filter-based Tracking
Scheme With Input Estimation IEEE Trans. on Aerospace and Electronic
Systems Vol. AES-15 pp 237-244 March 1979.
[74] R.L. Moose, " An Adaptive State Estimation Solution to the Maneuvering
Target Problem", IEEE Trans.' on Automatic Control Vol. AC-20 pp 359-
362 June 1975.
[75] A. Lundulf and M. Minker, " Reliability of Velocity Meaurement By MTD
Radar", IEEE Trans. on Aerospace and Electronic Systems Vol. AES-21,
NO.4 July 1985.
[76] R. Duda, and P.E. Hart, "Pattern Classification ans Scene Analysis", John
Wiley & Sons, New York, 1973.
[77] Y. Barshalom, " Tracking Methods in a Multiobjective Environment", IEEE
Trans. on Automatic Control, AC-23, pp 618-626, August 1978.
[78] G. D. Bergland and C. F. Hunnicut " Application of a Highly Parallel Pro
cessor to Radar Data Processing" IEEE Trans. on Aerospace and Electronic
Systems Vol. AES-8, pp 162-162, March 1972.
[79] S. H. Bokhari, " On The Mapping Problem" IEEE Trans. on Computers
Vol. C-30, pp 207-214, March 1981.
[80] H. Kasahara and S. Narita, "Practical Multiprossor Scheduling Algorithms
For Efficient Parallel processing" IEEE Trans. on Computers, Vol. C-33,
pp 1023-1029, November 1984.
[81] R. Sethi, "Scheduling Graphs on Two Processors, " SIAM Journal of Com
puting, pp 73-82, 1975.
2i4
[82] M. R. Garey and D. S. Johnson, "Computers and Intractability: A Guide to
the Theory of NP-Completeness" San Francisco: W.H. Freeman & Company
1979.
[83] R. T. Lee, " Fault-Tolerant Algorithm Mapping Onto Parallel Computing
Architectures", M.S. Thesis, D'ept. of Electrical and SysteTn$ Engineering,
University of Connecticut, Storrs, 1988.
[84] D.P. Atherton, "Tracking Multiple Targets Using Parallel Processing", lEE
Proceedings, Vol. 137, No.4, July 1990.
[85] A. Farina, A. Russo, F. A. Studer, "Advanced Models of Targets and Dis
turbances and Related Radar Signal Processors" , IEEE International Radar
Conference, 1985.
[86] M.J. Tsai, "Resolution of Closely Spaced Optical Targets Using MLE and
MEM" , IEEE Trans. on Aerospace and Electronic SysteTn$, Vol. AES-18,
No.2 March 1982.
[87] J. A. Edward, and M. M. Fitleson, "Notes on Maximum-Entropy Process
ing", IEEE Trans. on Information Theory, Vol. IT-19, pp 232-234, March
1973.
[88] B. R. Frieden, "Restoring With Maximum Likelihood and Maximum En
tropy", Journal of the Optical Society of America Vol. 62, pp 511-518,
1972.
[89] S. M. Kay, " Noise Compensation For Autoregressive Spectral Estimation",
IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-28,
pp 292-303, June 1980.
275
[90] G.S. Sandhu, and A.V. Saylor, " A Real-Time Statistical Radar Target
Model" IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-21,
No.4, July 1985.
[91] A. Papoulis, "Probability, Random Variables and Stochastic Processes",
New York:McGraw Hill, 1965. "
[92] J. H. Dunn, and D. D. Howard, "Radar Target Amplitude, Angle and Scin
tillation From Analyssis of the Echo Signal Propagating in Space", lEE
Trans. on Microwave and Information Theory, Vol. MIT-16, pp 715-728,
September 1968.
[93] J. J. Hopfield and D. W Tank, "Neural "Computation of decision in opti
mization problems", Biological Cybernetics, 52, pp 141-152, 1985.
[94] D. T. Magill, "Optimal adaptive estimation of sampled stochastic process",
IEEE Trans. on Automatic Control, Vol. AC-I0, pp 434-439, 1965.
[95] G.A. Ackerson and k.s. fu, "On state estimation in switching environments" ,
IEEE Trans. on Automatic Control, Vol. AC-15, ppl0-17, Feb. 1970.
[96] S. Salinger and Wangsness, "Target handling capacity of a phased array
tracking radar" ,IEEE Trans. on Aero~pace and Electronic System, Vo1.AES-
8, No.1, pp43-50, Jan. 1972.
[97] C. Morefield, "Application of 0-1 integer programming to multitarget track
ing problem", Proc. IEEE conference on Decision and Control, pp. 428-433,
Dec. 1975.
[98] C. Morefield, "Application of Bayesian Decision Theory to multi target
surveillance problems", NAECON, pp. 489-494,1976.
276
[99] D.M. Klamer, "Non-parametric maneuver detection in Kalman filtering",
IEEE Conf. on Decision and Control, New Orleans, pp. 544-548, Dec.
1977.
[100] H.L. Wiener, A.S. Distler and J.H. Kullback, "Operational and implemen
tation problems of multitarget tracking problems" , IEEE Conf. on Decision
and Control, Fort Lauderdale, pp. 361-367, DEc. 1979.
[101] RJ. Fitzgerald, "Simple tracking filters: steady-state filtering and smooth
ing performance", IEEE trans. on Aerospace and Electronic Systems, Vol.
AES-16, No.6, pp. 860-864, Nov. 1980.
[102] R.J. Fitzgerald, "Simple tracking :filters: position and velocity measure
ments", IEEE trans. on Aerospace and Electronic Systems, Vol. AES-18,
No.5, pp. 531-537, Nov. 1982.
[103] RJ. Fitzgerald, "Simple tracking filters : closed form solutions, ", IEEE
trans. on Aerospace and Electronic Systems, Vol. AES-17, No.6, pp. 781-
785, Nov. 1981.
[104] Y.T. Chan, J.B. Plant, J.R.T. Bottomley, "A Kalman tracker with a simple
input estimator", IEEE Trans. on Aerospace and Electronic Systems, Vol.
AES-18, No.2, pp. 235-241, March 1982.
[105] K.V. Ramachandra, "Position, velocity and acceleration estimates from
noisy radar measurements" , lEE proc. Communication, Radar and Signal
Processing, Vol., Part F, No.2, pp. 167-168, April 1984.
[106] D. Lucas, K. Ekman and F.P. White, "The application of fuzzy pointer
s in multisensor/multitarget environment", IEEE Conf. on Decision and
Control, San Diego, p. 1217, Jan. 1979.