Neural network-based detection and tracking of maneuvering … · 2.5 Noise in the Radar Receiver...

Neural network-based detection and tracking ofmaneuvering targets in clutter for radar applications.

Item Type text; Dissertation-Reproduction (electronic)

Authors Amoozegar, Seyed Farid.

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.

Download date 24/04/2021 09:11:15

Link to Item http://hdl.handle.net/10150/186824

http://hdl.handle.net/10150/186824

INFORMATION TO USERS

This ~uscript has been reproduced from the microfilm master. UMI

films the text directly from the original or copy submitted. Thus, some

thesis and dissertation copies are in typewriter face, while others may

be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality

illustrations and photographs, print bleedthrough, substandard margins,

and improper alignment can adversely affect reproduction.

In the unlikely. event that the author did not send UMI a complete

manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate

the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by

sectioning the original, beginning at the upper left-hand comer and

continuing from left to right in equal sections with small overlaps. Each

original is also photographed in one exposure and is included in

reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced

xerographically in this copy. Higher quality 6" x 9" black and white

photographic prints are available for any photographs or illustrations

appearing in this copy for an additional charge. Contact UMI directly

to order.

U-M-I University Microfilms International

A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. MI48106-1346 USA

3131761-4700 800:521·0600

Order Number 9502624

Neural network-based detection and tracking of maneuvering targets in clutter for radar applications

Amoozegar, Seyed Farid, Ph.D.

The University of Arizona, 1994

U·M·I 300 N. Zeeb Rd. Ann Arbor. MI 48106

1

NEURAL NETWORK-BASED DETECTION AND TRACKING OF

MANEUVERING TARGETS IN CLUTTER FOR RADAR APPLICATIONS

by

Seyed Farid Amoozegar

A Dissertation Submitted to the Faculty of the

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT

In Partial Fulfillment of the Requirements For the Degree of

DOCTOR OF PHILOSOPHY

WITH A MAJOR IN ELECTRICAL ENGINEERING

In the Graduate College

THE UNIVERSITY OF ARIZONA

1994

THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE

As members of the Final Examination Committee, we certify that we have

read the dissertation prepared by __ ~F~a~r~i~d~-~A~m~o~o=z~e~9~a~r ________________ __

entitled __ ~I~~e~u~ra~l~N~e~t~w~o~r~k-~B~a~s~e~d~D~e~te~c~t~i~o~n~a~n~d_T~r~a~c~k~i~n~g ______________ __

of Maneuvering Targets in Clutter For Radar

Applications.

and recommend that it be accepted as fulfilling the dissertation

requirement for the Degree of ~P~h~.~D~.~i~n~E~l~e~c~t~r~ic~a~l~E~n~g~l~·n~e~e~r~l~·n~9~ ______ __

Dr. Malur K. Sundareshan Date

Dr. Hal Thar <"/':;V/9y Date

Dr. Larr t;.fztfll Date I

Date

Date

Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copy of the dissertation to the Graduate College.

I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation

Date

3

STATEMENT BY AUTHOR

This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the library.

Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgement of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

SIGNED: -",6-",-~~.;...:;..=.;;;....~~1fJr::;.<------

4

DEDICATION

To my parents for raising me

and my wife an two children for their patience

5

ACKNOWLEDGEMENTS

First and foremost, I wish to express my sincere appreciation to my dis

sertation director, Professor Malur K. Sundareshan for his advice and guidance

regarding both research and professional development throughout the entire peri

od of this study.

I also wish to thank the other members of the examining committee, Dr.

Larry Schooley, and Dr. Hal. Tharp, for their review of this dissertation.

I would also like to express my special appreciation to the following people

to whom lowe my achievements: my father :.<\.brahim who set high goals and

standars for me along with support throughout his entire life, my mother who

encouraged me at all times, my wife Afsaneh who gave me the courage, and my

children Hanieh Sadat and Seyed Mohamad who gave me the energy and drive to

succeed.

Finally lowe a particular debt of gratitude to two of my friends Ali Notash

for his vision and support in technical writing, and Seyed Hossein Sadati for his

critics and feedback in the earlier versions of this dissertation.

TABLE OF CONTENTS

LIST OF ILLUSTRATIONS

LIST OF TABLES

ABSTRACT ...

CHAPTER 1. INTRODUCTION

1.1 Basics of Radar Signal Processing

1.2 Components of a Radar System .

1.2.1 Constant False Alarm Rate Processing.

1.2.2 Moving Target Indicator .

1.2.3 Tracking Filter . . . . .

1.3 Multi-Target Tracking Applications and History

1.4 The Neural Network Approach .

1.4.1 What is Neural Computing?

1.4.2 Network Operation ..

1.4.3 Information Processing

1.4.4 Neural Network vs. Artificial Intelligence

1.5 Mathematical Preliminaries of Neural Networks

1.6 Organization of the Dissertation . . .

1.7 Contributions of this Dissertation

CHAPTER 2. REVIEW OF MULTIPLE TARGET TRACKING THEORY

2.1 Overview of Linear Filtering . . . . . .

6

Page

11

13

15

17

17

18

20

21

22

22

24

24

26

28

29

31

33

36

36

TABLE OF CONTENTS-Continued

2.1.1 Radar Signal Representation . .

2.1.2 Statistical Description of Signals

2.2 Random Processes

2.3 Matched Filtering

2.4 Neyman-Pearson Criterion.

2.5 Noise in the Radar Receiver

2.6 Radar Clutter . . .

2.6.1 Clutter Statistics

2.7 Target Modeling ..

2.8 Overview of MTI & CFAR Processors .

2.8.1 More on Doppler Effect . . . .

2.8.2 Radar Pulses with DopIer Shifts

2.8.3 Delay Line Cancellers

2.8.4 Adaptive MTI . . .

2.9 Review of Current Methods in Target Tracking .

2.9.1 Unknown Input Model

2.9.2 Multiple Model Approach

2.9.3 Multiple Hypothesis Testing (MHT) Method

2.9.4 Colored Noise Modeling of Maneuver

2.9.5 Variable Dimension Filter (VDF)

2.9.6 Input Estimation Model (IE)

2.10 Parallelism in Target Tracking

2.11 Sources of Nonlinearity and Their Problems

7

Page

37

37

38

42

42

45

47

48

49

53

55

56

57

58

60

61

63

64

65

66

67

i1

72

8


Page

2.12 Data Association . . . . . . . . . . . . . . . . . . i3

2.12.1 Nearest-Neighborhood vs. All-Neighborhood Approach 73

2.12.2 Probability Data Association Filter (PDAF). . . 74

CHAPTER 3. A ROBUST NEURAL NETWORK SCHEME FOR CFAR DETECTION 79

3.1 Introduction. . . . . . . . . . . 79

3.2 Development of NN-CFAR Scheme . 86

3.2.1 Framework for Neural Network Training 87

3.2.2 Training with Optimum Detector as the NN Teacher 88

3.2.3 Selection of Input Features. . . . . . . . 92

3.2.4 Neural Network Architecture and Training. 98

3.3 Robustness Evaluations of NN-CFAR 101

3.4 Conclusions . . . . . . . . . . .

CHAPTER 4. NEURAL NETWORK IMPLEMENTATION OF THE MOVING TARGET INDICATOR

4.1 Introduction. . . . . . . .

4.2 Some Basics on MTI Designs

4.2.1 Current Approaches to MTI

4.2.2 The Radar Ambiguity Function .

4.2.3 Transversal Filters . . . . .

4.2.4 The MTI Improvement Factor

4.2.5 The Optimum MTI Processing Theory

124

126

126

127

130

132

133

135

13i

9


Page

4.3 "Vhy Neural Networks for Implementation of NITI (NN-MTI)? 137

4.4 Neural Network Architectures for MTI . . . . . . . . 143

4.4.1 NN-MTI Doppler Shift Extraction from Pulse Series 143

4.4.2 Implementation of Pulse Canceller with Neural Networks 154

4.4.3 Analysis of NN-MTI Design with PRF Switching 157

4.5 Conclusions . . . . . . . . . . . . . . . . . . 160

CHAPTER 5. TARGET TRACKING BY NEURAL NETWORK MANEUVER MODELING li9

5.1 Introduction. . . . . . . . . 179

5.2 Neural Network Implementation of Maneuver Modeling 181

5.2.1 Problem Formulation .

5.3 The First Input Parameter

5.3.1 Statistical Properties of the Innovation Process

5.3.2 Estimation of States Using the Innovation Process

5.4 Optimum Bias Detection . .

5.5 The Second Input Parameter

5.5.1 Formulation of the Heading Estimate

State Equations and the Heading Estimate.

5.6 The Third Input Parameter . . .

5.6.1 Velocity Innovation Parameter

5.6.2 Quantization of the Noise Process .

5. -; Generation of the Training Vectors . .

183

187

188

189

191

197

199

202

204

205

207

210


5.8 Neural Network Architecture and Training Data

5.9 Performance Evaluation .. . . . . . . . .

CHAPTER 6. CONCLUSION AND FUTURE WORK

6.1 Introduction. . . . .

6.2 Specific Contributions .

6.3 Directions for Further Research

REFERENCES ............ .

10

Page

217

217

257

258

262

264

11

LIST OF ILLUSTRATIONS

Figure Page

3.1 Schematic diagram of CFAR detector 82

3.2 Neural network architecture for NN-CFAR 100

3.3 Performance of NN-CFAR in Experiment # 3 112

3.4a Comparison of NN-CFAR and CA-CFAR for N=33 116

3.4b Comparison of NN-CFAR and CA-CFAR for N=25 117

3.4c Comparison of NN-CFAR and CA-CFAR for N =17 118

3.4d Comparison of NN-CFAR and CA-CFAR for N=9 119

3.5 Target between two clutter patches 123

4.1 Schematic diagram of pulse cancelers 162

4.2 Amplitude response of pulse cancelers 163

4.3 Transversal filter for MTI processing 164

4.4 Frequency response of pulse cancelers with distinct PRFs 165

4.5 Radar pulse train . 166

4.6 NN-MTI response for variable step sizes 173

4.7 Comparison of the NN-PC with a conventional canceler 174

4.8 Performance of NN-PC in separating slow and false targets. 175

4.9 NN-PC and PRF switching with binomially weighted pulses 176

4.10 NN-PC and PRF switching with unweighted pulses 177

4.11 NN-PC performance in presence of heavy clutter 178

5.1 Magill bank of N parallel filters 234

5.2 Block diagram of the optimum bias detector 235

5.3 The Neural Network Maneuver Detector . 236

5.4 Geometry for the target radial velocity measurement 237

12

5.5 Adaptivity to target maneuver through neural networks 238

5.6 A flowchart of the adaptivity to target maneuver 239

5.7 The two maneuver indicators together 240

5.8 Response of position innovation to maneuver 241

5.9 Indicators in case of consecutive maneuvers . 242

5.10a NN & IE x-coordinate position errors 243

5.10b NN & IE y-coordinate position errors 244

5.lOc NN & IE x-coordinate velocity errors 245

5.11 NN & IE y-coordinate velocity errors 246

5.12 Mean filtering errors for a variable acceleration profile 247

5.13 Mean filtering errors for consecutive maneuvers 248

5.14 An acceleration profile 249

5.15a Filtering error for IE method 250

5.15b Filtering error for NN method 250

5.16 Mean filtering errors for Singer models 251

5.17 Mean prediction errors for NN and Singer models ?-? _u_

5.18 Mean filtering errors for a radial trajectory 253

5.19 The standard deviations for NN and Singer models 254

5.20 The mean filtering error for a non-radial trajectory ?--_u;;>

13

LIST OF TABLES

Table Page

3-1 Comparison of ADT for CA-CFAR with optimal detector . i5

3-2a Comparison of ADT for NN-CFAR in Experiment # 2 . 91

3-2b Performance of NN-CFAR 92

3-3 Comparison of NN-CFAR and CA-CFAR 94

3-4a Performance of NN-CFAR with N =25 98

3-4b Performance of NN-CFAR with N=li 99

3-4c Performance of NN-CFAR with N=9 . 99

3-5 Performance of NN-CFAR in edge clutter. 105

3-6 Performance of NN-CFAR in Experiment # 6 106

4-1 NN-MTI classification of doppler shift with 5 mls 151

4-2 NN-MTI classification of doppler shift with 6 mls 1-" v_

4-3 NN -MTI classification for 10 independent pulses . 152

4-4 NN-MTI classification for 10 independent noisy pulses 1-" v_

4-5 Two-step classification of slow and fast targets 153

4-6 Low resolution doppler shift extraction by NN-MTI 154

4-7 High resolution doppler shift extraction by NN-MTI 155

4-8 Performance of the NN-MTI for noisy pulses . 155

4-9 NN-MTI classification in the presence of clutter 156

5-1-1 Summary of Data for Experiment # 1 . 215

5-1-2 Summary of Performance for Experiment # 1 215

5-2-1 Summary of Data for Experiment # 2 . 216

5-2-2 Summary of Performance for Experiment # 2 216

5-3 Summary of Performance for Experiment # 3 217

5-4 Summary of Performance for Experiment # 4

14

217

15

ABSTRACT

Until the recent past, almost all proposed methods for detection and track

ing of maneuvering targets in clutter have followed the algorithmic path. For most

multi-target tracking problems, however, the algorithmic approach generally re

quires a speed and a degree of parallelism which is far beyond the capabilities of

available computational resources.

This dissertation investigates the development of neural network-based

methods for detection and tracking of maneuvex:ing targets in clutter background

and focuses on three major operations required for this overall task. A detection

scheme is developed by utilizing the pattern classification ability of a trained neu

ral network which helps in a better representation of the clutter and the targets.

Utilizing the mapping property of neural networks, a higher probability of detec

tion is achieved while preserving a constant rate of false alarm. The second unit

is a Moving Target Indicator (MTI) which is trained through examples in order

to integrate a series of noisy radar pulses and provide estimates of target radial

velocity.

For the problem of tracking a maneuvering target, conventional algorithms

employ a Kalman filter which provides estimates of the target position and velocity.

While a Kalman filter is the most powerful linear estimator for continuous random

variables, it may fail to converge in the pres~nce of sharp measurement disconti

nuities which may be caused by clutter or sudden target maneuvers. A multilayer

feedforward neural network in conjunction with a Kalman filter can better resolve

the discontinuity in the measurement sequence. In the new approach proposed

16

here, a neural network is trained to provide an on-line estimate of the necessary

artificial noise components which will help neutralizing the corresponding bias in

Kalman filter estimates of target kinematic parameters.

Ii

CHAPTER 1

INTRODUCTION

1.1. Basics of Radar Signal Processing

Radar is a system which is used for detection as well as location of objects.

The word radar is an acronym derived from the phrase radio detection and ranging.

It has many useful engineering applications in which several extremely sophisticated

algorithms as well as engineering design complexities may be encountered. Radar

is an important remote sensing instrument. It is also an essential requirement for

surveillance systems that may employ more than one sensor. Sensors report both

target and background information. Background information occasionally includes

clutter, noise, intelligent interference, as well as false alarms that may obscure the

true target information. These unwanted signals, together with internal sensor

noise, add to the uncertainties in the kinematics of the target. The problem gets

even more complicated in scenarios where more than one sensor and more than one

target are present. Furthermore, parts of the background clutter may be stationary

while some other parts may be moving with some speed.

The primary objective in radar signal processing is to partition the sensor

data into sets of observations, or tracks produced from the same source. Observa

tions from sensors may be received either in the form of quantitative measurements,

such as the number of existing targets, estimates of target velocity, future predict

ed position, or as higher level obsen-ations such as target classification, shape, and

other attributes.

18

Radar signals in particular ~arry a lot more information than can be ex

tracted by today's digital processing methods. There are many reasons for this

shortcoming. First, radar signals are analog in nature, and conversion to digi

tal representation poses some problems. Particularly in the non-uniform cluttered

environment where the clutter may have a large dynamic range in amplitude vari

ation, linear filters fail to keep up with the resolution requirements. The limitation

of AID (i.e., Analog to Digital conversion) technology further complicates the

problem. The limited resolution power of AID convertors does not usually allow

the usage of full speed capability of the current digital processors. As we move

higher in the electromagnetic spectrum, such as laser radar or infrared detectors,

the problem gets even worse. The effect of a sudden change in clutter amplitude is

sensed by the radar receiver as a large step function at the input of a linear filter

that can mask the detection of other slowly varying targets. The second major lim

itation of current digital methods is the conflicting requirement of more reference

cells for sampling of background information and limitation of the processing power

for such high rates of data. Therefore, the computational requirement is the major

reason that in most detection schemes the target background is rejected before fur

ther processing (e.g. tracking, identification, situation assessment, classification)

on the target itself is started. Neural networks, on the other hand, provide means

of processing more information such that the target and the background clutter

may well be processed together in a more efficient way.

1.2. Components of a Radar System

Depending on the particular application, a radar system is composed of

many components and subsystems. One of the major applications of radar sys

tems is in the surveillance of an environment. Surveillance includes detection and

19

tracking of multiple targets in a cluttered background. A complete system that

performs this task is referred to as a Multiple Target Tracking (MTT) system. In

MTT systems, one or a collection of sensors are used to gather information about

the targets of interest. This information is then filtered through complex signal

processing units and valid target returns are then used for further data processing

to extract features of interest in the set of observations.

There are three main units in an MTT system. The primary function of the

first stage, which is called the Constant False Alarm Rate (CFAR) processing unit,

involves detection of the target against the background clutter. After the target

has been detected, the next stage is to determine whether the target is stationary

or moving with some velocity. This is referred to as MTI, which stands for Moving

Target Indicator. For a stationary target, the challenge is to suppress the clutter

from a :fixed background, such as the ground clutter in the detection of a stationary

vehicle. On the other hand, in the detection of a moving target, the background

clutter could be due to an aggregation of birds or other interfering targets that

move in the neighborhood of the target. The final stage of the MTT system is to

maintain the track of each individual target. Each track file may include target

position, velocity, as well as other features attributed to the target.

As we move from detection to tracking, the problem gets progressively more

complex. This increased complexity is due to the fact that at each level of pro

cessing, there are always some clutter data that remain unfiltered and they further

corrupt the input to the next unit. A target track may totally get lost during a

maneuvering period if the target's sudden acceleration input is not well estimated

by the tracking filter. As more clutter data leak through the other units into the

tracking filter, the process of estimating the target acceleration will become more

20

difficult. We now briefly describe the role of each unit. Further details on each of

these units will be given in Chapter 2.

1.2.1. Constant False Alarm Rate Processing

A false alarm is defined as the detection of a noise signal instead of the

true target signal. It takes place when the noise amplitude crosses the threshold

which is set for detection of the target signal. We may have false alarms in each

unit. However, it is primarily the job of the detection units (i.e., CFAR or MTI) to

preserve a constant false alarm rate. That is, the received pulses from the target

are integrated and the threshold at the output of a receiver is set to achieve a

desired probability of false alarm. The receive~ noise is generally modeled with

Gaussian distribution, whereas the clutter distribution is non-homogeneous and

non-Gaussian. Conventional target detection schemes are based on the matched

filtering technique which attempts to optimize the signal-to-noise ratio. The p

resence of unknown clutter, however, degrades the optimality of matched filtering

due to the non-white characteristics of the correlated clutter since the matched

filters are optimal if and only if the background is white Gaussian noise and the

signal shape is known. In usual practice, a pre-whitening filter is used to decorre

late the interference from the signal. The signal is then passed through a CFAR

processor for final processing. The CFAR detector observes the noise or clutter

background in the vicinity of the target and adjusts the threshold in accordance

with the measured background.

Detection is a general problem and is not specific to only radar signal pro

cessing. For example, there can be found a lot of medical applications for signal

detection. The major requirement in such applications is a form of parallel or se

quential processing that looks at several items in the background before a decision

21

can be made about an abnormal signal that could represent some sort of a dis

ease. Moreover, signals that are reflected from natural objects such as heart, blood

vessels in the human body, trees or a group of birds in the en .... ironment are more

complicated to model than echo signals from man-made objects. Therefore, novel

detection schemes that can intelligently utilize the parallel distributed processing

power of neural networks seem to be necessary in order to perform such complex

tasks.

1.2.2. Moving Target Indicator

A Moving Target Indicator (MTI) makes use of an important physical prop

erty of radar electromagnetic pulses. Each time a pulse hits a moving target, it

would either compress or stretch in wavelength depending on whether the target

is approaching or moving away from the radar. The MTI processor requires more

signal processing than a CFAR unit. In fact, a complete CFAR processor should

include doppler processing as well. This is primarily due to the fact that each pulse

may be corrupted by false doppler shifts due to clutter motion in the vicinity of

the target and there are several pulses that have to be considered. The presence of

correlated clutter can make the integration of pulses a rather difficult task. That

is, the integration of pulses is useful only if the clutter has an independent effect

on the pulses.

Depending on the wavelength of operation, MTI has different zones of blind

speed. That is, an MTI unit suppresses the clutter at certain velocity ranges.

Therefore, a target might be moving at some velocity which is not detected by the

MTI unit. Since the returned pulse amplitudes are modulated by the corresponding

doppler shift of the target, the job of an MTI unit is to calculate a set of optimal

weights for the pulses such that the weighted combination of pulses would result

in useful information about the target radial velocity.

22

1.2.3. Tracking Filter

The tracking filter is the final stage of the MTT system. There are many

noisy samples together with clutter data that enter the set of measurement inputs

for the tracking filter. A measurement se;; is defined as the set of all candidate

target returns. For example, if a total "of 20 pulses are transmitted by radar, some

of them may hit the target and return with some noise added to them. Some other

pulses may just hit different objects in the neighborhood of the target as the radar

scans through the target. Therefore, we may have several returns from a target

that have to be processed in order to come up with a more accurate estimate

of the target features (e.g., position and veloc.ity). Tracking in a multitarget,

cluttered, multisensor environment is characterized by uncertainty in the origin of

the measurements. The tracking function consists of: a) filtering, which is an

estimation of the current state of the target and b) accuracy, which is represented

by the covariance matrix of the predicted estimates. These two functions are

complicated due to the uncertainties in the dynamical models of the targets and

the noisy measurements of the desired states (i.e., position and velocity).

1.3. Multi-Target Tracking Application & History

Multiple target tracking systems are designed for a variety of applications.

The applications include surveillance and tracking of targets, instrumentation, air

traffic control, ground mapping, detection and recognition of objects, robotic vision

systems, and infrared vision homing missile systems, to name a few. Radar tracking

has a twenty-year history behind it. Operators of radar systems used to connect

"blips" on radar screens. In 1955, Wax [1] noticed the similarity between the

radar tracking problem and a fundamental problem in nuclear physics, where it is

required that the path of an actual particle be identified against a background of

23

random noise. He proposed that the elements of initial track formulation (birth),

track maintenance (life), and track deletion (death) were common to all problems

of multiple target tracking. Later on, modeling of radar signals as time series of

stochastic processes was soon introduced.

The next major breakthrough In MTT theory came in 1964 with a paper

published by Sittler [2] on Bayesian formulation. The use of Bayesian theory was

a :first step in the recursive estimation of states of interest. Sittler's work occurred

before the Kalman filtering approach was adopted as a widespread approach to

recursive state estimation and prediction of the target position. Thus, it was not

until the early 1970's that the MTT theory became a major topic of interest. The

papers by Barshalom and Singer [3,4] heralded the development of modern MTT

techniques that combined correlation and Kalman filtering theory.

An eminent scientist who introduced the stochastic filtering theory is Ko1-

mogorov. He is among the :first who studied the minimum mean-square estimation

in stochastic processes during the late 1930's. He also stated a fundamental theorem

that has been used in explaining the mapping properties of multilayer perceptrons,

which forms the basis for trajectory interpolation and estimation used in this dis

sertation [5]. The relevant work by Wiener [6] introduced and solved the problems

of linear filtering and prediction for stationary random processes which can be de

scribed in terms of their power spectral densities. The underlying mathematical

difficulties facing Wiener's solution generated additional research that led to the

development of recursive time domain approach by Le"inson and later by Kalman

and Bucy [1]. Kalman and Bucy used the state-transition models for dynamic

systems. The theory was then extended into filtering and prediction in nonlinear

24

dynamic systems as well as to adaptive estimation problems, which are the focus

of this dissertation.

1.4. Neural Network Approach

Parallel distributed processing is a new architecture for fast processors that

provides powerful computational capabilities required for most target tracking

problems. In tracking multiple maneuvering targets in a cluttered environment,

the requirement of a large bandwidth for multiple sensors puts a high load on the

computational efficiency of the tracking system. The computational effort required

for tracking n targets can grow eA-ponentially with the number of sensors observing

the scene or the number n of targets present.

There is an inherent parallelism in the nature of tracking multiple targets

that, if used efficiently, can render the tracking system less dependent on the num

ber of targets present. Neural network architectures can remove some of the limi

tations of model-based maneuver tracking filters. With regard to the background

clutter, neural network training ability and the distributed nature of its memory

can store the various clutter distribution parameters through training which results

in a better method of representing the target and the clutter fluctuations.

1.4.1. What is Neural Computing?

Neurocomputing is a computational process inspired by the human cogni

tion processes. Complex combinatorial functions can be handled by simple neural

structures which even the most powerful digital computers cannot do. The neural

network is an analogy to the human brain. Each neuron is a simple processing unit

which receives signals as inputs and performs a weighted summation followed by a

nonlinear thresholding on them. When the total signal magnitude is large enough

to pass the threshold, the neuron is said to fire, producing an output signal. The

25

human brain consists of billions of neurons that are densely interconnected. Arti

ficial neural networks are also based on this structure and the elements are usually

organized into groups called layers. A typical network consists of a sequence of

layers which are fully or partially connected. The connection may be only in a for

ward direction, or the network may include lateral as well as feedback connections.

The way the processing elements are connected determines the architecture of the

neural network. Each network structure is able to learn a class of problems and

training for each network may be different.

1.4.2. Network Operation

There are two main phases of operation ill: neural networks, namely learning

and recall. Learning, by definition, is the process of adapting or modifying the

connection weights in response to input vectors. IT the desired response is provided

at the output to compare with the actual network output, then the learning process

is called supervised learning [5]. IT no desired output is shown to the network, then

the learning is called unsupervised learning [5]. There are other forms of learning

that lie between these two types of learning. Whatever the learning procedure,

there must be some sort of learning rule that specifies how the weights are adapted

in response to new examples. The required training for a neural network is generally

a long process and may require several thousands of examples to be presented to the

network many times. The learning parameters may change over time to modify the

learning speed. The control of the learning parameters is called a learning schedule

[5].

An incoming input vector mayor may not have been presented to the net

work before. IT the learning process has been completed successfully, then the

network will be able to recall the desired response to any given input vector. The

26

simplest form of a network is one which has no feedback connections from one

layer to another, or from one neuron to itself. This kind of network is called a

Feedforward Network [5]. In feedforward networks, the information is passed from

the input buffer through the hidden layers (the intermediate layers) to the output

layer. Feedforward networks are powerful mapping tools since they use nonlinear

transformation functions.

1.4.3. Information Processing

Neural networks offer new information processing capabilities. Their ar

chitectures are quite different from that of conventional digital computers. There

are parallel processors based on serial machine~, but the neural network parallel

processing is different. Conventional computers process inputs one item at a time

and, in a sense, they lose the overall picture. Sequential processing has always had

a difficult time detecting the patterns hidden in the information presented due to

an inability to gain a view of the whole picture. Neural networks process many

inputs at once and work toward reaching a stable picture. For this reason, neural

networks can do a lot of tasks that are nontrivial for conventional computers with

considerably less effort. Tracking the state of a dynamic system, as an example,

is very time consuming and expensive for a sequential machine, while a human

observer can very well identify an object and track its trajectory as well as making

higher level decisions about the object.

Tracking demands an immense amount of computational power. It is a very

complex process whose success depends on the availability of computer resources

and their processing speed. As such, even powerful computers will soon reach their

capability limits in typical tracking problems. As massively parallel processors,

neural networks have proven to be extremely useful in many problems of practical

27

interest with tracking. Furthermore, ~sing conventional parallel processing requires

complex programming skills to take full advantage of the conventional parallel

machines. Sequential processing is good for procedural tasks that can be put in

ordered steps and parallel processing with this architecture is extremely difficult

for the class of problems that can't be 'put in an exact mathematical or procedural

form.

There are a number of technical problems for which neural networks have

demonstrated a strong potential for providing efficient solutions. These problems

include but are not limited to pattern recognition, classification, speech processing,

image understanding, radar processing, robotic c<?ntrol, target tracking, and missile

guidance. Once the network is well-trained, it can respond to the problems almost

instantly. The same architecture and hardware can be applied to a variety of

other problems as well. Although most of the neural network research requiring

simulations is still done on sequential machines, analog neural network circuits have

already been built and tested [5]. The processing elements are only a first order

approximation of biological neurons. Biological neurons are known to perform a lot

more than the simple operation of summation and thresholding in artificial neurons.

Therefore, neural network computing (neurocomputing) is about new architectures

for computing machines that complement the serial processing machines.

Neurocomputing is inspired by the surprisingly massive parallelism of the

brain. As an example, neurobiology has discovered that even in the human brain,

certain architectures exist that do particular tasks and this knowledge has shed

some light on the state of the neural network research. Therefore, instead of using

a large network to learn a complex problem, it is better to use a number of smaller

28

networks such that each network is trained with a specialized task. As neurobi

ology discovers new frontiers about the human brain, more will be known about

neurocomputing. Neural network technology is a multidisciplinary field which is

what makes it grow so fast.

1.4.4. Neural Networks vs. Artificial Intelligence

In Artificial Intelligence and e)"-pert systems, knowledge is made explicit

through the formation of rules. Every rule can be defined by some sort of mathe

matical function. The function may be arbitrarily complex and highly nonlinear.

A function however is a metaphor and transforms data into another form. H rules

can be defined by functions, the problem can be presented as a pattern-matching

problem. The nonlinear transfer functions of processing elements in a neural net

work allow representation of any arbitrary nonlinear function. The learning of the

function by a neural network means that the network has learned the rule. Train

ing can be performed by examples and the network can extract the rules through

the examples. The main function of a neural network is pattern recognition.

Traditional expert systems and statistical systems also use pattern recogni

tion as a scheme. Although neural networks do the same, they do it more efficiently.

Pattern recognition requires the ability to match large amounts of information si

multaneously (in parallel) and then produce a categorized output. Learning, on

the other hand, is the extraction of rules and generalization of the rules to similar

classes of problems. Neural networks can internally build the structures and fea

tures pertinent to a problem. This is in comparison to the statistical techniques

that require more resources and information processing before they can solve classi

fication problems. Neural networks can organize the data and extract higher order

statistics and learn from data with minimal external intervention.

29

Knowledge representation is useless unless efficient recallability of the knowl

edge is also available. Related data bases and knowledge matching are the basic

requirements of AI units. Once knowledge is represented, it should be efficiently

updated, which means knowledge is preserved as modules that need to be refreshed

or fused to new pieces of information .. Processing knowledge with structured pro

gramming techniques is a very difficult task since it is a distributed process. As

such, many unlike pieces of information need to be put together to produce a

high level decision. It is also very expensive to do knowledge processing. Neu

ral network associative memory and real time pattern matching provide extremely

powerful tools for approaching fuzzy and inexact algorithmically complex prob

lems. Knowledge is represented in a neural network through connection weights.

The weights are the memory units of a neural network and are well distributed

throughout the network.

The fault tolerance of a neural network is due to the distribution of infor

mation among the weights. IT some processing elements are destroyed, or some

connections are altered, the performance of the network would degrade gracefully

and this is because, unlike traditional computing systems, information is not con

tained in one place [5]. The fault tolerant characteristics of neural networks makes

neural computing systems extremely well suited for applications in which failure in

control or damage to memory units can cause disasterous results such as in nuclear

power plants, air traffic control, space operations, missile guidance, and the like.

1.5. Mathematical Preliminaries for Study of Neural Networks

Each processing element of a neural network has a transfer function [5,93].

Transfer functions are generally nonlinear functions such as a sigmoid, or a hy

perbolic tangent function. They have a fundamental role in the adaptation of the

30

network and determine the sensitivity of the processing element to its input from

previous modes. Processing elements can receive inputs belonging to multiple in

put classes. Separate inputs to each mode may play the role of an activation signal

derived from a common scheduling process to acth-ate a set of processing elements

synchronously. The parallelism in up·dating provides a means for parallelization

of neurocomputing hardware. A typical processing element input from a class of

inputs is a weighted sum in the follO\ving form

Ik = LWkjXkj jEl(

(1.1)

where Wkj is the weighting factor for class k inputs on the ph input of class k.

The weighting coefficient Wkj is called the local memory of the processing element

associated with class k. The mathematical data type of these weights associated

with the connections of each class of inputs can be defined as desired. The pro

cessing element output is usually a signal in the range { -1, + I} and can be either

digital or analog. For processing elements with n outputs, the output vector is

X = (Xl, X2, .•• , Xn)T where Xi represents the output signal of the ith element.

The domain of the vectors is an n-dimensional cube

As mentioned earlier, weights are adaptive coefficients within the network

that determine the intensity of the input signal. They are adjusted according to a

learning rule and the network learns by adjusting its weights according to the rule.

Some initial weight vectors should be assumed to start the learning. Assuming

there are N processing elements in a layer, the weight vector for that layer may

be represented by ltV = (WI, W2, ••• , ltV N ) T. Assuming that there are n weights

31

associated with each processing element, then Wi would be the weight vector for

each processing element. Further details can be found in [5].

1.6. Organization of the Dissertation

This dissertation mainly focuses on the employment of static multilayer

neural networks for the adaptive detection and tracking of a maneuvering target in

a cluttered environment. To show the diversity of available algorithms for detection

and tracking of a maneuvering target in clutter, some of the schemes that are often

discussed in the literature are briefly outlined in Chapter 2. We will also specifically

describe the mathematical preliminaries of the MTT subsystems in Chapter 2.

Chapter 3 concentrates on the design and implementation of a Neural

Network-based Constant False Alarm Rate (NN-CFAR) processor. We shall

demonstrate the superior performance of the NN-CFAR processor over the tra

ditional Cdl Averaging Constant False Alarm Rate (CA-CFAR) processor. We

will also discuss the statistical properties of the input parameters that are care

fully selected for a good representation of the background clutter. Although the

discussion is basically over the training of the neural network in a homogeneous

background, it will be shown through several simulation runs that the method can

be extended to a non-homogeneous background as well. It will be shown further

that the use of additional input parameters to the neural network can greatly im

prove the CFAR detection performance particularly when the number of reference

cells is less than about 30.

A neural network implementation of a Moving Target Indicator (NN-MTI)

will be discussed in Chapter 4. Different neural network architectures will be pre

sented with different sets of inputs to demonstrate the neural network properties

in application to Coherent Pulse Integration (CPI). We will show how the neural

32

network can compute the optimal weights for the received pulses in order to ex

tract the doppler information. This will be shown both in the absence and in the

presence of clutter in the 'Iricinity of the target. The extraction of the doppler shift

directly through amplitude distribution of the pulses can eliminate many complex

processings required by the traditional methods.

In Chapter 5, use is made of the nonlinear mapping property of neural net

works to implement a hybrid maneuver detector and compensator. 'We shall discuss

how a combined set of parameters may reflect sufficient information about a target

maneuver. 'We will then use these parameters as inputs to a neural network. The

two outputs are then used to compensate for the bias which has been accumu

lated during the previous sampling period. We also discuss how this scheme can

be extended to other types of target maneuvers (e.g., circular). We will perform

an in-depth analysis of the mathematics involved in the compensation property of

this unit. Comparison with some of the powerful traditional maneuver detector

and compensator techniques will illustrate that the neural network can be used

in conjunction with the Kalman filter for realizing an on-line adaptive tracking

filter, particularly when the duration of acceleration is comparable to the sampling

period.

The dissertation is concluded in Chapter 6 which summarizes the specific

contributions and outlines some possible extensions for further research. These

suggestions include the use of recurrent networks for some typical radar detection

and tracking problems as well as integration of target tracking subsystems with

guidance units.

33

1. 7. Contribution of the Dissertation

Radar signals are processed and analyzed through time series representa

tion of the amplitude and phase of the returned echoes. In recent years stochas

tic models of signals have provided a major mathematical framework for a good

representation of random processes such as echoes from natural objects or other

man-made fluctuating targets. However, analysis and design of signals have been

formidable tasks when more than one or two parameters are involved in signal

representation or other parameter estimations related to the random process are

required. Elimination of the requirement for a precise and detailed mathematical

model of radar signals together with the parallel. processing capabilities useful for

on-line implementation of detection and tracking algorithms make neural networks

very valuable, particularly in dynamical situations [94-106J. In this dissertation, we

have concentrated on three principal directions that play important roles in almost

any complete radar system. In this regard, the contribution of the dissertation can

be outlined as follows.

1) Performance of conventional Constant False Alarm Rate (CFAR) proces

sors degrade sharply as the number of reference cells (i.e., clutter samples

in the vicinity of the target) decreases. The need for more reference cells

in turn is due to the statistical requirements for the parameters that are

used in designing these processors. In reality, however, only a limited num

ber of reference cells is usually available. This may be either due to radar

constraints such as resolution and sampling time or the presence of other

interfering targets as well as clutter patches in the vicinity of the primary

target. In this research, a novel adaptive autodetection technique is devel

oped through the Neural Network implementation of Constant False Alarm

34

Rate processor (NN-CFAR) that combines a number of parameters and

uses it in a single processor. A multilayer feedfonvard network with back

propagation training scheme is used. The Optimum Detector is used as the

main source of training examples. The NN-CFAR which is designed for the

homogeneous background, enhances the performance of the traditional CA

CFAR by the use of additional parameters. Performance of NN-CFAR is

then compared with the CA-CFAR processor. It is shown that NN-CFAR

performs better than conventional CA-CFAR processor particularly when

the number of reference cells decreases. This is a very important feature in

practical implementations where the number of resolution cells might be lim

ited. Other advantages are the speed of response and the ease of hardware

implementation.

2) The Neural Network implementation of a Moving Target Indicator (NN

MTI) is also presented as an alternative solution to the conventional pulse

cancellation techniques. Radar pulse amplitudes are modulated by doppler

frequency shift which is due to a physical property of electromagnetic inter

action of pulses with a moving target. We make use of a multilayer neural

network to calculate the optimal weights that are necessary for Coherent

Pulse Integration (CPI) in order to extract the doppler shifts directly with

out further complex electronic hardware.

3) A hybrid approach to maneuver modeling is presented for tracking a ma

neuvering target using a multilayer feedforward neural network that works

in conjunction with a Kalman filter. The network is trained for a straight

line motion of the target executing sudden longitudinal accelerations. This

35

approach provides a framework for a fundamentally different method of ma

neuver modeling compared to those which employ purely statistical tech

niques. In particular, these latter techniques require more samples from the

target to compensate for the bias which is induced by target acceleration.

Furthermore, inclusion of more features in the estimation process for a more

accurate maneuver modeling results in an increase in computational com

plexity, particularly in situations involving short term accelerations. With

neural network maneuver modeling, on the other hand, we can minimize

the required number of samples by utilizing more features. It is shown that

the neural network maneuver modeling scheme performs better than pure

ly statistical techniques particularly when the duration of acceleration is

comparable to the period of measurement.

36

CHAPTER 2

REVIEW OF MULTIPLE TARGET TRACKING THEORY

2.1. Overview of Linear Filtering

In a radar environment, the data rate is generally higher than the rate that

can be efficiently handled by the available processing power of digital computers.

Several detection schemes based on statistical linear signal processing techniques

have been proposed in the literature [15-17]. ~ almost all radar detection mech

anisms, decisions are made based on comparing the output of the receiver with

some threshold level. IT the envelope of the receiver output is greater than the

threshold, a signal is said to be present. The decision, however, depends on the

rate of false alarm that can be tolerated. Preserving a constant false alarm rate is

peculiar to radar detection. A false alarm means that the detector declares that a

target is present while the true detected signal is due to noise only. In the literature

of statistics, this is called a Type I error [8]. There is also another type of error

which is due to missed detection and occurs when the signal amplitude is below the

threshold setting and therefore is declared as noise. It is not feasible to minimize

both types of errors simultaneously. Hence, in radar detection, the type II error is

attempted to be minimized at the cost oftype I error (i.e., false alarm).

A decision criterion which has been widely accepted by the radar commu

nity is the Neyman-Pearson observer. In this technique, the threshold is fixed by

allowing a certain false alarm probability. This is called Constant False Alarm

Rate (CFAR) processing. The signal-to-noise ratio of a single pulse, however, is

37

not always enough to reliably accept the decisions and the probability of detection

can be increased by utilizing multiple pulses as opposed to a single pulse. If the

shape of the signal was known as well as the background, then a matched filter

preceded by a whitening filter could be developed to result in an optimal decision.

Since there are so many random parameters affecting both the signal and the back

ground, samples from the background are required to help in classifying the target

and the clutter.

2.1.1. Radar Signal Representation

Radar performance measures are generally expressed in probabilistic terms.

No precise definition of signals is available. This has led to the extensive use of

statistical detection theory [9] as the primary procedure for processor design in

radar systems. In general, statistical detection theory is an abstract process based

on hypotheses testing. Processing is optimized from the detection point of view.

There are generally two hypotheses. One is that the target is present (HI) and

the other is that the target is absent (H 0). This can be carried over every indi

vidual resolution cell. A sequence of observations give rise to a set of conditional

probability density functions for every hypothesis. A likelihood ratio test is then

formed and a decision threshold is chosen based on some optimality criterion. The

optimality criterion for a Const;mt False Alarm Rate (CFAR) processor is to maxi

mize the detection probability under a constant false alarm probability. To further

analyze the processing performed on the signal, we must first represent the various

forms of signals, clutter, and noise as they propagate through the processor.

2.1.2. Statistical Description of Signals

Consider an ensemble of records of the same random process. If the records

are sampled at the same instant of time. then each sample will represent a different

38

value. This range of values satisfies the definition of a random variable and can

be described by a probability density function. For n samples, at time instants

tJ, t2,' .. ,tn, we have the n-dimensional density function

which statistically defines the random process. For a stationary random process,

the statistical moments (e.g., mean, variance) are constant with respect to time,

and P(x,t) reduces to

Therefore, we can determine the statistics of the process from a single time record

because the ensemble statistics and time statistics are identical. A purely random

process is one for which the probability density function is given by

P(x) = P(Xl)P(X2) ... P(xn ).

This model is used extensively in the Marcum-Swerling [8] approach to radar de

tection problems.

2.2. Random Processes

Random processes are stochastic models of time variations of signals. In

simple words, a random process is composed of a sequence of time-varying random

variables. As an example, the measured voltage across a noisy resistor at different

instants of time corresponds to a random process. As another example, the se

quence of radar pulse amplitude returns from a target which is fluctuating is also

39

a random process. One of the most important random process models which has

been used extensively in practice is the Gaussian process.

A random process {Xi} is Gaussian if the variables XI,X2, ••• ,Xn have a

joint (n-dimensional) Gaussian probability density function for any set of values

i 1, i2,." , in and any value of n. The Gaussian density function of a joint process

is given by

1 [ 1 T -1 ] P(X1 •..• ,Xn )= / /i'?'i exp --2(x-m) G (x-m) . (21l")n 2V IGI (2.1)

where X and m are the n-dimensional vectors

where mi corresponds to the mean of each Xi random variable. The covariance

matrix C is defined by

where

and the correlation coefficient Pij is given by

Cij Pij = --.

UjUj

Assuming a stationary Gaussian process, we have

Uj = (fj = (f

and

40

IT the sampling times t l , ••• ,tn have been chosen such that Xl, ••• ,Xn are uncorre-

lated, then

[Ul

:.1 c= u~

0 n

and hence

(2.2)

which is the product of the n first-order density functions, since Xl, ••• ,Xn are

independent.

There are two other random process models that are of prime importance

in target detection and tracking, namely the Markov and the VV"einer processes.

A Markov random process is a process in which the present value is dependent

41

only on the value of the process at the previous sample time and on a transition

probability. A Markov process is expressed by

P(x) = P(XI,X2, ... ,Xn) =:= P(xnlxn-d ... P(X2IxdP(xd

= P(xl)IIi=2P (Xil xi-d·

The transition probabilities are called conditional probabilities and reflect the basic

mechanism of a Markov process. The order of the Markov process depends on the

conditional density functions and the number of previous samples that relate to the

present sample. These processes are generally used in the target tracking literature

and are basically for modeling the target maneuvers. One example is Singer~s model

of target acceleration which makes essential use of the properties of the Markov

process.

The Wiener random process (or Brownian Motion) is a limiting form of the

random walk which is a sum of independent steps of size $, equiprobable in every

direction, taken at intervals S. The limiting process, when $ - 0 and 6 - 0 such

that s/~ is constant, yields a random process wet) with the following Probability

Density Function (PDF)

P[w(t)] = N[w(t); 0, at]

i.e., it is normal with zero mean and variance at. Note that. the Wiener process is

nonstationary. It relates to white noise, denoted here as net), by wet) = net).

42

2.3. Matched Filtering

A matched filter is a network whose frequency response function maximizes

the output peak signal to mean noise power. The frequency response of this filter

is H(f) = 5*(f) where H(f) is the spectrum of the corresponding matched filter

and 5*(f) is the conjugate of the input signal spectrum. The output of a matched

filter is not a replica of the input signal, and therefore the shape of the signal is

not preserved. The output of the matched filter is proportional to the input signal

cross correlated with a replica of the transmitted signal with a time delay. The

time delay is required for the filter to observe all of the signal before matching is

performed.

2.4. Neyman-Pearson Criterion

While matched filtering is used for the detection of a known signal in the

background of known statistics (i.e., white Gaussian noise), we need a more general

method for the detection of random signals. There are two major schemes for

signal detection based on hypothesis testing and these are the Bayes criterion and

the Neyman-Pearson criterion. The Bayes criterion is based on the probabilities

about the source of information (i.e., the signal) and some cost functions. The

Neyman-Pearson test, on the other hand, maximizes the probability of detection

subject to the constraint that the false alarm probability does not exceed some

preassigned 'value. Therefore, the Neyman-Pearson criterion is directly applicable

to radar and sonar, while Bayes criterion is mainly used in communication systems.

Both criteria consist of two steps. First, the probability ratio forms the likelihood

function. Second, this ratio is compared against a threshold to make the decision

about the presence of a target signal.

43

The Neyman-Pearson test reduces to matched filtering if the signal is known

and the noise statistics is white Gaussian. In radar and sonar systems however, the

correlation matrix of noise in the observed data is unknown and adaptive techniques

are required to set the appropriate threshold in each reference cell (i.e., samples

from the background). In a typical radar detection system, between 10 to 20 pulses

may be passed through each resolution cell and the returned echoes from each cell

form a time series of samples which represent the object in that particular cell.

The target detection problem is further complicated due to the lack of information

about the statistical description of clutter. The likelihood ratio is of the form

where H and Po, respectively, correspond to the hypotheses HI and Ho, and '"Y is a

predefined threshold. Neyman-Pearson criterion is a special case of Bayes criterion.

Example

Let S be the target signal and n the background noise. Assume that the

Gaussian random variable n has zero mean and 'variance q2 = 2, and S is a constant

equal to either 0 or 1. That is, if the target is present then S = 1 and in the absence

of the target S = 0 . We form the two hypotheses

Ho : S= 0

HI : S = 1.

Then using a Neyman-Pearson test with P(DIIHo) = 0.1 we want to form an

optimum decision rule. Assuming that the probability density function of the noise

is Gaussian. the probability density functions for the two hypotheses are given by

1 21 PO(Y) = --e-Y 4 2..ji.

and PI(y) = _1_e-(Y-l)2/4

2..ji.

44

where Po is the Gaussian density function with zero mean and PI is the Gaussian

density function with the mean equal to 1 (i.e., the density function for the signal

plus noise).

The likelihood ratio is then given by

>.(y) = PI(y) = e(2y-l)/4.

PO(y)

The Neyman-Pearson test is to choose HI if

e(2y-I)/4 > >. _ 0

or equivalently choose HI if Y ~ ,. The new threshold '"( is obtained by taking the

natural logarithm of both sides of the above inequality. To determine the threshold,

the false alarm probability is

By referring to the statistical tables, '"( can be determined to satisfy the above as ,

'"( = 1.8. Then,

45

This is the probability of detection based on the single observation y. With i =

1.8, the original threshold >'0 ~ 1.9 and the decision rule is then: choose HI if

>.(y) ;:::: 1.9, otherwise choose Ho.

2.5. Noise in the Radar Receiver

A good model for narrowband receiver noise is

net) = a(t) cos[wot + B{t)]

where Wo is the carrier frequency, a{ t) is the amplitude of the envelope modulation,

and B(t) is the phase modulation. The basic assumption in a narrowband noise is

that the bandwidth of the noise process Bn ~ wo/2r. , which is always satisfied in

radar receivers. Most of the significant noise in a radar receiver is at the front end

of the receiver near the antenna. The narrowest bandwidth of amplifiers determines

the bandwidth of the noise. Band-limited noise is, of course, correlated and this

further complicates the detection process.

The receiver thermal noise is almost always assumed to be of Rayleigh dis

tribution aCt) _a2

P(a(t» = -exp[-] O"~ 20"~

(2.3)

where O"~ is the statistical variance of aCt) and P(B(t» = 1/2r.. The phase distri-

bution is almost always assumed uniform. Yet another convenient mathematical

representation of the noise process is;

net) = xn(t)coswot - Yn(t)sinwot

46

where Xn = a cosO and Yn = a sinO. In this case, both Xn and Yn are Gaussian-

distributed 'with zero mean and are independent. Their probability density func-

tions are given by

P(Xn) = (?l )1/2 exp[-x!/2u;] Un _To

P(Yn) = (?l )1/2] exp[-y~/2u~]. Un _il"

(2.4)

The signal seen at the receiver is the combination of signals received from the

target, noise, and clutter. For a non-fluctuating target, the returned radar pulse

ret) with a maximum amplitude of C will be

ret) = C cos(wot + </».

Therefore, the received signal Set) will be of the form

Set) = C cos(wot + </» + aCt) cos[wot + O(t)]

Set) = [C cos </> + Xn(t)] coswot - [Csin</> + Yn(t)] sinwot

and the resulting density function of Set) is given by [9]

(2.5)

where C2 /2q~ represents the signal-to-noise ratio (SNR).

4i

2.6. Radar Clutter

The radar clutter echoes consist of radar returns from unwanted reflectors

and they often obscure the signal from targets of interest. Clutter is generally

caused by such things as rain, sea, clouds, chaff, or mountains. There are occa

sionally other interferences due to jamming which may have some intelligence, or

may be totally random in nature. Clutter, however, is a term used for the natural

kind of interference in the target background. Examples of radar targets are ship

s, aircrafts, satellites, and missiles. Clutter returns are often much stronger than

target returns and the processing requirement is to increase the signal-to-clutter

ratio (SIC). Depending on the type of target and its dynamics, several different

techniques are available. High resolution radars, for example, have the advantage

of small resolution cells and therefore see less clutter. Clutter is usually distributed

over a larger area than the target and is generally correlated both temporally and

spatially from one resolution cell to another.

The processors that reject clutter are divided into two major categories,

namely the CFAR processors and the MTI (i.e., Moving Target Indicators). '\iVhile

CFAR processors are generally used for stationary targets, the MTI processors are

used for moving targets. MTI is a generalization of CFAR and includes doppler

resolution cells in addition to range and azimuth cells. In this dissertation, the

implementation of both detectors using neural networks will be investigated.

2.6.1 The Clutter Statistics

The main objective of the CFAR and MTI designs is to suppress the back

ground clutter. Clutter distributions are generally unknown prior to the receiver

design but certain distribution families are used for quite a number of practical

situations. Clutter fluctuations are represented by amplitude statistics as well as

48

frequency spectra. The amplitude statistics give information about the percent

age of time during which the returns have a given range of values. The frequency

spectra, on the other hand, represent how rapidly the amplitude values change.

A classical radar distribution of clutter is Rayleigh distribution which repre

sents a background with a large number of equal size, uniformly distributed phases

of scatterers. IT one of the scatterers is dominant and is much larger than the other

scatterers, then the distribution would change to a Rician distribution [9]. In the

higher frequency radar systems, the resolution cells (i.e., the minimum dimensions

that can be resolved my the radar) and therefore the individual scatterers are of

smaller size and a larger deviation from the mean scattering value may occur which

results in longer tails in the probability density functions such as Log-normal and

Weibull density functions [9]. These kinds of densities occur in the low depression

angles and result from large scatterers that are shadowed most of the time but are

observed occasionally. The Log-normal clutter distribution of Vi is given by

1 {I [ Vi )2]} P(Vi) = vi? v.. exp - ?u2 In( - , --;rUe z - e J.Le

(2.6)

where Vi ~ 0, and J.Lc and U e are the two parameters of the distribution and Vi is

the target echo in each resolution cell. If we let ~i = In Vi, the resulting distribution

would be

(2.i)

which is a Gaussian distribution. This means that an idea110garithmic amplifier

in front of the CFAR detector would make the CFAR processing easier and this

scheme has been used in practice. The Weibull clutter density is given by

49

It should be noted that the Rayleigh density is a special case of the Weibull density.

Note also that Log-normal and WeibUll represent families of distributions. The

target and the clutter can be classified efficiently provided that the test statistics

can be extracted such that they represent the correct member of the family based

on the two parameters estimated by CFAR.

Both frequency and autocorrelation functions of clutter are generally used to

describe it, but frequency spectra are more often.used. The clutter autocorrelation

function is usually assumed exponential and its spectrum is given by

A P(f) = 1 + (fljc)n

where A is the mean value of the power density.

2.7. Target Modeling

Target amplitude fluctuation can greatly modify the signal-to-noise ratio

required to achieve a high probability of detection [90,92]. The more the target

characteristics are known, the better separability from clutter will occur. Since

clutter from natural objects is difficult to model analytically due to their large

varieties, a priori knowledge of the clutter model is not normally available. Despite

a lot of research in this area, the statistical models for radar environment and

clutter information still lack the desired performance. The target modeling, on the

other hand, has been easier because more is known about targets of interest and

there are only a few classes of targets of interest in each radar detection scenario.

50

Furthermore, one has to be concerned about the mathematical complexities of

these models. The analysis of signals gets more intricate as they propagate through

different units of the radar processor. That is why most of the target and/or clutter

models in the literature are from the same family of distributions to further simplify

the analysis and the design of the radar processors (e.g., CFAR).

Most targets of interest in radar detection scenarios are mainly man-made

objects which are more structured and result in some specific features in the reflect

ed signal. The challenge is then to design receivers that can extract these structured

signals buried in background noise and clutter. This is exactly what inspired this

research for neural network-based implementation of conventional CFAR and MTI

detectors. Neural networks are able to extract features in the structured target

echoes with some details that are hidden to conventional statistical receivers. As

an example, much research has been done to detect maneuvers when they are just

about to happen. Each target begins some preliminary actions to prepare for a ma

neuver, like banking its wings, or lifting the aircraft's nose. These features are very

difficult to extract from a moving target with conventional detectors. Furthermore,

these signals are generally distorted by eclipsing of pulses and other scintillation

parameters which are due to the relative motion of the target with respect to the

radar.

Targets are either fluctuating or non-fluctuating. The model [9] used for

non-fluctuating targets is called the Marcum model and is mainly used for distant

targets where fluctuation does not affect the SNR significantly. For higher reso

lution radars, however, the target is always fluctuating due to rapid changes in

the aspect angle. The well-known four Swerling models [8,9] have long been used

for statistical target modeling [90,92]. The Swerling models are of two classes, the

51

Rayleigh and Chi-square models [9]. Each model then includes two other models

which correspond to slow and fast fluctuations. The Rayleigh distribution is a

special case of the Chi-square distribution with one degree of freedom. It is given

by

(2.8)

where"Y is the average radar cross section (RCS) of the target and'Y is the instan

taneous RCS. The case of slow fluctuation is defined as a situation where pulses are

correlated in every individual scan, but are independent from one scan to another,

whereas rapid fluctuation is the case where the pulses are independent even whithin

a single scan. The Chi-square distribution with two degrees of freedom represents

a target with a single dominant scatterer which is non-fluctuating and is surround

ed by smaller individual scatterers. The Chi-square distribution has two cases as

well. The slow fluctuation in which pulses are correlated in individual scans, and

the fast fluctuation with independent pulses in every scan. The probability density

function for the Chi-square distribution is given by [9]

Assuming square law detection with N independent pulses and Rayleigh

density (non-coherent integration), we get a Chi-square density function with 2N

degrees of freedom. The output of the square law detector would be

N-l

z= Lxr (2.9) i=O

52

where Xi has Gaussian distribution with zero mean and variance of 0'2. The thresh

old for detection is derived in [9] and is given by

(2.10)

where Tm is called the threshold multiplier which is calculated in a different way

for each particular CFAR detection algorithm [9] and jJ. is the statistical mean of

the signal amplitudes in the reference cells. This threshold description is applied

in a system with Rayleigh pulse amplitude distribution and non-coherent pulse

integration. 'When a theoretical threshold is not known exactly, a greater SNR is

required to produce the desired Pd (propability of detection) for an allowed Pia

(probability of false alarm). This increase in SNR is called the CFAR loss.

The Swerling I, II models are generally used for a closed form analysis of

CFAR detectors due to the fact that these are the more standard mathematical

models. Furthermore, these models are often used for comparison of different CFAR

schemes. In the Swerling I model with pulse-to-pulse correlation and independent

scan-to-scan Rayleigh reflection, the probability density is P(x) = ie-z/%U(x)

where X is the signal-to-noise ratio and x is the average signal-to-noise ratio. The

probability of detection with a single pulse i.e., N = 1, is given by [7}

where Y is the detection threshold. For N pulses, Pd is given by

Pd = peN - 1, Y) + (Z : 1) N-l e-Y(z+l) x [1 - peN - 1, z ~z 1)] , N ~ 2

where N-l ym

P(N, Y) = L -, e-Y

m=O m.

53

and

z = N x ; the statistics

x = average signal-to-noise ratio on each sample.

Based on these equations, the detection curves are generated for different levels

of SNR for each specified level of Pia. It should be mentioned that most of these

curves are generated for Swerling models only. This is merely due to the availability

of closed form expressions for the probability of detection. Swerling case one and

two are more suitable for aircraft, while case three and four are suitable for rockets,

missiles, and satellites [8].

2.8. Overview of MTI & CFAR Processors

There are three major resolution cells that are involved in radar processing of

received pulses. These cells are referred to as range cells, angle cells, and doppler

cells. The objective of a CFAR processor is to preserve a constant rate of false

alarm. As mentioned earlier, constant false alarm rate means that the detection

threshold should be adjusted such that the probability of detecting a false target

is kept constant. To do this, a complete CFAR scheme is required which includes

doppler resolution cells as well as position cells (i.e., range and angle).

Due to the inherent design complexities present, the detection schemes for

stationary targets and for moving targets will be discussed separately in more detail

in Chapters 3 and 4 respectively. The term CFAR ordinarily refers to the concept

of extending the detection process to the neighborhood of the target (i.e., use of

auxilliary cells) for a better representation of the background clutter. On the other

hand, MTI refers to the processing of a sequence of pulses such that the moving

targets are separated from the background clutter. Once an MTI filter is designed,

54

there can be one MTI unit used for each position cell in the neighborhood of the

target cell (i.e., test cell). That is, for each range cell, there are a number of pulses

that have to be processed in order to decide whether the radar return is due to a

moving target or clutter. The mO'lring target is further classified as slow, fast, etc.

In Chapter 3 we focus on the amplitude fluctuations of the returned pulses

without regard to the frequency information carried by the pulses. Note that radar

pulse amplitudes are modulated by two major factors. One is due to the target

scintillation and eclipsing which is caused by target fluctuations (e.g., change in

aspect angles) as well as range gating of the pulses and timing jitters. We have

already discussed the available models for the tar~et fluctuations. The other factor

is modulation of the pulse amplitudes due to the doppler effect. These two effects

are separately accounted for in a CFAR design and are the subjects of discussion

in the next two chapters.

Targets of interest in MTI systems are moving targets with a radial velocity

which is usually higher than the the velocity of the background clutter. The radial

velocity has direct relation with the phase change of the radar returns. This phase

rate of change will cause a shift in frequency which is called the doppler shift and

can be used by the Moving Target Indicator (MTI) to remove the clutter.

MTI can be operated from a fixed or moving platform such as a ship, an

aircraft, or a satellite. There are a wide range of applications for MTI in surveillance

and detection of low altitude targets. MTI also has a number of applications in Air

Traffic Control (ATC) due to the presence of birds and other slow moving objects.

The slow mO'lring objects add to the complexity of MTI simply because clutter is

also moving slowly and a lot of overlap occurs between the unwanted spectrum and

the spectrum of the desired targets.

55

MTI and CFAR processors generally require many compromises in the radar

processor design and they add to the cost and complexity of the design of detection

and tracking algorithms. According to Skolnik [8], the basic concepts of MTI were

introduced during World War II. Since then, limitations of the processing tech

niques have always been a major factor in the complexity of MTI processors. The

advent of digital processing technology resolved many of the design constraints in

MTI. However, sinc~ the reliability and speed of processors are of prime impor

tance in radar design principles, new frontiers have yet to be discovered in the

neural network applications to MTI as a nonlinear fast processor with capabilities

far beyond the linear processing techniques that have been used in MTI since 1970.

2.8.1. More on Doppler Effect

The electromagnetic pulse of the radar incident wave undergoes a physical

phenomenon that causes a compression in, or extension of, the radar wave with

change in the radial velocity of the target or clutter with respect to the radar.

Doppler magnitude is computed by the derivative of the phase of the returned

pulse. The phase is given by ' is the apparent

wavelength. It follows that, the doppler shift is given by !d = 211r • ~~ = ~, where

Vr is the radial velocity of the target with respect to the radar.

Since there are different scatterers on the target, the doppler shift is usually

spread over a range of values rather than at a single value. For example, the rotation

of the propellant blades in an aircraft engine introduces additional doppler shift

that may obscure the exact doppler shift of the aircraft body. Typical examples of

clutter include weather clutter which is composed of extended different parts (each

moving with some speed) or a group of birds where each bird has a slightly different

velocity. The change in the aspect angle of the target also causes additional spread

56

in the doppler shift. It is this spread and its a priori unknown characteristics that

make the MTI design a challenging problem.

The MTI function is composed of two major steps. Doppler filtering is first

performed to separate the target signal from that of the target's surroundings.

Secondly, position measurements of the target are extracted. The main function of

the MTI is detection of moving targets, which hence requires additional processing

to obtain the accurate position and velocity of each target (e.g., Kalman filtering).

2.8.2. Radar Pulses With Doppler Shifts

For a typical detection process, multiple pulses are required in order to

achieve a high signal-to-noise ratio. Doppler resolution is also inversely related to

the pulse width (r), which is the primary factor for range resolution. If / d > ~, the

doppler signal may easily be distinguished from a single pulse. However, if /d < ~,

pulses will be modulated in amplitude and many pulses are needed to extract the

doppler shift. This is shown in the following signal modeling. Let the input pulse

to the MTI be represented by a(t)ei"'dt, where Wd represents the doppler shift in

terms of radian frequency and a( t) is the pulse amplitude. Assuming that the next

pulse is received r seconds later with equal characteristics, the MTI output of a

single pulse canceller will be

where aCt - r) = aCt). Hence,

Ivol = 2Ia(t)I(1 - COSWdr )

Note that the MTI output is zero for Wdr = 2mT", which is what causes the

blind speed zones in the MTI output signal. When the mean doppler frequency

57

associated with clutter exceeds the radar's pulse repetition frequency, target data

will be mixed up with the clutter data due to this overlap in their frequency spectra.

Targets may also be aliased due to the same effect and therefore multiple targets

may be declared for each individual target. Aliasing is a side effect of uniform

sampling in linear processing. Multiple PRF (Pulse Repetition Frequency) are

used for reducing the aliasing effect at the cost of some loss in MTI performance.

2.8.3. Delay Line Cancelers

The simplest delay line canceler is a two pulse canceler which is a time

domain filter. The delay time is equal to a pulse repetition interval T which is on

the order of a few milliseconds for typical air-surveillance radars. The advantage of

a time domain MTI over frequency domain filters is that a single network operates

at all ranges and separate filters for each range resolution cell are not required as

in the doppler filter banks. It is more efficient, however, to have a combination

of time domain and frequency domain filters. The video signal received from the

target at range Ro is given by

(2.l1a)

The signal from the previous sample is delayed by T, therefore

V2 = ksin[21r!d(t - T) - <Po]. (2.l1b)

The MTI output Vo is then the difference of the two signals VI and V2 and can be

expressed as

(2.l1c)

58

The blind speed zones occur when the response of the single delay line canceler is

zero. This corresponds to /d = T = nP RF. The relative target velocities resulting

in zero response from MTI are called blind speeds and are given by Vn = ;~, where

n is an integer.

The presence of blind speed zones in the MTI response reduces the detection

performance of MTI. We have already discussed one common method which uses

pulse staggering to reduce the number of blind speed zones. For a single delay

line, one may put the first blind speed outside the range of the expected doppler

frequencies. This, however, causes an ambiguity in range measurements. The

residue Wi of the clutter sample at the ith sample is

Wi = Si - Si-l .

Broader clutter rejection nulls may be achieved simply by adding more delays or by

cascading two MTI filters based on the two-pulse canceler. In this case, the MTI

output will be the square of the output from a single canceler which is proportional

to sin27rfdT. For a two-pulse canceler, the residual clutter at the output of the

MTI is given by

Wi = Si - 2Si-l + Si-2 ,

where the coefficients of the above expression are similar to the coefficients in the

polynomial expansion of (x - y)2. Since these multipliers are the weights of the

MTI filter, higher order MTls employ higher order polynomial coefficients.

In general, the coefficients correspond to the coefficients of the polynomial

expansion of (x - y)n. As another example, a four-pulse canceller would have as its

coefficients 1, -3, +3, -1 which corresponds to (x - y)3. An N delay, binomially

59

weighted MTI is able to cancel amplitude samples of an (N _l)th order polynomial.

The cancellation does not start until all pulses are received.

2.8.4. Adaptive MTI

Current MTI processors are followed by Fast Fourier Transform doppler

filter banks. The bandwidth of each filter is matched to the duration of the dwell

time of the main beam on the target, that is, BD = iD where TD is the dwell

time. This would result in N = P:: returns that are coherently integrated. The

general MTI structure is basically a linear transversal filter followed by an envelope

detector (or with coherent integrators as described above).

Moving clutter such as sea, weather, chaff, groups of birds, and other inter

fering moving targets, however, require more intelligent ways of calculation for the

MTI coefficients. Adaptive MTI is the suggested structure in the current literature

for these changing situations. The basic assumption generally made in Adaptive

MTI (AMTI) is that target returns are spatially concentrated in a few resolution

cells (usually one or two) but clutter tends to be diffused in a number of resolution

cells extended around the target. If the cell being observed is considered as the test

cell, then the neighborhood cells that are usually limited in number to anywhere

from twenty to forty samples are used to estimate the clutter spectrum.

One method for the estimation of the clutter spectrum is the Maximum

Entropy Method (MEM) which is based on the prediction of clutter samples such

that maximum information (randomness) results. The MTI coefficients can then

be adjusted based on the estimated spectra. The power spectral density of an

Autoregressive model (AR) is given by [9,89]

SU) = p PeT 11 + I:k=l ak exp( -j27rfkT) 12

60

where Pc is the clutter power, T is the sampling time, and P denotes the order of

the process which depends on the amount of peak-to-peak variations. The MEM

algorithm can be used to estimate the {ak} coefficients. This will produce the

flattest spectrum of all possible spectra with an autocorrelation function similar

to the measurement data. This is still a least-square fit of the AR model to the

clutter data and no prescribed methodology is available for the determination of P

(model order). If a correct model order P is not chosen, the MEM algorithm will

not be useful in the sense that the estimate may be too smooth or too rough with

spurious details embedded in the spectrum.

2.9. Review of Current Methods in Target Tracking

There have been several different approaches to modeling a maneuver per

formed by a target being tracked. Maneuver is what makes target tracking a

challenging task. Even a single target can be lost by an evasive maneuver. The

background clutter, when present, adds to the complexity of the problem. This is

due to the fact that false returns from the clutter may introduce sharp changes in

target returns that can trigger a false declaration of a maneuver. When a target

maneuvers, it is actually trying to escape the tracking window which has a calculat

ed volume. This tracking window, which is usually elliptic or rectangular (though

other shapes may also be considered), is generally based on a priori knowledge of

the standard deviation of the background noise and the maximum expected speed

of the target.

In order to track the target under a maneuvering condition, the tracking

window size (i.e., the expected region of the target position and velocity) has to

be larger. However, as this window increases in size, more clutter will enter into

the measurement data. Some clutter data may be falsely accepted as target data

61

and cause a loss of track during the maneuver. Therefore, efficient methods of

maneuver modeling are required in order to make the tracking more robust to

different clutter distributions. In the following sections, we shall review some of

the existing methods for maneuver modeling.

2.9.1. Unknown Input Model

Since radar cannot measure accelerations, a priori models are necessary to

account for the sudden accelerations. In the Unknown Input model, it is assumed

that the maneuver command input is unknown [10,77]. Therefore, it is modeled as

a random process that is referred to as process noise. Depending on how harsh the

maneuver is, a certain level of white noise can be used in the target dynamic model

to represent the unknown acceleration. The target dynamics in a maneuvering

situation can be represented in the form

x(k + 1) = Fx(k) + GU(k) + vi(k) (2.12)

where u( k) represents the actual target input acceleration (i.e., maneuver command

input) and vi(k) represents the ith process noise level. In the Unknown Input

model, the term u( k) is set equal to zero because it is unknown and instead several

different levels of vi(k) are used. These levels are chosen based on the algorithm

developed. For example, the transition between the noise levels is provided either

by a transition probability calculation or by some sort of threshold settings on the

innovation sequence. As another example, a harsh maneuver would result in a

large magnitude of the innovation. We may assign a noise process Vi with a large

standard deviation as long as the magnitude of the innovation sequence is large.

Each noise level provides a certain amount of increase in the size of the

tracking window. There are different schemes for choosing the right noise level (i.e.,

62

the mean and the standard deviation of a Gaussian noise process). It should be

noted that an exact model for the actual target acceleration is practically impossible

due to the uncertainty in the exact time of acceleration. However, as will be

discussed later, it is possible to estimate u( k) at some time after the maneuver has

taken place and continue the estimation process after that time. Therefore, in such

methods not omy should the time of occurence of the maneuver be estimated, but

also a certain level of 'Y i( k) is still required to safeguard against the uncertainty

in the estimation of the input u( k). To summarize, in all maneuvering target

tracking problems, the uncertainty of target input acceleration is compensated by

the inclusion of an artificial noise. The challenge is then to integrate the available

information about the target to devise a noise process that leads to the convergence

of the tracking filter. Furthermore, convergence of the filter should take place before

another maneuver is initiated.

Innovation is defined as the difference between the current measurement

data and the predicted measurement at the previous scan, i.e.,

lI(k) = z(k) - Hx(klk -1), (2.13)

where H is the measurement matrix. Then the normalized innovation is defined as

where S( k) is the covariance matrix of the innovation. S( k) includes the uncertainty

in the new information which is due either to the noise in the sensor measurement

or to prediction error and is given by

S(k) = H(k)P(klk -l)HT(k) + R(k), (2.14)

63

where R( k) is the measurement noise covariance matrix. Obviously, since v( k) is

Gaussian distributed, S(k) will have a Chi-square distribution. If the maximum

expected innovation induced by the maneuver is denoted by Emu, the probability

P{Ev(k) ~ Emax} = 1 - a will indicate the probability that the target is not ma

neuvering. Upon exceeding the threshold Emax , the covariance of the process noise

will be increased until Ev( k) is reduced below this threshold. One should remember

that a maneuver is not a stochastic process. A maneuver is a deterministic pro

cess, which is unknown for tracking purposes. The confusion about the nature of

a maneuver could arise because modeling uncertain events, such as a maneuver, is

popularly handled using the concepts and tools from the theory of probability and

stochastic processes.

2.9.2. Multiple Model Approach

This approach is similar to the Unknown Input model approach, except that

instead of adjusting the covariance of process noise, the models are already set up

and corresponding to each model a probability or a likelihood function is obtained

for the correct model. Let Mj be the event that model j is correct with prior

probability,

P(Mj) = Jlj(O) j = 1, ... ,r.

Then the likelihood ratio at time k given that model j is correct is given by

where vj(i) is the innovation vector under the assumption that model j is correct

at time i. Assuming Gaussian noise in measurement, the PDF (i.e. Probability

Density Function) of the innovation function of tracking filter j is

(2.15)

64

Using Bayes's rule,

where p;(O) can be assumed as a lower bound for the initial model probability.

The measurement set for each scan period can be represented by

where mk corresponds to the number of the received measurements inside the

tracking window and Zi( k) refers to each individual measurement. The cumulative

set of measurements is represented by

where Z(j) represents the set of validated (i.e., gated) measurements in scan j that

have fallen in the validation gate. The state estimates will be the weighted average

of states conditioned on the model, i.e.,

r

E{x(k)IZk} = LE{x(k)IM;,Zk}p{M;IZk} ;=1

2.9.3. Multiple Hypothesis Testing (MHT) Method

One can always create new hypotheses for each received set of data and add

to the complexity and accuracy of the tracking filter model. Upon large deviations

of innovation function v( i), one can define a maneuver hypothesis and open a new

file for the target tracks. There are several ways to do this; however, all methods

depend on detecting the maneuver in its early stages. In order to speed up the

65

maneuver detection, the measurement space may be restricted to a subspace of the

target velocity which undergoes the dominant change when the target prepares for

a maneuver. As an example, if Z(k) = [x(k),J(k)]T represents the measurement of

position and frequency of the carrier signal, one may restrict the innovation process

to a normalized change in velocity, which in turn is proportional to the change in

carrier frequency (i.e., doppler shift). That is,

(k) = [J(k) - j(klk _1)]2 v S(k)

The addition of one hypothesis for every time that a maneuver is detected is not

without a cost. Each single hypothesis will result in a new tree of more hypotheses

in later scans which may have to be pruned at a later time.

2.9.4. Colored Noise Modeling of Maneuver

A more realistic model of a maneuver is a correlated (colored) noise process

rather than a white noise. In this approach, the target acceleration aCt) is modeled

as a zero-mean random process with exponential autocorrelation given by

R(T) = E[a(t)a(t + T)] = O"~e-QITI

where O"~ is the variance of target acceleration and 1/ Q is the autocorrelation time

constant. Singer [11] has presented a probability density function for this kind of

maneuver model. Based on this probability density function, the variance of target

acceleration can be calculated and is given by

O"~ = A~ax [1 + 4Pmax - Po] . (2.16)

66

A correlated noise process for a maneuver that uses target acceleration may be

modeied as a first-order Markov process of the form

a(k + 1) = pa(k) + J1- p2um r(k) (2.17)

where the correlation coefficient p is defined in terms of the sampling interval T

and the correlation time constant T, and r( k) is a white Gaussian process with unit

variance. The correlation coefficient is assumed to be of the form of an exponential

function given by

p = e-T / r •

The time constant T is referred to as the approximate time duration of target

maneuver and is generally between 10 to 60 seconds for a typical target of inter

est. Longer time constants (e.g., 400 seconds and more) are typical for the slower

maneuvering targets.

2.9.5. Variable Dimension Filter (VDF)

The VDF approach which is developed by Barshalom [10] suggests two gen-

eral modes of operation, i.e., the quiescent mode, and the maneuver mode. In the

case of the quiescent mode, the trajectory is assumed to be a straight line with

constant velocity. Only position and velocity are estimated in the quiescent mode

while in the second operational mode the acceleration can be added as an addi

tional state. Some radars with adaptive sampling increase the sampling rate upon

detection of the maneuver. However, experience has shown that the problem is not

resolved just by faster sampling [1]. The two filters for the two modes of operation

will have the forms :

[ . .]' x = xxyy

67

which is the state vector for the quiescent model and

for the maneuvering model. Upon detection of a maneuver, the state vector is

augmented by the acceleration state. The acceleration state itself can be modeled

by anyone of t~e available methods.

2.9.6. Input Estimation Model (IE)

In this section, we describe the acceleration model proposed by Bogler [41],

which is referred to as Input Estimation (IE) method. This model is primarily de

veloped to account for fast maneuvers of the target and serves as a good procedure

for comparing with the neural network-based model presented in this dissertation.

Bogler's algorithm will be briefly outlined in the following.

Consider a system with one dimensional state equation

x(k + 1) = Fx(k) + GU(k) + V(k) (2.18)

where U is an unknown input modeling the target maneuver and V is a white noise

process assumed zero-mean with covariance Q. The observation sequence is given

by

z(k) = Hx(k) + W(k)

where the observation noise W is also a zero-mean white noise with covariance

R and is independent of the process noise V(k). In the absence of the target

maneuver, the estimation of the state is performed by using the model without

input (i.e. non-maneuvering model) which is given by

x(k + 1) = Fx(k) + V(k).

68

From the innovations of the Kalman filter based on the non-maneuvering model,

the input U(k) is to be detected, estimated and used to correct the state estimate.

Assume that the target starts maneuvering at time k. Its unknown inputs

during the time interval [k, ... , k + s] are U(i), i = k, ... , k + s -1. The estimates

from the (now mismatched) filter based on the non-maneuvering model will be

denoted by an asterisk. The one-step prediction will be

x*(i + Iii) = F[I - W(i)H]x*(ili -1) + FW(i)z(i)

= (i)x*(ili - 1) + FW(i)z(i) i = k, ... ,k + s - 1

where q>(i) = F[I - W( i)H] and the initial condition for the predicted state is

x*(klk -1) = x(klk - 1).

The recursion equation for one-step prediction in terms of the initial condi

tion is given by

iii

x*(i + Iii) = [II (j)]x(klk -1) + L [ IT (m)]FW(j)z(j) j=k i=k m=i+l

for i = k, ... ,k + s - 1. Now if the inputs (i.e. U(k) and V(k» were known,

the correct filter based on the input model would yield estimates according to the

recursion

xCi + Iii) = (i)x(ili -1) + FW(i)z(i) + GU(i) iii

= [II (j)]x(klk -1) + L [ IT (m)][FW(j)z(j) + GU(j)]. i=k j=k m=j+l

Note that the only difference is the last term containing the inputs. The corre

sponding innovations based on the two filters are

v(i + 1) = z(i + 1) - Hx(i + Iii)

69

and

v*(i + 1) = z(i + 1) - H£*(i + Iii).

These innovations can be related as

i i

v*(i + 1) = v(i + 1) + HI: [ II q>(m)] GU(j). i=k m=i+l

Assume a constant input over the time interval [k, ... , k + s], i.e.,

U(j) = U , j = k, . .. , k + s - 1

which yields

v*(i + 1) = wei + I)U + v(i + 1) , i = k, ... , k + s - 1

where i i

wei + 1) = HI: [ II q>(m)]G. i=k m=i+l

It can be seen from the expression for v*( i + 1) that the innovation v* of

the non-maneuvering filter is a "linear measurement" of the input (maneuver) U

in the presence of the additive "white noise" v. It then follows that the input can

be estimated using a least-squares criterion from

y= wU+e

where

_ (V*(k.+ 1») y = . and

v*(k+s)

_ (W(k.+ 1») '11- .

w(k+ s)

70

are the stacked "measurement" vector and matrix respectively, and the "noise"

€ = (V(k ~ 1)) v(k + s)

is of zero-mean with block diagonal covariance matrix S, which was given in equa-

tion (2.14).

The estimation can be done in a batch form as

(2.19a)

with the resulting covariance matrix

(2.19b)

Based on this estimate, a maneuver is declared "detected" only if it is statistically

significant. The test for significance for the vector estimate fJ is

d(fJ) = fJTL-IfJ ~ C

where c is a threshold.

The choice of the threshold is as follows. If the input is zero, then

fJ ~ N(O,L) (2.20)

i.e., the estimate is a normal random variable with zero mean and covariance L.

Then the statistics of d is Chi-square distributed with nu degrees of freedom (nu is

71

the dimension of the vector U) and c is chosen such that the probability of incorrect

decision is

P{d(U) ~ c} = Q,

with Q = 10-2 or any desired confidence level.

When a maneuver is detected, the state has to be corrected as follows:

xU(k +8 + 11k+ 8) = x*(k+ 8 + 1Ik+8) +MU (2.21)

where xU is the new corrected state with input modeling. In this equation, the

matrix M, called the propagation matrix, is given by

k+s k+s

M = L [ II ~(m)] G (2.22) j=k m=j+l

and the covariance associated with the new estimate xU is

pU(k+s + 11k +s) = P(k +s + 1Ik+s) + MLMT. (2.23)

A maneuver is considered finished when the input estimate based on mea-

surements from the sliding window of length 8 becomes insignificant. The length

s is a design parameter. In the cases where the duration of a maneuver is short

relative to a sample interval, a window size of 8 = 1 or 2 sampling periods is ap

propriate. However, in most practical cases, it will be necessary to consider data

over a longer period in order to produce a reliable estimate. This is the general

requirement for every statistical parameter estimation.

2.10. Parallelism in Target Tracking

The ultimate goal in parallel processing of the target tracks is to come up

with an alogorithm that is independent of the number of targets to be tracked.

72

The lowest level of parallelism starts at the instruction level of the processor. For a

complex algorithm, each task has to be described at the lowest level of description

to make use of the maximum parallelism inherent in the structure of the algorith

m. This is not always practical with complex tasks. For multi-sensor multi-target

tracking algorithms, we need to describe the target dynamics in time, space, and

feature domain, some of which are not efficiently described by the low level instruc

tion sets of current digital computers.

According to Pattipati et. al. [12], the most important tasks in target

tracking can be broken into five steps as: (1) track prediction, (2) gating, (3)

track update, (4) clustering, and (5) formation of the global hypothesis. The

optimum use of parallelism in these non-uniform tasks is to distribute the tasks

such that the alogorithm executes uniform number of operations for all steps. It is

only by distribution of the number of operations by an optimal use of this inherent

parallelism can be made.

2.11. Sources of Nonlinearity and their Problems

There are two major sources of nonlinearity in tracking problems. One is

due to the measurement done in one coordinate system and filtered in another.

As an example, for airborne radars, an inertial (non-rotating) reference or fixed

coordinate system will be the preferred one for tracking, whereas for ground-based

radars the cartesian coordinate system is used. A cartesian coordinate system is

more convenient for track prediction. The second source of nonlinearity is the target

acceleration. When target acceleration is added to the state vector, the dynamical

equations will also become nonlinear. The available theory, however, is limited

to linear filtering which is only optimum for Gaussian noise in the measurement

process. Further processing is required in conjunction with a linear filter in order to

73

compensate for the bias in the state estimates which has been caused by a sudden

acceleration of the target.

2.12. Data Association

Gating and data association is the heart of target tracking. It is through

this step that all infeasible hypotheses about the correlation of target returns are

dropped. The updating of tracks starts with gating around predicted positions. It

is then at the discretion of the algorithm designer, as well as the limitations of the

speed and available time, to allow more than one measurement return per target

in each gate. It is also important to have a strategy in the case of an overlap in

the measurement sets for different targets. The overlap occurs when targets stay

too close to one another for one or more scan periods.

2.12.1. Nearest-Neighbor vs. All-Neighbor Approach

In the nearest-neighbor approach to data association, at most one observa

tion can be associated with the corresponding target and this will be based on a

distance measure. In this approach, a given observation can be used only once. A

distance measure is minimized over all possible cases. In a multiple target situa

tion, there is a large probability of error with this approach. This is due to a large

number of observations in each gate with equal probability of occurrance around

the predicted position. The likelihood function for association of observation j to

track i is given by [10]

where Si is the residual covariance matrix for track i and 4i denotes the statisti

cal distance of received observation data from predicted position. The statistical

distance is given by

d2 -TS-l-ii = lIii i lIii

74

where Vij is the residual vector from observation j to track i. The product

1riM/2yIjSJ is the total volume of the gate centered around the expected tar

get position. The statistical parameter i determines the confidence level, and M

is the dimension of the state space. The assignment matrix (i.e., assignment of the

observations to the tracks) can be modeled as an optimization problem to minimize

an overall distance function.

Another problem with the nearest-neighbor approach is that when several

equiprobable observations have fallen in the validation gate, the algorithm just

takes the closest one without paying attention to the probability of the observation

being correct. This is because the covariance matrix of error does not account for

the probability of the incorrect measurement being processed. The other approach

is the all-neighbor approach in which all observations within the gate are consid

ered with some probabilities and a given observation can be used again to update

the multiple tracks. It remains to calculate the probability of association of each

individual observation and then averaging them probabilistically. This approach

has been very effective for single targets with one or more sensors. Based on the

second approach, a filter has been developed by Barshalom [13,14]. This filter is

used in this dissertation for the purpose of training the neural network in Chapter

5 and hence will be reviewed in the next section.

2.12.2. Probability Data Association Filter (PDAF)

In this approach, the probability of each event is calculated before the event

is considered. This probability calculation assigns the uncertainty to each event.

The PDAF decomposes the estimation with respect to the origin of each element

75

of the latest set of measurements Z(k) = {Zi(k)}. However, it is assumed that

there is only one target of interest modeled by the dynamical equation

x(k + 1) = F(K)x(k) + v(k)

z(k) = H(k)x(k) + W(k)

where v and Ware zero-mean, mutually independent, white Gaussian noise pro

cesses with covariances Q( k) and R( k), respectively. By the assumption of one

target, it is meant that only one observation belongs to the target in the valida

tion gate and all other observations are assumed to be from the residual clutter.

The term residual clutter refers to the remaining clutter that has not been filtered

by the CFAR or MTI processors. These extraneous observations are modeled as

identically distributed random variables with uniform spatial distributions.

The PDA filter has two cases of interest, that of being optimal and of bieng

suboptimal. Since the probabilities are calculated based on the measurement sets

received up to time (k - 1), the optimal PDA recalculates the new sequence of

probabilities from the beginning up to the arrival of the new set of measurements.

This exhaustive batch processing approach is normally replaced with the proba

bility calculations based on latest measurement sets only leading to a suboptimal

filter. The basic assumption is that

P[x(k)IZk-

1] = N[x(k);x(klk -l),P(klk -1)] (2.24)

which means that the true target state (e.g., position) is assumed to be normally

distributed around the predicted state.

The following events need to be defined next:

(h(k) = {zi(k) is the target-originated measurements}, i = 1, ... ,mk

76

Bo(k) = {none of the measurements at time k is target-originated}.

Then we define

i = 0,1, ... , mk

where (3i (k) is the probability of each event being correct. The events are mutually

exclusive and hence mA:

L (3i(k) = 1. i=O

The state estimate is then a weighted average over these events, which can be

computed as x(klk) = E[x(k)IZk]

mA:

= LE[x(k)IOi(k),Zk]p[Bi(k)IZk] i=O mA:

= L Xi(klk){3i(k). i=O

It may be noted that xi(klk) = E[x(k)IBi(k), Zk] is the state estimate under the

assumption that the event Bi(k) is correct. Furthermore, xi(klk) is given by

i = 1, ... ,mk (2.25)

where

and

S(k) = H(k)P(klk - l)HT(k) + R(k).

The term W(k) is the weighting for innovation or the new information contained

in each event Bi(k), and S(k) is the covariance matrix of the innovation vector.

77

The innovation component corresponding to each new measurement is

(2.26)

Observe that once 80 (k) is considered, the filtered state is set equal to the predicted

state, i.e., once

xo(klk) = x(klk -1)

which means that if we do not receive any measurement within the predicted gate

volume, then there is no need for filtering and the filtered estimate is set equal to

the predicted estimate. Combining the equations we get

x(klk) = x(klk -1) + W(k)lI(k) mA:

lI(k) = L.Bi(k)lIi(k). i=l

This filter is highly nonlinear since the corresponding covariance matrix,

unlike the standard Kalman filter, is dependent on the data. This is due to the un-

certainty in the origin of the measurement assumed earlier. The event probabilities

are then derived by Barshalom [3] as

i = 1, .. . ,mk

with

and

78

where PD is the probability that the target is detected by the radar, PG refers

to the probability that the target is detected inside the predicted region (gate),

and .A is the spatial density of a false measurement in a Poisson-distributed clutter

environment and is given by

mk is the total number of measurements received and Vk denotes the volume of the

two-dimensional elliptical (i.e. Gaussian-based) validation region centered around

the predicted state and is given by

Various PDA Filters can be developed by different ways of calculating the

association probabilities f3i(k). However, they are all based on the underlying

assumption that only one valid target exists in the validation gate. Due to this

assumption, when there is actually another target present in the validation gate,

the PDA filter will be confused and it picks only one of the targets at the time of

the crossing trajectories. PDAF can track multiple targets as long as they are not

too close to overlap within the validation gate.

The PDA Filter focuses on one validation gate at a time. This is the nature

of PDA that was originally designed for tracking a single target in clutter. The

dependence of the association probabilities on innovation is such that the larger

the innovation is, the less likely the data is associated to the target. Furthermore,

larger innovation indicates that the data is farther away from z(klk - 1) which

is the expected measurement vector of the target return. There quite a number

of other algorithms that have been developed such as those given in [4,74,84,94],

however, we summarized the major assumptions and mathematical background

which is common to most tracking algorithms in the current literature.

79

CHAPTER 3

A ROBUST NEURAL NETWORK SCHEME FOR CFAR DETECTION

3.1. Introduction

The complexity in target detection by radar systems generally arises from

the fact that the return signal to the radar (echoes) at any particular scan of the

antenna may consist of the signal from the target to be detected, the background

clutter and some thermal noise, all of which may be highly correlated. Detection of

stationary radar targets in nonstationary noise and clutter offers more challenges

compared to situations when the target is moving due to the fact that in the latter

case the differences in the doppler spectral characteristics (MTI techniques) could

be exploited. Historically, the problem of stationary target detection is handled as

a statistical detection problem by treating the clutter as interfering background.

Most of the modern work in target detection by radar signal processing is

inspired by the pioneering work of Finn [15,16], who employed statistical model

ing of clutter and noise and used a false alarm probability regulation mechanism

which involved making the detection threshold proportional to a spatially sampled

maximum likelihood estimate of the output variance of the cell under test (where

this variance is due to the clutter environment). The goal of Finn's approach was

to develop a Constant False Alarm Rate (CFAR) processor which maximizes the

probability of target detection Pd while maintaining the probability of false alarm

Pja below a prescribed value. By comparing the processed voltage signal from each

resolution cell to an adaptive threshold, which is obtained from estimates of the

80

mean level of the interference over the adjacent range celis, automatic detection

of targets in nonstationary clutter and noise background is obtained, while achiev

ing a constant rate of false alarms when the interference is homogeneous over the

reference cells.

In the most basic CFAR detection scheme, called the Cell Averaging-CFAR

(CA-CFAR), the threshold for detection is set adaptively by computing the arith

metic mean of the outputs from a number of adjacent cells [16,17]. The detection

probability, Pd, in this scheme improves as the number of reference celis, N, in

creases and in the limit as N -+ 00 , Pd approaches the detection probability of the

optimum detector (i.e., the classical Neyman-Pearson detector) which is based on

a fixed threshold determined from an a priori knowledge of the mean level of in

terference. A serious degradation in detection probability, however, results from a

reduction in the number of available reference celis. Several factors such as any lim

itations of the radar system under use (in terms of resolution and sampling time),

presence of interfering targets and clutter patches in the vicinity of the primary

target may contribute to the reduction in the number of reference celis.

For operation in variable clutter environment, the performance of several

CFAR processors proposed in the literature [18,19,45] also depends on the efficiency

of the clutter classification scheme employed [47,76], which in turn depends on the

number of independent data samples that can be processed during every scan. A

critical problem with decision-making using this approach is the correct estimation

of the filter parameters (those of the whitening and matched filters), which in

turn depends on the selection of a model (AR or ARMA) of appropriate order for

representing the time-series data.

81

It is widely acknowledged that the design of a CFAR processor which is

capable of delivering a consistently high level of performance in all situations that

may include not only homogeneous background but also various forms of nonho

mogenities, caused by clutter edges, clutter patches, multiple interfering targets

etc., is not feasible. This is due to the fact that the inherent assumption in the de

sign of CA-CFAR processor, viz. the statistics of interference at each reference cell

are the same as the statistics of the test cell, is violated. This has prompted a flurry

of activity in this area leading to several modifications of the basic CA-CFAR algo

rithm, the differences mainly stemming from the selection logic used for extracting

the signal that will be compared with the signal from the test cell to perform detec

tion. For a brief description of the underlying processing, consider the schematic

diagram of a typical CFAR detector shown in Fig. 3.1. The reference window of

width N = 2n + 1 is split into a leading part (of width n) and a lagging part (of

width n) symmetrically about the cell under test and the square-law detected sam

ples from the adjacent leading and lagging reference cells are summed individually

and processed by the selection logic. The processed signal Ys is multiplied by a

threshold multiplier Tm and is compared with the sample Yn+l from the test cell,

and a "target detected" or "target absent" decision is made depending on whether

or not Yn+l exceeds the threshold value TmYs. It is in the specific algorithm used

by the selection logic that the various CFAR processing schemes differ. In the basic

CA-CFAR scheme, Ys is obtained as Yiead+ Yiag where Yiead = E;':~~2 Yi (i.e., sum

of the samples from the reference cells leading the target cell) and Yiag = E~=l Yi

(i.e., sum of the samples from the lagging reference cells). In a variation of this

scheme, called the Greatest Of-CFAR (GO-CFAR) [20,21,46], Ys is selected as

max(Yiead, Yiag), whereas the selection Ys = min(Yiead, Yiag) results in yet another

modification called the Smallest Of-CFAR (SO-CFAR) [22,23].

Incollling Satnl,les

A,ljnr~1I1 nrfrr< ... r. C"II,

Sill/ale Law

I>cleclor

II ....

E 1,,+.

Y,,,. - E .. , ,.,,+1

I/n .. ' = Sillllftl r.o," 1r.1I rell

II,

E

Y" •• t ... I.'

J',

Comparator

Tn,Y,

Threaholcl Multiplier 'l'n.

Fig. 3.1 A Schematic Diagram of CFA R Detector

-Detection Decision

00 l\:)

83

Several other modifications such as the Ordered Statistics-CFAR (OS-CFAR) [24],

the Trimmed Mean CFAR (TM-CFAR) [25] and the Censored Mean Level detector

(CMLD) [26] also exist in the literature. A detailed performance evaluation of these

schemes can be found in [27].

When the background is homogeneous and the reference cells contain in

dependent and identically distributed (iid) observations governed by exponential

distribution, the basic CA-CFAR yields optimum target detection performance.

However, the performance degrades (increase in Pia and/or increase in detection

threshold leading to a lower Pd) when these assumptions are violated, particular

ly in nonhomogeneous background situations. It is precisely to compensate for

these performance degradations in the various cases of background nonhomogeni

ties that different modifications of CA-CFAR have been developed. For instance,

the selection logic used in GO-CFAR is tailored to overcome the performance loss

when step increases in the background noise level (such as that produced at clutter

edges) are present, while that in SO-CFAR is tailored to yield good performance

in multiple target environments for resolving two closely spaced targets (such as

when an interfering target lies within the reference cells of the primarv target to

be detected). Despite the underlying differences stemming from the selection log

ic employed, a common requirement for all of these mean-level CFAR detectors

(CA-CFAR and its various modifications) is the availability of an adequate num

ber of samples from the reference cells, or a fairly large-sized reference window

N. This is due to the statistical requirements for the parameters that are used

for representing the target fluctuations and the clutter background. The detection

performance hence degrades very sharply when the size of the reference window

is less than about 30. Furthermore, each modification of CA-CFAR is developed

84

to specifically handle the performance loss arising from a specific situation that

violates the generally held assumptions about the environment and may not offer

any benefits (and in some cases may further degrade the performance) if the situa

tion encountered is a different one. For instance, as noted by several investigators

[23,27,48,49], while the GO-CFAR detector efficiently regulates the false alarm rate

in the presence of edge clutter, it may indeed worsen the performance in multiple

target environments (such as when an interfering target with strength equal to the

primary target appears in the reference window).

In this chapter, we shall present a novel neural network-based CFAR detec

tion scheme (referred to as NN-CFAR scheme for abbreviation) that offers a robust

performance in the face of loss of reference cells and also other nonideal conditions

corresponding to nonhomogeneous background environments. This scheme employs

a multilayer feedforward neural network trained by error backpropagation approach

[28] using the optimal detector as the teacher. The excellent pattern classification

capabilities of trained neural networks are exploited in this application to efficiently

counter performance degradations due to reduced reference window sizes and other

nonidealities.

Artificial neural networks are emerging as very attractive alternatives to tra

ditional methods (maximum likelihood techniques, nearest-neighbor classification

etc.) in the development of computer-based pattern classification algorithms, since

they can learn to perform the required classification without the assumption of

probabilistic models for the input patterns. Pattern classifiers are mappings that

define partitions of feature space into regions corresponding to class membership.

Classification problems that are not linearly separable and require nonlinear de

cision boundaries can be solved using multilayered neural networks with neurons

85

having nonlinear transfer characteristics. This area has witnessed an explosion

of research in the recent past and one of the important results that has come

out is based on the celebrated theorem of Kolmogorov. This result states that

any continuous nonlinear mapping can be approximated as closely as desired by a

multilayered neural network with a feedforward topology and sigmoidal nonlinear

functions [31-33].

The basic idea underlying the present work is the employment of a neural

network for a better representation of the target and the background, such that the

samples from a smaller sized window can be used without significant CFAR perfor

mance loss. For a brief description, it may be noted that the information loss due

to the reduced reference window size can be compensated by the use of additional

parameters. The conventional mean-level CFAR detectors primarily use one or

two parameters for representation of the background (for instance, average clutter

power used to set the threshold) and any attempts to use more parameters general

ly result in significant increases in design and implementational complexities thus

neutralizing any possible performance gains. On the other hand, a neural network

implementation of the CFAR detection scheme provides a convenient approach

for accommodating more input features without corresponding increase in design

complexities owing to the parallel processing capabilities of the neural network.

The fault tolerant properties of the neural network-based design also need

a particular emphasis. When the actual clutter distribution encountered deviates

from the ones used for training the NN-CFAR detector, the level of detection per

formance is maintained due to the utilization of more input features. Furthermore,

in such highly nonhomogeneous situations such as when the returns from some

of the reference cells are defective (particularly when some dead cells are present

86

on both sides of the test cell), the NN-CFAR scheme offers a better performance

than the conventional methods. Performance evaluations of the presently devel

oped scheme in these and other interesting scenarios will be described in a later

section of this chapter.

The primary emphasis in this chapter will be on describing the input features

used for training the neural network and on demonstrating the viability of this

approach for target detection in diverse background environments. Consequently,

to keep the discussion simple, we will limit ourselves to providing performance

comparisons with the basic CA-CFAR scheme and will highlight the advantages of

employing the NN-CFAR scheme in these scenarios. Expanding the approach to

more efficiently handle a particular scenario for which a specific modification of the

CA-CFAR (such as GO-CFAR or SO-CFAR) has been developed is straightforward

and will require more training examples to be selected from that specific scenario.

3.2. Development of NN-CFAR Scheme

Targets of interest in radar detection usually result in specific features in the

reflected signal which are however buried in thermal noise and clutter. Due to its

pattern classification ability [29,50], a carefully trained neural network can more

efficiently distinguish features in the structured target echoes by utilizing some

details that are generally hidden to conventional statistical receivers. The primary

objective of the present design is to employ a neural network scheme in order to

enhance the performance of conventional mean-level CFAR processors, particularly

when the number of reference cells is reduced and/or other nonideal conditions are

present in the detection environment.

87

3.2.1 Framework For Neural Network Training

In this section, we shall briefly describe the assumptions used in establish

ing a framework for generating the training examples and also for the simulation

exercises that will be discussed in the next section. These assumptions are how

ever common to most existing CFAR design procedures and hence facilitate a fair

comparison of performance later. It should however be emphasized that NN-CFAR

design does not really require these assumptions or is not limited only to environ

ments satisfying all of these assumptions. In other words, the training vectors can

indeed be generated and the neural network can be satisfactorily trained with these

vectors even when the assumptions are not all satisfied.

(i) Square-law detection will be used (Fig. 3.1) and samples are sent serially

through a shift register of size N = 2n + 1, where n is an integer.

(ii) The cell at the center (viz. the (n + l)th cell) is used for the primary target,

such that the leading and the lagging windows are of equal width.

(iii) Only range cells will be used (however, an extension of this approach to two

dimensions is straightforward) and the output of each range cell is assumed

to be exponentially distributed with probability density function given by

f(p) = (2~) exp( -p/2q), p~O. (3.1)

(iv) Targets in the reference window (both primary and any interfering targets)

have only temporal fluctuation and the amplitude fluctuation is according

to the Swerling-I model. Thus, under the null hypothesis Ho (no target

present in a range cell), q in (1) is the average power of the total clutter

plus thermal noise, which will be denoted by J.L. Under the alternative

hypothesis HI (target present in the range cell), q refers to the average

power due to all three returns (clutter, noise and reflection from target) and

88

is represented by q = Jl (1 + S), where S denotes the average signal-to-noise

ratio (SNR) of the primary target. For any interfering target with power I,

1/ S represents the interference to target power ratio, which will be used in

the performance analysis. Thus, for a reference cell containing an interfering

target, q = Jl (1 + I) represents the average signal power.

(v) The dutter and noise residues in the range cells are assumed to be indepen-

dent and identically distributed (iid). It must be emphasized that although

this assumption will correspond to a rather specialized case for training, as

mentioned earlier our primary interest in this chapter is to compare the per

formance of the NN-CFAR scheme with that of the basic CA-CFAR (which is

particularly designed for a homogeneous background). Once the NN-CFAR

is trained with the selected examples, the performance will be evaluated to

highlight the robustness to the loss of reference cells as well as to the devi

ations of the actual clutter distribution from the original distributions used

for training.

3.2.2. Training with Optimum Detector as Teacher

In the development ofNN-CFAR scheme, the neural network will be trained

for a number of distinct false alarm rates (i.e., PIG values) using the decisions from

the optimum detector as examples and the detection performance (after comple

tion of training) will be compared with that of the CA-CFAR scheme in diverse

operational scenarios, such as when the size of the reference window is progres

sively reduced. In order to establish a precise framework for executing these tasks

and for stating the performance metrics used in the comparison, we shall briefly

identify some performance quantities of interest for the optimum detector and the

CA-CFAR detector.

89

Under the assumptions of homogeneous background and knowledge of the

total noise power p., the probability of false alarm in the optimum detector is given

by

(-Xo) P,o. = Pr[x > Xo IHo] = exp '2j;: (3.2)

where x denotes the signal from the test cell, Xo is the fixed threshold and Ho de

notes the null hypothesis (noise only). Under the other (signal present) hypothesis

Hl , the optimum detection probability p;pt is

(3.3)

whe!~ S denotes the signal-to-noise ratio. In the case of CA-CFAR, the corre

sponding quantities can be evaluated by observing that the signal from the test

cell x is compared with a variable threshold TmYs (see Fig. 3.1). Hence,

(3.4)

which can .be expressed [21] as

P =M (Tm) 10. 2p.

(3.5)

where M (.) denotes the moment generating function (mgf) of the random variable

Y". Similarly, the detection probability is

(3.6)

which can be expressed [21] as

(3.7)

90

As noted in [21] and [22], the average detection threshold (ADT) provides a

convenient mechanism for estimating the loss of detection performance (due to the

finite reference window size) of various CFAR processing algorithms and is defined

as

(3.8)

For the CA-CFAR, under the assumption of exponentially distributed homogeneous

noise background, this simplifies [27] to

(3.9)

which is independent of f.l. For the optimum detector, however, the threshold is

fixed and hence, using (2), the ADT can be evaluated as

Xo ( ADTOI'I = 2f.l = -In PI")' (3.10)

Of particular interest to our work is the change in ADTc F AR as the size of

reference window N decreases. The detection probability p;FAR and the detection

threshold Tm can be evaluated in this case as

CFAR [ Tm]-N Pd = 1 + 2f.l (1 + S) (3.11)

and

which clearly illustrate the effects of reducing N. In particular, as N decreases,

Tm increases which consequently results in a lower probability of detection for CA

CFAR compared to that of the optimum detector. This, of course, is the price for

keeping PI" constant.

91

The variation of ADT as a function of N has been studied in the literature

for different CFAR processors. Table 3-1 gives a quantitative comparison of ADT

for the CA-CFAR processor for different values of N with the optimum detector

threshold computed for several PIC values. This table is also given in [27].

Table 3-1 Comparison of ADT for CA-CFAR with threshold for optimum detector

Optimum N=9 N= 17 N=25 N=33

PIO ADT T ADT T ADT T ADT T ADT

10-4 9.21 2.1 17.3 0.78 12.4 0.47 11.2 0.33 10.6 10-6 13.8 4.6 37.0 1.37 21.9 0.78 18.6 0.54 17.2 10-8 18.4 9.0 72.0 2.16 34.6 1.15 27.7 0.77 24.9

Evidently, reduction in the number of averaging cells results in progressively

raising the threshold and consequently the target is masked when N is considerably

small. Furthermore, as N tends to infinity, the CA-CFAR detection threshold

matches that of the optimum detector (if the background stays homogeneous). This

explains the rationale for using the optimum detector as the teacher for training

NN-CFAR. Table 3-1 also contains another useful information. Observe that for

N = 33, the ADT for CA-CFAR gets reasonably close to that of the optimum

detector, while progressively worsening as N becomes smaller. Hence, for designing

the NN-CFAR, we use the same number of cells (i.e., N = 33) for training (the

performance of the trained network will, however, be evaluated for N < 33 and

also for nonhomogeneous background situations). Another reason for selecting N

= 33 during the training phase is due to the statistical requirements on the input

features used; most of these signals need to have a sample size of at least 30 in

92

order to give an unbiased estimate of the mean or the variance that will be used. Of

course, there is no upper bound on the size of N and larger the value of N selected

during the training phase, the better representation of the background clutter the

neural network will be exposed to. Thus, N = 33 is a representative selection and

this will be used for all of the further development.

From the analysis of the CA-CFAR scheme given above, it is evident that

the average power in the reference cells should be included in the set of input

parameters simply because it serves to represent the background power. Thus in

the training process, we expose the average power of the cells to the network for

several distinct values of PIG. Of course, in addition to this parameter, we will

use a few more parameters such that the network continues to have a recall of the

actual target and background distributions even when the reference window size

gets reduced. Details of the training process will be given in a later section.

3.2.3. Selection of Input Features

The NN-CFAR detector is trained to make decisions based on features de

rived from the radar data from the N = 2n + 1 reference cells collected during

one course of a single antenna scan. Of fundamental importance for a satisfactory

training of the neural network and for the pattern classification performance of the

trained neural network is the selection of an appropriate set of input features. In

this section, we shall describe the specific input features that are used to represent

the target and the clutter fluctuations in the NN-CFAR scheme, and also briefly

describe the motivations for their selection. An obvious selection of the input fea

ture to be used is the output of the test cell (or the center cell of the reference

window) Yn+l. Additional parameters that provide statistical characterization of

93

the samples from the reference cells on either side of the test cell will be used to

complete the input feature set.

The motivation for using more parameters for training the neural network

comes from the observation that statistical parameters generally lose their effec-

tiveness as the number of samples is reduced. To compensate for this loss in the

face of reduced reference window size, we employ several parameters that attempt

to characterize the same statistical properties of the available sample set. It must

be noted that most of these parameters have been individually used in the design

of different types of CFAR algorithms earlier. The ability of the neural network for

simultaneously processing these various signals (i.e., the signal fusion capability)

permits all of them to be used together in this application for obtaining a better

representation of the background clutter.

Statistical Mean Over the Reference Window:

The first feature included in the input set is the statistical mean J.LT' which

reflects the total average power in the reference cells (including the cell under

test). Since during training we are using N = 33 resolution cells, statistically this

constitutes a sufficient number to compute the sample mean, particularly when

each cell has an independent distribution. If the outputs of the range cells (i.e.,

output of the square law detector), Yi, i = 1,2, ... N, are independent exponentially

distributed random variables, the mean J.LT given by

1 N J.LT = - "'Yi N~

z=1

is also exponentially distributed.

94

Average Powers of the Leading and the Lagging Windows:

Evidently, in the presence of a target, the return from the test cell (Yn+l)

affects the parameter J.lT considerably. In order to provide a sense of the background

by itself, we use the average powers of the leading and the lagging sides of the

reference window as two input features. These signals Yiead and Yia9 are given by

1 2n+l

Yiead = - L Yi n i=n+2

and 1 n

Yia9 = - LYj, n j=l

where n = N21. It is evident that Yiead = Yia9 if the background is homogeneous

and exposing the neural network to these features helps the network learn the

distributions in the cases when the background is no longer homogeneous (due to

a clutter edge or due to the presence of interfering targets, for instance) causing

Yiead and Yia9 to differ significantly. As noted in Section 3.1, these features have

been used in certain modifications of the basic CA-CFAR scheme to enhance the

detection performance in nonhomogeneous background scenarios.

Variance of the Leading and Lagging Windows:

Target and clutter fluctuations affect the accuracy of the features discussed

so far. To represent these fluctuations, we use "fead and ufa9' which are the vari

ances of the leading and the lagging windows, respectively. Use of both these signals

would enable the NN-CFAR to detect the deviations from the gaussian behavior

by comparing the variances. For illustration, consider the case when the average

power of the leading window is higher than that of the lagging window. This could

correspond either to a scenario where a clutter edge discontinuity which is uniform

in each side with a sharp difference in amplitude is present, or to a scenario where

the presence of some interfering targets on one side that could contribute to the

95

average power is indicated. The use of the variance on each side would help the

NN-CFAR to distinguish these situations and determine whether there is an edge

clutter or an interfering target, or the background is still homogeneous.

The t-statistic:

The features discussed thus far mainly reflect the background magnitude

and dispersion with respect to the target. For representing the target fluctuation

itself, one could attempt to use intelligent parameters that statistically relate the

output of the test cell Yn+l to the returns from the reference cells. One such

parameter is the t-statistic defined by Goldstein [18]

1 "N t = Yn+l - Iii L.Ji=1 Yi

J k Lf::l (Yi - -k Lf=1 Yi) 2

(3.12)

As noted in [18], a subtraction of the maximum likelihood estimate of the

mean of Yn+l appears in the numerator of this expression, while the denominator

performs a normalization by dividing by the maximum likelihood estimate of the

standard deviation of Yn+l. In [18], this parameter was used (more exactly, a

modified parameter "log-t" obtained by replacing the outputs Yi in the expression

(12) by their logarithms was used) to automatically adjust the detection threshold

for maintaining false alarm regulation in log-normal clutter and in Weibull clutter.

Median of Clutter Samples:

One parameter that has not been used extensively in the literature on CFAR

processing is the median of clutter samples. This parameter, however, has been

extensively used in digital image processing applications and we will use it for NN

CFAR training. For an ordered sample set X = Xt,X2, ••• ,XN of size N, where the

96

samples are arranged in increasing order of magnitude and N is odd, the median

is defined by the statistic

1] = X(N+l). 2

While the mean and the median together give a good representation of the behav

ior of the sample set X, the median indicates some useful statistical characteristics

not reflected in the mean. For instance, while the mean provides a measure of

the central tendency of the sample set, it can be significantly affected by extreme

values, such as the ones resulting from the presence of an interfering target with

a high interference-to-signal (1/ S) ratio. Evidently, such extreme values influence

the computed average power and presence of interfering targets hence results in a

raising of the detection threshold if not taken care of. Some suggested modifica

tions in the literature, such as the Censored-ofCFAR (CO-CFAR) [23], attempt to

isolate the interfering targets and use only the remaining cells for threshold calcu

lation. It is easy to see how this operation further reduces the number of available

cells.

Use of the median as an input feature for NN-CFAR training has the ad-

vantages of simple computation and of providing a better representation of the

background clutter when interfering targets are present. However, in dealing with

samples from a population such as radar samples in each scan period from an ex

tended background, the sample mean does not vary as much from sample to sample

as does the median. This implies that the sample mean is more stable than the

median for estimation of average clutter power and is more suitable when consis-

tent interference does not exist. Consequently, using both parameters one could

obtain some indication of the presence of an interfering target (i.e., if the two pa

rameters differ significantly then the presence of an interfering target is strongly

97

indicated, whereas if the two parameters are close then most probably there will

be no interfering target).

Correlation Between Leading and Lagging Cells :

Yet another useful input feature to be used in NN-CFAR training is the

correlation coefficient between the leading and the lagging portions of the refer

ence window. The correlation coefficient plays an important role in bivariate data

analysis problems and in the present case will help identify and parameterize any

deviations from independence of clutter data on the two sides of the test cell. Thus,

from the samples Yt.Y2, ... ,Yn from the lagging cells and Yn+2,Yn+3, ... ,Y2n+l

from the leading cells (see Fig. 3.1), the correlation coefficient is calculated as

~n l~n ~n L.Ji=l YiYn+i+l - n L.Ji=l Yi . L.Jj=l Yn+j+l

corr= --~~--------~~~----~~----TJ • O'lead • O'lag

where O'lead and O'lag represent the standard deviations of the two sides and are

given by

and

2 _ 1 O'lead - (n - 1)

2 _ 1 O'lag - (n -1)

To illustrate the usefulness of this parameter, observe that a value of corr =

.6 would indicate that approximately 60% of the clutter data on the two sides of

the test cell are linearly related. This in turn can suggest the existence of some

kind of correlated background, such as edge clutter. Specifically, for a reference

window of size N = 21, n = 10 reference cells are on each side of the test cell and

98

in the case of an edge clutter, the clutter amplitude in 6 of the leading cells might

be a scalar multiple of the average clutter amplitude in the corresponding 6 cells

on the lagging side. Since we are using a static neural network (i.e., feedforward

processing only), only the spatial correlation is considered in the computation of

carr.

3.2.4. Neural Network Architecture and Training

Fundamental to the employment of a neural network in any specific appli

cation is its function approximation capability and, as mentioned in the Section

3.1, several powerful analytical results [31-33] have been established to confirm

the existence of a multilayer neural network with feedforward topology and sig

moidal nonlinear functions. However, as is true with other function approximation

procedures (such as the use of polynomials, Fourier series and general orthogo

nal functions). these results do not give a procedure for estimating the number of

terms needed, i.e., the number of layers and the number of nodes in each layer, for

achieving a desired degree of approximation. These have to be determined by trial

and error for the specific problem at hand.

The basic processing element (neuron) in these function approximating net

works has an input-output characteristic which is obtained by forming a weighted

sum of the several inputs received and producing an output which is a nonlinear

function of this weighted sum, according to the relation

vet) = f (t, WiUi(t))

where Ui('): ~ -+ ~,i = 1,2, ... m, are the inputs, v(·): ~ -+ ~ is the output and

Wi E ~,i = 1,2, ... m, are the weights. 1('): ~ -+ ~ is an approximately selected

nonlinear activation function that satisfies the following conditions:

(i) xf(x) > 0 for all x E ~ (first and third quadrant function)

(ii) liml:J:l_oo f(x) = k sgn(x), k >·0 (saturating function)

(iii) f~Zltl ~ f~22) for all IXII :::; IX21 (nondecreasing function).

99

Commonly used activation functions are the sigmoid characteristics (e.g : f( x) = tanh()'x), or f(x) = (1 +e-Z)-l).

In order to accept the nine input features described in the last section,

viz. {Yn+b IlT' Yiead, Yiag, Qfead,Qfag,t,7],corr} , a network with an input layer

consisting of nine nodes was employed. Since no further preprocessing of these

input features is needed, these nodes serve only to fan-out the signals. Two hidden

layers with 14 nodes in each layer were selected with the sigmoidal activation

function f(x) = [1 + e-zt l . Since the output of the neural network represents

the decision on the presence or the absence of target, one output node, whose

output is a linear combination of the outputs from the previous layer, was used to

complete the network architecture. A specified number d1 is used at the output

node to indicate the presence of the target and the absence of target is indicated by

another specified number d2 , with the decision threshold set appropriately between

dl and d2 for separating the two cases (see Fig. 3.2).

Training of this network by error backpropagation was conducted following

the generalized delta rule with momentum [28] and using the decision of the op-

timal detector for computing the error in each case. Data from a homogeneous

background with 33 reference cells that include samples from the background as

well as the target were used for training. The motivation for using a reference win

dow of size 33 during training is from the statistical effectiveness of the parameters

used in the input feature set, as discussed earlier. Examples from nonhomogeneous

background situations were not used in network training on purpose, since our

100

primary emphasis in this study is to compare the performance losses sustained by

NN-CFAR with those of CA-CFAR in various cases when the background deviates

from being homogeneous.

Inp~t Layer

11 .....

p-

1';., --0 ..

(f-, ... --0

o o

o t --0 0

TJ --+c.....r---~-.

Hidden Laye:s

Output Layer

Fig. 3.2 Neu..~ ~e:work Arci:itecture

Threshold

101

For generating the training vectors, four different levels of false alarm rates

that were two orders of magnitude apart (i.e., P'C = 10-2 ,10-4 ,10-6 , 10-8 ) were

considered. For each P! c value, six different levels of SNR ranging from 1 db to 20 db

were used. The target and the background were considered to follow exponential

distribution such that they yield iid samples in each resolution cell. Target, clutter

and noise samples were generated independently and added together according to

the assumed distribution. Training vectors were generated such that there would

be 30 examples from each combination of P'C and SNR levels, in order to ensure

statistical independence of samples. Thus a total of 720 training vectors were

generated.

The training examples were exposed to the network in the batch processing

mode [34] i.e., by computing the error accumulated after each cycle, which is one

complete sweep of the 720 training vectors. The training was rather smooth and

there was no significant need for using the momentum term. The algorithm was

run for 800 cycles when steady-state conditions were attained (i.e., error reducing

to appreciably small values) and the training was terminated. For establishing

a decision threshold at the output node of the network, the values d1 = 8 and

d2 = 1 were used and any response above (d1 + d2)/2 = 4.5 was classified as

"ta.rget present". The exact values selected for these parameters are of no particular

significance and the indicated values are only a representative selection which were

maintained in conducting performance evaluation tests that will be described in

the next section.

3.3. Robustness Evaluations of NN-CFAR

Our objective in this section is to evaluate the performance of NN-CFAR

trained as above in a variety of different scenarios and operating conditions that

102

deviate significantly from the conditions for which the network is trained. The

primary focus in this study is to establish the robustness of the NN-CFAR scheme

to deviations caused by SNR values outside the training range, high clutter levels

beyond those used in training, reduction in the size of reference window, nonho

mogeneous background conditions due to the presence of clutter edges, interfering

targets and dead cells, for which the network is not exposed to during training. For

establishing a benchmark level of performance, we shall employ the basic CA-CFAR

scheme and compare the NN-CFAR performance with this in each case.

Due to the fact that analytical methods are not available for evaluating

the performance of neural network-based schemes, the comparisons were done by

simulation experiments. For ensuring fairness in the comparison, in each exper

iment the average of 100 different runs executed independently for different PIe.

values, different clutter distributions etc., were evaluated. These experiments have

revealed that NN-CFAR consistently provides a superior performance in each of

these scenarios. Brief descriptions of a few of the more interesting experiments will

be given in the following.

For conciseness, the results will be given either in a tabular form or in

a graphical form. The following notations will be used to describe the various

quantities:

S = Average SNR of the primary target (in the test cell);

I = Average SNR of any interfering target;

ASNR = Average of SNR in 100 independent runs (in db);

ACNR = Average of CNR (clutter-to-noise) ratio in 100 independent runs (in db);

PDCA = Percentage of detection for CA-CFAR scheme;

PDNN = Percentage of detection for NN-CFAR scheme;

% CCA = Percentage of correct classification for CA-CFAR;

% CNN = percentage of correct classification for NN-CFAR;

103

Merit = A figure of merit which indicates whether NN-CFAR performed better or

worse than CA-CFAR in the particular test. A "+" sign indicates higher probability

of detection (Pd) together with less increase in false alarm rate (P,..) for NN-CFAR

compared to CA-CFAR. (Evidently, any increase in Pd for one detector is worthy

of notice only if P,o. has not increased beyond that for the other detector). Thus,

in all of the performance comparisons given here, we will be interested only in this

overall performance.

It should be emphasized that the performance variations depicted in the

tables and graphs may not appear smooth (as intuition would suggest). This is

only due to the fact that the average from a finite number (100, to be exact) of

runs, with parameter values selected by a random number generator, has been

calculated in each case.

Experiment # 1

The first experiment was directed to testing the generalization performance

of NN-CFAR i.e., the detection performance by processing test vectors not included

in the training patterns to which the network was exposed to during the training

phase.

A total of 120 test vectors, generated by a random selection of false alarm

rates in the range 10-6 ~ P,o. ~ 10-2 , were used in this experiment. The input

feature set {Yn+I. tLT' ¥lead, ¥lag, qfead' qfag' t, 7], corr} was generated in each case

with the usual assumptions on the distributions and provided the test vector for

processing by NN-CFAR. More specifically, for each selection of P,o.' a random value

for the average SNR , S, in the range 1 ~ S ~ 20 db was generated. A separate

104

random number generator was used for the clutter-to-noise ratio (CNR) and the

clutter data was transformed into exponential distribution using this average CNR

value. This was done independently for each cell.

Our objective in this experiment is to compare the detection performance of

NN-CFAR with that of the optimum detector for which the probability of detection

is computed from p;pt = [P,..J 1/I+S, using the values for P'A and S. For simulating

the presence or the absence of the target and its detection with the optimum

detector, a uniform random number Po between 0 and 1 was generated and, if Po ~

p;pt, the target was declared "present?' and its radar return was then generated

following an exponential distribution with the selected values of S (average SNR).

For the 120 input vectors tested with, NN-CFAR resulted in a correct de

cision 102 times (i.e., 85% correct classification) which indicates that NN-CFAR

is a very capable processor. It should be emphasized that this performance level

is only a representative one and by no means constitutes the best performance

possible with NN-CFAR. Indeed, with additional training effort (i.e., training with

more examples) and attempts to tune the architecture of the neural network, it is

possible to realize further improved performance levels. Since our primary focus in

this work is only on a proof-of-concept demonstration (that is, to demonstrate the

feasibility of a neural network-based algorithm for CFAR detection) and further to

demonstrate the robustness characteristics of this algorithm, no additional work

for optimizing the NN-CFAR performance was conducted.

The conclusion from this test that NN-CFAR has been adequately well

trained to follow the distribution of background clutter does not come as a sur

prise., since the testing conditions involved a homogeneous background and all the

necessary parameters for representation of homogeneous background are present

105

in the input feature set presented to the neural network. In fact, the analysis of

CA-CFAR scheme [23,27] indicates that the specific input feature !J.T by itself is

sufficient to represent a homogeneous background. This may lead one to suspect

the need for the other features that NN-CFAR uses as inputs. It should be noted,

however, that when the background conditions deviate from being homogeneous

and/or other nonideal conditions (such as reduction of the reference cells etc.) are

present, the CA-CFAR scheme does not ensure the same high level of performance.

It is the robustness of the NN-CFAR in the face of such deviations from ideal

conditions that will be established in the sequence of experiments to follow.

In order to test the performance of NN-CFAR when the input feature

set was reduced in size, we also conducted several experiments by training neu

ral networks with smaller numbers of input nodes (specifically, 8 input nodes

and 7 input nodes while maintaining the rest of the architecture unchanged)

and by selectively dropping one or two input features from the feature set

{Yiead, Yiag, O"fead' O'fag' t, 77, corr}. It should be noted that the return from the test

cell Yn+l and the average power !J.T were retained as inputs in each of these ex

periments. Tests conducted over a wide range of ASNR and ACNR levels and for

different PI" levels indicated a degradation of detection performance (reduction in

percentage of correct classification % CNN) ranging from about 1 to 4%. While

this may not appear highly unattractive under the present test conditions involving

homogeneous background scenarios, the greater fault tolerance properties induced

by the additional inputs will equip the NN-CFAR to continue to offer a high lev

el of performance in the case of nonhomogeneous background scenarios and when

reduced window sizes are encountered.

106

Experiment # 2

To evaluate the performance of NN-CFAR when an interfering target is

present, in this experiment we maintained the size of the reference window at

N = 33. A Swerling-l interfering target with l/S = 1 (i.e., average fluctuating

amplitude of the interfering target at the same level as that of the primary target)

was placed in cell #4. For a value of PIG = 10-5 , the input features were generated

for different selections of S by a random number generator such that the average

of SNR over 100 independent tests, ASNR, was maintained at a certain level.

The results of processing these inputs by NN-CFAR and a comparison with the

performance of CA-CFAR algorithm in these cases are summarized in Table 3-2a. It

must be noted that each entry of the table refers to the average of 100 independent

tests conducted with distinct SNR values selected such that the average ASNR

value is at the indicated level. The consistently better performance offered by

the NN-CFAR scheme compared to the CA-CFAR is clearly evident. This also

underscores the usefulness of the median parameter (TJ ) included in the input feature

set, which alerts the neural network to the presence of the interfering target (note

in contrast that the CA-CFAR scheme is totally blind to this new situation which

causes a deviation from homogeneous background).

To further test the robustness of the NN-CFAR scheme, the above exper

iment was repeated with values of ASNR considerably outside the range of SNR

values used in the training phase (viz., 1db - 20 db). Despite the two mismatch

es (i.e., presence of interfering target and different SNR values) from the training

conditions, NN-CFAR maintained a high level of performance depicted in Table

3-2b. The superior performance offered by NN-CFAR despite the fact that the

SNR values used in the test were up to 11 db out of range (i.e., beyond the training

107

levels) indicates that the background clutter distribution is more important for

the NN-CFAR performance than the actual magnitude of the SNR. It may also

be noted that the use of t-statistic facilitates to reduce the dynamic range of the

clutter.

21.7 20.5 20.1 19.4 19.1 18.3 17.8 17.2 16.7 16.4 15.0 13.2 9.7

Table 3-2a Comparisons of NN-CFAR in Experiment # 2 (Interfering Target Present)

0.53 0.63 53 65 + 0.48 0.64 55 71 + 0.44 0.61 47 67 + 0.47 0.62 62 77 + 0.51 0.53 52 58 + 0.33 0.51 49 68 + 0.30 0.45 40 57 + 0.21 0.37 44 57 + 0.23 0.39 43 60 + 0.18 0.35 41 56 + 0.15 0.28 50 56 + 0.04 0.22 48 57 + 0.01 0.05 73 75 +

108

Table 3-2h Performance of NN-CFAR for SNR outside the training range in Experiment # 2

31.6 0.55 0.88 54 88 + 31.2 0.65 0.86 65 86 + 30.5 0.55 0.77 55 78 + 29.9 0.60 0.79 61 80 + 29.4 0.61 0.80 61 80 + 29.0 0.63 0.86 62 86 + 28.8 0.50 0.80 49 80 + 28.5 0.63 0.82 60 82 + 28.0 0.57 0.77 55 78 + 27.7 0.56 0.77 56 77 + 27.3 0.56 0.81 56 81 + 26.6 0.55 0.78 56 81 + 25.9 0.61 0.82 60 82 + 25.1 0.67 0.78 69 79 + 23.7 0.63 0.82 64 85 + 22.7 0.56 0.74 61 77 + 20.9 0.47 0.60 46 64 +

109

Experiment # 3

In this experiment we examined the robustness of NN-CFAR for increasing

clutter levels. With all conditions of the test identical to that in the previous

experiment except that the interfering target is removed, the detection performance

of NN-CFAR was evaluated in comparison with that of CA-CFAR for various SNR

levels and various CNR (clutter-to-noise ratio) levels. The results of the test are

summarized in Table 3-3 (where each entry of the table once again corresponds

to the average from 100 independent runs). It is particularly noteworthy that

despite the increase of clutter average amplitude above 10 db beyond the levels

used for training, the statistical pattern of the clutter has been quite well picked

up by the NN-CFAR. This once again confirms our conclusion from the previous

experiment that the input features are well tailored to provide the information on

clutter distribution during the training process.

One particular aspect that may not be apparent from the stated results is

the role of the sigmoidal activation function in the neural network to reduce the

sensitivity of NN-CFAR to out of range signal amplitudes. The saturation at the

hidden layers for higher SNR values helps keep the performance similar to the cases

in the lower SNR ranges, and this is an attractive feature in detection schemes since

detection algorithms generally do not perform well when the receiver is saturated.

For instance, if the individual sensors in the CA-CFAR resolution cells become

saturated due to limitation of the sensor dynamic range, the CA-CFAR algorithm

would not have any inherent cure to compensate for it.

Table 3-3 Comparison of NN-CFAR and CA-CFAR for variation in clutter level (Experiment # 3)

ASNR ACNR PDCA PDNN %CCA %CNN MERIT

30.7 30.2 .91 .98 92 99 + 30.4 30.2 .90 .98 90 98 + 29.9 29.7 .90 .98 90 98 + 29.6 29.8 .86 .95 87 95 + 29.1 29.5 .89 .97 90 98 + 28.8 29.4 .86 .96 86 96 + 27.9 30.0 .87 .96 89 97 + 27.8 28.1 .86 .96 87 96 + 27.4 26.9 .87 .95 87 95 + 27.0 26.1 .81 .90 83 92 + 26.3 25.9 .84 .89 86 89 + 26.1 26.2 .80 .94 80 94 + 25.5 25.5 .83 .90 86 90 + 24.1 24.5 .81 .88 86 90 + 22.8 22.5 .86 .93 88 93 + 22.1 21.6 .83 .88 86 90 + 20.6 20.9 .80 .87 85 90 + 19.7 18.4 .74 .83 84 87 + 19.4 19.4 .73 .77 83 83 + 18.2 16.7 .62 .69 72 76 + 17.0 16.7 .67 .72 82 82 + 16.4 15.6 .69 .71 78 80 + 15.6 15.7 .73 .74 79 79 + 12.4 14.3 .51 .53 73 70 = 11.8 11.5 .41 .39 63 65 = 10.3 10.3 .51 .46 65 71 = 8.0 8.8 .31 .26 49 61 = 4.6 3.4 .22 .17 49 89 +

0.79 0.41 .14 .03 36 86 -

110

Investigation of the activation function effects on the NN-CFAR sensitivity

to out-of-range target SNR is also of interest due to another reason. In a typical

operational scenario, one may have to contend with jamming signals with SNR

beyond the range that can be tolerated by the detector. Due to the requirements

of the interface electronics (e.g., use of linear amplifiers), almost all detectors are

111

operated in the linear region of the signal amplifier and hence it is vital for the de

tection algorithm to have the least sensitivity to signal amplitude level, particularly

at high and low levels. Fig. 3.3 depicts the variation of probability of detection (Pd)

against SNR for both NN-CFAR and CA-CFAR. While the superior performance

of NN-CFAR for SNR levels above 14 db is clearly evident, it should be emphasized

that the apparent lower performance for SNR levels below 14 db is indeed mislead

ing due to the fact that the false alarm rate was also much more for CA-CFAR

than for NN-CFAR in this range. This is evident from the results in Table 3-3

(the same values were used for sketching Fig. 3.3), where the percentage of correct

classification (% CCA and % CNN) are also given in addition to the probabilities

of detection (PDCA and PDNN). For example, note that for ASNR=1O.8, PD

CA=O.51 and PDNN=0.46 (as Fig. 3.3 depicts), whereas % CCA=65 is below %

CNN=71. For levels of SNR above 14 db, both PDNN and % CNN are consistently

higher than PDCA and % CCA respectively.

.... c:::i

/ .... \ : ~

j' ........

•............... •...........

0.0 7.0 14.0 21.0 SNR (db)

a •

NN-C~AR CA--CrAR

28.0

112

Fig. 3.3 Performance of NN-CFAR in Experiment # 3 (Increased clutter levels)

113

Experiment # 4

The focus in this experiment is to evaluate the robustness of NN-CFAR when

the reference window gets reduced in size. With all other conditions remaining the

same as in the previous experiment, two different tests were conducted to study

the amount of degradation in detection performance caused by the loss of reference

cells. The results obtained with a reference window of size N = 25 are summarized

in Table 3-4a for the case when the target is in the clutter region (i.e., the sample

from the test cell includes both target and clutter returns) and the average SNR and

CNR values are varied. These results are to be compared with the entries in Table 3-

3, which give the performance of NN-CFAR under identical conditions except with

N = 33. It is evident that NN-CFAR maintains a high level of detection accuracy

despite the loss of reference cells and it is only in the lower end of the table, when

the signal levels are very low, that the detection performance is compromised.

For a neural network-based scheme, this only indicates that during the training

phase the network is not exposed to an adequate number of examples reflecting

that particular range of operation, and in the present case, including more training

examples emphasizing lower signal levels will help overcome the performance loss in

this range of SNR values. Experiments were also conducted with further reduction

of the size of reference window. For illustration, only a few of the results obtained

for N = 17 and N = 9 are given in Tables 3-4b and 3-4c * to indicate the general

* In comparing the entries of these tables it should be noted that the variation of detection

probability with respect to the average SNR is not smooth due to the averaging with a finite

number of independent runs (viz. 100 runs) and also due to the different clutter levels used in

these runs which changes the signal-to-clutter ratio.

114

trend which is sufficient to appreciate the excellent robustness characteristics of

NN-CFAR.

ASNR

30.8 30.5 29.9 29.8 29.4 28.6 28.2 27.3 26.5 24.9 23.8 22.3 21.2 18.8 16.7 11.0 3.4

Table 3-4a Performance of NN-CFAR with a reduced reference window, N=25, for different clutter levels

(Experiment # 4)

ACNR PDCA PDNN %CCA %CNN MERIT

31.1 .79 .94 79 94 + 30.8 .74 .95 76 95 + 28.7 .80 .94 82 94 + 30.2 .79 .94 79 94 + 28.9 .87 .96 88 97 + 28.9 .79 .94 79 94 + 28.6 .78 .93 80 94 + 27.1 .80 .95 83 98 + 25.9 .83 .95 85 96 + 23.7 .72 .91 76 93 + 23.7 .79 .90 83 91 + 22.3 .70 .84 .79 87 + 20.0 .67 .86 73 87 + 18.6 .60 .74 73 80 + 14.6 .59 .67 79 78 + 10.8 .40 .40 68 63 -4.1 .17 .03 58 81 -

ASNR

30.7 29.8 29.1 28.3 27.8 25.8 24.0 21.2 20.2 18.7 18.1 14.8 12.3 4.6

Table 3-4b Performance of NN-CFAR with reduced reference window; N=17 (Experiment # 4)

ACNR PDCA PDNN %CCA %CNN MERIT

30.4 .64 .97 64 97 + 29.5 .61 .93 62 93 + 28.2 .58 .91 61 91 + 28.4 .60 .94 62 94 + 27.6 .64 .91 68 91 + 25.9 .56 .88 62 90 + 24.8 .57 .89 61 89 + 21.0 .51 .83 57 84 + 20.6 .52 .84 61 87 + 19.2 .49 .79 62 85 + 19.8 .50 .82 61 87 + 14.2 .55 .59 77 77 + 12.2 .38 .48 75 76 + 3.8 .13 .12 70 76 +

Table 3-4c Performance of NN-CFAR with reduced reference window, N= 9 (Experiment # 4).

ASNR ACNR PDCA PDNN %CCA %CNN MERIT

21.7 22.4 .51 .84 61 89 + 27.5 27.9 .08 .82 11 83 + 25.4 25.7 .05 .83 11 85 + 24.9 24.5 .07 .80 14 81 + 24.3 23.9 .09 .82 14 84 + 21.6 22 .02 .75 12 80 +

115

The test described above assumes that the target is in the clutter region

and in this case, an increase in the probability of detection Pd occurs at the cost

of increase in the false alarm rate. We also conducted another test with the target

in the clear region while the adjacent reference cells are from the clutter region.

In this case, the false alarm rate will be reduced together with a reduction in Pd

116

when the number of available reference cells goes down. The results of this test for

progressively reduced values of N = 33,25, 17 and 9 are shown in Figs. 3.4a, b, c

and d, where the performance of NN-CFAR is compared with that of CA-CFAR in

each case. It is of interest to observe that CA-CFAR completely loses its detection

capability for smaller values of N (N = 9 and smaller). The superior performance

delivered by NN-CFAR is clearly evident from these graphs.

c::t

,r./-···~· .. rJ iii

c -. - NN-CF"AR CA-CF"AR

Q~----------~----------~----------~----------~----------~ 0.0 7.0 14_0 21.0 28.0 35.0 SNR (db)

Fig. 3.4a Comparison of NN-CFAR and CA-CFAR for a window size of N = 33

CD c:i

6 • ..;> -:>

!1:i CD

c:::::;)

c.o ~

-:> • ..;> ....;>

:S---: o co

..D o ~

co

CI NN-CF"AR • CA-CF"AR

I············· .. /\ ........ r •...•....•.....

•..............

117

c:i~---------------___ ~ _______________ -r __________________ ~ _______________ ~ _______________ --,

5.0 11.0 17.0 23.0 29.0 35.0 SNR [db)

Fig. 3.4b Comparison of NN·CFAR and CA·CFAR for a window size of N = 25

118

CI NN-CF"AR • CA--CF"AR

-c::i

" :1 .............. ..·······-···· .. ~ .......... ······u .....•.......•........... \.... -

C>

c::i~----------__ r------------r----------__ ~-----------r----------~ 0.0 B.O 12.0 18.0 24.0 30.0

SNR (db)

Fig. 3.4c Comparison of NN-CFAR and CA-CFAR for a window size of N = 17

CD c:i

6 .~ u CD

~ci CD

c:::::I

c.o ~ ~ .J ~

:5~ o co

..D o ~

c - NN-CF"AR • - CA--CF"AR

119

co ............................................ .

=~ ... --------~-4~------~~~~~----~----------~-----------, S.O 10.0 IS.0······ 20.0 25.0 30.0 SNR (db)

Fig. 3.4d Comparison of NN-CFAR and CA-CFAR for a window size of N = 9

120

Experiment # 5

In this experiment, we evaluate the performance of NN-CFAR for the case

of an edge clutter, which is a situation that results in the average powers in the

leading and the lagging portions of the reference window to differ significantly.

Under such circumstances, if the target is in the clear region or in the region of

lower clutter levels, serious target masking could result (due to an unnecessary

raising of the threshold caused by the heavy clutter on the side of the test cell),

whereas if the target lies in the region with higher clutter level with some of the

reference cells in the clear, the false alarm rate could increase very sharply [23].

For the performance evaluation, we set the target in the clear region (which

offers the more challenging case in the presence of edge clutter) and conducted var

ious runs with different average SNR values and also different values of N. A few of

the illustrative results are briefly summarized in Table 3-5, where the Yiead and Yiag

values are included to indicate the differences in the average powers used to rep

resent the clutter edge in these tests. In appreciating the robustness of NN-CFAR

it should be kept in mind that the clutter edge situation was not included in the

training examples used to train NN-CFAR. Thus, there are three mismatches from

the training conditions, viz., presence of a nonhomogeneous background, reduced

reference window size and SNR values outside the training range. The superior

performance offered by NN-CFAR is hence highly noteworthy. Also, the role of the

input features Yiead and Yiag in signalling the NN-CFAR that the background has

deviated from the homogeneous case with which it was trained, needs a particular

emphasis.

N ASNR

33 28.5 27 23.7 19 22.8 17 19.1

Table 3-5 NN-CFAR performance in edge clutter (Experiment # 5)

Yiead Yiag PDCA PDNN %CCA %CNN MERIT

29.9 26.0 .75 .83 78 86 + 26.0 20.2 .37 .42 48 51 + 26.0 20.0 .27 .46 42 61 + 21.9 17.9 .23 .51 42 70 +

121

As noted in the Introduction, the GO-CFAR which is a specialized modifica

tion of the basic CA-CFAR scheme, has been specifically designed for handling edge

clutter situations and has been shown to provide an improved level of performance

in these situations. Hence a question may be raised why the performance of NN

CFAR is not compared with this specialized scheme. The reasons are two-fold. On

the one hand, the NN-CFAR has not been trained for any type of nonhomogeneous

background and hence such a comparison may not lead to any valid conclusions.

On the other hand, it is well known that GO-CFAR, being a specialized design

for edge clutters, does not perform as well as the basic CA-CFAR in the other

cases (for instance, homogeneous background, presence of interfering targets etc.)

[23], and our present interest is to establish the robustness characteristics of NN

CFAR against a variety of diverse types of deviations in the operating conditions.

It is evident that a specialized NN-CFAR that offers the best performance in the

face of a particular type of deviation can be designed very simply by including a

large number of examples depicting that specific type of operating conditions in

the neural network training set.

122

Experiment # 6

A case of special interest is when some of the reference cells in the clutter

region are defected on both sides of the test cell and there is no return from these

cells. Such a situation is illustrated in Fig. 3.5 and is of particular importance in

fault tolerant detection. This also represents a highly nonhomogeneous situation

and, to the best of our knowledge, has not been addressed in the literature on CFAR

detection. It is clear that a GO-CFAR algorithm would have been appropriate

if the dead cells were on one side of the test cell only, in which case the GO

CFAR would select the side with greater clutter power and simply ignore the dead

cells. However, when the dead cells are located on both sides of the test cell, a

more challenging situation is encountered. The performance of NN-CFAR in such

situations was evaluated for different average SNR and CNR values and also for

various window sizes. For conducting these tests, the samples in the reference cells

were generated for the specific SNR and CNR levels and samples from four of these

cells (two on each side of the test cell) were killed by replacing them with zeros.

Some illustrative performance results from these tests are summarized in Table 3-6,

once again underscoring the superiority of the NN-CFAR scheme.

Table 3-6 NN-CFAR performance in Experiment # 6 (Target between two clutter patches with defective reference cells)

N Dead Cells ACNR ASNR PDCA PDNN %CCA %CNN MERIT

33 4,8,19,25 26.9 28.9 .77 .79 82 84 + 27 4,8,19,25 26.4 28.7 .79 .83 81 85 + 17 2,6,12,14 17.6 18.8 .38 .41 63 66 + 15 1,5,10,14 23.2 25.7 .55 .51 61 67 +

123

Fig 3.5 Target between two clutter patches with defective reference cells

(Experiment # 6)

124

3.4. Conclusions

The major contributions of this chapter are the development of a neural

network scheme for CFAR detection (NN-CFAR) and the establishment of its ro

bustness characteristics. The details of employing a neural network in this applica

tion, specifically the selection of the input features and the network training using

these inputs, were described. Performance of the NN-CFAR scheme in a variety

of operating scenarios, some of which correspond to significant deviations from the

training conditions, was quantitatively evaluated. A comparison with the perfor

mance expected from the CA-CFAR scheme is given in each case to establish the

superiority of the presently developed scheme.

The following major conclusions can be drawn from the performance eval

uation results from a number of experiments reported in this chapter. While in

homogeneous background scenarios with an adequately large number of available

reference cells (typically N ~ 33), NN-CFAR matches the performance offered

by CA-CFAR, as the size of the reference window reduces NN-CFAR maintains a

high level of performance significantly better than CA-CFAR. In scenarios where

the background deviates from being homogeneous, as in the case of clutter edge,

presence of defective cells on both sides of the test cell and interfering targets, NN

CFAR continues to deliver superior performance over a wide range of SNR and

CNR levels. The underlying reason for this robustness is the ability of the neural

network to follow the statistical variations of the target and the clutter significantly

better than CA-CFAR, which in turn is facilitated by its capability to process more

statistical parameters as input features for target detection. This characteristic,

together with the parallel distributed architecture of the neural network, which

125

facilitates considerable hardware implementational benefits, makes the NN-CFAR

scheme a highly attractive and viable procedure for CFAR detection.

As mentioned at various places in this chapter, the primary emphasis in

this study has been to demonstrate the robustness of NN-CFAR to deviations in

the operating scenarios from the conditions used for training the neural network

and consequently the neural network was trained for homogeneous background

cases only. Evidently, there is no reason to limit the training examples to any

specific case or a set of cases only, and by appropriate training for nonhomogeneous

background cases (edge clutter, interfering target etc.), a further increased level of

performance together with robustness to loss of reference cells and SNR variations

can be expected. Thus the NN-CFAR scheme offers the potential for integrating

the strong points of several specialized CFAR processors (such as the GO-CFAR,

SO-CFAR etc.) all in one single processor.

CHAPTER 4

NEURAL NETWORK IMPLEMENTATION OF

THE MOVING TARGET INDICATOR

4.1. Introduction

126

As described in Chapter 1, Moving Target Indicator (MTI) is one of the

most important functions of a high quality radar system [51-63]. One of the major

applications of MTI processing is in Air Traffic Control (ATC) systems where

several different objects are moving in the vicinity of the aircraft being tracked.

Furthermore, detection of a group of birds which may be flying near the aircraft

engine during the take off or the landing phase is of prime importance. Therefore,

in this chapter the term clutter refers to the unwanted moving targets which have

to be suppressed. Our objective is to conduct a series of detailed experiments

on the processing of radar pulses using a neural network. We will discuss how

a multilayer feedforward neural network with backpropagation learning may be

employed in order to perform the functions of the MTI processor without excessive

complexity in the receiver design. The Neural Network-based MTI (NN-MTI) is

trained through examples in order to integrate a series of noisy radar pulses and

provides estimates of the target radial velocity in an on-line fashion. The mapping

property of a trained neural network is utilized to extract this information from

the radar pulse amplitude distribution. The key advantages over the traditional

methods are the speed of response, hardware implementation, and flexibility in

127

designing for variable-bandwidth doppler filter bank (which is not offered by the

classical methods).

4.2. Some Basics on MTI Designs

The principal functions of an MTI processor are to utilize the doppler fre

quency shift produced by a moving target in order to: i) determine the relative

velocity of a target and ii) separate a desired moving target from undesired sta

tionary objects (clutter) [8,64,66]. The doppler frequency shift is related to the

radial velocity of the target by

where Vr is the radial velocity of the target with respect to the radar, >. is the

wavelength of the signal, and <P is the phase shift of the signal after it hits a

moving target. The input to an MTI includes a sequence of pulses which are either

returned from the target of interest or from its neighborhood objects, i.e., clutter.

The output of the MTI is either in the form of a decision about the target (i.e.,

moving or stationary) or it can include more precise information about the velocity

of the target. Obviously, for more information about the velocity of the target, one

has to go through more complex design procedures [65,67,68]. Once the pulses are

received by an MTI, each pulse goes through a certain amount of delay in time

which is an integer multiple of the period between the pulses and then they are

integrated (i.e., added in some fashion). The result of this pulse integration is

either compared with a threshold or further processing might be needed to extract

additional information about the target velocity.

We described the mathematical representations of the input and the output

of the MTI filter in section 2.8. As an example, equations (2.11a, b) represent the

128

two inputs of a simple MTI filter (i.e., a two-pulse canceler) and the corresponding

output is given by equation (2.11c). Fig. 4.1 illustrates two basic designs of an

MTI filter and their frequency responses. Fig. 4.2 further illustrates the frequency

responses for a number of MTI filters with different number of delays. For example,

a three-pulse canceler is one that waits until three pulses are received and uses two

delaying elements (e.g., a shift register) to process the pulses. Also note that

other configurations are possible by rearranging the delay elements (e.g., cascade).

Another type of MTI which has been used in this dissertation for comparison with a

trained neural network is depicted in Fig. 4.3a. This is a more general architecture

for an MTI filter and the weights Wi can be calculated through different methods.

Despite the differences in the various architectures, the basic function of of an MTI

can be stated by the following:

Given a series of N pulses p{ i), where i = 1, ... , N, that are returned from a

moving target, what are the best weights {Wi} that must be multiplied by these

pulses such that the weighted sum z = L:f WiPe i) gives a representation of

the target velocity (or indicate that the target is moving).

The neural network-based MTI is inspired by this definition and, as will be

discussed in the following sections, provides several advantages over the classical

architectures of MTI. The number of pulses to be processed in an MTI filter is

an important design parameter. This is in turn dictated by the Pulse Repetition

Frequency (PRF), which is inversely related to the period T of the pulse train.

Furthermore, as we discussed in Section 2.8.3., the doppler shift is also related to

the PRF, i.e., n !d = -=nPRF T

129

where n is an integer. It can be seen that for a higher doppler sensitivity, which in

turn corresponds to a better velocity resolution, one needs to choose PRF as high

as possible. On the otherhand, if PRF is too high then the pulses will be closer

together which results in ambiguities in range measurements. There are quite a

number of other reasons that constrain the choice of PRF in practice. Some of

these constraints are the following :

1) The radar has to have a certain amount of dwell time on each target and

several pulses have to be received in order to obtain a high signal-to-noise

ratio. On the other hand, the dwell time is restricted by the scanning rate

of the radar.

2) Each pulse is required to have a certain amount of power and hence is

limited by its amplitude and duration. Furthermore, the width of a pulse

determines the range resolution of a radar. The pulse width also relates to

the bandwidth of a radar, which is usually a constraint by itself.

3) The blind speeds, which are integer multiples of a certain velocity that

coincide with the MTI notches (as defined in Section 2.8.3) in the frequency

domain, put another constraint on the choice of PRF (see Fig. 4.2). In

other words, the output of the MTI is zero when the doppler shift is an

integer multiple of PRF, i.e., !d = if. = nPRF. Furthermore, these target

velocities resulting in zero response from MTI are given by Vn = ~; where

n is an integer.

4) Any limits on the computational resources can in turn limit the data rate

that can be received by an MTI.

130

5) The flexibility of producing pulses with certain duty cycles (i.e, TIT, where

T is the pulse width and T is the time interval beteen the pulses) of a trans

mitter might be limited due to size restrictions of the radar for a particular

application.

6) Other external requirements, such as jamming (interference by a hostile

source), Electronic Counter Measures (ECM) etc., may also restrict the

number of pulses.

Ideally, it is desirable that a radar designer combine several different func

tions in a single radar. Such radars are called multi-function radars and are gener

ally more difficult to design due to several conflicting requirements. Our objective

in this chapter is to demonstrate that the MTI functions can be implemented by

a well trained neural network which offers greater flexibility in design, which is

quite advantageous for a multi-function radar system for air-survillance and other

applications.

4.2.1. Current Approaches to MTI

Since the advent of digital computers there have been two major classes of

MTI design procedures. The first method is based on Pulse Cancellation approach

which is implemented through a Linear Transversal Filter, which has been the

dominant method for the implementation of linear signal processing algorithms.

Linear signal processing, on the other hand, puts severe limitations on the flexibility

of the MTI design. When an MTI is designed using a transversal filter with a £Xed

number of delays, it only accounts for as many pulses as there are delay units,

and additional received pulses are essentially wasted. Linear operation in these

processors further accentuates limiting the dynamic range of clutter amplitude

levels that spreads the clutter spectrum. The spread in clutter spectrum degrades

131

the improvement factor (i.e. signal-to-clutter ratio) of the MTI. The second method

is based on the Fast Fourier Transform (FFT). In the first method the processing

is done in the time domain, while with the FFT method the MTI functions are

implemented in the frequency domain [8,57].

Most of the current MTI techniques can be represented through appropri

ate linear constant coefficient difference equations for which simple Z-transform

techniques are available for design purposes. In scenarios characterized by non

Gaussian clutter distribution, however, non-uniform sampling is required to avoid

blind speed regions. A method that is generally used to resolve the ambiguity in

velocity (i.e., blind speeds) is called pulse staggering, which is another name for

using pulse series with different PRFs. This, however, limits the use of linear trans

form techniques in the analysis and design of linear MTI processors. Additionally,

the desired MTI response is one with fiat passbands. A fiat passband indicates

that the MTI reponse is uniform for all target velocities that are of interest. Fig.

4.4 illustrates the response of two MTI :Biters with pulse staggering. Note that it is

desirable to have a fiat response for an MTI filter at regions of interest. However,

lack of sufficient work on the theory of non-uniform sampling limits the optimal

calculation of the MTI filter weights and one has to sacrifice the improvement factor

(defined in Section 4.2.4) in order to obtain a fiat frequency response. The MTI

weights are also called coefficients since in a digital implementation of an MTI,

which is represented by a difference equation, the coefficients in the equation will

serve as the weights to each delay element. In the following discussions, we will

refer to MTI filter weights and coefficients interchangably.

Computational complexity is another limiting factor in the computation of

the MTI filter coefficients. Furthermore, the number of coefficients is also limited

132

(since the number of coefficients is exactly equal to the number of pulses used) which

in turn limits the flexibility of design. Some recently developed MTI techniques

make use of the data association and state estimation methods through linear

prediction theory [35]. Although these methods perform well in situations where

high signal-to-noise (SNR) ratios are present, they do very poorly in cases of low

SNR. In summary, the linear MTI techniques have almost reached their theoretical

limits and despite additional research in linear MTI techniques, certain problems

such as those mentioned above still remain. These problems are mostly due to the

underlying assumptions of linear processing. Therefore, it is of particular interest

to direct the MTI research towards the use of nonlinear processing techniques

provided by the use of neural networks. In the following sections, the general theory

of optimal MTI processing as well as the underlying mathematics are reviewed and

the proposed NN-MTI with its performance evaluations will be described.

4.2.2. The Radar Ambiguity Function

Performance of an MTI is very much constrained by the resolution require

ments for range and velocity. The separation of two closely-spaced targets as seen

by the radar depends on the resolution of the pulses. The pulses have to be as nar

row as possible in order to have a high spatial resolution. Radar state variables of

interest are range, azimuth, and radial velocity. The radar ambiguity function is a

complex envelope of the response of a matched filter receiver to the radar transmit

ted wave which has been reflected from a point target [9]. This function represents

the radar ambiguity in time and frequency domain. The choice of PRF as well

as the width of the pulses are critical design parameters due to the conflicting

requirements of range and doppler resolutions.

133

As mentioned before, multiple pulses are needed in order to achieve the

required signal-to-noise ratio for the detection process. Multiple pulses also cause

a long transient response in feedforward pulse cancelers with the result that the

MTI processing may not be able to take full advantage of the clutter correlation.

That is, by the time the filter reaches a steady state operation, the pulses may be

decorrelated from clutter. Doppler shift (fd) is also inversely related to the pulse

width (T), which is the primary factor for the range resolution. IT /d > ~, the

doppler signal may easily be distinguished from a single pulse. However, if /d < ~,

pulses will be modulated in amplitude and many pulses are needed to extract the

doppler (see Fig. 4.5) [9,71].

The weights of an MTI filter are optimized according to some a priori as

sumptions made about the doppler frequency. To separate noise and clutter from

the moving target, particular points in the doppler frequency domain have to be

removed without too much narrowing down the MTI bandwidth. The magnitude

response of the MTI looks like a comb filter which has stopbands at regions of

heavy clutter and passbands in regions where the target doppler spectrum is ex

pected with minimal attenuation. Linear MTI performance degrades even with a

small degree of skewness in noise or clutter from the assumed Gaussian distribu

tion. Due to this shortcoming, adaptive schemes are needed in order to use spectral

estimation of clutter before cancellation is performed.

4.2.3. Transversal Filters

An MTI with a transversal filter has a frequency response proportional to

sinn 7r/dT, where n is the number of delay lines used. The corresponding weights

are given [8] by

i-I n! wi=(-l) (n-i+1)!(i-1)! , i = 1,2, .. , n + 1,

134

which are the binomial coefficients (weights). The average ratio I = (S / C)out/ (S / C)in

, which is the ratio of signal-to-clutter of the output to that of the input is defined

as the MTI improvement factor. The improvement factor provides a measure of

performance for MTI systems and is independent of the target velocity and only

depends on the weights Wi, the clutter autocorrelation function, and the number

of processed pulses. In general, there is only a small difference (less than 2db) in

improvement factor between the optimum weights and the binomial weights [8].

Delay line cancelers with amplitude responses in the form of sinn7rJdT, where n

refers to the number of pulses, are optimum in the sense that they approximately

maximize the average clutter attenuation and probability of target detection at the

midband doppler frequency and its harmonics. Too much narrowing the passband

reduces the number of detectable targets. Furthermore, as more delay lines are

used, the notches at de and the PRF harmonics will be too broad which limits the

passband.

A transversal filter with N outputs can be used to form a bank of frequency

filters to cover from de to the maximum desired PRF. Define the weights applied

to the outputs of the N taps as

TXT. _ e-i [211"(i-l)k/NJ YYlk - , i = 1,2,· .. , N & k = 0,··· , N - 1

where each value of k corresponds to a different set of N weights, and a different set

of doppler filter responses. The impulse response of the corresponding transversal

filter is then given by

N

hk(t) = L S[t - (i _1)T]e-i211"(i-l)k/N

i=l

135

and the corresponding Fourier transform is

N

Hk(f) = e-j27r/ t L e j27r(i-l)[/T-k/Nl.

i=l

Hence,

IHk(f) I = It ej27r(i-l)[/T-k/Nll = ISi~[7rN(fT - kiN)] I. i=l sm[r.(fT - kiN)]

It can be seen that the peak response of the filter occurs at 0, IIT,2IT, etc.

This kind of filter bank leads to a coherent integration and good SNR performance.

We argue that a neural network will be far more efficient in a coherent integration

because even in a staggered PRF situation, which is needed for the enhancement of

blind speed situations, the peaks of doppler filter banks and their bandwidths can

be shaped through training. In contrast, the linear doppler filter bank implemented

with FFT and transversal digital filters can only provide a uniform set of doppler

filters with equal bandwidths and fi."Ced nulls.

4.2.4. The MTI Improvement Factor

As defined earlier, the MTI Improvement Factor is the signal-to-clutter ratio

at the output of the MTI system divided by the signal-to-clutter ratio at the input,

averaged over all target radial velocities of interest. This is given by

Ie = (SoICo)/(SdCi )

where S and C represent the signal and clutter power, respectively. This equation

can be rearranged as

Ie = (SoISi)(CdCo)

where (So lSi) is called the MTI gain, which is the ratio of the signal average at the

output of the MTI to the average of the input signal. The term (Cd Co) represents

136

the clutter attenuation factor. This factor is independent of the target velocity

and depends merely on the MTI weights and the power spectrum of the clutter

as well as the number of pulses used in the process. Clutter power may well be

overshadowing the target (e.g. 10,000 times stronger than target power).

The improvement factor in temis of the covariance matrix functions and the

complex weight vectors is given by

WTMsW* Ij= WTMeW *

where W represents the vector of complex weights and Ms and Me are the signal

and interference covariance matrices, respectively. We can calculate the improve-

ment factor of coherent MTI using the equation [9]

~n-l 2 L..Jj=O Wj

Ie = ~n 1 ~n 1 ( . k) L..Jj=O L..Jk=O WjWkPe J -

where the term on the numerator is the MTI gain and W j denotes the MTI's weight,

Pc is the clutter correlation coefficient, and n is the number of pulses processed by

the MTI. The clutter correlation coefficient for a Gaussian clutter density is given

by

where T is the pulse interval period and (j~ is the variance of the clutter distribution.

Therefore, the improvement factor depends on the MTI weights and hence the

optimum weights that maximize the improvement factor can be calculated. We may

also use binomial weights as mentioned before since the difference in performance

is only in the order of 2dB.

137

4.2.5. The Optimum MTI Processing Theory

The MTI filter, as discussed thus far, provides a comb filter that ideally has

a flat passband in the expected target regions and stopbands in the heavy clutter

regions. As mentioned before, the MTI is effective only in improving the signal-to

clutter ratio and has no capability in improving the system signal-to-noise ratio.

Therefore, radar detection theory is used only for single pulse MTI analysis. For the

SNR enhancement, an appropriate integration is needed following the MTI. This

integration can be implemented by coherent or incoherent integration methods.

One way to do a coherent integration is to use an FFT algorithm to form a bank

of filters [9,67]. The whole system of the MTI f~llowed by an integrator acts as a

matched filter which is matched to the target spectrum. This is the case for uniform

doppler frequencies (i.e., assuming that the targets are distributed uniformly across

the doppler frequency band). An alternative method to achieve integration is to

cascade MTI with an incoherent integrator. The incoherent integration, however,

causes a detection loss due to the fact that the MTI receiver noise is correlated

incoherently. This loss increases as more pulses are integrated.

4.3. Why Neural Network For Implementation of MTI (NN-MTI) ?

The linear representation of time series, in general, is more constrained

than the regression capability provided by the neural networks. In almost any filter

design problem, including the MTI, the main idea is to shape the filter response for

any arbitrary condition without long transient responses. Other polynomial filters,

such as a Chebyshev filter, can also be used. However, this will cause ripples in

the passband and still a large number of delay lines are needed for a highly shaped

filter response. For example, Chebyshev design results in a wider passband but

only at the cost of a lower improvement factor. IT only a few pulses are available,

138

the shaping of the MTI response is very hard to form. This is another issue that

we will address in the NN -MTI design.

Nonrecursive transversal filters provide N zeros for synthesizing the MTI

response. However, as mentioned earlier, this requires a large number of zeros for

a highly shaped filter. An alternative design is to make use of recursive filters.

The presence of feedback loops, however, causes a very poor oscillatory transient

response. The additional degree of freedom in a recursive transversal filter is due

to more connectivity among the delay units. A multilayer feedfoward architecture

offers the required larger connectivity without needing the feedback loops and

there is no transient response for NN-MTI beca~se everything is done in parallel.

Hence, a desirable steady-state response can be readily achieved with NN-MTI.

Also the poor transient response due to the presence of feedback loops in recursive

transversal filters results in a severe ringing when large clutter returns are received,

which effectively act like a step input to the MTI. Ringing is undesirable because

it causes a masking of the target signal until the transient response fades away.

Digital MTI designs have another major limitation which is the restriction

on the dynamic range imposed by the analog to digital (A/D) converter. The A/D

converter must operate at a speed high enough to preserve the informatio~ content

of the radar signal and the number of bits into which it quantizes the signal must

be sufficient for the precision required. The number of bits in the A/D converter

determines the maximum improvement factor that the MTI radar can achieve

[8,61]. A limiter is generally used to make sure that the A/D converter covers

the peak excursion of the detector output. Therefore, the practical constraints on

the speed and dynamic range of A/D converter pose major drawbacks in a digital

implementation of the MTI processors. This problem does not exist in a neural

139

network implementation of MTI simply because the nonlinear transfer function of

the hidden layers will do the required normalization through training.

The residual clutter can compound the incoherent integration which totally

destroys all improvement produced by integration. We will show that the nonlinear

processing capability of a neural network can successfully combine the doppler

processing and integration performed by MTI and its coprocessor (i.e., integrator).

This means that the MTI implementation with this method reduces the need for

an effective number of independent pulses. There are several other conflicting

requirements for the optimum MTI design where the algorithmic procedures may

not be as efficient. In summary, shaping the ma&nitude frequency response of MTI

demands the fiexibilities offered by the neural network.

The connectivity of the processor elements in the neural network creates a

number of different weight combinations that can be optimized based on a certain

number of pulses and interference distribution. In NN-MTI, the weights are opti

mized by feeding back the error in an iterative process until a desired performance is

achieved. The feedback feature is peculiar to the proposed neural network architec

ture (i.e. multilayer feedforward with backpropagation learning). The steady-state

response starts after the training is completed, therefore the ringing effect and poor

transient response observed in recursive delay-line filters are not of any concern in

a neural network implementation of MTI. The weights are optimized subject to the

condition that the signal-to-clutter ratio be maximized. The linear MTI is only

optimized for one kind of distribution, mainly Gaussian, while the neural network

has much more memory and can be optimized for several different distributions,

not just one. It is therefore possible to add more features to result in a single and

more compact design by utilizing the fiexibilities offered by the neural network.

140

Traditionally, as we discussed in the previous chapter, for the detection of

a stationary target one needs to integrate the amplitude fluctuations of the radar

pulse sequence and map the information into the correct decision about the target

with regard to its absence or presence. We demonstrated in Chapter 3 that the

neural network mapping property provides an efficient methodology for integrating

several different parameters to well identify the presence of the target in clutter

with robustness to the loss of resolution cells. In the light of the same concept

one may rely on the NN-MTI to do the required processing with fewer number

of pulses. Radar pulse series can be either random in phase (i.e., each pulse has

a different phase angle) or they can be generated with the same starting point

(i.e., phase angle) for each pulse. Furthermore, the pulse series may have a defined

phase modulation to enhance the detection process. The random phase pulses are

referred to as incoherent pulses whereas the in-phase or phase modulated pulses

are called coherent pulses. Coherent pulses can be utilized to extract information

about the target motion based on the doppler shift. This is in turn reflected by

the rate of change of the phase angles (Le., frequency shift) with respect to the

reference wave.

The underlying mechanisms in the NN-MTI and MTI are totally different.

The NN-MTI outputs represent information other than the residual clutter as well

as the processed signal. The output of the NN-MTI is the decision and declaration

about targets, not the signal itself. In other words, whereas the output of an MTI

processor gives a map of the target and the background which has reduced clutter,

the NN-MTI helps in the final decision about the separation of the target from

clutter and makes no mapping of the clutter itself. This is due to the nature of

neural network processing that a residual signal at its output does not have any

141

meaning other than what is interpreted. Therefore, the NN-MTI that we are going

to introduce in this chapter is designed only for separating the moving targets

from clutter as well as providing the target radial velocity. Furthermore, this will

be done without the need for the Fast Fourier Transform (FFT) methods and other

hardware complexities such as precise -timing of the pulses.

As mentioned before, clutter peaks occur in the MTI input which may not fit

in the dynamic range of the MTI processor, which is in turn due to the limitation

of the AID devices. The nonlinear operation of the limiter creates additional

harmonics of clutter which causes a spread in the clutter spectrum and results in

a reduction of the improvement factor. Distortion of the clutter statistics due to

the AID limiting causes a deviation of the MTI weights from their optimal values.

Therefore, a nonlinear processor that can optimize its weights from the beginning

based on this nonlinear effect will definitely provide a greater improvement. This

feature is provided by NN-MTI since the first hidden layer will always normalize

the input vector accordingly and each sample is normalized in parallel so that the

clutter variation may have a far more dynamic range.

From the neural network standpoint, the function of MTI processing is to

calculate the weighted combination of the pulse amplitudes in order to demodu

late the effect of doppler on the corresponding pulse sequence. To perform this

task, many different methods have been proposed for calculation of these weight

s [9,68]. We have already discussed the deficiencies of the existing methods as

well as the flexibilities offered by the neural network architectures. Since a neu

ral network provides a tool for the calculation of these weights through training,

one can concentrate more on shaping the frequency response of the MTI filter as

well as more efficient presentation of the pulse sequence to the processor. In other

142

words, the use of other parameters as an aid to better code and decode the pulse

sequence becomes possible. This is a very important feature for the neural network

method of implementation of MTI, since there may not be as much restriction on

the waveform design and coding of the pulses.

To summarize, with the algorithmic methods and linear transversal filter

implementation techniques, the flexibility of design is much more limited as dis

cussed in this section. Furthermore, one can think of the neural network as a more

general Fourier transform method which has been one of the major tools for the

analysis of radar signals. The Fourier transform, which is a linear transformation

based on eigenfunction expansion of the signal,. has only two degrees of freedom

i.e., the amplitude and frequency of each harmonic. On the other hand, the neural

network activation functions can be of different forms for each layer with differ

ent sensitivities as well as different learning parameters. The eigenfunctions for

the Fourier transform are of periodic nature which serves as a major drawback of

this technique in several radar processing problems such as those characterized by

the presence of the blind range or blind speed zones. Moreover, the analysis of

non-periodic pulse sequences (i.e., non-uniform sampling) with Fourier method is a

formidable task. Therefore, exploration of new implementation techniques through

the employment of nonlinear transform methods, such as that provided by a neural

network, holds out particular attraction in handling the problems mentioned in this

section.

143

4.4. Neural Network Architectures of the MTI

In this section we shall investigate several different design structures for

the MTI and analyze their performance compared to the classical Pulse Canceler

method. As will be seen in the following discussions, the NN-MTI has far more

flexibilities than the pulse cancellation method which is implemented by a linear

transversal filter. We will move in a step by step manner through a sequence of

simulation studies towards the goal of obtaining better design procedures for MTI.

4.4.1. NN-MTI Doppler Shift Extraction From Pulse Series

The fundamental problem to study in the NN-MTI design is to see how well

a neural network extracts the doppler shift from·a series of pulses in the absence of

clutter. In the following experiments we will study this property through training

of several different neural networks.

Experiment # 1

For this purpose (i.e, doppler processing by neural networks), we generated

a series of 5 pulses with a peak amplitude of 100 from a moving target with a radial

velocity range of [Om/s, 75m/s]. A 3-layer neural network with 5 input nodes, 10

hidden nodes, and 1 output node was trained with 300 training vectors which were

generated as follows:

1) A velocity step size of 5m/s was used. As will be discussed in the follow

ing experiments, the choice of the step size for the velocity increments in

generating the training data has a significant effect on the robustness of the

resulting NN-MTI scheme.

2) A doppler sensitivity of 6.7Hz/m/s was used which is the sensitivity for

the 1 GHz carrier frequency. This quantity was chosen based on the radial

144

velocity range to avoid blind speed zones in the training data since the neural

network will be mislead from an inconsistency in the training set.

3) Uniform pulse intervals were used.

4) The output was divided into three different classes of doppler shifts i.e.,

(0. Hz, 120. Hz), (120. Hz, 300. Hz), and (300. Hz, 500. Hz).

5) Generalized Delta Rule was used for the weight adjustments.

6) The activation function was the two sided sigmoid function, bounded by

[-1,+1].

Note that this resembles a bank of thr~ doppler filters with non-uniform

bandwidths! The Fast Fourier Transform (FFT) method cannot generate non

uniform bandwidths for the doppler filter bank. The reason for this is the periodic

nature of the FFT. That is, for each range cell of the radar, one can generate N

uniform (and only uniform) bandwidth doppler filters with the FFT algorithm,

where N depends on the number of pulses as well as the duration of each pulse

[8,9]. Similarly, the Pulse Canceler methods only provide a wideband MTI filter

with a non-flat response. The non-flat response is due to the small number of

pulses. The Pulse Canceler method cannot separate the velocity ranges which is a

drawback in the case of multiple target situations.

In the following tables, P( i) denotes the peak amplitude of the ith pulse

which is in the range [-100,100]. Note that during the training phase, the pulse

amplitudes are normalized with respect to the maximum magnitude of the pulses

in the sequence. The quantity termed "desired solution" refers to the desired value

that has been used for training the neural network. For instance, the numbers

30.0,20.0, and 10.0 in Table 4-1 and the other tables represent certain classes of

145

target velocities selected (e.g., 30.0 stands for a slow target and 10 stands for a

fast target). The last column denoted "NN solution" indicates the response of the

trained neural network in each scenario to a set of test samples that were different

from the samples used during the training phase. For brevity, only a few tests (out

of 100 conducted tests) are shown in these tables. The rate of correct classification

and the average velocity error that was incorporated for testing the trained neural

network are indicated at the top of each table. In order to generate the test data, we

selected 100 different samples of target velocities and we added a uniform random

number to each velocity. Furthermore, this average error (i.e., mean of the uniform

random numbers) was either equal to or greater than the velocity step size used for

training the neural networks. Then we simulated the doppler shifts corresponding

to these random target velocities which resulted in new pulse amplitudes other

than those which were used in training. In describing the results, we will also refer

to the test samples as "test vectors" which denote the group of pulses that were

generated as such for testing and evaluation.

Table 4-1 illustrates the response of the NN-MTI for the test data after the

training was completed in Experiment # 1. The percentage of correct classifica

tion was 89%. Note that the neural network solution (i.e., output) in this case

is compared with a threshold which is defined as the average of the numbers rep

resenting two adjacent classes. Hence, the number 29.9 in Table 4-1 is compared

with (30.0+20.0)/2 which is 25.0. Now since the number 29.9 is greater than 25.0,

we classify it as a slow target which means that the neural network response was

correct for the test vector in the corresponding row (i.e., for the specified values of

P(1),P(2),··· ,P(5».

146

Table 4-2 shows the performance for a set of 10 test vectors which were

generated with an average error of 6m/ s in the radial velocities of the target. The

percentage of correct classification in this case reduced to 76%. This indicates

that teaching the NN-MTI by direct use of pulse amplitude distribution is not

sufficient with a step size of 5m/ s for- velocity increments in the training vectors

if more noise immunity is desired. In other words, the trained neural network

has learned to generalize as long as the average noise in velocity is less than or

equal to the increments used in training. For example, the number 13.8 in the

first row of Table 4-2 which denotes the neural network (NN) solution in this case

is well below the threshold value (i.e., 25.) that was mentioned above. Therefore,

it results in a false classification of the target velocity. On the other hand, the

numbers 29.6,18.1,19.1,18.2,19.1,23.6,7.8, and 9.2 represent correct classification

since they are close enough to the correct number which represents that class (i.e.,

30.0,20.0,10.0 referring to slow, medium, and fast). It may be noted that using

the neural network not only do we get an indication of target movement but also

we can further classify the speed range of the target.

It may also be noted that, in the absence of clutter, one can easily train

the neural network with smaller step sizes in radial velocity and make use of the

associative memory property of neural networks and simply store the responses

corresponding to each pulse amplitude distribution. However, in the presence of

clutter, we need to parameterize the amplitude modulation of the pulses such that

the noisy pulse amplitude distribution can be trained to the network.

The effect of an increase in the number of pulses was studied by considering

a case with 10 pulses. The neural network architecture was the same except for

the number of inputs which was 10 in this case. The number of training vectors,

14i

leaxning rates, and the activation function were maintained the same as before.

The radial velocity step size was also 5m/s. The training was stopped when the

error reached a steady state. As can be seen from the three samples in Table 4-3,

the NN-MTI can imitate a doppler filter bank (of 3 in this case) with variable

bandwidths. The percentage of correct classification increased to 92% giving a 3%

improvement compared to the case discussed earlier (as represented by Table 4-1).

Obviously, the use of more pulses accounts for this improvement.

Despite the increase in the number of pulses, the test vectors for the 6m/ s

velocity error reveal that with a slight change in the radial velocity error beyond

the training step size, the response of NN-MTI. deviates considerably (Le., 14%

decrease) from the desired classification rate as illustrated in Table 4-4. Therefore,

regardless of the number of pulses used for the training, the NN-MTI cannot cor

rectly classify the radial velocities except for the velocity values for which it has

been trained with some variation (i.e, 5m/ s). Note in comparison that the Pulse

Canceler method cannot classify the target radial velocity at all. That is, it simply

indicates whether the target is moving or not by looking at the pulse amplitude

distribution. On the other hand, with the Fast Fourier Transform method (FFT),

one can extract the radial velocities with an algorithmic implementatic~ of the u

niform doppler filter bank. However, the net amplitude response of an FFT-based

filter is not flat. From the entries in Tables 4-1 through 4-4, it is evident that we

can further improve the performance of the NN-MTI to do as well as the FFT

algorithm in an on-line fashion with a variable bandwidth doppler filter bank such

that the amplitude response stays flat. Recall that the variable bandwidth filter

bank is particular to the neural network-based MTI and is not feasible with the

FFT method (at least not in an efficient way).

148

Table 4-5 shows how the NN-MTI with a direct presentation of pulses at

the input nodes (without any additional preprocessing) has learned to perform a

classification of the moving target. This time only two classes were considered,

i.e., the class represented with the numerical value of 10 is considered as slow and

the class represented by the numerical v-alue of 0 is considered as fast. A similar

test experiment illustrated that the NN-MTI with direct presentation of the pulses

learns to classify the radial velocities within a slight variation from that of the

training vectors.

Experiment # 2

To demonstrate that the neural network ~an efficiently learn the nonlinear

functional relationship between the doppler shift and the pulse amplitude distribu

tion, we trained a neural network with a similar structure as discussed above and

arranged for the training data such that there were 10 examples for every 1m/s

increment in velocity. Furthermore, we directly presented the 5 amplitude mod

ulated input pulses to the network with the one output representing the doppler

shift that has caused the corresponding pulse amplitude modulation. We performed

this experiment for a velocity range of [Om/s, 70m/s1 with a total of 700 training

vectors (i.e., 10 examples for each increment of 1m/s). The network learned the

correspondence between the input pulse sequence and output doppler shift exactly.

We conducted these experiments on the pulse sequences without any pre

processing and without any noise or clutter samples in the pulse series. From these,

we can conclude that in the absence of noise and clutter, neural networks can be

used for an efficient on-line mapping of modulated pulses onto the doppler shift.

This is a task which ordinarily requires a large amount of storage locations as well

as much computational effort for implementing with the use of the FFT algorithm.

149

The performance evaluations conducted here indicate that the error can be reduced

by further training with smaller step sizes on the noise-free data.

Adapting the training data to the required resolution is a subtle point in

the design of multi-spectral sensor systems which require the fusion of data from

different sources with different resolutions. Table 4-6 illustrates the training data

(i.e., the sequence of 5 input pulses and the corresponding desired output which

is the doppler shift) that correspond to a step size of 5m/s for training which

resulted in a low resolution of doppler shifts (Le., comparing the difference in the

last two columns in Table 4-6). On the other hand, Table 4-7 shows how well

the neural network learned this nonlinear funct~onal relationship for the velocity

increments (i.e., step size) of 1m/s which is a rather small value in terms of radar

measurements in the microwave regions. The entries in Table 4-7 indicate that,

as long as the average error in velocity is less than 1m/ s, the neural network can

produce the correct doppler shifts within two decimal digits. It is interesting to

note that a similar network structure and similar training algorithm may be used

for millimeter wave radars as well as radars which operate at optical frequencies.

Having a unified electronic circuitry that is capable of providing efficient operation

in such a ,vide spectrum of wavelengths is a very attractive feature of the neural

network application to radar design.

One of the most important parameters in radar detection (this is true for

other detection schemes as well), which is based on the transmission of a series of

pulses, is the actual probability of detection for each individual pulse. In all of the

above experiments we assumed that the probability of detection (Pd) for each pulse

was equal to 1. That is, training vectors were generated based on the assumption

that Pd = 1. In practice, however, depending on the probability of detection, some

150

of the returning pulses may be either too much corrupted with noise or they miss

the moving target and hit a stationary target in its neighborhood. The end result

is that some of the pulses could be very different from those actually available at

the time of training the NN -MTI.

Note that in this experiment we are disturbing the pulse amplitude pattern

in a totally different way such that only one or two pulses out of a total of five are

completely distorted, whereas in the previous experiments all pulses were slightly

affected by the addition of a uniform random noise to the target velocity. In Table

4-8 only some of the entries (denoted by an asterisk) corresponding to the pulses

in Table 4-7 have been corrupted with noise. As an example, the second pulse

in the pulse series of Table 4-7 that corresponds· to 0.67 of doppler shift has been

changed from -9004 to +100. This is a rather large deviation compared to the

maximum amplitude of each pulse which is 100. Note that in Table 4-8 only one

of the pulses is significantly different from the correct ones used for training. A

similar experiment with two corrupted pulses revealed that, if training is based on

noise-free pulses, satisfactory results can be achieved. Although training with noisy

pulses is the ultimate objective of this research, this experiment illustrates the level

of neural network tolerance to deviations of pulses from the correct pattern.

Another important observation in this experiment is that if we train a neural

network with a very fine resolution (i.e., less than Imjs of velocity step sizes), then

more noise in the measurement can be tolerated. The significance of this observa

tion can be further emphasized by noting that in generating the training data we

did not take into account any kind of noise or clutter. This example illustrates that

lowering the step size in training brings the reward of more immunity to deviations

from the true pulse sequence. Furthermore, this benefit of noise immunity can be

realized with only a small number of pulses (5 in these experiments).

151

Experiment # 3

We will now attempt to train the neural network with the inclusion of clutter

in the training data. A network with a similar structure to what was used in the

previous experiments was trained with 300 vectors. Once again only 5 pulses were

used. However, in this experiment, the pulses were all corrupted with a uniform

clutter distribution characterized by a mean value of 5m/8, which is equal to the

step size of the target velocity. The main objective of performing this experiment

was to observe the behavior of the neural network to the presentation of corrupted

training data. Note that in all of the traditional methods for NITI processing,

particularly the FFT-based methods, the coinci~ence of the clutter velocity with

the target velocity resolution has been a major problem. From this experiment we

want to analyze how the neural network will respond to this situation, in which

pulses are directly used as inputs without any preprocessing. Furthermore, we

would like to study the effect of the training set data that has been corrupted

with clutter samples. The results summarized in Table 4-9 show that the neural

network does not have any problem with learning from the corrupted training data.

However, the performance was reduced to 82% (correct classification) as compared

with the situation represented in Table 4-1 (that yielded 89% correct classification).

Therefore, training with clean data results in a better performance.

We now decrease the doppler sensitivity from 6.7 Hz/m/s to 0.7 Hz/m/s in

order to extend the range of the velocities. Recall that for lower doppler sensitivity

we have a wider range of velocities before we run into the aliasing problem (i.e.,

152

blind speed zones) *. We generated 300 training vectors which were composed of

variable step sizes in velocity. That is, instead of using a constant step size as

before, we used 10 levels with 1 m/s increments, 5 levels with 8 m/s increments,

and 5 other levels with 15 m/s increments. A total of 20 examples were generated

for each velocity level. The network consisted of 5 input nodes which received the

5 input pulses, 6 hidden nodes, and one output node for representing the doppler.

The results of this training are illustrated in Fig. 4.6. The average error for

doppler shift was 43.5Hz which corresponds to 43.5/.7 :::::: 62.1m/s in the target

velocity. The vertical axis in Fig. 4.6 denotes the the normalized doppler shift of

each group of pulses which is caused by the velocity that they represent. That is,

each point in this figure refers to a series of pulses that were processed in parallel.

The performance of the NN-MTI in providing a close approximation of the doppler

shift underscores the capability of neural networks in MTI applications. It must

be noted in comparison that the traditional pulse cancelers can only tell whether

the target is moving or not without providing any information about the target

velocity. Therefore, despite the apparent deviation from the desired performance,

this is still much better than the response of an ordinary pulse canceler.

The particular situation where the clutter velocity corresponds to the

doppler resolution of the MTI filter has attracted particular attention. Even the

FFT-based algorithms perform very poorly in this situation. One can see that

* Presence of blind speed zones in the training data keeps the neural network from learning.

One way to get around this problem (i.e., confusion of the network in the blind zones) is to

manually assign the correct decision to resolve the conflict in training data by slightly perturbing

the data in a deterministic fashion. We leave this for a later time.

153

the parallel processing of the pulses provided by the neural network helps in rec

ognizing the pulse amplitude pattern corresponding to each velocity even in the

face of presence of clutter at the input during the training phase. In comparison

to training with a noise-free data set, the results indicate that training a neural

network with noise or clutter samples superimposed on the training vectors will be

more efficient only if some additional parameters are included in the input. We

will discuss this aspect in a greater detail in the other experiments to follow.

Before we close this section, we conclude the following from the series of

experiments that were conducted thus far. First, it was demonstrated that there

is sufficient information for the neural network to extract the doppler shifts from

the pulse amplitude distribution. In other words without the need to employ the

FFT algorithm, a neural network-based procedure can be designed to offer an

alternative solution which is much faster and applies to more general forms of

signals received. (By more general forms of signals we mean that one can perform

more sophisticated coding and make use of more complex modulation techniques in

order to provide immunity for the pulse sequences against noise and clutter and yet

perform an on-line extraction of the doppler shift with simpler hardware design).

Secondly, by using arbitrarily smaller step sizes in the training data, one can gain

more robustness against the variation of pulse amplitudes due to noise or any other

source of unwanted pulse modulation (e.g., eclipsing of pulses at the receiver). The

other interesting observation was that it is possible to achieve a variable bandwidth

doppler filter bank (as opposed to the FFT algorithm which provides a uniform

bandwidth filter bank). The advantage of a variable bandwidth filter bank is that

different patterns of clutter spectrum can be removed. It remains to make use

of some methodology to cross-correlate the pulses such that any residual error

154

due to clutter is decorrelated from the pulse sequence before the pulses are used

for training. This can be done in a number of ways [53,69,70] such as the use

of a) modulation techniques, b) statistical parameters, and c) waveform coding.

Furthermore, it appears that some of these methods can themselves be implemented

by a neural networks (e.g., Pulse Coding).

4.4.2. Implementation of Pulse Canceler With Neural Networks

As was discussed previously in this chapter, the digital Pulse Canceler is a

classical time-domain approach to the MTI design as compared to the FFT algo

rithm which is a frequency-domain approach. The operation of the Pulse Canceler

is based on the integration of binomially weighted pulses. The output of a digital

Pulse Canceler is a signal on the radar screen that indicates the moving target. As

more pulses are used in a Pulse Canceler, a more uniform (i.e., :flat) MTI response

will be achieved. A uniform response indicates that targets moving at different ve

locities are equally detected. The problems arising from the presence of noise and

clutter must be differentiated here. A high signal-to-clutter ratio (SIC) indicates

that less extraneous scattered data will appear on the radar screen while a high

signal-to-noise ratio (SIN) is manifested as a brighter spot at each instant that

the target is detected. With a Pulse Canceler, the clutter around the de level (i.e.,

slowly varying clutter) will be canceled. The problem with the Pulse Canceler is

that it does not provide the doppler shift. Moreover, the Pulse Canceler is not

as efficient when some of the pulses are missing (i.e., Pd < 1). That is, it only

enhances the signal-to-clutter ratio (SIC) and not the signal-to-noise ratio. While

the clutter spread is more concentrated near the de range, noise spectrum may be

spread over the entire spectrum. In the last section we mentioned that a modula

tion of some kind is required to preprocess the pulses before they are used as the

inputs to the neural network.

155

Experiment # 4

To examine the neural network response to binomially weighted pulses, we

provided 500 training vectors that included 25 different velocity levels covering the

range [0,125] with 20 examples from each. The step size for the velocity increments

was 5m/s and the average clutter velocity was set to O. The time interval between

the pulses was 1 millisecond. The neural network architecture comprises of 5 input

nodes, 10 hidden nodes with a nonlinear activation function which was selected as

a two-sided sigmoid function, and one output node which performed the weighted

sum of the pulses from the hidden nodes. A similar network with only 6 hidden

nodes learned with about the same degree of effo:rt and resulted in almost the same

minimum error. One may think that a neural network is just learning to be a

summer in this case. Although this is true, there are more advantages than just

adding the pulses if some classification process takes place at the same time. We

did not include clutter samples in the training data for the reasons dis~ssed before.

Inclusion of clutter data in training requires coding, modulation, or some statistical

treatment. The main objective of this experiment was to train the function of a

Pulse Canceler to the neural network. The primary desired feature of the neural

network here was to achieve robustness to noise as we discussed in the previous

experiments.

The binomially weighted pulses will provide an improvement in the signal

to-clutter ratio, while the neural network properties of associative memory and

parallel processing of the pulses will facilitate preserving the magnitude distortion

of pulses in the presence of noise (i.e., when Pd ::5 1), which improves the signal

to-noise ratio as well. It must be emphasized that in a Pulse Canceler which

is implemented by a transversal filter, the interrelation of pulses as a group is

156

not accounted for. On the other hand, the trained neural network can play an

important role in providing a graceful recovery of the doppler information which

is embedded in the amplitude distribution of the pulse train. That is, the group

of pulses that hold the doppler information constitute a pattern and hence when

a pulse is missing in the sequence or if its magnitude has been distorted by noise,

which is usually the case in practice, the neural network implementation of MTI will

outperform the classical pulse cancellation methods. It must also be emphasized

that with the neural network approach, processing the pulses in parallel will further

help relating each pulse to all other pulses through the connection weights of the

hidden nodes.

Another advantage of the neural network implementation of Pulse Canceler

(NN-PC) is the achievement of fast on-line response. We reserve the term NN-PC

for a class of NN-MTI which is trained to function like an ordinary pulse canceler

while the term NN-MTI will be used with a more general meaning. To demonstrate

the capability of a trained neural network pulse canceler, we generated a series

of undistorted pulses which were amplitude modulated by doppler effect and used

them as inputs to NN-PC (which was trained to merely indicate the target motion)

as well as the conventional PC (i.e., the one with binomially weighted pulses as

depicted in Fig. 4.3). As can be seen in Fig. 4.7, the two schemes identically

responded to each group of pulses (5 pulses in each group) and provided the correct

indication that the target is moving. However, any variation in the target speed

cannot be classified by the conventional PC since it does not have a fiat frequency

response (see Section 2.8.3). Therefore, the vertical changes in Fig. 4.7 do not

correspond to the actual target speed. The NN-PC, however, can be trained to

classify the target speed. To illustrate this feature of NN-PC, we conducted a

157

separate test on another NN-PC which was trained to classify a slow target and a

fast target by setting the output to a numerical value of 10 for the velocity range

of [20m/ s, 120m/ s] and zero otherwise. The performance is illustrated in Fig. 4.8,

where the response of NN-PC is compared to the true (i.e., desired) response.

Another experiment with the binomially weighted pulses in the conventional

PC revealed that one cannot extract doppler shifts from these pulses, whereas this

can be done by direct input of the unweighted pulses to the neural network as

was discussed previously. It is an important observation that a neural network

does not learn doppler shifts from the binomially weighted pulses while performing

more efficiently when the pulses are directly fed ~ received. We conclude that the

role of binomially weighted pulses, when used in the neural network training, is

merely to cancel any existing clutter and that doppler shifts can not be extracted

from them in the way that they are extracted in ordinary pulse cancelers. As

we will show in another experiment, if the sum of the binomially weighted pulses

is used as an input to the neural network, a more efficient processing can result,

particularly when other parameters are also used as inputs in conjunction with

it. Note that the binomial pulses have been amplitude modulated two times, once

with the doppler variation and then weighted again by the binomial coefficients

to achieve robustness to clutter. This is why training a neural network to extract

doppler shifts from binomially weighted pulses is more difficult than training with

the unweighted pulses.

4.4.3. Analysis of NN-MTI Design With PRF Switching

Up to this point, our main objective was to study the mapping of a set of

coherent pulses that have been amplitude modulated with doppler shifts onto the

158

corresponding doppler shifts. Use of a single PRF (i.e., uniform time interval be

tween pulses) has the disadvantage that the range of velocities that can be detected

without ambiguity is limited. The ambiguity is due to the aliasing problem which

is a consequence of uniform sampling. The PRF switching has been a traditional

method for overcoming this ambiguity. However, there exist some problems with

non-uniform sampling. Some of these are: a) the underlying theory is not well

developed for a clear procedural design of the MTI filter, which means that the

analysis of the PRF switching with Fourier transform methods is very difficult, and

b) the frequency response of the multiple PRF pulse sequence is not flat, which

leaves a lot of room for further development. As will be seen in this section, the

flexibility offered by a neural network implementation can provide a new way of

forming a desired MTI frequency response when multiple PRFs are employed.

Experiment # 5

A neural network with 9 input nodes and 10 hidden nodes with nonlinear

activation functions similar to those used in earlier experiments was trained. The

training examples were generated based on two sets of pulse sequences. Each pulse

sequence was composed of 3 individual pulses. The only difference between the two

sets of pulse sequences was the PRF; specifically, the time interval between the first

set of pulses was 1 milliseconds whereas it was .863 milliseconds for the second set.

Furthermore, the pulses in each set were multiplied by binomial weights. Recall

that the multiplication of the pulse amplitudes by the binomial weights adds to the

clutter visibility and is used in the traditional methods (e.g., the pulse canceler).

However, as it was illustrated in an earlier section, direct use of the binomially

weighted pulses as inputs to the neural network does not result in a satisfactory

training.

159

We have already discussed some reasons for the failure of neural networks

in learning directly from the binomially weighted pulses. Despite this, we included

binomial weighting in this experiment in order to further enhance the robustness of

NN-MTI to clutter. In this experiment we also provided some additional informa

tion to further evaluate the efficacy of using binomial weighting. This additional

information consisted of three parameters, viz., 1) the mean of the pulses which

was calculated over the total of 6 pulses, 2) the mean of the pulses for the first

sequence, and 3) the mean of the pulses for the second sequence. We conducted

this experiment both with and without the binomial weightings. Doppler sensi

tivity was set to 6.67hz/m/s and the velocity increment was 5m/s. A total of

25 different velocity levels were generated with 5m/s increments and 20 examples

from each level were included in the training set. This gave rise to a total of 500

training examples. The output was composed of three different classes of radial

velocity i.e., slow, medium, and fast moving targets.

In these experiments, we did not include clutter in the training data. The

results of training with and without binomial weighting are illustrated in Figs.

4.9 and 4.10, which show no difference in the performance of the two methods.

However, what is remarkable in this experiment is that despite the fact that the

neural network could not learn from direct presentation of binomially weighted

pulses (as illustrated earlier), we have found a way to include them in training.

This is through the inclusion of the arithmetic mean of each set of pulses and the

overall mean of all the pulses which serve to create a deterministic relation among

the two sources of modulation hence preventing any possible conflict in the training

data. There are two important achievements here which need to be emphasized.

One is the flat frequency response that can be obtained by the neural network

160

scheme regardless of the multiple PRFs used in the pulse sequence. The other is a

possible increase in clutter visibility which will be discussed next.

In order to underscore the use of binomial weighting for some advantage, we

repeated this test with the inclusion of clutter in the test vectors. We added moving

clutter samples to the pulses such that each clutter sample imposed a random

doppler shift in addition to the true doppler shift of each pulse. The average

velocity of clutter samples was set at 30m/s. A performance comparison of the

NN-PC trained with binomially weighted pulses (as described above) with the case

when the pulses were not weighted is illustrated in Fig. 4.11. This is where it does

make a difference to include a clutter visibility fe~ture (i.e., additional weighting of

pulses with binomial coefficients). The advantage of combining the clutter visibility

feature of pulse cancelers with the noise immunity, :fiat frequency response, as well

as many other design :fiexibilities offered by a trained neural network is clear from

these experiments.

4.5. Conclusion

Biological systems (e.g., insects, birds, flies) have capabilities beyond those

of the conventional MTI processors. Insects, for example, can easily detect a mov

ing target as it approaches them. Furthermore, the size of an insect is not even

comparable to the size of the simplest MTI filter. It is hence of interest to the

radar community to explore how the functions of an MTI processor can be mod

eled by Artificial Neural Networks (ANN). Our objective in this chapter was to

initiate a thorough investigation of radar pulse processing with neural networks.

Although techniques of radar signal processing, including MTI, have been vastly

improved by the availability of digital computers in recent years, these methods are

generally based on complex mathematical procedures which make the engineering

161

and design of radar receivers rather costly and vulnerable to electronic faults (e.g.,

loose connections, short circuits). Work reported in this chapter has provided some

evidence that doppler filter banks can easily be implemented with neural networks

even in situations where a limited number of pulses are available for processing. It

was shown that training a neural network for MTI filtering is more successful when

clean data is used.

We showed that, in contrast to the ordinary pulse cancelers which are based

on optimally weighting the input pulses, the neural network can do the weight

ing within the process of its training and make use of binomial weighting as an

additional factor to further enhance its perfo~ance in clutter. Furthermore, it

is possible to shape the frequency response of the NN-MTI as desired without

needing the complex process of pole placement, which is traditionally required in

both digital and analog filter design procedures. A rather important feature that

was explored during the course of these experiments is that non-uniform sampling

(multiple PRFs) can be efficiently handled with a trained neural network. Also,

a variable-bandwidth doppler filter bank is much simpler to implement with neu

ral networks when compared to the case with linear transversal filters. Since the

fault tolerancy of neural networks in the face of loss of connection weights has

already been established in the literature [5,83], we did not conduct any further

comparisons with the traditional MTI filters when some of their connections are

lost.

As a final note in this chapter, the Fast Fourier Transform (FFT) technique

has established itself as a 'valuable tool for digital processing of radar signals. Al

though the FFT-based MTI filters were not thoroughly addressed in this chapter,

we outlined some of the outstanding features of the neural network-based methods

162

which are not easily available in FFT-based methods. Our experiments with neu

ral network processing of coherent radar pulses have revealed that neural networks

provide convenient mechanisms for alternative modulation and pulse transform

techniques with more attractive engineering design features. A good example is

the parallel processing of pulses in time, space, and frequency domain which is in

demand for future distributed sensor systems.

InpJ_o_el....;Oy_'i_ne_--'~~ rl,loelOY line ~~ ~I. _____ ~ ~ .. _____ ~ __ ~~

OuTeuT

(al

IneuT I., Deloy line OUTPUT

Fig 4.1 (a) Two-pulse canceler; (b) three-pulse canc.eler

163

Freauency

Fig 4.2a Relative frequency response of the single-pulse canceler (solid

curve) and the two-pulse canceler (dashed ~urve).

zO.--......,.------,--...,.----------, (1)

Freauency

Fig 4.2h Amplitude response for (1) three-pulse canceler. (2) £"e-pulse

canceler, (3) 15-pulse canceler

Input

Summer

Output

Delay r,.. -1

164

Fig 4.3 General form of a transversal filter for MTI processing

1.0 IV

g 0.8 Co

~ 0.6 ... ~ 0.4

o

-10

I , , , , ,

2/72 2/T, 3/72 Frequency

(a)

\ , \ I \ , , I , , , ' , ,

3/T, 4/72

\ I \ I , , I , \ I \ I

\ I , I

\ \ \ \ I I I I ,

iii ~ -20 I '

I ' II , I , ' , I , I \ "

, , , IV II> c: o Q. II>

~ -30

-40

, , I, 1/ \I

I' II " II

--- Fixed prf

--- Staggered prf

II II

" " I'

I I I I t

-50~ ______ ~~ ______ ~~ ______ ~~ _____ ~~ o 1.0 2.0 3.0 4.0

Target velocity relative to first blind velocity at f.ixed prf

'Cb)

Fig 4.4 Frequency response of pulse cancelers with two distinct

PRF (a) single-pulse; (b) five-pulse

165

166

t-T-j

-~ ~ ~ nuwM lal

-IV \A N \A (IJ I

==ri---- °--------1 ----~--------~

----'------

leI

Fig 4.5 (a) Radar pulse train: (b) video pulse train for doppler frequenc:·

!d > 1/7; (c) video pulse train for doppler frequency h < liT.

167

Table 4-1 NN-MTI classification of doppler shift with 5 mls average error in test vectors with '89% correct classification

P(l) P(2) P(3) P(4) P(5) Desired Solution NN Solution

30.9 58.8 80.9 95.1 100.0 95.1 58.8 -58.8 -95.1 0.0 30.9 -58.8 80.9 -95.1 100.0

30.0 20.0 10.0

29.9 19.9 14.1

Table 4-2 NN-MTI classification of doppler shift with 6 mls average error in test vectors with 76% correct classification


24.9 -48.2 68.4 -84.4 95.1 30.0 13.8 -48.2 -84.4 -99.8 -90.5 -58.8 30.0 29.6 68.5 -99.8 77.1 -12.5 -58.8 20.0 18.1 -84.4 -90.5 -12.6 77.1 95.1 20.0 19.9 95.1 -58.8 -58.8 95.1 0.0 20.0 18.2 -99.8 -12.5 98.2 24.8 -95.1 20.0 19.1 98.2 36.9 -84.5 -68.5 58.8 20.0 23.6 -90.5 77.1 24.9 -98.2 58.8 10.0 7.8 77.0 98.2 48.2 -36.7 -95.0 10.0 30.0 -58.8 95.1 -95.1 58.8 0.0 10.0 9.2

Table 4-3 NN-MTI classification for 10 independent pulses with 92% correct classification

168

P(l) P(2) P(3) P(4) P(5) P(6) P(7) P(8) P(9) P(lO) Desired NN

30.~ 58.7 80.9 95.1 100.0 95.1 80.9 58.7 30.8 0.0 30.0 29.9

95.1 58.8 -58.7 -95.1 0.0 95.1 58.7 -58.7 -95.1 0.0 20.0 19.9

30.9 -58.7 80.9 -95.1 100.0 -95.1 80.9 -58.7 30.9 0.0 10.0 13.0

Table 4-4 NN-MTI classification for 10 pulses with 78% correct classification

P(l) P(2) P(3) P(4) P(5) P(6) P(7) P(8) P(9) P(10) Desired NN

24.8 -48.1 68.4 -84.4 95.1 -99.8 98.2 -90.4 77.0 -58.7 30.2 15.0 -48.1 -84.4 -99.8 -90.4 -58.7 -12.5 36.8 77.0 98.2 95.1 :;0.0 -0.15 68.4 -99.8 77.2 -12.5 -58.7 98.2 -84.4 24.8 48.1 -95.1 20.0 18.3 -84.4 -9004 -12.5 no 95.1 24.9 -68.4 -98.2 -36.8 58.7 20.0 -1.9 95.1 -58.7 -58.7 95.1 0.0 -95.1 58.7 58.7 -95.1 0.0 20.0 18.6 -99.8 -12.5 98.2 24.8 -95.1 -36.8 90.4 48.1 -84.3 -58.7 20.0 9.5 98.2 36.8 -84.4 -68.5 58.7 90.4 -25.0 -99.8 -1204 95.1 20.0 13.5 -90.4 ii.O 24.9 -98.2 58.7 48.2 -99.8 37.0 68.5 -95.1 10.0 9.6 77.0 98.2 48.1 -36.6 -95.0 -84.4 -12.6 68.2 99.8 59.1 10.0 8.9 -58.7 95.1 -95.1 58.7 0.0 -58.7 95.1 -95.1 58.7 0.0 10.0 9.8

Table 4-5 Two step classification of slow & fast moving targets with 94% correct classification


30.9 58.7 80.9 95.1 100.0 0.0 0.001

58.7 95.1 95.1 58.7 0.0 0.0 0.001 80.9 95.1 30.9 -58.7 -100.0 0.0 0.001 95.1 58.7 -58.7 -95.1 0.0 0.0 0.001 100.0 0.0 -100.0 0.0 100.0 0.0 0.03 95.1 -58.7 -58.7 95.1 0.0 10.0 10.0 80.9 -95.1 30.9 58.7 -100.3 10.0 9.9 58.7 -95.1 95.1 -58.7 0.0 10.0 9.9 30.9 -58.7 80.9 -95.1 100.0 10.0 10.0

169

Table 4-6 Low resolution doppler shift extraction by NN-MTI with step size = 5m/s

P{l) P(2) P(3) P(4) P(5) Desired Solution NN Solution

53.5 -90.5 99.2 -77.0 30.9 0.67 0.53

-90.5 -77.0 24.9 98.2 58.8 1.34 1.56 99.2 24.9 -92.9 -48.2 80.9 2.01 1.58 -77.0 98.2 -48.1 -36.8 95.1 2.68 2.4

30.9 58.8 80.9 95.1 100.0 3.35 3.1

24.9 -48.2 68.4 -84.4 95.1 4.02 3.79 -72.9 -99.8 -63.7 12.5 80.9 4.69 4.12 98.2 -36.8 -84.4 68.4 58.8 5.36 5.16 -92.9 68.4 42.6 -99.8 30.9 6.03 6.45 58.8 95.1 95.1 58.8 0.0 6.7 6.37

170

Table 4-7 High resolution doppler shift extraction by NN -MTI with step size = 1mjs


53.5 -90.4 99.2 -77.0 30.9· .67 .67

-90.4 -77.0 24.9 98.2 58.8 1.34 1.34 99.2 24.9 -92.9 -48.2 80.9 2.01 2.01

-77.0 98.2 -48.2 -36.8 95.1 2.68 2.68 30.9 58.7 80.9 95.1 100.0 3.35 3.35 24.9 -48.2 68.4 -84.4 95.1 4.02 4.02 -72.9 -99.8 -63.7 12.5 80.9 4.69 4.69 98.2 -36.8 -84.4 68.4 58.7 5.36 5.36 -92.9 68.4 42.6 -9.9 30.9 6.03 6.03

58.8 95.1 95.1 58.8 0.0 6.70 6.70

171

Table 4-8 NN-MTI performance for a probability of detection less than one * indicates that the pulse is noisy


53.5 100.0· 99.2 -77.0 30.9 .67 .88 -90.4 -77.0 ~.O. 98.2 58.8 1.34 1.57 ~.O. 24.9 -92.9 -48.2 80.9 2.01 1.84 -77.0 98.2 -48.2 -36.8 0.0· 2.68 3.68 30.9 58.7 80.9 0.0" 100.0 3.35 2.76 ~.O. -48.2 68.4 -84.4 95.1 4.02 4.12

-72.9 ~.O. -63.7 12.5 80.9 4.69 4.27 98.2 -36.8 0.0· 0.0· 58.7 5.36 4.57 -92.9 68.4 42.6 -9.9 0.0" 6.03 6.20

58.8 95.1 0.0 58.7 ~.O. 6.70 6.47

Table 4-9 NN-MTI classification in presence of clutter with S2% correct classification

P{l) P(2) P(3) P(4) P(5) Desired Solution NN Solution

30.9 5S.S SO.9 95.1 100."0 30.0 30.0

5S.S 95.1 95.1 5S.S 0.0 30.0 30.0

SO.9 95.1 30.9 -5S.S -100.0 30.0 29.9

95.1 5S.S -5S.S -95.1 0.0 20.0 19.9

100.0 0.0 -100.0 0.0 100.0 20.0 lS.9 95.1 -5S.S -5S.S 95.1 0.0 20.0 lS.S SO.9 -95.1 30.9 5S.S -100.0 20.0 lS.9 5S.S -95.1 95.1 -5S.S 0.0 20.0 lS.S

30.9 -5S.S SO.9 -95.1 100.0 10.0 14.6

0.0 0.0 0.0 0.0 0.0 10.0 9.9 -30.9 5S.S -SO.9 95.1 -100.0 10.0 12.7 -5S.S 95.1 -95.1 5S.S 0.0 10.0 9.1S -SO.9 95.1 -30.9 -5S.S 100.0 10.0 9.11 -95.1 5S.S 5S.S -95.1 0.0 10.0 9.14 -100.0 0.0 100.0 0.0 -100.0 10.0 9.0S

172

173

c -

: .... :

Oes~red MTI Response NN-MTI Response

~ .. _Je. i ~·"l .r- :._.

e·:

·e·:

c::::>

·e·

.... ....

~~------------~------------~------------r-----------~~-----------' 0.0 4.0 8.0 12.0 16.0 Coheren~ PuLse Group

Fig. 4.6 NN·MTI response for variable step sizes in training data and identical

velocity error for pulse groups in the test data

20.0

C>

a PuLse CanceLer- Respanse - NN-PC Respanse

174

~~------------~------------~~----------~~------------~------------, 0.0 5.0 10.0 15.0 20.0 Caher-en~ PuLse Gr-aup

Fig. 4.7 Comparison of the NN·PC with a conventional

Pulse Canceler for indicating the target motion only

25.0

o - Oesi..r-ed Response NN-PC Response

175

~~--~~~~==~======-===~-T------------~~------------r-------------' 0.0 2.0 4.0 6.0 8.0 10.0 Coher-en~ PuLse Gr-oup

Fig. 4.8 Performance of NN-PC in separation of slow and fast targets

. _ .. _ .. _ .. _ ..

::i-

CI

. _ ... _ .

··w···w··_··_·

o - Des~~ed Response NN-PC Response

176

..•.

ci~-------------T------------~~------------r-------------~------------~ 0.0 5.0 10.0 15.0 20.0 25.0 Cohe~en~ PuLse Group

Fig. 4.9 NN-PC performance with PRF switching and binomially weighted pulses

(no clutter)

177

_00 0_0 00_00 0_00_00_0

c:::>

~-

00_00_00_00

c:::> ...;-

c:::>

00_00 0_0

CI

• Des~red Response NN-PC Response

ci~-------------'~------------r-------------r-------------'-------------~ 0.0 5.0 10.0 15.0 20.0 Conaren~ PuLse Group

Figo 4010 NN-PC performance with PRF switching and unweighted pulses

(no clutter)

25.0

· ... : c::>

:... .:

~"·i ~ .•. : : .. -

'-I i ... :

o •

178

r·· t_:

:.-: .•. ,

~ .. j

B~nom~oL We~gh~~ng No B~nom~oL Re~9h~~n9

ci~ ____________ -, ____________ ~~ ____________ ~ ____________ ~ __________ --,

0.0 5.0 10.0 15.0 20.0 Coh_ren~ PuLae Group

Fig. 4.11 performance of NN·PC with and without use of the binomial weights

for target velocity classification in presence of heavy clutter

25.0

179

CHAPTER 5

TARGET TRACKING BY

NEURAL NETWORK. MANEUVER MODELING

5.1. Introduction

Target tracking systems that operate in a track-while-scan mode have great

difficulties in maintaining the track when the target performs unpredictable ma

neuvers. A maneuver is a sudden change in ~cceleration which can take place

in different directions depending on the capabilities of the target being tracked.

Tracking a single target would be a simple task if there were no maneuvers. Tar

get maneuvers add significant complexities to the signal processing required for

tracking. The complexity arises due to lack of measurements from the target ac

celeration. Radar measurements are limited to range, angle, and sometimes the

radial velocity. These measurements are further corrupted by noise and background

clutter.

Ordinarily a Kalman filter is used to estimate the true position of the target

in an optimal fashion as long as the measurement noise is Gaussian. Sudden

accelerations, however, cause a bias in the measurement sequence. Unless this bias

is compensated for, the filter will diverge and the true track will be lost. Tracking

a maneuvering target in a cluttered background constitutes a difficult problem

that has been addressed in the literature for many years. As outlined in Chapter

2, several different methods have been introduced to model target accelerations.

The classical methods are mainly based on one statistical parameter that is used

180

to detect the presence of the maneuver. Upon detection of the maneuver, an

artificial noise is generated to substitute for the true acceleration in obtaining

future estimates.

Use of a single parameter often requires that the error be propagated over

several samples in the past for correction of the previous estimates. This is due

to the fact that the artificial noise cannot be correctly generated unless enough

samples are received. The waiting time for more samples can however result in a

total loss of the track since the target can begin a new maneuver. If the target

begins a new maneuver before the first one is compensated for, the filter will never

converge. Therefore, most of the proposed algori~hms in the current literature have

the disadvantage of losing the target in situations of short term accelerations, in

which the duration of acceleration is comparable to the time period between the

measurements. One method to resolve this problem appears to be the use of more

features in the estimation process so that fewer samples would be required. This is

a formidable task for the current algorithms for reasons of computational effort and

that maneuver is a real-time problem. The issue of the practical implementation

of tracking algorithms has also attracted some attention by itself in the literature

on target tracking. For example, Fitzgerald [37] and [101-103] discusses some of

the computational requirements needed by the current algorithms and addresses

the limitations of existing microprocessors for practical implementations. With the

advent of the neural network technology, some of these difficulties can be removed

in an efficient way. This makes the neural network approach more appealing to

target tracking problems.

The objective of the research reported in this chapter is to design and eval

uate a neural network-based maneuver modeling scheme that requires only a small

181

number of required samples for the compensation of bias in the measurement se

quence. A multilayer feedforward network with backpropagation learning is used.

For simplicity in the illustration of basic ideas, only longitudinal acceleration will

be considered, although the approach can be readily extended to other types of

maneuvers. The neural network uses three input parameters in order to identify

the presence of a longitudinal acceleration. Upon detection of the acceleration, the

amount of noise required to compensate for the bias is generated by the network.

The correct estimate of the target position and velocity are then recalculated using

the output of the neural network only for one time step in the past. The proposed

design has the following primary advantages:

1) It has a quick response for short-term accelerations.

2) Detection and compensation for the maneuver are done in one step.

3) The neural network controller works in conjunction with a Kalman filter

giving rise to a hybrid tracking system. Therefore, the Kalman filter can be

kept simple with only position and velocity as the states.

5.2. Neural Network Implementation of Maneuver Modeling

The principal idea behind the use of a neural network in this application can

be described as follows. In tracking a target using Kalman filter, the main source

of divergence is the bias which is introduced by the target maneuver, especially

when the tracking filter has reached a steady state and the filter gain has low

values. We use a neural network as an adaptive mechanism to help adjusting the

filter gains in the presence of accelerations. This provides a hybrid approach to

compensate for the bias in the Kalman filter estimation of states. That is, we

retain the Kalman filter as the main filter for tracking while the neural network

is employed to detect the presence of an acceleration and make up for the bias.

182

The estimation of states is still performed by the Kalman filter and the neural

network helps adjusting the target dynamical model only when a maneuver (i.e.,

sudden acceleration) is detected. Furthermore, detection of a maneuver is done by

the neural network through the use of a normalized innovation parameter that is

generated by the Kalman filter.

Most of the existing algorithms traditionally use the innovation sequence

for the estimation of this noise [38,104,105]. It turns out, however, that the inno

vation sequence is only one of several parameters that can be used for obtaining

an indication of maneuver. There are other parameters each of which is capable

of indicating a different characteristic of the target maneuver and these have been

used separately by different algorithms, such as the Heading Assisted Filter [39]

which relies on the target heading estimate. In our approach, we use two other

parameters in addition to the innovation sequence to further improve the detection

and compensation of the bias in the Kalman filter.

Classifying the type of maneuver performed by the target using more than

one parameter is what makes the neural network quite appealing. Our approach is

to employ some of these parameters as inputs to a multilayer neural network that

is trained to generate the required compensating noise signal. For simplicity, we

confine our discussion to the longitudinal acceleration only. However, our study

shows that other types of maneuvers (e.g., circular, sinusoidal, etc.) can also be

modeled with this approach.

The target dynamical model, as will be described below, is adjusted through

adding the noise components U x and u y generated by the neural network. It is this

noise that different algorithms such as the ones proposed by Singer [40] and Bogler

183

[41,72] try to model. We have already discussed in Chapter 2 some of the ma

jor problems with these methods. The goal here is to employ a neural network

to overcome these problems which can potentially enhance the performance with

these methods. It is important to include the coupling of the acceleration compo

nents in the model for target acceleration. These components are coupled due to

the fact that during the transformation of range and angular measurements from

the polar to cartesian coordinates, the measurement errors are no longer indepen

dent. Bogler's method, for example, lacks the coupling effect of the acceleration

components. According to Bogler [72], other methods have been proposed in the

literature, however, they are computationally inefficient and are. With a neural

network-based procedure on the other hand, both components can be generated

through the same network and hence they can be used directly to update the tar

get dynamical model. That is, a single network is used for estimation of both

components and no further processing is needed to include the effect of coupling.

5.2.1. Problem Formulation

For a precise description of the problem and the parameters that will be used

in the neural network-based maneuver modeling, let us consider a two dimensional

tracking situation in which the state vector consisting of the positions and velocities

of the target in the two coordinates is given as

xT(k) = [x(k)i(k)y(k)y(k)] (5.1)

and the state equation is represented by the dynamical model

x(k + 1) = Fx(k) + Gu(k) + v(k). (5.2)

The second and the third terms in the above model refer to an accelera

tion input and an additional correction factor respectively. The acceleration input

184

however is unknown to us and it has to be estimated with some uncertainty. The

matrix F is termed the iransition matrix and G, the noise matrix. These matrices,

when multiplied by the vectors x( k) and u( k) in the above equation, will represent

the equations of the target motion in one sampling period T [43], i.e.,

and

y(k + 1) = y(k) + iJ(k)T + ~T2Uy(k).

The matrix G is called the "noise matrix" because it is multiplied by the vector

u( k) which is unknown and its components need to be estimated. Expressing the

above equations in the form (5.2), one obtains the F and G matrices given by

F= [~ : and (5.3)

The unknown input u( k - 1) for modeling the target maneuver is to be

estimated by the neural network. The term v( k) is a zero-mean white noise process

with covariance Q, i.e.,

E{ } {Q, k=j;

Vk Vi = 0, k 'I- j (5.4)

where Vi and Vk are two samples of the noise process v(k). The observation se

quence is given by

z(k) = Hx(k) +w(k) (5.5)

185

where w( k) represents the measurement noise, which is assumed zero-mean with

covariance R and is independent of the process noise v( k).

The filter designed on the basis of the non-maneuvering model (i.e., u( k) =

0) would cause the innovation sequence (see Eq. (2.13)) to build up in magnitude

(when u #: 0). Using the innovation sequence with other indicators (which will be

discussed later), the corresponding input noise level can be estimated through the

neural network. Filter gains are then adjusted through covariance matrices of the

Kalman filter which incorporates the variance of error due to the neural network

estimate in the form

(5.6)

where P(klk) is the covariance matrix of the state estimates, which reflects the

Kalman filter errors in its estimates of states, and q~n is a term which represents

the neural network estimation error. It is a term which is computed off-line during

training when there is no maneuver ( i.e., u( k) = 0). In the presence of a maneuver,

the neural network estimate li(k - l)nn = lux uyjT would be used until the filter

reaches the steady state. Appropriate conditions for reaching steady-state rapidly

can be incorporated in the training examples. In other words, the precision of the

algorithm can be enhanced by making use of smaller step sizes of acceleration to

generate the training examples. Since the neural network is trained to estimate

the acceleration one step in the past, the prediction correction is done based on

the equation

xnc(klk -1) = Fx(k -11k -1) + GUnn(k -11k -1) (5.7)

186

where xnc is the neural network correction for the prediction of the states in the

previous step. The filtered estimate is then

Xnc(klk) = xnc(klk -1) + I{(k) [Z(k) - HXnc(klk -1)]. (5.8)

where K( k) is the Kalman filter gain which is given by

K(k) = FP(klk)HT [HP(klk)HT +R] +Q. (5.9)

The covariance matrix of prediction will change from the non-maneuvering

case (i.e., zero process noise) to

P(k + 11k) = p P(klk)pT + GQGT (5.10)

where

(5.11)

This approach to the compensation of bias induced by the maneuver does

not require a long waiting time for several measurements, as in the scheme pro-

posed by Bogler [72] which requires at least 5 to 6 measurements. Computation

of the propagation matrix M (Eq. (2.22) in Chapter 2) from the estimated time

of maneuver up to time k is not required since the update takes place at every

step during the period of acceleration. The calculation of the propagation matrix

represents the primary processing load in Bogler's method. It may be noted that

in this method estimation of u( k) is merely based on the residual information of

the nominal Kalman filter. With the present neural network method, however,

we make use of additional parameters to represent the maneuver which helps in

identifying the acceleration in a shorter period of time. Consequently, there is no

need for a propagation matrix as required by Bogler's method.

187

5.3. The First Input Parameter

In effect, the neural network replaces a bank of N parallel filters which would

have been required otherwise. Instead of matching each filter to a separate quan

tity of the residual information (i.e., innovation), we let the neural network learn

the nonlinear relationship that exists' between the acceleration and the residual

information. As mentioned earlier, the idea of using a bank of parallel filters was

first introduced by Magill [41,94] (Fig. 5.1). He developed an expression for the

posterior probability that the nth Kalman filter is the correct one to use.

The first input parameter that we use is the innovation term which is mul

tiplied by the Kalman filter gain to smooth out the predicted estimate, that is

v(k) = z(k) - H x(klk -1). (5.12)

This expression is representative of the actual newness of information about the

position measurements. The change in residual information is sensitive to changes

in velocity and therefore different initial velocities should be included in the training

set (which will be discussed in a greater detail in a later section). In terms of the

duration of the acceleration, however, each estimate is updated after a sampling

period. In this method, the target maneuver is jointly detected and corrected in

a single continuous operation without much computational complexity. The other

methods in the literature that have the same feature are computationally involved

and have long delays due to the sequential nature of processing.

The changes in position innovation were normalized with respect to the

covariance of innovation S( k), which is

S(k) = H P(klk -l)HT +R (5.13)

188

where R is the measurement covariance matrix. The components of innovation

were normalized separately and their summation was used as the input parameter

which is given by

(5.14)

where the terms SuCk) and SuCk) are the diagonal components of the covariance

matrix S(k). Note that we have combined the two components to keep the input

feature set minimal. 'We could use Vz and Vy (i.e., the normalized Vz and Vy com

ponents) independently as input features together with some higher level relations

among them such as vz/vy or vzvy. For the time being, we want to show that the

neural network produces satisfactory results if appropriate features of the maneuver

are presented as inputs. Features should be descriptive of the maneuver class and

be focused on the incremental changes rather than the exact instanteneous values.

Reducing the features to represent incremental changes will result in a much less

training effort.

5.3.1. Statistical Properties of the Innovation Process

Let the vector z(klk -1) denote the minimum mean-square estimate of the

observed data z(k) at time k, given all the past values of the observed data up to

time k - 1. This is actually the Kalman filter estimate. As mentioned before, the

innovation process associated with z( k) is defined as

v(klk -1) = z(k) - z(klk -1), k=1,2, ... (5.15)

where the vector v(klk - 1) represents the new information in the observed data

z( k). The innovation process has the following properties

189

1) The innovation process v(klk -1) corresponding to the observed data z(k)

at time k, is orthogonal to all the past observations z(l), z(2), ... , z(k -1),

i.e,

E [v(k) Z(j)] = 0, j = 1,2, ... ,(k -1).

2) The innovation process consists of a sequence of vector random variables

that are orthogonal to each other, as shown by

j = 1,2, ... ,(k -1).

3) There is a one-to-one correspondence between the sequence of vector ran

dom variables z(l), z(2), ... , z(k-1) representing the observed data and the

sequence of vector random variables v(1), v(2), . .. , v( k) representing the in

novation process, and therefore one sequence may be obtained from the

other by means of a linear transformation without loss of information. This

can be stated as

{z(l), z(2), ... , z( k)} {::::} {v( 1), v(2), ... , v( k)} . (5.16)

5.3.2. Estimation of States using the Innovation Process

In the classical approaches, the state estimates may be expressed in the form

of a linear combination of the sequence of innovation process v(l), v(2), ... , v( k).

In previous sections we argued that these procedures are generally slow and require

many samples of innovation sequence whereas when a maneuver takes place it has

to be detected and compensated within a minimum number of sampling intervals.

There is no procedure in the current literature that uses parameters other than v( k )

in the estimation process without adding more dimension to the state vector of the

190

Kalman filter. This is primarily due to the nonlinear relationship between these

parameters. IT we call these additional parameters as additional feature vectors

that identify the maneuver, then the estimation with an augmented state vector

through Kalman filter may seem appealing. However, Kalman filter as a linear

minimum mean square estimator will fail to converge when there is a bias in any of

these parameters since the orthogonality assumption fails. In addition to this, due

to the extra amount of computations, adding more states will cause additional lag

time in reaching the steady state. In the next section we shall discuss the process

of bias detection and show how the bias induced by a maneuver may be detected

with the neural network. However, before we discuss that, let us relate the neural

network method that we have used with a linear estimation technique using the

innovation process.

Consider the state u(k) and its estimate u (jIZk), where j is the time index

for the state and Zk represents the observation sequence up to time k. Then the

estimation using the innovation process is expressed as a linear combination of the

samples in the sequence v( k) in the form

k

u(jIZk) = L Bj(i) v(i). (5.17) i=l

The set {Bj(i)} represents the matrix sequence that has to be determined. Ac

cording to the principle of orthogonality, the predicted state-error vector should

be orthogonal to the innovation process.

The neural network architecture can play an important role in the orthogo

nalization of the sequence. Classical techniques use parallel filter banks to find the

filter that gives the minimum residual error. This approach is quantitatively limited

to a finite set of discrete values of v(m). With the use of a neural network, however,

191

we extend the quantity v( m) over its magnitude rather than the time index. That

is, we:fix m = ko (i.e., the instant that maneuver is detected) and quantize vq(ko),

with q denoting the number of quantizations desired which depends on the training

effort. An interesting property now is the interpolation capability afforded by the

neural network architectures which extends the power of neural network implemen

tation beyond that possible from ordinary parallel architectures. By appropriate

generation of examples of innovation sequence, the training process reduces to min

imizing the average cumulative error which is E {[u(j) - ft UIZk)] vq(ko)} in the

linear estimation case.

5.4. Optimum Bias Detection

In the previous section we discussed the statistical properties of the inno

vation sequence and it was mentioned that the innovation sequence (or residual

sequence) that is used in the Kalman filter should be a white noise process with ze

ro mean. In the presence of a maneuver or any other consistant interference, there

will be a nonzero mean in this sequence. If this mean is not removed from the

filter, it will propagate through the filter parameters and the covariance matrices

and results in a deviation from the true track. The purpose of the adaptive scheme

is to detect this bias as soon as it appears and to generate appropriate signals to

compensate for the bias. Both the maneuver and the unwanted clutter contribute

to the bias. The adaptive scheme should have the following properties

1) It should detect the bias as early as possible. This is called the maneuver

detection [99] process.

2) The performance of the maneuver detector should not be affected by clutter.

Clutter data can introduce an additional bias term which may result in a

false maneuver detection.

192

3) Correction for the bias should take place before the next target return is

received by the radar.

According to McAuly [42J, the innovation sequence can be modeled as

v(k) = vo(k) + m(k - ko) (5.18)

where vo(k) is a zero-mean, white noise process with variance No. The second

term in the above equation is due to the bias which has occurred at time k ;:::

ko• We assume that the maneuver corresponds to the introduction of a constant

acceleration in the interval of estimation (i.e., from the time bias is detected back

to the time the maneuver was first started). This bias will then manifest itself as a

quadratic function of (k - ko ) in the measurement sequence since the position and

acceleration are related as such. Therefore, the bias term could be approximated

as

(5.19)

where T is the sampling time, koT denotes the unknown time at which the maneu

ver was initiated, and J.l is a parameter related to the magnitude of the acceleration

or the nonlinearity of the model. We can describe [42J the process of a maneuver

detection as follows:

Detection of a maneuver is equivalent to detecting the presence of a

deterministic signal of unknown amplitude and time of arrival in a

background of zero-mean white noise.

Assuming that the measurement noise is Gaussian, the generalized likelihood

ratio test will be to declare the presence of a maneuver if

{

k k } L(k) = ~ v2(n) - ~l! ~ [v(n) - J.lS(n - kO)J2 ;:::,x, (5.20)

where

and

Now, define the functions

SCi) = (iT? u(i),

u(i) = {O, 1,

k

i < OJ i ~ O.

E(k,ko) = L S2(n- ko) n=O

k

p(k, ko) = L lI(n) Sen - ko), n=O

and use in equation (5.12) to give the likelihood.ratio as

193

Since ko (the actual starting time of the maneuver) is unknown, many dif

ferent discrete values are assumed and the corresponding bias terms are calculated

as

A(k ) = p2(k, ko) J.L 0 E(k,ko)'

which reduces the likelihood ratio to

Given that we are currently at time kT, and that we are testing for a bias which

may have been initiated at any time (k - l)T, (k - 2)T, (k - 3)T, ... , there is only

a finite number of the past measurements that need to be processed. This yeilds

ko = k -j, j=1,2, ... ,M

194

where M is the number of the past sampling times used for estimating the magni

tude of the maneuver. The likelihood ratio now becomes

L(k) = max p2(k, k - j) m=1.2 •...• M E(k, k _ j)

where k

p(k,k-j) = L v(n)S(n-k+j), n=O

and k

E(k, k - j) = L S2(n - k + j). n=O

However, Sen) = 0 for n < 0 and therefore

m

E(j) = L S2(n). n=O

Now, we define a bank of M filters where each filter has an impulse response

h ·(1) = SCi - 1) u(1) ] JE(j) ,

j = 1,2, ... ,M

then k

p(k, k - j) = JE(m) L yen) hiCk - j). n=O

The time of initiation of maneuver can be estimated as shown in 142] based on the

following decision.

j = m=5~.,M [t yen) hiCk - n)]2 ~.A. n=O

(5.22)

To summarize, the residual sequence v( n) which is generated by the Kalman

filter through a bank of filters is matched to the time jT. The magnitude of the

quadratic bias function is then compared against a fixed threshold. IT the threshold

is exceeded, a maneuver is declared. ·When the maneuver is declared to have taken

195

place at time jT in the past, the bias compensation process is started based on the

information provided by the maneuver detector (i.e., magnitude of the acceleration

as well as the time of its occurrence). A block diagram representation of the optimal

bias detection is shown in Fig. 5.2. The neural network bias detection scheme that

we have used follows the same path except for the following differences.

1) Instead of quantizing the time of occurrence of the maneuver to M different

values, we quantize the threshold A to a finite set of values {Aq }.

2) We maximize the likelihood ratio L( k) with respect to Aq• Therefore, the

problem reduces to the following:

Assuming that the maneuver started "at time k - 1, what should the

magnitude of the input acceleration be so as to come up with the threshold

of Aq?

Therefore, instead of fixing the threshold, we quantize it to N discrete values

{Aq; q = 1, ... , N}. This will change the likelihood ratio to

_ max {P~(k' k - 1) Lnn(k) - q Eq(k,k-1) q= 1, ... ,N} (5.23)

where the subscript nn refers to the neural network likelihood ratio. The problem

is now reduced to a single nonlinear filter (i.e., neural network) which is matched

to the computed bias in one sampling time. As can be seen in Fig. 5.3a, the bank

of N parallel filters is replaced with a single neural network. Note that v(k) refers

to the innovation vector with the two components VI (k) and V2( k) which are the

first and second input parameters (i.e., the position and velocity innovation). Fig.

5.3b illustrates the inputs and outputs of the neural network. There are several

advantages to this scheme which are summarized below.

196

1) Several parallel filters are replaced by one nonlinear system of neural net

work. The on-line adaptation of this scheme should be emphasized. Since

the training is performed off-line, the response is calculated almost instan

tenously.

2) It is important to avoid a false detection of maneuver particularly when

clutter is present. A clutter sample may falsely indicate the new position

of the target which may look like a sudden change in the position measure

ments. Using a large number of quantized levels N for the threshold settings

reduces the chances of losing track in the cases when a maneuver is falsely

detected.

3) The ratio of the filter time constant T and sampling time T (i.e., TIT) is

not critical for this design.

It may appear that as N increases, more a.daptivity is gained. We found

this to be not true. This is due to the fact that as the acceleration reduces in

magnitude, clutter will be the major factor in introducing the bias in the resid

ual sequence which may result in a false detection of maneuver. Therefore, the

resolution is chosen such that the maximum error equals the standard deviation

of the measurements ( i.e., q R of the radar sensor). Given the expected range of

accelerations (e.g., 0 - 20 ml s2 ), we quantized this range into step sizes that cor

respond to q R. For example, assuming a maximum target acceleration of 20 m I S2

and a maximum scan time of 10 seconds, the required precision for the neural net

work estimate should be within 1 ml S2 in order to keep the position error below

the resolution of the radar. This error in the acceleration estimate corresponds to

100 meters in the position per scan period. The sampling time of 10 seconds is a

197

nominal value and the neural network response is based on the assumption that

the actual sampling rate (i.e., radar scan period) is less than 10 seconds.

5.5. The Second Input Parameter

In the previous sections it was noted that in order to capture the maneuver

as soon as it occurs additional input p~ameters are needed. One may note that in

most of the current algorithms, the two processes of detection and compensation are

performed in two distinct phases. In contrast, with a neural network-based scheme,

we do not need to separate the two processes. The way to accomplish this is to

find appropriate additional input parameters such that they are generated through

independent processes and contain useful infonp,ation about the maneuver. We

suggest using independent parameters for the following reasons.

1) We want to keep the filter simple for a high speed of response. Therefore,

we use the neural network as an adjunct device to reduce the load off of the

Kalman filter. Without such a strategy, each individual feature parameter

has to be defined as a separate state in the overall state vector increasing

the complexity of the Kalman filter.

2) It is better to use different estimators for the parameters instead of including

all in one state vector of the Kalman filter. This will keep the errors in the

estimation process independent from each other particularly in the presence

of a bias.

3) As mentioned before, it will be difficult to compensate for the bias due to

the coupling effects of the errors in the components of the acceleration. We

suggested earlier that there should be at least three parameters that con

tribute to the actual estimation of the acceleration input. These parameters

198

should at least contain the following information about the target maneu

ver: a) The intensity of the acceleration, b) A sense of the direction of the

acceleration, and c) The clutter visibility.

As mentioned at the beginning of this chapter, we are using the neural net

work scheme for tracking the longitudinal acceleration. However, we chose to take

a general approach for training in order to be able to extend the training process

for a circular maneuver as well. Therefore, we still need estimates that identify

the heading of the target in the presence of noise and clutter. The estimation of

the heading from noisy position measurements is in itself a problem of consider

able practical importance. The common appro~ch to this problem is to measure

the x and y positions simultaneously. IT one could assume a constant velocity, the

heading estimate could be evaluated as the ratio y/&, where & and y are derived

from a least square estimator. However, the problem with this approach is that it

assumes constant velocity, which is not the case in our problem where the speed

profile is totally unknown. This estimator is very sensitive to changes in speed and

in a noisy environment (particularly clutter) it gives highly erroneous results. This

is due to the fact that the heading is mainly due to some correlation between the

(x, y) measurements in the cartesian coordinates. The use of & and y causes rapid

changes in the heading. Also, it should be noted that the use of & and y from the

Kalman filter output is not highly appropriate because they are already corrupted

by the bias. Therefore, a good estimate is one that uses a line fit through all the

measurements.

The importance of the heading stems from the fact that its first derivative

is the angular velocity, which can be used instead of the heading to describe the

199

motion of a turning aircraft. It is well known that tracking systems using heading

estimates exhibit a very stable performance in the event of long sampling intervals.

5.5.1. Formulation of the Heading Estimate

Methods for tracking of targets with constant heading and variable speed

in a fixed direction fall into two categories. The first category refers to those with

prior assumptions about the target speed profile whereas the other category consists

of those that do not make such assumptions. However, in both cases maximum

likelihood estimators are used with different sets of assumptions. For a precise

discussion of these estimators, let us briefly state the problem of interest.

Given the measurements (Xm(t), Ym(t)) at times t = ti, i = 1, ... , N, we

are looking for the maximum likelihood estimate of the target heading H(t). The

constant-heading trajectory satisfies the conditions

H(t) = 0

H(O) = Ho.

The measurements in cartesian coordinates are

where the measurement noises nx and ny are Gaussian random variables with zero

mean and variances 0"; and 0";. Let us define

M = y(t) - Yo = tanHo x(t) - Xo

(5.24)

200

where (xo, Yo) is an unknown coordinate at t = O. 1\1 is called the heading and it

is this quantity in which we are interested to obtain an estimate in order to use as

our second input parameter to the neural network.

There are a few approaches in the literature for obtaining this estimate [39]

of which we discuss the two most related to our application. These two methods

are described as cases (a) and (b) as follows.

Case (a): 0-; and 0-; are assumed unknown. In this case, the method of least

triangles, which is given in [39], provides the heading estimate as

where N

l:f-l(Ym(i) - y)2

l:~l(Xm(i) - x)2

J.l = sgn L(Ym(i) - Y)(Xm(i) - x). i=l

(5.25)

Case (b): Both 0-; and 0-; are known. In this case the maximum likelihood

solution (MLE ), given in [39], may be used.

It should be noted that the Kalman filter computes i: and y and a value for

the heading estimate can be determined by M = y/i:. We have already discussed

some of the deficiencies of the y / i: estimate; specifically, we do not wish to use

the Kalman estimates because once the maneuver is initiated these estimates are

biased. Also, in order to eliminate the need for any a priori assumptions, we rule

out the .i~lLE estimate, since it requires the knowledge of 0-; and 0-; which are

based on the assumption of the noise being Gaussian in the measurements. We do

this mainly because we are dealing with clutter data which may have an unknown

statistics inside the validation gate. Furthermore, we limit N (i.e., the number of

the data points to be used in the estimation of the heading) to three scan periods

201

since we want to use the change in heading as an input parameter as opposed to the

heading itself. Note that in each scan period we may have several measurements.

This feature is more general in the sense that lateral accelerations can be included

in the training set as well. However, for the time being we limit the training to the

longitudinal acceleration only.

In the computation of the heading estimate by MLT in equation (5.25), we

use the set of data given by

(5.26)

That is, we use the filtered estimates for the tiD;le step k - 2. This approach will

make use of the Bayesian association and puts more emphasis on the correction

from the time k -1 to time k.

Angular velocity measurements can be obtained from the difference between

the new measurement and the last valid heading estimate. Determination of an

acceptable threshold for the turn rate is a critical issue here. The usual practice

in the current literature is to set the threshold based on a rough estimate of the

velocity. This is due to the relation w = ~ where w is the turn rate, v is the

tangential velocity, and an is the lateral acceleration. To set a threshold of 0.0025

rad/ sec for the turn rate for a straight line trajectory, a large combination of (an, v)

may satisfy this relation with a nontriviallatteral acceleration. A constant angular

velocity threshold corresponds to different lateral acceleration thresholds for an

aircraft fiying at different velocities. That is, a target heading may be changing

slowly such that, even though the turn rate is small, the target may be deviating

from the straight line. In contrast, in our approach, examples are generated from

assuming different speed profiles. We provide the change in heading with two other

202

parameters, one of which has already been discussed in detail. In other words,

the position innovation together with the other parameter (i.e., doppler change)

provides a measure of change in the velocity. We then let the neural network learn

the different quantization levels and associate each turn rate to the expected change

in velocity.

This adaptive thresholding for the maneuver is helpful in extending the

training to the lateral acceleration as well. However, we have limited the training

to straight line trajectories only. The plant noise (i.e., the acceleration input) to

be estimated by the neural network has to be in conjunction ,vith these maneuver

detection thresholds so that an aircraft executing slow maneuvers with angular

velocity under the threshold can be tracked with the straight line assumption. We

assume that the aircraft is subject to random zero-mean accelerations uncorrelated

from sample to sample and constant during each sampling interval. This is in con

trast to the assumptions made by Singer [40], since he assumes some correlation

among the samples. However, even though this is true in most practical cases, it

results in a slow adaptation since the amount of correlation is not known until sev

eral samples are processed. Therefore, fast maneuvers cannot be corrected within

a sampling period.

5.5.2. State Equations and the Heading Estimate

In the previous section we discussed the importance of the heading estimate

in modeling the target maneuver. We shall now discuss the effect of including the

heading estimate more precisely in regard to the modeling of longitudinal acceler

ation. Heading assisted maneuver tracking has been investigated in the literature

and has been shown to have some advantages [39]. It can tolerate longer sam

pling times (i.e., fewer samples) as well as maneuvers of irregular shapes. These

203

properties are in confirmation of our belief that heading estimate is an appropri

ate parameter for our purpose and contains valuable information about the target

while it is performing a maneuver. In contrast with other heading assisted tracking

filters which employ the estimate obtained from the output of the Kalman filter

(which could be biased) we will use the heading estimate MLT obtained from the

method of least triangles as described earlier.

Another reason motivating the use of MLT comes from the following obser

vation. Note that one other possible method is to include the angular velocity Wk

in the overall state vector of the filter and use WkT for the change in heading. The

state equations will then be represented by

:i; = Vz + noise

V:z: =wv, + noise

iJ = VII + noise

Vy = -WV:z: + noise

w=O+noise

and the measurement equation becomes

[xm] [1 0 0 0 0] [~l Ym = 0 0 1 0 0 Y Wm 0 0 0 0 1 VII

W

+ noise.

These equations represent a nonlinear system and require the use of an extended

Kalman filter. The needed manipulation of 5 x 5 matrices for implementing this

algorithm are unacceptable in real-time tracking. This once again confirms the

efficacy of using the JVILT estimate.

204

We calculate the change in the target heading by

(5.27)

where MLT is given by equation (5.25). It must be particularly emphasized that the

angular velocity state is no longer needed in the state equations, thereby reducing

the burden on the Kalman filter. Thus, there are two main advantages in using this

parameter (i.e., change in heading estimate) other than longer sampling period, viz.

the state equations are simpler and errors in the input parameters to the neural

network will be independent with this approach.

5.6. The Third Input Parameter

We now investigate the role and significance of the third input parameter in

the training. In the last section it was pointed out that there are three fundamental

issues that need to be considered in obtaining an acceleration estimate. These are:

1) the intensity of acceleration,

2) direction of tangential velocity,

3) initial velocity at the time of acceleration.

Each of these factors can be significantly affected by the presence of clutter. The

intensity and direction have already been discussed in detail. These parameters

should meet the following objectives:

1) Maneuver has to be detected as soon as possible (i.e., within two to three

scan periods).

2) At least one of the parameters must reflect the needed information about

the changes in velocity. Since doppler shift can be measured within each

205

scan period, this can provide information about the target preparation for

a maneuver and hence can aid in an early detection of the maneuver. Sen

sitivity of this parameter, however, is not without a cost, since more false

detections of maneuvers may occur. This explains why we may need all of

these parameters so that together they can reliably identify the maneuver.

For example, the velocity innovation is more sensitive to the starting and

ending times of the maneuver while the position innovation is more sensi

tive to the intensity of the acceleration components. H we use the velocity

innovation by itself as an indication of the maneuver, it may cause a false

detection of the maneuver, since a slight change in the velocity, if not cou

pled with information on the intensity level, might indicate falsely that a

maneuver is initiated.

3) Upon detection of the maneuver, we need to quantize the intensity of the

maneuver which is mainly the role of the first parameter.

5.6.1. Velocity Innovation Parameter

As described in earlier chapters, the echo from a moving target produces a

shift in the radar carrier frequency, which is the doppler effect. Doppler shift takes

place when the wave radiated from a point source is compressed in the direction of

motion or when it is spread out in the opposite direction and is directly related to

the target radial velocity through the relation

h(k) = fo(k) - io(klk - 1) (5.28)

where h is the doppler shift in frequency and fo is the center frequency of the

transmitted wave. Since each radar pulse is a modulated wave of some kind (e.g.,

sinusoidal), the shift in frequency could be measured on a pulse to pulse basis.

206

Therefore, one can take an average of these changes for each pulse sequence per

scan which is the task of an MTI processor (whose design was discussed in Chapter

4).

As shown in Fig. 5.4, the radial velocity R is related to the tangential

velocity VT according to

(5.29)

where L is the angle between the line of sight and the target heading (i.e., direction

of tangential velocity). It should be noted that as this angle approaches 900,

the doppler shift will vanish and no measure of target velocity will be available.

Furthermore, the target velocity is scaled by VT · = RI cos L and hence unless the

value of angle L is somehow provided with the measurement of R, there will be

no useful information in providing the velocity innovation. Using the heading

estimate !VI (obtained by kILT as discussed earlier) the expression for the velocity

will become VT = RlcosL in which L = {) + ¢, where angles {) and ¢ are given by

Using these equations, VT can be expressed in the form

VT

= (xx + yy)/...;x+Y ~ . cos [tan-1 (;) + tan-1 (M) ]

Thus VT can be expressed as VT = "iI! (x, x, y, y, <1» where "iI!(.) is a nonlinear

function representing the relation of the tangential velocity to the states and the

heading estimate. The corresponding change in tangential velocity due to the

207

change in x, y, x, y and heading is described by another nonlinear function S VT = ~ (Sx,Sx, Sy,Sy, S¢» where S¢> = tan-1 M(k)-tan-1 M(k-l) is the angular change

in the heading estimate. One can see that, without providing the heading change

estimate (i.e., S¢», the correct scale for VT cannot be extracted by the network.

The variance of the doppler shift is related to the variance of the range rate

which is given by [43]

where

2 >'~fs (1 =

R 4JSNR

and ~fs is the doppler filter bandwidth. We shall assume that doppler measure

ment is available. It should be noted that we do not include it in the overall state

equations, but merely use the normalized measurement directly for generating a

neural network input. This is to ensure that not only the input parameters have

independent sources but also the filter is kept simple. Thus, the third parameter

(neural network input) is independently measured and normalized with respect to

a worst-case clutter variation (e.g., (1/4 = 30mls). We shall call this parameter

V2, which is given by

where >. is the wavelength of the transmitted wave.

5.6.2. Quantization of the Noise Process

(5.30)

For a precise description of the quantitative approach that we are taking in

the maneuver modeling, in this section we shall discuss how a numerical quantity

is computed to scale the covariance matrix of the input process noise. Consider

the target state equations

x(k + 1) = Fx(k) + v(k)

z(k) = Hx(k) + w(k)

where x = [x X y yf,

Z = [::]

H = [~ ~ ~ ~

F= [~ ~ o 0 o 0

208

k = 0,1,2, ...

and v( k) and w( k) are zero-mean white noise processes with covariance matrices

Q and R. Hence

E[v(k) vT(j)] = Qa(k,j)

and

v(k) = Gu(k)

where u( k ) is the acceleration noise process which is what we are trying to estimate

with the neural network. That is, we estimate fl(k - 1) first and then recompute

x(klk) based on the fact that usually a maneuver is detected at least two to three

samples later. This dictates that we go back in time to correct for the bias. The

209

quantitative approach in the literature, however, does it as follows [10]. With the

noise matrix G given by

and

we can write

where

[~T2

G- T - 0 . 0

E[uCk) u(j)] = U;Okj,

The scale factor q is the power spectrum of the process noise which is unknown.

It is important to realize that in contrast with our neural network approach, the

noise samples considered here are independent and identically distributed. The

covariance matrix scaled by q is given by

210

It is very difficult to select the scale factor q off-line. As one can see, there

is no apTioTi information on 0"; • The quantity q and its related parameters are

given by

6 . T m = maneuver tIme constant

0"2 ~ variance of acceleration "

It must be noted that T m and amaz are unknown and are selected prior to

tracking. The situation for this approach will become more complicated yet when

the target performs a few short maneuvers with different time durations as well

as different acceleration levels. In such cases, Bogler's and Singer's methods will

result in a crude estimate of O"a. Bogler used the Input Estimation (IE) method [72]

to estimate O"a (i.e., equation 2.19b). However, as we argued before, using several

samples (e.g., N = 6) for the estimation of O"a is appropriate only if Tm ~ T

(that is, if the duration of acceleration is much longer than one sampling period).

Typical values for T m are in the range of [5,200] seconds and the sampling period

for a track-while-scan radar is in the range of [5,10] seconds.

5.7. Generation of the Training Vectors

Based on an expected range of the target velocity we generated a series of

examples such that the longitudinal acceleration was included at different initial

velocities. The expected range of velocity was assumed to be 200m/s to 700m/s.

The expected range of acceleration was assumed to be zero to 20 m / s2 • Only

longitudinal acceleration was considered in the example set. It should be noted

211

that all three of the input parameters are incremental values which further prepare

the neural network for a sudden change in acceleration. The examples are based on

the maximum tolerance for the tracking error per scan. That is, the incremental

changes for acceleration magnitude is 1 m I s2. This corresponds to a 10 m I s error in

speed, which is typical of an error induced by clutter variations. To prepare for the

worst case, we assume a doppler uncertainty of 30 ml s in the velocity innovation

which results in 15 % error for the minimum velocity. The straight line trajectories

were generated for 0 :5 Q:5 45° \vith an angular separation of one degree. It may

be recalled from the earlier discussions that the choice of (j 6~ = 1 ° is based on the

fact that in air traffic control systems the practical heading estimate transmitted

by the beacon system has a one degree uncertainty. Also, a turn rate of 3° I sec is

typical of a slow turn , and hence, as a rule of thumb we use 1/3 of this value for

the tolerance of the heading change estimate. The third parameter is computed

from equation (5.30).

For a better doppler sensitivity, a shorter wavelength must be used. The

millimeter wave radar has a high doppler sensitivity, e.g., 233.3 Hzlmls. That is,

for each 1 ml s change in the closing rate fl, the transmitted frequency will shift

233.3 Hz. This is a rather high sensitivity which can detect early changes in target

radial velocity. Therefore, one can see how important this factor is in the pattern

recognition of a maneuver.

In the simulation results, we will demonstrate the performance of the pro

posed neural network method in comparison with classical techniques that make

use of doppler returns and we will point out some interesting observations. For sim

ulation purposes, we used a wavelength which is commonly used in precision target

tracking systems, viz. ). = 8.57 X 10-3 m. The range of values for the position

212

innovation sequence can be approximated by the maximum size of the validation

gate [10] which corresponds to

where n z is the dimension of the measurement space (which is in this case 2) and

The parameter 'Y corresponds to the 99% probability region, which is obtained from

the Chi-square distribution tables [14,91] and has the value 16.

Recall that the weighted sum of the innovation sequence has a Chi-square

distribution with the number of degrees of freedom equal to n z • Also note that 'Y is

a preselected value and is kept constant for most applications. The gate probability

PG is related to 'Y and as shown in [10] is given by

.., -~

=l-e (5.31)

where P G is the probability that the target is inside the gate. Then the proba

bility that the target is detected inside the gate is PDPG • The probability that

all other targets detected inside the gate are false targets (i.e., clutter returns) is

1 - PDPG • This choice for 'Y (Le., 'Y = 16) corresponds to a rather heavy clutter

environment. As the number of clutter returns increases in the validation gate,

the magnitude of the innovation increases and is further adjusted by the scale fac

tor q2 ('xVk, P D) which is introduced by Barshalom [10] and can be provided as a

look-up table. Therefore, the first input parameter lies somewhere in the range

213

o.s ~ vl{k) ~ 2. Note that the innovation sequence moves toward a smaller value

as the clutter increases, that is less probability is assigned to each data originating

from the target. As each data falls further away from the predicted position, the

corresponding magnitude of the innovation increases but its Bayesian probability

of being originated from the target decreases. That is

j = 1, ... ,m (S.32a)

where Pj{k) is the probability that the measurement Zj is from the target, and

PoCk) is the probability that all other measurem~nts are false and is given by

[ m(k) ]-1

PoCk) = b{k) b{k) + ~ ej{k) (5.32b)

where

and

(5.32c)

Recall that mk is the number of measurements that fall inside the gate (including

the false measurements from the clutter) and is given by

(5.33)

We computed the average magnitude of the normalized innovation in the

absence of an acceleration (i.e., constant velocity) over 100 simulation runs with

50 samples per run as

( 1 ) (1) 100 SO .

ev = 100 50 ~ t; e~{k) (5.34a)

214

where

(5.34b)

and

(5.34c)

such that Q2, as mentioned above, is the correction factor for the combined in

novation. We have to use the Q2 scale factor since we do not have the correct

innovation in PDAF (which was discussed in Section 2.9); rather, we have the

combined weighted innovation which results from the combination of all changes

due to each data point (both from the target and the clutter) inside the gate. This

is how the extraneous data (i.e., those from clutter returns) get normalized. When

the acceleration takes place, ell is allowed to raise as high as 16. If the maneuver

is not corrected by the time the innovation reaches this magnitude, the model fails

to compensate for the sudden acceleration input. We preset the value of / = 16

for the neural network estimate but for the simulation of the Input Estimation

method we use a window of N = 4 with /1 = 10, /2 = 16, /3 = 20, and /4 = 50.

The validation gate for the Input Estimation (IE) method has to be kept large

enough to make sure that the target is still within the gate for at least N sam

pling periods after the occurrence of the maneuver. Corresponding to these values,

thresholds of 2.0, 2.5, and 2.7 were assigned. Note that in simulating the neural

network scheme we do not need to make these assumptions and there is only one

gate because N = 1. That is, once the maneuver is declared, the neural network

tries to compensate for the induced bias in just one sampling period. If the bias

215

still exists (as indicated by the parameters), it repeats the process. The ranges of

input values for all of the parameters are given below.

and the output ranges are

0.2 ~ M(k) ~ 1.5

0.5 ~ .vl(k) ~ 2

0.0 ~ v2(k) ~ 80

0.5 ~ uy ~ 20.

Since there are three input features, in each backward pass in the application

of the backpropagation algorithm, the sensitivity of the error in acceleration is

calculated with respect to each parameter. This new error is called the scaled local

error at each processing element in the output layer and is given by

(0) oE oE OUj e· =---=----) alh) OUj oI~h)

) )

= (uj(k -1) - uj(k -1))f'{oI?)) (5.35a)

where the term E represents the cumulative sum. of the squared errors after each

sweep of the N training examples and

(5.35b)

represents the activation function for the nodes in the hidden layer. With this

function, the derivative J'(z) used in equation (5.35a) becomes

J'(z) = J(z)(1 - J{z))

216

and

(5.35c)

where I?) is the output of a node in the hidden layer, the superscript h and h-1

denote the hidden and input layers in that order, and x~h-I) is the input from node

i of the input layer.

The training algorithm can be summarized as follows.

1) Run Kalman Filter with the first set of initial conditions.

2) Generate the first example by accelerating the target with minimum step

size (e.g., 1 mjsZ).

3) Calculate the three input parameters to the neural network (Le., VI, Vz, ch).

4) Forward pass to the neural network and calculate the error in acceleration.

5) Repeat the process until maximum error bound (e.g., 100m) is reached.

6) Backward pass the error to adjust the neural network weights.

7) Repeat steps 1-6 until the desired number of levels of acceleration, heading,

and intial velocities are generated.

8) Calculate the residual training error standard deviation (inn and use it to

further adjust the Kalman filter covariance matrices.

Figs. 5.5 and 5.6 show how the trained neural network is used for adaptation to

target maneuver. The threshold c in Fig. 5.6 (also see Section 5.4) is set during the

training and depends on the required sensitivity of the maneuver detection scheme

(e.g., for (i R = 100, c = 1mj s2).

217

5.S. Neural Network Architecture and the Training Data

Two different neural networks were designed-one for the estimation of the

input acceleration and the other for information reduction on the innovation se

quence, referred to here as NNa and NNq • While the training of NNa is more

challenging, training of N N q is a relatively simple task. The N N q simply stores a

scale factor Q2, which is a function of >'Vk and PD and serves as a look-up-table.

Use of the N N q provides an efficient way of storing the values of the Q2 factor which

is necessary for tracking in the presence of clutter and is a normalizing parameter

in the PDAF filter [10]. In the absence of clutter, the inputs to the NNa network

do not need to be scaled by Q2.

There are three input nodes in the maneuver modeling neural net architec

ture (i.e., NNa) and one hidden layer with 14 nodes. The activation function for

the nonlinear hidden nodes was selected as fez) = 1/(1 + e-:), while the output

nodes where chosen as linear. The starting learning rate for the hidden and output

layers were selected as 0.003 and 0.18, respectively. We employed the Generalized

Delta learning rule with momentum for adusting the weights. For the total of 800

training vectors, the cumulative error was only 71.13. As mentioned before, af

ter the training is completed, we include this residual training error of the neural

network to the covariance matrix Q of the Kalaman filter as U nn in equation (5.11).

5.9. Performance Evaluation

In this section we demonstrate by illustrative examples that the performance

of the proposed neural network-based maneuver modeling scheme is superior to

that provided by the existing techniques, particularly in the event of a short-term

longitudinal acceleration. A short-term acceleration is one that has duration T

comparable to the sampling period T (i.e., T ~ T). This is in contrast to the

218

conventional methods which usually require either T « Tor T » T for giving a

good performance. When T« T, the acceleration is too short and a random noise

process modeling is usually adequate. Also, for the case when the maneuvering

period is much longer than the sampling period (T » T), a correlated noise process

such as the one generated in Singer'somethod [11,40] may be used to model the

maneuver and then to compensate for it. The more difficult case, which is where the

capabilities of the neural network-based maneuver modeling scheme are definitively

established, is when the acceleration is not short enough to be considered trivial

nor is it long enough to be correctly modeled by purely statistical methods.

Before describing the details of the variou~ simulation experiments conduct

ed, we shall give an illustration of the effects of a sudden target acceleration on the

quantities used as inputs to the neural network.

1) It is important to note the combined effect of the input parameters 111

and 112 in the detection of a maneuver. As an example, an acceleration of 5mjs2

was performed by a target at t = 40 seconds with a heading of 45° away from the

origin with an initial velocity of 300 m j s. Duration of this maneuver was 50 seconds

and the probability of false alarm was PIG = 0.00055 with uniform clutter density.

Fig. 5.7 illustrates how the normalized velocity innovation changes according to

this acceleration. Note how ii2 (i.e., the normalized velocity innovation) responds

at one sampling instant later. The top part of the curve is not :flat because of the

disturbance by clutter.

2) In Fig. 5.S, another example depicting a similar situation is shown

except that the acceleration was introduced at t = 20 seconds for only one sampling

instant. Note that the first peak is generated by the true acceleration while the

second peak is due to a clutter data. Recall that a clutter data in the validation

219

gate will cause an increase in the combined magnitude of the normalized position

innovation VI. This means that we definitely need another indicator that will

neutralize the second false maneuver.

3) Using VI and V2 together will reduce the effect of clutter because both

VI and V2 are sensitive to the true maneuver. This example shows why the combined

use of VI and V2 gives the neural network a sense for the maneuver intensity. A

harsh maneuver will result in a longer duration of V2 and a larger peak for iiI.

4) Fig. 5.9 illustrates the role of the maneuver indicators. The first maneu

ver is performed at t = 50 seconds and lasts 10 seconds which is rather short (i.e.,

TIT = 1). Recall that we use a sampling period.of 10 seconds which is typical for

track-while-scan radar systems. The second maneuver starts one sampling period

later, i.e., at t = 60, seconds and lasts 30 seconds. Note that the VI parameter is

not responsive to the second maneuver until one to two scans later. Therefore, it

is the role of the second parameter (heading estimate) to provide a confirmation of

target direction and to declare that this sharp change in the position is not due to

a change of heading. With the present method, the maneuver will still be captured

one scan later and it will be compensated.

Experiment # 1

The first experiment involves a short duration acceleration which is small

in magnitude. For the purpose of comparison of the performance of the present

approach with that from an existing one, a sampling period of length N = 2 was

used for the Input Estimation method proposed by Bogler [72]. The target is

assumed to follow a straight path from the initial position of (100 m, 100 m) with

respect to the radar with an initial speed of 250mls and radially moves away from

the radar with a heading of 450• The scan period was assumed to be T = 10

220

seconds and there was no clutter in this experiment. The standard deviation of the

measurement error was assumed to be dependent on the range with a maximum

of 100 meters (which is typical for radar measurements). Therefore, as the target

moves away from the radar, CT R increases to a maximum of 100 meters. For this case,

we considered five different values as the target moved away from the radar (e.g.,

5m,20m,40m,60m, and 100m). The azimuth standard deviation was assumed

to be CTe = 0.003 radian. The size of the correlation gate was set to be "'I = 16 for

the neural network. For the IE method, "'II = 16 and "'12 = 20 were used for two

consecutive scans.

An acceleration of 5mJs2 was initiated at t = 40 seconds and lasted for 10

seconds which is one sampling period only. The simulation was averaged over 100

runs. A summary of the main input data for each simulation run is given in Table

5-1-1. Table 5-1-2 summarizes the results of the track statistics. The mean filtering

errors of target position and velocity for the neural network-based method (NN)

and for the IE method are shown in Figs. 5.10a-5.10c. The overall probability

of detection of target along the path was assumed to be 100 %. It may be noted

that the shorter track life * for the IE method is due to a small sampling window

of length N = 2 (i.e., two sets of measurement data). Also, the velocity error is

relatively very high for the IE method. This seems to be due to the short duration

of acceleration as discussed earlier. Note that we defined a track life to be complete

if the filter converged. Otherwise the track life was assumed incomplete as long as

the filter error was less than 250 meters. Other definitions may be used for both

methods.

* Track life is defined as the number of consequtive sampling periods that the target kine

matic parameters are estimated within a predefined accuracy [10].

221

Experiment # 2

In this experiment, the target starts a maneuver at t = 40 seconds with

an acceleration of 20 m/52 • The target path is the same as that in the first run.

However, a clutter region is now present which extends between the 10th and 20th

scan. The PIa in this clutter region is· assumed to be 0.6 in a gate of 5 km radius

around the predicted target position. The radar coverage is assumed to be 40 km,

and PIa in the clear region is set to 0.000001. The probability of detection is a

function of the signal-to-noise ratio. The measurement uncertainty for the range is

assumed to be dependent on the range value and range values of 5, 10, 15,20, 25, 30

meters were considered. The use of lower values .is facilitated by using a high reso

lution radar (e.g., millimeter wave) with a high doppler sensitivity of 233Hz/m/5.

Note that the IE method uses the position innovation only for the detection and

the correction of the maneuver and it makes no use of the doppler measurement.

In contrast, one of the inputs of the neural network-based model is a normalized

change in the doppler shift. It is well established in the literature [43,75,85] that

doppler information significantly enhances the tracking performance. However,

there are several problems with the way it is commonly used as an additional state

of the Kalman filter. These will be discussed in a greater detail under Experiment

#6.

In the present experiment, the size of the correlation gate was kept the

same for the neural network (i.e., I = 16 for all simulation runs) whereas for the

IE method we used a window of length N = 4 with 11 = 10, 12 = 16, 13 = 20,

and 14 = 50. The standard deviation of the radial velocity was assumed to be

3m/s and the clutter data included the spread of the doppler clutter spectrum

222

of 30m/s. A summary of the data used in this experiment is given in Table 5-

2-I and the performance is summarized in Table 5-2-2. The track statistics, as

summarized in Table 5-2-2, show that an average Pd of 94.2% was achieved for

the target detection along the path and only 1.2% of the target data was rejected

for the neural network scheme. The· clutter rejection was not as expected but

clutter data was reduced to 37% inside the correlation gate, which is still much

better than 65% in the gate for the other scheme. It must be emphasized that the

doppler shift, which is utilized in the third input parameter in the neural network

scheme, helps avoiding a nonlinear filter which will result if the radial velocity is

used as an additional state variable. The percentage of the rejected target plots

was low for both filters but the mean track life for the neural network scheme was

again higher than that for the other scheme. A plot of the mean filtering error is

shown in Fig. 5.11 which clearly confirms the superior tracking performance of the

present scheme.

Experiment # 3

The trajectory is maintained to be the same as in Experiment 2. The

target initial position is again (x,y) = (100m, 100m) with an initial velocity of

200m/s. The first maneuver takes place at t = 60 seconds with an acceleration

input of 5m/s2 and lasts for one sampling period. The second maneuver takes

place at t = 90 seconds with an acceleration of 10 m/ s2 two sampling periods after

the first maneuver. A window of length N = 2 was used for the IE method. The

measurement uncertainty values are the same as that in Experiment 2. The results,

as summarized in table 5-3, indicate that the mean track life is considerably less

for the IE method when compared to that of the neural network scheme. This

is due to the short interval between the two maneuvers and the fact that both

223

maneuvers have short durations. Fig. 5.12 illustrates the mean filtering errors for

both methods.

Experiment # 4

In this experiment the scenario is similar to that considered in the previous

experiment except that the duration of the second maneuver is longer and there is

only one sampling period of difference between the two maneuvers. As it can be

seen from the data in Table 5-4, the mean track life has dropped slightly for the IE

method. This is because the second acceleration is longer in duration and hence

it is modeled better. The short interval between the two accelerations, however,

causes a degradation in the overall filter performance. The sharp peak at scan

25 is due to the bias that was not compensated for earlier at the time the first

acceleration took place. Therefore, it causes an increase in the mean error on top

of what is due to the first acceleration.

The IE method easily fails as the interval between the two accelerations is

reduced. In contrast, with the neural network scheme, both maneuvers are well

compensated since the first acceleration is compensated for even before the second

one is initiated, whereas with the IE method the second acceleration starts before

the first one is fully corrected. This situation arises due to the short interval

between the two maneuvers and that it takes longer for the IE method to do the

corrections. In the next experiment we will see how the tracking error of the IE

method increases without converging as the interval between the two accelerations

decreases to less than one sampling period. The mean filtering errors are depicted

in Fig. 5.13 which clearly demonstrates that the neural network scheme offers a

better performance.

224

Experiment # 5

This time we include a range measurement error of q It = 100 m. The accel

eration profile is shown in Fig. 5.14. The peak for the IE filter indicates that the

filter sees the two accelerations as just one incident which occurs around t = 54

seconds. The IE filter detects the first "maneuver right at the middle of the interval

during which the second maneuver is taking place. The second maneuver has a

different magnitude and different duration different from the first one. All tracks

were lost in the 100 trials (i.e., filter errors were beyond 250 meters before the 15th

scan) for the IE filter. The neural network scheme, on the other hand, responded

more faithfully, as can be seen in Fig. 5.15a & ~.

A close look at the acceleration profile and the neural network scheme reveals

that the neural network sees the two accelerations as one and they are both of a

short duration with a total time of 25 seconds. The neural network response time

is within approximately 40 seconds. The clutter effect is reduced considerably

due to the higher velocity and doppler property of the second input parameter.

This example illustrates that at higher velocities and for more sudden and short

accelerations, the performance of the neural network is superior to that of the

IE technique. of the acceleration. A lower initial velocity of 200 m/ s with similar

conditions was tested and a larger steady state error for a longer time was observed

for the neural network together with some ringing effect. This shows that the

proposed neural network technique performs well in most conditions which which

can be handled by IE method.

As a summary of the overall performance, for short-duration maneuvers the

neural network scheme converges with a longer track life compared with the IE

225

method. The best performance from the neural network scheme is achieved under

the following conditions:

1) the clutter is uniform (which is what it has been trained for),

2) the acceleration is large in magnitude,

3) a sudden change in the acceleration profile occurs, and

4) the target initial velocity at the time of maneuver is higher than the maxi

mum clutter velocity.

Experiment # 6

In this experiment we perform a comparison of the neural network scheme

with a tracking algorithm that incorporates the radial velocity measurement. We

have already discussed the effect of the doppler information on the tracking accu

racy in the previous sections. In this example we give a more precise description

of the use of the doppler information which results in a nonlinear filter (Le., the

Extended Kalman Filter). In general, the radial velocity information is used to

improve the tracking performance at the following stages:

1) Initialization;

2) Estimation of the track parameters;

3) Plot-to-track association in a dense environment.

The radial velocity information speeds up the initialization phase because

it requires only one plot to indicate the target speed instead of two or more plots

needed by the position measurements. A more accurate calculation of the track

parameters improves tracking in the sharp acceleration situations, which is the

primary objective in this chapter. Unfortunately, the radial velocity measurement

226

gives only a limited information about the target velocity, hence losing the accuracy

as the target path deviates from the radial approach to the radar. With the neural

network model, however, we combine the target heading information so that the

relevance of the doppler information to the actual target speed is trained to the

neural network. Therefore, the performance of the neural network scheme does

not degrade severely as the target approaches the radar in a nonradial path. A

considerable improvement is achieved by the related tracking filter when p (i.e., the

radial velocity) is related to the target heading. Thus a neural network can play

a significant role in relating the actual target velocity (i.e., tangential velocity) to

the doppler information.

Traditional filters that incorporate the doppler information lack this feature,

which implies that the doppler information is wasted in most trajectory patterns

that are different from the radial path. By including the doppler measurement, the

neural network not only eliminates the need for the nonlinear Kalman filtering but

also it provides a more efficient use for this parameter. Once again, the reasons

that doppler information is usually not combined with the heading estimate in the

traditional filters are:

1) the computational constraints,

2) the longer delay to reach the steady state, and

3) the large coupling errors.

The dynamical equations for the Singer model with which the performance

of the neural network scheme will be compared are given in [40j. The performance

is evaluated in a clear region with the parameters

m O"p = 150m ,O"e = 0.003rad ,0". = 22-.

p s

227

The target initial position was (x, y) = (10 km, 10 km) with the initial velocity

of 350 m/ $ along a radial trajectory specified by a = 45°. Radar scan period

was assumed to be 5 seconds. The longitudinal acceleration started at t = 75

seconds with a magnitude of 20 m/ $2 which lasted 150 seconds. The expected

duration of the target acceleration was assumed to be 120 seconds for the Singer

model. The probability of target maximum acceleration for the Singer model was

set equal to 0.01 and the probability for uniform straight line motion was set equal

to 0.9. A uniform false alarm probability of Plo. = 0.000001 was assumed for clutter

data. For the Singer model, the variance of acceleration was found according to

2 a 2

Ua = 3(H4P,!9o.;£",-Po). and the size of correlation gate (for the Singer model) was

set to 100.

It may be noted that in the Singer model the processes of detection and esti-

mation of acceleration are done in two separate steps. For the maneuver detection

by the Singer model, we set a threshold of 2.6 for the innovation sequence. The

performance results are illustrated in Figs. 5.16-5.19 for the Singer filter with and

without doppler measurement and are compared to the performance resulting from

the proposed neural network scheme. Fig. 5.16 illustrates a significant improve

ment in the tracking error when the doppler measurement is used. Note that this

improvement is achieved through a complex nonlinear processing of the doppler

measurement. Despite the computational complexity of the conventional nonlinear

Kalman filter using doppler measurement as an input feature, this parameter (i.e.,

the doppler shift) loses efficiency for a nonradial target path. Note that, as ex-

pected, the proposed neural network scheme reflects the error a few samples faster

than the Singer model. This is due to the capability for the processing of more

parameters for the detection of the maneuver.

228

The standard deviation of the error for the neural network, however, slight

ly increased over that for the Singer model. This is due to the fact that the neural

network schemel is primarily designed to model short term accelerations, whereas

the Singer model is particularly designed for accelerations with longer durations.

However, the improvement that is achieved by the neural network scheme is con

sistent for nonradial trajectories as well. As the simulation was repeated for the

heading angle of a = 20, it was noted that the performance of the neural network

scheme stayed graceful while the Singer model with the doppler measurement re

sulted in more errors. This is because p does not completely reflect the true change

in the target tangential velocity in the absence of the heading information in the

Singer model, whereas the target heading information is reflected in the third input

parameter supplied to the neural network. The results obtained are depicted in

Fig. 5.20.

This example illustrates that in general the neural network compensation

method can eliminate the need for a nonlinear extention of the Kalman filter.

Furthermore, use of additional parameters that can help in the maneuver pattern

recognition does not add to the computational time of the neural network, partic

ularly when the training is performed off-line. It should be mentioned that the

Singer model of acceleration is generally more appropriate for long duration of ac

celeration due to the correlation between samples. Therefore, the performance of

the neural network model is almost comparable to this filter with a slight decrease

in the mean filtering error and a slight increase in the standard deviation of the

error. The standard deviation of the error as shown in Fig. 5.19 increases for the

later part of the track which indicates that the neural network model performs the

229

best for short-term accelerations. Singer model on the other hand does not perfor

m satisfactorily for the short term accelerations [41]. Clearly, a model that does

not assume any correlation between the samples is more efficient for accelerations

of short duration. For neural network model, we do not assume any correlation

among the samples. Such assumptions· are only applicable when larger samples are

available. The neural network model for the maneuver detection and compensation

has one unique advantage over all classical models which is its fast on-line response

and more efficient hardware implementation.

s.to. Conclusion

The primary objective of the research reported in this chapter was to develop

a neural network architecture that generates the required artificial noise signal in

order to compensate for the bias in the Kalman filter which is caused by a sudden

target acceleration. We introduced a hybrid approach to tracking a maneuvering

target in clutter by means of a neural network which was employed as an aid to

the Kalman filter. Furthermore, we showed that an efficient use of the inherent

parallelism of the neural networks can eliminate the need for several filters running

in paralleL We conclude that the following issues can be addressed by the neural

network-based model in a more efficient way if a hybrid approach is used:

a) Detector delay time for the maneuver pattern recognition,

b) quantization of the maneuver intensity,

c) coupling of the projected acceleration components,

d) maneuver classification and the choice of appropriate noise model, and

e) correlation coefficient of the samples in the maneuvering period as in the

Singer modeL

230

Although the multilayer feedforward architecture may not be the best candidate

to replace the Kalman filter (due to its static processing nature), as we have shown

in this chapter, it can work as a highly efficient coprocessor with the Kalman filter.

Therefore, we propose the hybrid approach to relace the Kalman filter working

alone. We showed that a single neural network can replace several parallel filters

with an identical performance and perform superior in cases of sharp discontinuities

in the acceleration profile. We chose the IE paradigm proposed by Bogler [72] to

generate the training vectors because we agree with Bogler and others [72,73] and

[78-82] that this method is the "optimal" approach for modeling sharp maneuvers.

Furthermore, we showed how the discontinuity in the target acceleration can be

modeled by the neural network. Although we used a one step backward estimation

of the acceleration input components (i.e., estimating uj(k - 1) ), the approach

can be extended to more samples in the past. In other words, the CHP algorithm,

which is the basis for Bogler's model, can be more efficiently implemented with the

neural network approach.

The use of a neural network in the present application provides the means

for a quantitative approach to maneuver modeling. It can serve as a fast com

pensator for the bias in the innovation sequence which is caused by a sudden

acceleration. The neural network can also help to classify different types of ma

neuvers. A problem with the neural network scheme that was designed here is that

it may not perform well when the target initial velocity is low. This drawback

is due to the fact that at lower velocities clutter data may be confused with the

target data. Since the traditional methods use a higher sampling period, clutter is

rejected more efficiently at lower velocities. Despite this disadvantage, the present

neural network-based approach offers significant performance benefits in modeling

231

the maneuvers for target tracking in clutter. In this chapter we have established

the many advantages offered by this approach which are not efficiently done with

the traditional methods. In conclusion, the parallel distributed processing of neural

networks can remove many of the complexities of the current tracking algorithms

when used in conjunction with the Kalman filter. Table 5-1-1 Summary of Data for Experiment # 1

Data for Radar Sensor Detection Probability Probability of False Alarm Radar Scan Period

100% at all ranges 0.000001

Standard deviation of range measurement Standard deviation of azimuth measurement

10.00 seconds 5,20,40,60,100 meters 0.003 radians

Clutter Data Probability of false data Correlation coefficient Size of clutter patch

Tracking Filter Data Acceleration Duration of maneuver Size of normalized correlation gate (IE) Size of normalized correlation gate (NN)

N/A N/A N/A

5 .!!!. 8 2

1 scan 16,20 16 (£Xed)

Table 5-1-2 Summary of Performance in Experiment # 1

% Detection % Detection % Detection of Mean track of target out of clutter inside life in alon~ the path correlation ~ate the ~ate # of scans

IE 100 1.6 N/A 16/20 NN 100 1.0 N/A 19/20

Table 5-2-1 Summary of Data for Experiment # 2

Data for Radar Sensor Detection Probability Probability of False Alarm Radar Scan Period Standard deviation of range measurement Standard deviation of azimuth measurement

Clutter Data Probability of false data Correlation coefficient Size of clutter patch (centered around predicted position)

Tracking Filter Data Acceleration Duration of maneuver Size of normalized correlation gate (IE) Size of normalized correlation gate (NN)

90% at 40 km 0.000001 10.0 seconds 5,10,15,25,30 m 0.003 radians

60% 0.6 5 km rectangular

20 ~ 3 scan 10,16,20,50 16 (fixed)

Table 5-2-2 Summary of Performance in Experiment # 2

232

% Detection % Detection % Detection of Mean track of target out of clutter inside life in along the path correlation gate the gate # of scans

IE 95.5 2.5 65 14/20 NN 94.2 1.2 37 17/20

233

Table 5-3 Summary of Performance in Experiment # 3

% Detection % Detection % Detection of Mean track of target out of clutter inside life in along the path correlation gate the gate # of scans

IE 82.4 10.0 65 9/20 NN 93.2 2.6 39 16/20

Table 5-4 Summary of Performance in Experiment # 4

% Detection % Detection % Detection of Mean track of target out of clutter inside life in alonK the path correlation gate the Kate # of scans

IE 85.0 19.6 . 63 8/20 NN 90.2 2.8 41 16/20

rzl~

~ L.,.... _____ ...:

r ll r ~

_II L-._______ r _rmgc::wm.tC

JiveD a U1 target p::x::s :::ode!.

~n'"_1 ;N_ .. ... 1 _....;KaI~r.::_::1l:;,;;n.;.;fi_lte:' __ ....:

Fig. 5.1 Magill bank of N parallel filters. The magnitude of the in-

no\-ation sequence is discretized into N clliIerent values. This can b~ done more

efficiently with the neural network.

234

NoM:lleln'el'

Fig. 5.2 Block Diagram of the optimum bias detector. K(k) is the

Kalman filter gain, rP is the transition matrix, H is the measurement transformation

matr..x, y(k) is the measurement ·;ector, and R(k) is the imlovation sequence.

235

r--;K~ , A -'

I fHl" I We L!!.J ! (kIK-!) I - - - -- - --

a) Neural network in the Kalman filter loop.

vl(k) ----! C vz(k) ----1 NN a

oh(k)---=1...-___ -.J

ACAPTMTY 'TCT~ ~

- -- .,

I

IDE.Ayl I I

- - -- J

ti.,(k - 1)

ti,(k - 1)

b) The inputs and outputs of the neural network. iii is the combined position

innovation of both x and yeo-ordinates, iiz is the normalized incremental doppler

shift, and Sh is the change in heading angle.

236

~ (k.'k"" - I

Fig. 5.3 The Neural Network Mane-olver Detector can replace the bank

of parallel filte:!. It can pciorm the de:ec-..lon and identification of mane-.lye: in

one step.

237

----------~

Fig. 5.4 it = V:r cos L is the relation for the target radial velocity and

tangential velocity. We include the target heading information (i.e.,. 6h ) in the

input parameters in order to relate the incremental doppler shift ii2 to the actual

change in target tangential velocity V:r •

Fig. 5.5

... 1_~_,_·O'lE_RJC:,!""l.i_R.A._1._..Jr y < --= >; I r" I

EV 4L1:AlE JCAL.\tAN G.UN

KaPH1'S·1

I T

c:)V 41UA."~ UPDA~C; P.'I.ICH,P

ICII: c r.: ICI\:.I

I Y

238

Adaptivity to target maneuver through neural network. Upon

detection of maneuver, the filter state i(klk) is recalculated using the neural net

work generated noise model [u:c(k -1), uy(k -1)].

• )

I !

C~_JETtJFN ___ )

Fig. 5.6

y

EVAU.:ATESOItMAUZm INNOVAncNCOIoIPONENrS

~ fWI.

B +

Far .. 1.2 N . 'TET'. "t! e

239

.. . I NOMANEUV5'I

( ~)

A fiow6art of the Adaptivity to target maneuver through

neural network. The out?ut of the neural network is iii( k - 1). This is then used in

the new dynamical equations of the tar:;et for time k and .: N N (klk) is calculated.

a

a

a

.0 ..•.

~ Q - 'ooat.&.I.GI" ~L~cP\ O-~.~~!I~"""

Fig. 5. i The two maneuver indicators together can pre\'ent divergcce

of the validation gate. Note that the second input parameter (i.e., the velocity

innovation) stays constant for the actual duration of acceleration. Together they

renect the intensity and duration of the acceleration.

240

Fig. 5.8 The increase in the magnitude of lnno'\"ation is due to the

'\\;dening of the validation gate in order to capture the target after it is known to

have initiated a maneuver. A wide validation gate collects more clutter. The first

peak is due to the actual target maneuver whereas the second peak is due to a

clutter data.

241

242

~ • - Q .... &.a.an ~

• - lie,""",,-!! W'tftOYa~

• . ... ..

Fig. 5.9 Position and velocity maneuver indicators for a case with two

consecutive accele!"ations \vlth al = 5mj s: and a'Z = 20mj s'1.. The "duration of al

is 10 seconds \ ... he!"eas a2 lasts 30 seconds.

CI

:1 In-eg

C! CI-V>

CI CD .,;_ t.-CD

..J CD ECI

cOl .; ~ t.CI ~ ~-1:1." • c:: c .~ C! .~ c-IDc:' C c.. ~ ~i 7'

CI gj

:1 :g-o

CI C ,., N .

0.0 2.5

L.E:SENO c - IE • - NEURAL NC-~ORK

. . . s.o 1.5 10.D 12.5

ScmpLi.ng t.i.m8

243

~

.~\..... .l··-••• ...

15.0 17.S

Fig. 5.l0a Neur-.J ~etwork and Input Estimation = co-ordinate position

errors. The acce!e.~tion is 5m/s2 which lasts for one sampling period T = 10

seconds.

t. o t.= t. • IDle c: o

I

.~ = -. . ; ~J-o· c.. = '" ~ I

= ~. N I

=

.•....•........ ; ..•.

~.'. ! ...•

j \ ..... .

I .-

!.£GOlO a - NETJRF.L NCWORK • - IE

244

\ . •

it .. .. l· .......

~l IJa-.-O---2~~-S----S~~-D----7~~S----1~D-_D----~1~-S----1~S-_D----l~7.-S--~~.D SampLi..r:= t.i..ma

Fig. 5.10h Neural ~etwork and Input Estimation y co-ordinate position

er.-ors.

~l :J

-0" c: o u = .. III •

to = ~

",-

Ill...,

~..; .J

= E

.~ ~J ~N = . ~ . .J .~ U

~ ;:;1 :. . CI:I

~-

a:t ,..:-. CI

·r··· .. f ~\ ! \ .•

t \

\, •

LEGENO a - NEURAL NETto/ORK • - IE

2L-~~~~~~~~~~' , 0 ~ i.s 5'.0 7.S lIi.a 12.S 15.0 17.S Zl.:l • ~=mpL~n9 ~~ma

245

Fig. 5.lOc Neural ::etwork and Input Estimation r co-ordinate velocity

errors.

c

~-N

c: III ~_ c.: III .-III Ec .s :g-

c. o tc: III In-

c: o ~ .-= • ...1 .,;_ 111-C· c..

L£GENO c - IE . • - NEIJRF.L NCWORK

. ..~ .... ..... .

...

./\~ . : ~ ..... . : : •.....

~ \! r_

•

Fig. 5.11 Neural :\etwork and Input Estimation :r: co-ordinate position

errors for an acceleration of a = 20m/ s2 which lasts for'" = 30 seconds. :-rote the

ringing of the neural network error in the first cycle which is caused by the clutter

data.

246

c:: c=

c c-

:1 CD ct., ... IZI ~ IZI Ec c: e-

• .) N

c c-... .

c

• . : '. [ .... j -\

!£GEND c - IE: • - NEURAL NC"roIDRK

'-

\.1

;l • ~a-.-~----5~.-J----l~a-.~-----15~·.-J----2D~·-.a-----2S~.-a----3D~·-.a-----3S~·.-a--~4~a.~

Scmpl.i.ng t.i..me

247

Fig. 5.12 Mean :1ter=..ng enors for Neural network and Input Esti:na.tion

methods for a. variable acceleration proDle. The first acceleration is 41 = 5m/52

whi6lasts for ~1 = 10 seconds. The second acceleration a2 = 10m! S2 with '2 = 10

seconds. The sampling pe.~od is T = 10. seconds.

= 5-N

= N_ CI . c ~-•

~

~l

l.ES2lD c - IE: • - NEURAL NcwaRK

• . ,:' '.

248

i-

~I • -iQ~.-::OO--S.~=----1l-:.-2SJ--1-6":":S7S--22.""'"-:"·-s:!O--2-.:-· .-Zi--3l-.75D---:sg-.-· :!7S--iS-~:c:l

5cmpli.nS t.i..ma

Fig, 5.13 Mean ::.1tering errors for the Neural Network and Input Es-

timation models when ell = 5m/s2 and 'T'l = 10 seconds followed by a large:

. accele:-ation of el2 = 20m/s2 with duration 'T'2 = 20 seconds.

~ --N

~c III • a:-t.'N o ~

IIIC

~ =-~-

c... o ~C

= .na..-c: .~

C:c o . ..J N

' ... -o t. III ~c

~ ai-u ~

C

:J P'I

c c~ ____ ~ __ ~ ____ ~ __ ~~ __ ~~ __ ~ ____ ~ __ ~

0.0 12.5 25.0 37.S 50.0 62.5 75.J S7.S 10C.:l Ti.llle i.n seconds

249

Fig. 5.14 The acceleration ?ro:file for simulation example 5. The :firSt

acceleration takes place with al = 10m/s2 and 1'1 = 5 seconds. the second accel·

~tion starts 5 seconds late:' and lasts for 10 seconds. the time between the two

accelerations is only T /2.

"c ~ - .-zc

0= ... C I

N

"!-7'

Q:co= Q: .-Q:'( w z cc - " t- ..;~

en C ~IC en ~_ ;C:l'l Q:'

'" ~-

c c::

250

C

I ~a-.-~----I.~~-_----l~.~-~-~---S-:~~------7.~Soa-----9-.~J7s-----11~.-~----1-J~.1-Z:----1~5.~ SiiMPL:NG TIME

Fig.5.15a The mt':lll filtering error of the Input Estimation (IE) method

f~r the acceleration profile i:l Fig. 5.::'4. The IE method fails to compensate for

this profile since the sampi~ .;ize is too small for the noise estimate pro\-ided by

this method.

= :£-N

c ~-

% Cc

~ iG-

1 c...c

~ ~-I c::'

= • I

:l :-

=

~i =1 ~~I------~----~----~----~----~----~----~----~ O.:mD 1.S7s 3.750 5.$ 7.~ S.J7: 11.250 13.:Z IS.lm

SfiMPL!NG TIME

251

Fig. 5.ISh The Neural Network mean filtering error for the acceleration

profile in Fig. 5.14. Since the :naneu\'e~i:lg period is always equal to one sampling

period for ~eural ~etwork noise model. Neural Network noise estimate is not

calculated statistically. rathe~ it is picked up through training.

= ~-,.,

Fig. 5.16

t •

~ j .. •

~,~ \ V • L£SENO

c - WITH DOPPLER • - WI~riOUT DOPPLER eo • o ••

•

252

Mean filtering errors for the x co-ordinate position with and

without doppler processing for Singer model. Doppler processing significantly im

proves the tracking accuracy for radial trajectories.

o 8-In

o

8-,..,

o CD": ~ sCIl'" .-CIl eo I: =_ .~ =

.f. ~.:1: : .,

• j

••

E {r'."! ~ = : .-CIl Q- • _. -:; 1 • ..-

I: .e.::..--":-" c 'I~ -.~ _0 .~ ..:

~ ~-I' ~'

1:_ C""": CIl 5:-e:

1 • I ~-,..,

o 8-

LEGEND c - S i NGE.~ MaOE"~ • - NEURF.L NE7WORK MOD~

:11 8...!. __ ~~ __ ~--~------~--~----~~ ~

0.:0:1 D~ 125 66~a gg~;75 132.alO 1S5·.~ 19S.1SO Z51.S7S 2S.!lO:J n.me (sec]

253

Fig. 5.1i Mean prediction enors for neural network and Singe:- models

in a radial trajectory.

c: ~--

c 2-7

c

~-• c:

LC"E~C a - SINGER /'100:::" • - NE":JRAL NC"riORK MOOa

~+-~--~--~.~-=~~. --~~~~ D.:l 32.5 &5." !l.S 1::'-1 I~S· ISS.O ZZl.S 2El.:l

Ti.me [sacl

254

Fig. 5.18 Me8!l Slte..ooing e:rors for Yeura.l NetWOrk and Singe: :no del ::

a radial trajectory.

= 8-....

~1 • I

ffi-'"

e

. c:: Itnc

=-;;;'

c: s-

LEGEND c - S! NGER MODEL • - NEURAl.. NETWORK MODE:..

, . .

•

• .r •

!13.7S 12S.CD ISS.2S I81.SO 218.75 S.CO Tlome (sacl

255

Fig. 5.19 The standard deviation of the filtering error for Neural Net·

work and Singer model.

256

CI

8-N

Fig. 5.20 The mean filtering enor for a non-radial trajec:or::. Tram-

tion~ methods lose doppler efficiency as trajectory deviates from a ramal path.

CHAPTER 6

SUMMARY, CONCLUSIONS & SUGGESTIONS

FOR FUTURE RESEARCH

6.1. Summary

25i

The parallel processing capabilities of neural networks together with their

on-line mapping properties make use of neural networks a powerful means for var

ious radar signal processing applications. Estimation of the parameters which are

nonlinearly related to a set of received imprecise data has generally been a difficult

problem in the modeling and analysis of the stochastic processes which have been

used for radar signal analysis. A neural network provides a convenient tool for ex

tracting useful information from a set of noisy radar measurements. On the other

hand, classical methods such as Kalman filtering and other parameter estimation

techniques provide appropriate mechanisms for utilizing the correlation among the

random processes characterizing the radar data. A neural network-assisted filter

ing method which makes use of the available algorithms in radar applications can

hence remove a lot of complexities in the analysis as well as the synthesis of existing

signal processing algorithms.

In comparison with currently available approaches to radar detection and

tracking, neural network methods have the ability to handle more information in

real time, which is of particular importance for on-line processing in most radar

applications. Furthermore, an exact mathematical modeling which is generally a

258

requirement for a programmed computing approach is not a prerequisite for a neural

network solution to these problems. The parallel processing method of neural

networks is significantly different from conventional parallel processing methods.

That is, while the mapping of complex algorithms to conventional parallel machines

is a difficult task and is often very inefficient, a neural network mapping, through

its parallel architecture, is fault tolerant and is more efficient. This feature allows

for a simultaneous processing of several statistically important parameters in radar

detection and tracking problems. In this dissertation we have focussed on designing

neural network-based methods for the detection and tracking of targets in clutter

environments. For this purpose, we have selected three of the major subsystems of

a complete tracking system.

In Chapter 1, we began with a definition of the problem of Multiple Target

Tracking (MTT) in clutter together with a brief discussion of neural networks and

their application to engineering problems. Some distinguishing features of neural

networks in comparison with those of current artificial intelligence systems were

also discussed. We also addressed the relation of neural networks to time series

analysis of radar signals. In Chapter 2 we briefly outlined some of the schemes

that are often discussed in the literature on target detection and tracking. We

also specifically described the mathematical preliminaries of the MTT subsystems.

Since there exists a vast and diverse number of methods in this area, we have limit

ed our discussion to the approaches which have received more significant attention

from researchers. Chapter 3 was devoted to the design of the Neural Network-based

Constant False Alarm Rate (NN-CFAR) processor and an evaluation of its supe

rior performance over the traditional auto-detection methods. A brief discussion

of optimal detection theory, which was the main source of the training examples,

259

was also presented in that chapter. The Neural Network implementation of a Mov

ing Target Indicator (NN-MTI) was presented in Chapter 4 where several neural

network structures were designed and analyzed for MTI applications. Finally, in

Chapter 5 the nonlinear mapping property of neural networks was used to imple

ment a hybrid manuever detector and compensator. The performance of the neural

network-assisted Kalman filter was evaluated by comparing its performance with

the most powerful existing tracking algorithms for tracking a manuvering target in

clutter. The principal contributions of the dissertation will be highlighted in the

next section.

6.2. Specific Contributions

This dissertation makes several specific contributions to current knowledge

in radar detection and tracking as well as to neural network applications to engi

neering problems. Some of these contributions will be briefly highlighted in this

section.

A serious degradation in the detection probability of conventional Constant

False Alarm Rate (CFAR) processors used in the automatic detection of radar

targets results from a reduction in the number of available reference cells. Several

factors such as the radar system constraints (in terms of the resolution and sampling

time), the presence of interfering targets, and clutter patches in the vicinity of the

primary target, may contribute to the reduction in the number of reference cells.

In Chapter 3, we presented a novel neural network-based CFAR detection scheme

(referred to as NN-CFAR) that offers robust performance in the face of a loss in

the number of reference cells. This scheme employs a multilayer feedforward neural

network trained by a backpropagation approach using the optimal detector as the

teacher. The excellent pattern classification capabilities of trained neural networks

260

are exploited in this application to effectively counter the performance degradations

due to reduced reference window sizes. In particular, it was demonstrated that a

neural network implementation of the CFAR detection scheme provides an efficient

approach for accommodating more input parameters without increasing the design

complexity for countering the information loss due to reduced reference window

sizes.

The potential application of neural networks to the processing of radar pulses

for extraction of target radial velocity was demonstrated in Chapter 4. We designed

and analyzed several neural network architectures for Coherent Pulse Integration

(CPI) of noisy radar pulses. Some very import~t features of the neural network

design of a Moving Target Indicator (NN-MTI), such as the flexibility of non

uniform sampling, shaping the MTI filter frequency response in the case of pulse

staggering, and enhancing the capability for varying the doppler filter bandwidths,

were discussed. Several training guidelines for radar pulse integration with neural

networks were outlined in this chapter. The principal feature of the neural network

in this aspect is the ability to correlate pulse amplitude distributions with the

modulation which is caused by the doppler effect due to target motion. Parallel

processing of the radar pulses gives the neural network more immunity to a mis

detection of the target in some instances where the probability of detection is less

than unity.

A new approach to tracking a maneuvering target using a neural network

based scheme was introduced in Chapter 5. The neural network models the target

maneuver and assists a Kalman filter in updating its gains in order to generate

correct estimates of the target position and the velocity. A performance evaluation

of the target tracking scheme is conducted under various interesting scenarios. The

261

parallel processing capabilities of trained neural nets are exploited in this applica

tion for realistically handling more input features to correct for the bias induced by

the target maneuver. The synergistic functioning of a trained neural network with a

Kalman filter that provides estimates of the position and velocity of a maneuvering

target is the principal feature of the present approach. The feasibility of employing

neural nets for maneuver modeling and for updating the Kalman filter gains to

correct for the bias induced by target maneuvers is demonstrated through exper

iments depicting several maneuver scenarios. While the performance delivered by

the present scheme is shown to exceed that possible by existing approaches in these

scenarios, additional performance evaluations in several other scenarios (tracking

in a cluttered environment, for instance) attest ·to the strength of this approach.

This work hence makes a useful contribution of the application of neural network

technology in the filtering and estimation areas. It should be emphasized that the

performance gain and superiority of NN-CFAR, NN-MTI, and maneuver modeling

over that of the conventional methods is not only in computational efficiency but

also in simplicity of design as well as hardware implementations.

6.3. Directions for Further Research

We now outline a number of possible extensions to the studies presented in

this dissertation. Since the introduction of Kalman filter theory, there have been

many efforts by researchers to either augment or modify this filter to resolve two of

the major problems arising with target tracking. The first problem is the tracking of

multiple targets in clutter. The theory of probabilistic data association developed

by Barshalom [3] has gained a viable reputation in the application of Kalman

filtering to tracking multiple targets in a cluttered environment. The second major

problem that has received much attention in the target tracking research is the

262

modeling of target maneuvers in the presence of clutter. The primary approach to

maneuver modeling has been through the generation of a noise process produced

by an autoregressive modeling which utilizes the past information. Based on this

approach, linear estimation theories have been used as the primary tools for the

modeling of target maneuvers. In light of the classical linear regression analysis,

the neural network approach can be thought of a nonlinear regression tool which

offers more fiexibilities in design.

In both of the problems which were mentioned above, on-line computation

is the primary limitation. We demonstrated how a multilayer feedforward neural

network can be used in the modeling of target maneuvers. A valuable extension

to this work is to make use of dynamical hidden layers such that more varieties in

the target maneuvers or clutter models can be assumed. As an example, in every

model of target acceleration, a very critical parameter that has to be estimated

for an efficient use of the model is the maneuver time constant. Therefore, more

on-line processing is required since we need to integrate even more parameters to

correctly match the timing and the intensity of the acceleration.

The hybrid approach that was taken here can also be extended to the mul

tiple target tracking case such that a dynamical neural network assists the Kalman

filter in clustering the data at every cycle of data association. The Joint Probablis

tic Data Association Filter (JPDAF) which has been developed by Barshalom [10]

has two major weaknesses. The first problem is due to the on-line computational

requirements which limits the number of targets that can be considered, partic

ularly when some of the target tracks are closely crossing. The second problem

is the way the filter sees the targets. If we put a dynamical neural network in a

closed loop with the JPDA, we may be able to create more separability in order

263

to adaptively separate the feature maps of the target tracks and simplify the track

association problem in cases of crossing trajectories. While some researchers such

as ntis [44] have introduced an approach to the first problem, which is simply a

way of implementing the JPDA with the Boltzman Machine, the second problem

has not been addressed in the literature as of the date of this dissertation.

Integration of the guidance and tracking subsystems through a neural net

work is another interesting extension of this research. As an example, in a missile

target interception problem, the true target acceleration has to be estimated and

then transferred to the guidance unit such that the next missile command is cal

culated. Depending on the geometry of the sit~ation (e.g., head-to-head or tail

chase), there are many nonlinearities involved in the estimation of the target and

missile accelerations. A trained neural network which classifies the type of engage

ment may be considered in a closed loop with the Kalman filter. The maneuver

classification followed by a joint estimate of the missile-target accelerations can

bring about more reliable solutions to a variety of interesting scenarios. With

modifications, the same approaches can be applied to the design of robotic vision

guided systems.

Some possible extensions to the NN-CFAR scheme is to consider the cor

relation among the clutter samples. Radar detection in correlated clutter is still

a problem of considerable complexity. In the work reported in this dissertation,

we have assumed independent samples in each resolution cell both temporally and

spatially. A real target may occupy more than one resolution cell which further

complicates the estimation of the clutter spectral parameters in the neighborhood

of the primary target. Note that this situation may arise for the primary target as

well as for the interfering targets. Once again, a neural network with dynamical

264

hidden layers may be used to readjust the size of the resolution cells or censor a

few cells occupied by the same target before clutter estimation is started.

The NN-MTI processing can play an important role in the future radar signal

processing methods. The parallel processing of the pulses allows more sophisticated

pulse coding and modulation techniques particularly for a coherent integration of

the pulses. We considered several multilayer neural network architectures that take

a series of pulses and respond with the target radial velocity. Although the missing

of some of the pulses was taken into account, the spread in the clutter velocity was

not studied. As we include more pulses, we can make use of some coding techniques

such that the clutter is decoupled from the puls~ sequence.

For more than three decades, FFT-based algorithms have been the major

tools for the frequency analysis of signals and systems. However, as discussed in

Chapter 4 of this dissertation, FFT is limited to linear filter design and lacks the

fiexibilities which are required for multiple sensor applications. As an example,

future surveillance systems will be based on quite a number of different sensors

operating in various bands of the electromagnetic spectrum. Furthermore, these

sensors may need to be connected through complex networks with several distorting

factors such as nonlinearity of the communication channels and different false alarm

rates of the sensors. Therefore, new approaches to signal modeling are needed and

the parallel processing of the time domain and frequency domain representations of

pulses will be an interesting problem to investigate. In conclusion, neural networks

provide a novel approach to parallel computing which suits the computationally

intensive problems such as target tracking in a cluttered environment.

265

REFERENCES

[1] S. S. Blackman, "Multi-Target Tracking With Radar Applications", Ded

ham, MA:Artech House, 1986.

[2] R.W. Sittler, " An Optimal Data Association Problem in Surveillance The

ory" IEEE Trans. on Military Electronics Vol. MIL-8 pp 125-139, April

1964.

[3] Y. Barshalom and E. Tse, " Tracking in a Cluttered Environment with Prob

abilistic Data Association" Proceedings of the 4th Symposium on Nonlinear

Estimation Sept 1973.

[4] R. A. Singer and K.W. Behnke, "Real-Time Tracking Filter Evaluation and

Selectio=:l For tactical Applications", IEEE Trans. on Aerospace and Elec

tronic Systems Vol. AES-7, pp 100-110 January 1971.

[5] D. E. Rumelhart, G. E. Hinton and R. J. Williams, " Learning Internal

Representations by Error propagation", Parallel Distributed Processing, D.

Rumelhart and J. McClelland (Eds), Vol. 1, (MIT press, Cambridge, MA,

1986).

[6] T. Kailath, " An Innovation Approach to Least-Square Estimation", IEEE

Trans. on Automatic Control, Vol. AC-13, pp. 646-655, December 1968.

[7] P. Swerling, "Probability of detection of Fluctuating Targets" IRE Trans.

on Information Theory, Vol. IT-6, pp 269-308, April 1960.

[8] M. Skolnik, "Introduction To Radar Systems" Mc Graw Hill, Second Edi

tion Chapter 4 pp 101-148 , 1980.

266

[9] J. L. Evans and E. K. Reedy, " Principles of Modern Radar" Van Nostrand

Reinhold 1987.

[10] Y. Barshalom and T.E. Fortman, Tracking and Data Association, Academic

Press: Sandiego, 1988.

[11] R. A. Singer and R.G. Sea, "New Results in Surveillance Systems Tracking

and Data Correlation performance in dense Multitarget Environment" IEEE

Trans. on Automatic Control, AC-18 pp 571-581 December 1973.

[12] Krishna, R. Pattipati, T. Kurien, R.T. Lee, and P. Luh, "On Mapping a

Tracking Algorithm Onto Parallel Processors" IEEE Trans. on Aerospace

and Electronic Systems Vol. 26, No.5 September 1990.

[13] Y. Barshalom and K. Birmiwal, "Variable dimension filter for maneuvering

target tracking", IEEE Trans. on Aerospace and Electronic Systems, Vol.

AES-18, pp 621-629, 1982.

[14] K. Birmiwal and Y. Barshalom, " On Tracking a Maneuvering Target in

Clutter" IEEE Trans. on Aerospace and Electronic Systems Vol. AES-20,

No.5 September 1984.

[15] H. M. Finn, "Adaptive Detection in Clutter", Proc. Nat'l Electronics Con

ference, Vol. 22, pp 562-567,1966.

[16] H. M. Finn and R.S. Johnson, "Adaptive detection mode with threshold

control as a function of sampled clutter level estimates", RCA Review, Vol.

29, pp 414-464, 1968.

[17] R. Nitzberg, "Analysis of the arithmetic mean CFAR normalizer for fluc

tuating targets", IEEE Trans. on Aerospace and Electronic Systems, Vol.

AES-14, pp 44-47, 1978.

267

[18] G. B. Goldstein, "False Alarm Regulation in Log-normal and Weibull Clut

ter", IEEE Trans. on AES, Vol. AES-9, pp 84-92, Jan. 1973.

[19] A. Mahmoodi and M.K. Sundareshan, "An adaptive scheme for optimal

target detection in variable clutter environment", Proc. 20th IEEE Conf.

on Decision and Control, San Diego, CA, Dec. 1981.

[20] V. G. Hansen, "Constant false alarm rate processing in search radars", Proc.

of IEEE 1973 International radar Conj., London, pp 325-332, 1973.

[21] V. G. Hansen and J. H. Sawyers, "Detectability loss due to greatest of selec

tion in a cell averaging CFAR" ,IEEE Trans. on Aerospace and Electronic

Systems, Vol. AES-16, pp 115-118, 1980.·

[22] G. V. Trunk, "Range resolution of targets using automatic detectors", IEEE

Trans. on Aerospace and Electronic Systems, Vol. AES-14, pp 750-755,

1978.

[23] M. Weiss, "Analysis of some modified cell-averaging CFAR processors in

multiple target situations", IEEE Trans. 0';7, Aerospace and Electronic Sys

tems, Vol. AES-18, pp 102-113, 1982.

[24] H. Rohling, "New CFAR processor based on ordered statistic", Proc. of

IEEE 1984 International Radar Conf., Paris, pp 38-42, 1984.

[25] J. T. Rickard and G. M. Dillard, " Adaptive detection algorithm for multiple

target situations", IEEE Trans. on Aerospace and Electronic Systems, Vol.

AES-13, pp 338-343, 1977.

[26] J. A. Ritcey, "Censored mean-level detector analysis", IEEE Trans. on

Aerospace and Electronic Systems, Vol. AES-22, pp 443-454, 1986.

268

[27] P. P. Gandhi and S. A. Kassam, "Analysis of CFAR processors in nonhomo

geneous background", IEEE Trans. on Aerospace and Electronic Systems,

Vol. 24, pp 427-445, 1988.

[28] D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning Internal

Representations by Error propagation", Parallel Distributed Processing, D.

Rumelhart and J. McClelland (Eds), Vol. 1, MIT press, Cambridge, MA,

1986.

[29] R. R. Lippman, "An Introduction to computing with Neural Nets", IEEE

ASSP Magazine, Vol. 4, pp 4-22, April 1987.

[30] H. Rohling, "Radar CFAR thresholding in clutter and multiple target situ

ations", IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-19,

pp 608-621, 1983.

[31] K. Hornik, M. Strinchcombe and H. White, "Multilayer feedforward net

works are universal approximators", Neural Networks, Vol. 2, pp 359-366,

1989.

[32] G. Cybenko, "Continuous value neural networks with two hidden layers are

sufficient", Math. Controls, Signals and Systems, Vol. 2, pp 303-314, 1989.

[33] K. Funahashi, "On the approximate realization of continuous mappings by

neural networks", Neural Networks, Vol. 2, pp 183-192,1989.

[34] S. 1. Sudharsanan and M. K. Sundareshan, "Training of a recurrent neural

network for nonlinear input-output mapping", Proc. 1991 Int. Joint Con

f. on Neural Networks (IJCNN-91), Seattle, July 1991 (Also to appear in

International Journal of Neural Systems).

269

[35] Michal Tuszynski, " Adapative MTI Filters For Uniform and Staggered

Sampling", IEEE Trans. on Aerospace and Electronic Systems, Vol. 27,

No.5, September 1991.

[36] A. Farina and A. Protopapa, " New Results on Linear Prediction For Clutter

Cancellation", IEEE Trans. on" Aerospace and Electronic Systems, Vol. 24,

No.3, May 1988.

[37] R. J. Fitzgerald, "Development of Practical PDA Logic For Multitarget

Tracking by Microprocessor, Proceedings of the American Control Confer

ence, Seattle, Washington, pp 889-898, 1986.

[38] A. Gelb, "Applied Optimal Estimation''", Cambridge, MA: M.LT. Press,

1974.

[39] S. R. Rogers " Tracking Targets With Constant Heading and Variable

Speed", IEEE Trans. on Aerospace and Electronic Systems Vol. 26, No.3,

May 1990.

[40] R.A. Singer, "Estimating optimal tracking filter performance for manned

maneuvering targets" IEEE Trans. on Aerospace and Electronic Systems,

Vol. AES-6, No.4, July 1970.

[41] P.L. Bogler, Radar Principles with Applications to Tracking Systems, Wiley

: New York, 1990.

[42] R.J. McAulay and E. Denlinger, "A decision-Directed Adaptive Tracker",

IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-9, No.2,

March 1973

[43] A. Farina, and F. A. Studer, "Radar Data Processing", Vol. 1 & 2, Letch

worth: Research Studies Press, 1985.

270

[44] D. Sengupta and R. ntis, " Neural solution to the multitarget tracking data

association problem", IEEE Trans. on Aerospace and Electronic SysteTn$,

Vol. 25, pp 96-108, 1989.

[45] P. Swerling, "Recent Developments in Target Models For Radar Detection"

AGARD Avionics Technical Symposium in Advanced Radar Systems, Istam

bul, Turkey, May 1970.

[46] F. Aluffi Pentini, A. Farina, and F. Zirilli, "Radar Detection of Targets

Located in a Coherent K Distributed Clutter Background" lEE Proceedings

F, Vol. 139, No.3, June 1992.

[47] W. Stehwein and S. Haykin," A Statistical Radar Clutter Classifier", IEEE

Int. Radar Conference, Dallas, TX, March 1989.

[48] D. E. Schmieder and M. R. Weathersby, " Detection Performance in Clut

ter With Variable Resolution", IEEE Trans. on Aerospace and Electronic

Systems, Vol. AES-19, No.4, July 1983.

[49] Arie Berman, and Amnon Hammer, "False Alarm Effects On Estimation in

Multitarget Trackers", IEEE Trans. on Aerospace and Electronic Systems,

Vol. 27, No.4, July 1991.

[50] B. G. Boone and R. A. Steinberg, "Signal Processing For Missile Guidance:

Prospects For The Future", John Hopkins Technical Digest, Vol. 9, No.3,

1988.

[51] A. Farina and A. Russo, "Radar Detection of Correlated Targets in Clut

ter" ,IEEE Trans. on Aerospace and Electronic SysieTn$, Vol. AES-22, No.

5, September 1986.

271

[52] Andrews, G.A., "Performance of a Cascaded MTI and Coherent Integration

in a Clutter Environment" , NRL Report 7533, march 1973.

[53] Brennan, 1. E., 1. S. Reed, "Optimum Processing of Unequally Spaced

Radar Pulse Trains for Clutter Rejection", IEEE Trans. on Aerospace and

Electronic Systems Vol. AES-4,· No.3, May 1968.

[54] Kendall, M. and A. Stuart, The Advanced Theory of Statistics, Vol. 2, Ch

29, London Griffin, 1969.

[55] F. William and M. Radant, "Airborne radar and the three PRFs", Mi

crowave Journal, July 1983.

[56] E. Aronoff and N. Greenblatt, "Medium. PRF radar design and perfor

mance", 20th Tn-Service Radar Symposium, 1974.

[57] D.C. Schleher, "Performance of MTI and ~oherent doppler processors",

IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-9, No.2,

March 1973.

[58] D.C. Sch1eher, "Performance of comparison of MTI and coherent doppler

processors", IEEE Int. Radar Con/., London, Nov. 1982.

[59] J. Marcum, "A statistical theory of target detection by pulsed radar", IRE

Trans. Vol. IT-6, No.2, April 1960.

[60] G. Dillards and J. Richard, "Performance of an MTI followed by an in

coherent integrator for nonfiuctuating signals", IEEE Int. Radar Con/.,

Washington, DC, April 1980.

[61] F. Kretschmer, F. Lin, and B. Lewis, "A comparison of noncoherent and

coherent MTI improvement factors", IEEE Trans. on Aerospace and Elec

tronics, Vol. AES-19, No.3, May 1983.

272

[62] H. Ward, and W. Shrader, "MTI performance degradation Caused by lim

iting", IEEE EASCON, Washington, DC, Sept. 1968.

[63] G. Trunk, "MTI noise integration loss", NRL Rep. 8132, July 1977.

[64] G. Andrews, "Optimum radar doppler filtering techniques", NRL Rep. No.

7727, May 1974.

[65] G. Andrews, "Comparison of radar doppler filtering", NRL. Rep. NO. 7811,

Oct. 1974.

[66] R McAulay, "A theory of optimum moving target indicator (MTI) digital

signal processing", Supplement 1, MTI Lincoln Laboratory REP., Lexing

ton, MA, Oct. 31 1972.

[67] G. Andrews, "Performance of cascaded MTI with coherent integration filters

in a clutter environment", IEEE radar con/., Washington, DC, April 1975.

[68] D.C. SchIeher and Schulkind, "Optimization of nonrecursive MTr', lEE Int.

radar con/., london, Oct. 1977.

[69] H. Thomas, N. Lutte, and M. Jelffs, "Design of filters with staggered PRF,

a pole-zero approach", lEE Proc. Vol. 121, No. 12, Dec. 1974.

[70] R. Roy and O. Lowenschuss, "Design of MTI detection filters with nonuni

form interpulse periods", IEEE Tram. Vol. CT, No.4, Nov. 1970.

[71] P. Prinsen, "Elimination of blind velocities of MTI radar by modulating

the interpulse period" , IEEE Tram. on Aerospace and Electronics Systems,

Vol., AES-9, No.5, Sept. 1973.

[72] P. Bogler, "Tracking a maneuvering target using Input Estimation", IEEE

Tram. on Aerospace and Electronic Systems Vol. AES-23, NO.3, May 1987.

273

[73] Y.T. Chan, A.G.C. Hu, and J.B. Plant, "A Kalman Filter-based Tracking

Scheme With Input Estimation IEEE Trans. on Aerospace and Electronic

Systems Vol. AES-15 pp 237-244 March 1979.

[74] R.L. Moose, " An Adaptive State Estimation Solution to the Maneuvering

Target Problem", IEEE Trans.' on Automatic Control Vol. AC-20 pp 359-

362 June 1975.

[75] A. Lundulf and M. Minker, " Reliability of Velocity Meaurement By MTD

Radar", IEEE Trans. on Aerospace and Electronic Systems Vol. AES-21,

NO.4 July 1985.

[76] R. Duda, and P.E. Hart, "Pattern Classification ans Scene Analysis", John

Wiley & Sons, New York, 1973.

[77] Y. Barshalom, " Tracking Methods in a Multiobjective Environment", IEEE

Trans. on Automatic Control, AC-23, pp 618-626, August 1978.

[78] G. D. Bergland and C. F. Hunnicut " Application of a Highly Parallel Pro

cessor to Radar Data Processing" IEEE Trans. on Aerospace and Electronic

Systems Vol. AES-8, pp 162-162, March 1972.

[79] S. H. Bokhari, " On The Mapping Problem" IEEE Trans. on Computers

Vol. C-30, pp 207-214, March 1981.

[80] H. Kasahara and S. Narita, "Practical Multiprossor Scheduling Algorithms

For Efficient Parallel processing" IEEE Trans. on Computers, Vol. C-33,

pp 1023-1029, November 1984.

[81] R. Sethi, "Scheduling Graphs on Two Processors, " SIAM Journal of Com

puting, pp 73-82, 1975.

2i4

[82] M. R. Garey and D. S. Johnson, "Computers and Intractability: A Guide to

the Theory of NP-Completeness" San Francisco: W.H. Freeman & Company

1979.

[83] R. T. Lee, " Fault-Tolerant Algorithm Mapping Onto Parallel Computing

Architectures", M.S. Thesis, D'ept. of Electrical and SysteTn$ Engineering,

University of Connecticut, Storrs, 1988.

[84] D.P. Atherton, "Tracking Multiple Targets Using Parallel Processing", lEE

Proceedings, Vol. 137, No.4, July 1990.

[85] A. Farina, A. Russo, F. A. Studer, "Advanced Models of Targets and Dis

turbances and Related Radar Signal Processors" , IEEE International Radar

Conference, 1985.

[86] M.J. Tsai, "Resolution of Closely Spaced Optical Targets Using MLE and

MEM" , IEEE Trans. on Aerospace and Electronic SysteTn$, Vol. AES-18,

No.2 March 1982.

[87] J. A. Edward, and M. M. Fitleson, "Notes on Maximum-Entropy Process

ing", IEEE Trans. on Information Theory, Vol. IT-19, pp 232-234, March

1973.

[88] B. R. Frieden, "Restoring With Maximum Likelihood and Maximum En

tropy", Journal of the Optical Society of America Vol. 62, pp 511-518,

1972.

[89] S. M. Kay, " Noise Compensation For Autoregressive Spectral Estimation",

IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-28,

pp 292-303, June 1980.

275

[90] G.S. Sandhu, and A.V. Saylor, " A Real-Time Statistical Radar Target

Model" IEEE Trans. on Aerospace and Electronic Systems, Vol. AES-21,

No.4, July 1985.

[91] A. Papoulis, "Probability, Random Variables and Stochastic Processes",

New York:McGraw Hill, 1965. "

[92] J. H. Dunn, and D. D. Howard, "Radar Target Amplitude, Angle and Scin

tillation From Analyssis of the Echo Signal Propagating in Space", lEE

Trans. on Microwave and Information Theory, Vol. MIT-16, pp 715-728,

September 1968.

[93] J. J. Hopfield and D. W Tank, "Neural "Computation of decision in opti

mization problems", Biological Cybernetics, 52, pp 141-152, 1985.

[94] D. T. Magill, "Optimal adaptive estimation of sampled stochastic process",

IEEE Trans. on Automatic Control, Vol. AC-I0, pp 434-439, 1965.

[95] G.A. Ackerson and k.s. fu, "On state estimation in switching environments" ,

IEEE Trans. on Automatic Control, Vol. AC-15, ppl0-17, Feb. 1970.

[96] S. Salinger and Wangsness, "Target handling capacity of a phased array

tracking radar" ,IEEE Trans. on Aero~pace and Electronic System, Vo1.AES-

8, No.1, pp43-50, Jan. 1972.

[97] C. Morefield, "Application of 0-1 integer programming to multitarget track

ing problem", Proc. IEEE conference on Decision and Control, pp. 428-433,

Dec. 1975.

[98] C. Morefield, "Application of Bayesian Decision Theory to multi target

surveillance problems", NAECON, pp. 489-494,1976.

276

[99] D.M. Klamer, "Non-parametric maneuver detection in Kalman filtering",

IEEE Conf. on Decision and Control, New Orleans, pp. 544-548, Dec.

1977.

[100] H.L. Wiener, A.S. Distler and J.H. Kullback, "Operational and implemen

tation problems of multitarget tracking problems" , IEEE Conf. on Decision

and Control, Fort Lauderdale, pp. 361-367, DEc. 1979.

[101] RJ. Fitzgerald, "Simple tracking filters: steady-state filtering and smooth

ing performance", IEEE trans. on Aerospace and Electronic Systems, Vol.

AES-16, No.6, pp. 860-864, Nov. 1980.

[102] R.J. Fitzgerald, "Simple tracking :filters: position and velocity measure

ments", IEEE trans. on Aerospace and Electronic Systems, Vol. AES-18,

No.5, pp. 531-537, Nov. 1982.

[103] RJ. Fitzgerald, "Simple tracking filters : closed form solutions, ", IEEE

trans. on Aerospace and Electronic Systems, Vol. AES-17, No.6, pp. 781-

785, Nov. 1981.

[104] Y.T. Chan, J.B. Plant, J.R.T. Bottomley, "A Kalman tracker with a simple

input estimator", IEEE Trans. on Aerospace and Electronic Systems, Vol.

AES-18, No.2, pp. 235-241, March 1982.

[105] K.V. Ramachandra, "Position, velocity and acceleration estimates from

noisy radar measurements" , lEE proc. Communication, Radar and Signal

Processing, Vol., Part F, No.2, pp. 167-168, April 1984.

[106] D. Lucas, K. Ekman and F.P. White, "The application of fuzzy pointer

s in multisensor/multitarget environment", IEEE Conf. on Decision and

Control, San Diego, p. 1217, Jan. 1979.

Neural network-based detection and tracking of maneuvering … · 2.5 Noise in the Radar Receiver...

Documents

Transcript of Neural network-based detection and tracking of maneuvering … · 2.5 Noise in the Radar Receiver...