Feature extraction from Doppler ultrasound signals for ...

ARTICLE IN PRESS

Computers in Biology and Medicine ( ) –www.intlelsevierhealth.com/journals/cobm

Feature extraction from Doppler ultrasound signals for automateddiagnostic systems

Elif Derya Übeylıa, Inan Gülerb,∗aDepartment of Electrical and Electronics Engineering, Faculty of Engineering, TOBB Ekonomi ve Teknoloji Üniversitesi,

Sögütözü, Ankara, TurkeybDepartment of Electronics and Computer Education, Faculty of Technical Education, Gazi University, 06500 Teknikokullar,

Ankara, Turkey

Received 1 December 2003; received in revised form 22 March 2004; accepted 4 June 2004

Abstract

This paper presented the assessment of feature extraction methods used in automated diagnosis of arterial diseases.Since classification is more accurate when the pattern is simplified through representation by important features,feature extraction and selection play an important role in classifying systems such as neural networks. Differentfeature extraction methods were used to obtain feature vectors from ophthalmic and internal carotid arterial Dopplersignals. In addition to this, the problem of selecting relevant features among the features available for the purpose ofclassification of Doppler signals was dealt with. Multilayer perceptron neural networks (MLPNNs) with differentinputs (feature vectors) were used for diagnosis of ophthalmic and internal carotid arterial diseases. The assess-ment of feature extraction methods was performed by taking into consideration of performances of the MLPNNs.The performances of the MLPNNs were evaluated by the convergence rates (number of training epochs) and thetotal classification accuracies. Finally, some conclusions were drawn concerning the efficiency of discrete wavelettransform as a feature extraction method used for the diagnosis of ophthalmic and internal carotid arterial diseases.� 2004 Elsevier Ltd. All rights reserved.

Keywords:Feature extraction; Automated diagnosis; Doppler signal; Discrete wavelet transform; Ophthalmic artery; Internalcarotid artery

∗ Corresponding author. Tel.: +90-312-212-3976; fax: +90-312-212-0059.E-mail address:[email protected](I. Güler).

0010-4825/$ - see front matter� 2004 Elsevier Ltd. All rights reserved.doi:10.1016/j.compbiomed.2004.06.006

http://www.intlelsevierhealth.com/journals/cobm

mailto:[email protected]

2 E.D. Übeyl˙ı, I. Güler / Computers in Biology and Medicine ( ) –

ARTICLE IN PRESS

Raw Dopplersignal

Preprocessing

Feature Extraction

Classification

Feature Selection

Output

where m < nx′ = {x1, x2 ,..., xm}

x = {x1, x2 ,..., xn}

x = {x1, x2 ,..., xn}

Fig. 1. Functional modules in a typical automated diagnostic system used for arterial diseases.

1. Introduction

Medical diagnostic decision support systems have become an established component of medical tech-nology. The main concept of the medical technology is an inductive engine that learns the decisioncharacteristics of the diseases and can then be used to diagnose future patients with uncertain diseasestates. A number of quantitative models including linear discriminant analysis, logistic regression,knearest neighbor, kernel density, recursive partitioning, and neural networks are being used in medicaldiagnostic support systems to assist human decision-makers in disease diagnosis. Neural networks havebeen used in a great number of medical diagnostic decision support system applications because of thebelief that they have greater predictive power. Unfortunately, there is no theory available to guide anintelligent choice of model based on the complexity of the diagnostic task. In most situations, developersare simply picking a single model that yields satisfactory results, or they are benchmarking a small subsetof models with cross validation estimates on test sets[1–3].

Various methodologies of automated diagnosis have been adopted, however the entire process cangenerally be subdivided into a number of disjoint processing modules: preprocessing, feature extrac-tion/selection, and classification (Fig.1). Signal/image acquisition, artifact removing, averaging, thresh-olding, signal/image enhancement and edge detection are the main operations in the course of preprocess-ing. The accuracy of signal/image acquisition is of great importance since it contributes significantly tothe overall classification result. The markers are subsequently processed by the feature extraction module.The module of feature selection is an optional stage, whereby the feature vector is reduced in size includ-ing only, from the classification viewpoint, what may be considered as the most relevant features requiredfor discrimination. The classification module is the final stage in automated diagnosis. It examines theinput feature vector and based on its algorithmic nature, produces a suggestive hypothesis[3–6].

ARTICLE IN PRESSE.D. Übeyl˙ı, I. Güler / Computers in Biology and Medicine ( ) – 3

Feature extraction is the determination of a feature or a feature vector from a pattern vector. For patternprocessing problems to be tractable requires the conversion of patterns to features, which are condensedrepresentations of patterns, ideally containing only salient information. Feature extraction methods aresubdivided into: (1) statistical characteristics and (2) syntactic descriptions. Feature selection provides ameans for choosing the features which are best for classification, based on various criteria. The featureselection process performed on a set of predetermined features. Features are selected based on either(1) best representation of a given class of signals, or (2) best distinction between classes. Therefore,feature selection plays an important role in classifying systems such as neural networks. For the purposeof classification problems, the classifying system has usually been implemented with rules using if-thenclauses, which state the conditions of certain attributes and resulting rules. However, it has proven to be adifficult and time consuming method. From the viewpoint of managing large quantities of data, it wouldstill be most useful if irrelevant or redundant attributes could be segregated from relevant and importantones, although the exact governing rules may not be known. In this case, the process of extracting usefulinformation from a large data set can be greatly facilitated[4,6].

There are numerous methods to represent patterns as a grouping of features. The choice of methodsappropriate for a given pattern analysis task is rarely obvious. At each level (feature extraction, featureselection, classification) many methods exist. Conventional methods of monitoring and diagnosing arterialdiseases rely on detecting the presence of particular Doppler signal features by a human observer[6–11].

Up to now, there is no work relating to the assessment of feature extraction methods used in automateddiagnosis of arterial diseases and a detailed comparative documentation in the literature. In the presentstudy, feature extraction from ophthalmic and internal carotid arterial Doppler signals for diagnosis ofarterial diseases was examined. In addition to this, the problem of selecting relevant features among thefeatures available for the purpose of classification of Doppler signals was dealt with. Multilayer perceptronneural networks (MLPNNs) with different inputs (feature vectors) were used for diagnosis of ophthalmicand internal carotid arterial diseases. The assessment of feature extraction methods was performed bytaking into consideration of performances of the MLPNNs. The performances of the MLPNNs wereevaluated by the convergence rates (number of training epochs) and the total classification accuracies.Finally, some conclusions were drawn concerning the impacts of feature extraction methods used for thediagnosis of ophthalmic and internal carotid arterial diseases.

2. Feature extraction from Doppler signals

The Doppler shift signal contains a wealth of information about blood flow occurring within the samplevolume of the Doppler ultrasonography. The most complete way to display this information is to performspectral analysis and present the results in the form of a sonogram. The variation in the shape of Dopplerpower spectrum as a function of time can be presented in the form of a sonogram. Sonograms showthe periodic heartbeats and within each beat it is possible to visualize the systolic and diastolic flow asthe heart contracts and then relaxes. In a sonogram, the horizontal axis (t) represents time, the verticalaxis (f) frequency and the gray level intensity at coordinates (t, f ) denotes signal power at frequencyfand time instantt. The darker the gray level at coordinates (t, f ), the higher the power of the frequencycomponentf measured at time instantt. By monitoring the sonogram, variation of the spectral propertiesof the Doppler signal and a number of extents related to the blood flow can easily be tracked[6,10–15].


ARTICLE IN PRESS

SD

M

Time (s)

ts

T

S/2

A

Dop

pler

shi

ft fr

eque

ncy

(Hz)

fmean

fmax

Fig. 2. Diagram illustrating the variables involved in the definitions of RI, PI,S/D ratio, SBI, constant flow ratio, and heightwidth index.fmax is the maximum frequency at peak systole,fmeanthe mean frequency at peak systole,Ssystolic peak,D enddiastolic height,M mean height of the Doppler waveform,T the length of the Doppler waveform,A the area under the curve,andts the duration of the systolic peak.

2.1. Feature extraction from Doppler power spectra and sonograms

Doppler signal is conventionally interpreted by analyzing its spectral content. Diagnosis and diseasemonitoring are assessed by analysis of Doppler power spectra and sonograms. The Doppler power spec-trum has a shape similar to the histogram of the blood velocities within the sample volume and thusspectral analysis of the Doppler signal produces information concerning the velocity distribution in theartery[6,10–15]. A number of parameters related to the blood flow may be extracted from the sonogramsand these are of high clinical value. Then decision is given whether such a feature vector is obtained froma normal or abnormal artery. The indices derived from the sonograms defined as resistivity index (RI),pulsatility index (PI) and they are widely used indices for the evaluation of Doppler sonograms. Boththe RI and PI are reflections of the resistance to flow, downstream from the point of insonation. They areinfluenced by many factors including proximal stenosis, distal stenosis and peripheral resistance. RI andPI are defined as

RI = (S −D)/S, (1)

PI = (S −D)/M, (2)

whereS is maximum systolic height,D is end diastolic height andM is mean height of the Dopplerwaveform (Fig.2) [6,10,11].S/D ratio is one of the index obtained from the Doppler waveforms and defined as

S/D = 1/(1 − RI), (3)

whereS is maximum systolic height,D is end diastolic height of the Doppler waveform (Fig.2) [6].One of the parameters derived from the Doppler waveforms is the spectral broadening index (SBI)

which is used for quantification of stenosis severity in arteries. The SBI is defined as

SBI = fmax − fmean

fmax, (4)


wherefmax is the maximum frequency at peak systole andfmeanis the mean frequency at peak systole(Fig. 2) [6,14].

The constant flow ratio (CFR) is defined as

CFR=DT /A, (5)

whereD is end diastolic height of the Doppler waveform,T is the length of the Doppler waveform,A isthe area under the curve (Fig.2) [6].

The height width index (HWI) combines information about both the pulsatility of the Doppler waveformand the time relationships within the Doppler waveform. Doppler waveforms recorded distal to arterialstenosis have relatively small pulsatile components and relatively wide systolic peaks, and both of theseserve to decrease the value of HWI which is defined as

HWI = PI(T /ts), (6)

whereT is the length of the Doppler waveform,ts is the duration of the systolic peak (Fig.2) [6].The power spectral density (PSD) estimates and sonograms of the Doppler signals are obtained by

spectral analysis methods. Therefore, a brief description of classical (fast Fourier transform), model-based (autoregressive, moving average, autoregressive moving average methods) and time-frequencyanalysis (short-time Fourier transform, wavelet transform) methods used for obtaining Doppler powerspectra and sonograms is given.

2.1.1. Fast Fourier transform methodThe fast Fourier transform (FFT)-based methods such asWelch method are defined as classical methods.

Welch spectral estimator can be efficiently computed via FFT and is one of the most frequently used PSDestimation methods. In the Welch method, signals are divided into overlapping segments, each datasegment is windowed, periodograms are calculated and then average of periodograms is found.{xl(n)},l= 1, . . . , K are signal intervals and length of each interval equals toM. The Welch spectral estimator isdefined as

Pl(f )= 1

M

1

P

∣∣∣∣∣M∑n=1

v(n)xl(n)exp(−j2�f n)

∣∣∣∣∣2

and PW (f )= 1

K

K∑l=1

Pl(f ), (7)

wherePl(f ) is the periodogram estimate of each signal interval,v(n) is the data window,P is the averageof v(n) given asP = (1/M)∑M

n=1 |v(n)|2, PW (f ) is the Welch PSD estimate,M is the length of eachsignal interval andK is the number of signal intervals[13,15–17].

2.1.2. Autoregressive methodThe autoregressive (AR) method is the most frequently used model-based method because estimation

of the AR parameters can be done easily by solving linear equations. The AR parameters can be estimatedvia different estimation methods such as Burg method. The Burg method for estimating the AR param-eters is computationally efficient and yields a stable AR method. The Burg AR method is based on theminimization of the forward and backward prediction errors and estimation of the reflection coefficient.


ARTICLE IN PRESS

From the estimates of thepth-order Burg AR parameters, PSD estimation is formed as

PBURG(f )= ep∣∣∣∣1 +p∑k=1

ap(k)e−j2�f k

∣∣∣∣2 , (8)

whereep = ef,p + eb,p is the total least squares error[13,15–17].

2.1.3. Moving average methodThe moving average (MA) method is one of the model-based methods in which the signal is obtained

by filtering white noise with an all-zero filter. Estimation of the MA spectrum can be done by thereparameterization of the PSD in terms of the autocorrelation function. Theqth-order MA PSD estimationis [13,15–17]

PMA(f )=q∑

k=−qr(k)e−j2�f k. (9)

2.1.4. Autoregressive moving average methodThe spectral factorization problem associated with a rational PSD has multiple solutions, with the

stable and minimum phase autoregressive moving average (ARMA) model being one of the model-basedmethods. A reliable method is to construct a set of linear equations and to use the method of least squareson the set of equations. Suppose that for an ARMA of orderp, q the autocorrelation sequence can beaccurately estimated up to lagM, whereM>p + q. Then the following set of linear equations can bewritten:

r(q) r(q − 1) · · · r(q − p + 1)

r(q + 1) r(q) · · · r(q − p + 2)...

...

r(M − 1) r(M − 2) r(M − p)

a1a2...

ap

= −

r(q + 1)r(q + 2)

...

r(M)

, (10)

or equivalently,

Ra = −r. (11)

Since dimension ofR is (M − q)xp andM − q >p the least squares criterion can be used to solve forthe parameter vectora. The result of this minimization is

a = −(R∗R)−1(R∗r). (12)

Finally the estimated ARMA power spectrum is

PARMA (f )= PMA (f )∣∣∣∣1 +p∑k=1

a(k)e−j2�f k

∣∣∣∣2 , (13)

wherePMA (f ) is estimate of the MA PSD and is given in Eq. (9)[13,15–17].


2.1.5. Short-time Fourier transformSpectral analysis of the signal under study is performed using the short-time Fourier transform (STFT),

in which the signal is divided into small sequential or overlapping data frames and FFT applied to each one.The output of successive STFTs can provide a time-frequency representation of the signal. To accomplishthis the signal is truncated into short data frames by multiplying it by a window so that the modified signalis zero outside the data frame. In order to analyze the whole signal, the window is translated in time andthen reapplied to the signal.

In STFT analysis, the signal is multiplied by a window functionw(t) and the spectrum of this signalframe is calculated using the Fourier transform. Thus

STFT(t, f )=∣∣∣∣∫ +∞

−∞x(�)w(� − t)e−j2�f � d�

∣∣∣∣2

, (14)

wherex(t) represents the analyzed signal[14,18–20].

2.1.6. Wavelet transformWavelet transform (WT) is designed to address the problem of nonstationary signals. It involves repre-

senting a time function in terms of simple, fixed building blocks, termed wavelets. These building blocksare actually a family of functions which are derived from a single generating function called the motherwavelet by translation and dilation operations. Dilation, also known as scaling, compresses or stretchesthe mother wavelet and translation shifts it along the time axis[21,22].

The WT can be categorized into continuous and discrete. Continuous wavelet transform (CWT) isdefined by

CWT(a, b)=∫ +∞

−∞x(t)�∗

a,b(t)dt, (15)

wherex(t) represents the analyzed signal,a andb represent the scaling factor (dilatation/compressioncoefficient) and translation along the time axis (shifting coefficient), respectively, and the superscriptasterisk denotes the complex conjugation.�a,b(·) is obtained by scaling the wavelet at timeb and scalea:

�a,b(t)=1√|a| �

(t − ba

), (16)

where�(t) represents the wavelet[14,19].Continuous, in the context of the WT, implies that the scaling and translation parametersa and b

change continuously. However, calculating wavelet coefficients for every possible scale can representa considerable effort and result in a vast amount of data. Therefore discrete wavelet transform (DWT)is often used. The WT can be thought of as an extension of the classic Fourier transform, except that,instead of working on a single scale (time or frequency), it works on a multi-scale basis. This multi-scalefeature of the WT allows the decomposition of a signal into a number of scales, each scale representinga particular coarseness of the signal under study. The procedure of multiresolution decomposition of asignalx[n] is schematically shown inFig. 3. Each stage of this scheme consists of two digital filtersand two downsamplers by 2. The first filter,g[·] is the discrete mother wavelet, high-pass in nature, andthe second,h[·] is its mirror version, low-pass in nature. The downsampled outputs of first high- and


ARTICLE IN PRESS

x[n]

g[n]

h[n]

2

2A1

D1

g[n]

h[n]

2

2A2

D2

g[n]

h[n]

2

2A3

D3

Fig. 3. Subband decomposition of discrete wavelet transform implementation;g[n] is the high-pass filter,h[n] is the low-passfilter.

low-pass filters provide the detail,D1 and the approximation,A1, respectively. The first approximation,A1 is further decomposed and this process is continued as shown inFig. 3.

All wavelet transforms can be specified in terms of a low-pass filterh, which satisfies the standardquadrature mirror filter condition:

H(z)H(z−1)+H(−z)H(−z−1)= 1, (17)

whereH(z) denotes the z-transform of the filterh. Its complementary high-pass filter can be defined as

G(z)= zH(−z−1). (18)

A sequence of filters with increasing length (indexed byi) can be obtained:

Hi+1(z)=H(z2i )Hi(z)Gi+1(z)=G(z2i )Hi(z), i = 0, . . . , I − 1 (19)

with the initial conditionH0(z)= 1. It is expressed as a two-scale relation in time domain

hi+1(k)= [h]↑2i ∗ hi(k)gi+1(k)= [g]↑2i ∗ hi(k), (20)

where the subscript[·]↑m indicates the up-sampling by a factor ofmandk is the equally sampled discretetime.

The normalized wavelet and scale basis functions�i,l(k), �i,l(k) can be defined as

�i,l(k)= 2i/2hi(k − 2i l),

�i,l(k)= 2i/2gi(k − 2i l), (21)

where the factor 2i/2 is an inner product normalization,i andl are the scale parameter and the translationparameter, respectively. The DWT decomposition can be described as

a(i)(l)= x(k) ∗ �i,l(k),

d(i)(l)= x(k) ∗ �i,l(k), (22)


wherea(i)(l) and di(l) are the approximation coefficients and the detail coefficients at resolutioni,respectively[21–25].

The concept of being able to decompose a signal totally and then perfectly reconstruct the signal againis practical, but it is not particularly useful by itself. In order to make use of this tool it is necessary tomanipulate the wavelet coefficients to identify characteristics of the signal that were not apparent fromthe original time domain signal[21].

2.2. Comparison of classical, model-based and time-frequency analysis methods

The FFT-based methods are based on a finite record of data and their frequency resolution are limited bythe data record duration, independent of the characteristics of the data. These methods suffer from spectralleakage effects, due to windowing that are inherent in finite-length data records. Furthermore, the principaleffect of windowing that occurs when processing with the FFT-based methods is to smear or smooth theestimated spectrum. The basic limitation of the FFT-based methods is the inherent assumption that theautocorrelation estimate is zero outside the window. From another viewpoint, the inherent assumptionin the FFT-based methods is that the data are periodic. Neither one of these assumptions is realistic[13,15–17].

The model-based methods do not require such assumptions. The modeling approach eliminates theneed for window functions and the assumption that the autocorrelation sequence is zero outside thewindow. The model-based methods spectra have better statistical stability for short segments of signaland have better spectral resolution and the resolution is less dependent on the length of the record. Themodel-based methods have better temporal resolution and produce continuous spectra. The disadvantagesof the model-based methods compared to the FFT-based methods are: the FFT-based methods are morewidely available and are the traditional engineering approach to spectrum analysis; the model-basedspectra are slower to compute; the model-based methods are not reversible; the model-based methods areslightly more complicated to code; the model-based methods are more sensitive to round-off errors, andfinally, the orders of the model-based methods depend on the characteristics of the signal and the currentobjective methods for model order determination are not satisfactory[13,15–17]. Based on the resultsof the studies existing in the literature, performance characteristics of the AR and ARMA methods werefound extremely valuable for spectral analysis of Doppler signals[12,13,15].

There is a distinct qualitative improvement in spectral analysis of nonstationary signals using the time-frequency analysis methods over the classical and model-based methods. The problem with the STFT isthat both time and frequency resolutions of the transform are fixed over the entire time-frequency plane.The STFT involves the implicit assumption that the data are quasistationary for the duration of eachanalyzed segment. Taking the FFT of a short segment of the Doppler signal leads to a distortion of thespectral estimate and leakage of signal energy into spurious side lobes due to the sharp truncation of thesignal. To reduce this distortion it is common practice to multiply the signal by a window function whichreduces the amplitude of the analyzed signal toward the beginning and end of the data segment. Usinglonger data segments reduces the distortion and leakage of the spectral estimates but may be violate thenonstationarity assumption. There is an obvious tradeoff when using the STFT between the distortion andpoor spectral resolution introduced by short data windows and the spectral broadening that arises fromnonstationary characteristics of the signal when using longer data windows. A more flexible approachwould be to use a scalable window: a compressed window for analyzing high frequency detail and a dilatedwindow for uncovering low frequency trends within the signal[14,19]. The WT addresses the problem


ARTICLE IN PRESS

Table 1Ranges of frequency bands in wavelet decomposition

Decomposed signal Frequency range (Hz)

D1 2500–5000D2 1250–2500D3 625–1250D4 312.5–625D5 156.25–312.5D6 78.13–156.25D7 39.07–78.13A7 0–39.07

of fixed resolution by using base functions that can be scaled. The wavelets act in a similar way to thewindowed complex exponentials that are used in the STFT, except that with the WT the length of signalbeing analyzed is not fixed. It is known that wavelets are better suited to analyzing nonstationary signals,since they are well localized in time and frequency. The property of time and frequency localization isknown as compact support and is one of the most attractive features of the WT. The WT of a signal isthe decomposition of the signal over a set of functions obtained after dilatation and translation of ananalyzing wavelet. The main advantage of the WT is that it has a varying window size, being broad atlow frequencies and narrow at high frequencies, thus leading to an optimal time-frequency resolution inall frequency ranges. Furthermore, owing to the fact that windows are adapted to the transients of eachscale, wavelets lack of the requirement of stationarity. Since flow in arteries is pulsatile and the red bloodcells have a random spatial distribution, the Doppler signal is time-varying and random. Therefore, theWT has become a powerful alternative to the STFT in analysis of the Doppler signals[14,21–25]. Thenthe method employed for feature extraction from Doppler signals using wavelets is introduced in thefollowing section.

2.3. Feature extraction using discrete wavelet transform

Spectral analysis of the ophthalmic and internal carotid arterial Doppler signals was performed using theDWT as described in Section 2.1.6. Selection of appropriate wavelet and the number of decompositionlevels is very important in analysis of signals using the WT. The number of decomposition levels ischosen based on the dominant frequency components of the signal. The levels are chosen such that thoseparts of the signal that correlate well with the frequencies required for classification of the signal areretained in the wavelet coefficients. In the present study, since the Doppler signals do not have any usefulfrequency components below 40 Hz, the number of decomposition levels was chosen to be 7. Thus, theophthalmic and internal carotid arterial Doppler signals were decomposed into the detailsD1 −D7 andone final approximation,A7. The ranges of various frequency bands are given inTable 1. Usually, testsare performed with different types of wavelets and the one which gives maximum efficiency is selectedfor the particular application. The smoothing feature of the Daubechies wavelet of order 1 (db1) madeit more suitable to detect changes of arterial Doppler signals. Therefore, the wavelet coefficients werecomputed using the db1 in the present study. The discrete wavelet coefficients were computed usingMATLAB software package.


Feature selection is an important component of designing the neural network based on pattern classi-fication since even the best classifier will perform poorly if the features used as inputs are not selectedwell. The computed discrete wavelet coefficients provide a compact representation that shows the energydistribution of the signal in time and frequency. Therefore, the computed discrete wavelet coefficientsof the ophthalmic and internal carotid arterial Doppler signals of each subject were used as the featurevectors representing the signals. In order to reduce the dimensionality of the extracted feature vectors,statistics over the set of the wavelet coefficients was used. The following statistical features were used torepresent the time-frequency distribution of the Doppler signals:

1. Mean of the absolute values of the coefficients in each subband.2. Maximum of the absolute values of the coefficients in each subband.3. Average power of the wavelet coefficients in each subband.4. Standard deviation of the coefficients in each subband.5. Ratio of the absolute mean values of adjacent subbands.6. Distribution distortion of the coefficients in each subband.

Features 1–3 represent the frequency distribution of the signal and the features 4–6 the amount of changesin frequency distribution. These feature vectors, calculated for the frequency bands, were used for clas-sification of the ophthalmic and internal carotid arterial Doppler signals.

Feature 6 is related with Laplacian distribution. It has been shown that, for a large class of signals,the transformed coefficients in each high frequency subband can be well described by a generalizedLaplacian distribution[26], which is often simplified to the special case of Laplacian distribution[27].The Laplacian distributionp(x) is defined as

p(x)= �

2e−�|x|, (23)

whose mean is zero and variance is 2/�2. The summation of differences among the real distribution ofwavelet coefficients and the distribution of expected Laplacian distribution is the distribution distortionof wavelet coefficients and defined as∑

x

|w(x)− p(x)|, (24)

wherew(x) is the real probability density function of the wavelet coefficients in the subband andp(x) isthe expected Laplacian distribution.

In some applications, in order to further reduce the dimensionality of the extracted feature vectors, onlysome of the statistical features given in this section can be used to represent the time-frequency distributionof the signal under study. However, in our applications all of the statistical features (6 statistical features)were used to represent the ophthalmic and internal carotid arterial Doppler signals.

3. Experimental results

The assessment of feature extraction methods described in Section 2 was performed by taking intoconsideration of classification accuracies. Therefore, a brief description of artificial neural network (ANN)architecture is given in this section.


ARTICLE IN PRESS

InputLayer

Hidden Layer 1

OutputLayer

Inputs Outputs

HiddenLayer N

Sum∑

TransferFunction

f (�)

Wj1

Wj2

Wjn

W = Weights

Detail of Each Neuron

Out

Fig. 4. Multilayer peceptron neural network topology.

3.1. Multilayer perceptron neural network

ANNs can be trained to recognize patterns and the nonlinear models developed during training allowneural networks to generalize their conclusions and to make application to patterns not previously en-countered[28–30]. The MLPNN, which has features such as the ability to learn and generalize, smallertraining set requirements, fast operation, ease of implementation and therefore most commonly used neu-ral network arhitectures, is shown inFig. 4. As shown inFig. 4, a MLPNN consists of (i) an input layerwith neurons representing input variables to the problem, (ii) an output layer with neurons representingthe dependent variables (what is being modeled), and (iii) one or more hidden layers containing neuronsto help capture the nonlinearity in the data. The MLPNN is a nonparametric technique for performing awide variety of detection and estimation tasks[28–30]. Therefore, MLPNNs with different inputs (featurevectors) were used for diagnosis of ophthalmic and internal carotid arterial diseases in the present study.

In the MLPNN, each neuronj in the hidden layer sums its input signalsxi after multiplying them bythe strengths of the respective connection weightswji and computes its outputyj as a function of thesum:

yj = f(∑

wjixi

), (25)

wheref is activation function that is necessary to transform the weighted sum of all signals impingingonto a neuron. The activation function (f ) can be a simple threshold function, or a sigmoidal, hyperbolictangent, or radial basis function. In the present study in the hidden layer and the output layer, the activation


functionf was the sigmoidal function:

f (�)= 1

1 + e−�. (26)

The sum of squared differences between the desired and actual values of the output neuronsE is definedas

E = 1

2

∑j

(ydj − yj )2, (27)

whereydj is the desired value of output neuronj andyj is the actual output of that neuron. Each weightwji is adjusted to reduceE as rapidly as possible. Howwji is adjusted depends on the training algorithmadopted[28–30].

Training algorithms are an integral part ofANN model development.An appropriate topology may stillfail to give a better model, unless trained by a suitable training algorithm. A good training algorithm willshorten the training time, while achieving a better accuracy. Therefore, training process is an importantcharacteristic of the ANNs, whereby representative examples of the knowledge are iteratively presentedto the network, so that it can integrate this knowledge within its structure. There are a number of trainingalgorithms used to train a MLPNN and a frequently used one is called the backpropagation (BP) trainingalgorithm. The BP algorithm, which is based on searching an error surface using gradient descent forpoints with minimum error, is relatively easy to implement. However, the BP has some problems formany applications. The algorithm is not guaranteed to find the global minimum of the error functionsince gradient descent may get stuck in local minima, where it may remain indefinitely. In addition tothis, long training sessions are often required in order to find an acceptable weight solution because ofthe well known difficulties inherent in gradient descent optimization. Therefore, a lot of variations toimprove the convergence of the BP were proposed such as delta-bar-delta (DBD), extended delta-bar-delta (EDBD), quick propagation (QP)[31–34]. Optimization methods such as second-order methods(conjugate gradient, quasi-Newton, Levenberg-Marquardt) have also been used forANN training in recentyears. The Levenberg-Marquardt algorithm combines the best features of the Gauss-Newton techniqueand the steepest-descent algorithm, but avoids many of their limitations. In particular, it generally doesnot suffer from the problem of slow convergence[35,36]. A number of researchers have carried outcomparative studies of MLPNN training algorithms[37–39]. The results of the studies have illustratedthat the relative performance of algorithms depends on the problem being used. Therefore, in this studythe MLPNNs were trained with the BP, DBD, EDBD, QP, and Levenberg-Marquardt algorithms.

ANN architectures are derived by trial and error and the complexity of the neural network is char-acterized by the number of hidden layers. There is no general rule for selection of appropriate numberof hidden layers. A neural network with a small number of neurons may not be sufficiently powerfulto model a complex function. On the other hand, a neural network with too many neurons may lead tooverfitting the training sets and lose its ability to generalize which is the main desired characteristic ofa neural network. The most popular approach to finding the optimal number of hidden layers is by trialand error. In the present study, after several trials it was seen that two hidden layered network achievedthe task in high accuracy. The most suitable network configuration found was 10 neurons for each hiddenlayer. The MLPNNs were implemented by using MATLAB software package (MATLAB version 6.0with neural networks toolbox).


ARTICLE IN PRESS

In order to classify the ophthalmic and internal carotid arterial Doppler signals, for each applicationfour MLPNNs with different inputs (different features used as inputs) were trained and the performancesof these MLPNNs were evaluated according to their total classification accuracies and convergence rates(number of training epochs). The inputs of the MLPNNs were determined by spectral analysis of theophthalmic and internal carotid arterial Doppler signals. The spectral analysis methods used in these twoapplications were chosen according to the conclusions concerning the performances of methods drawnin Section 2.2.

3.2. Application of MLPNNs to ophthalmic arterial Doppler signals

The ophthalmic arterial Doppler signals were obtained from 214 subjects. The group consisted of 103females and 111 males with ages ranging from 19 to 65 years and a mean age of 33.5±0.5 years. DiasonicsSynergy color Doppler ultrasonography was used during examinations and sonograms were taken intoconsideration. According to the examination results, 52 of 214 subjects suffered from ophthalmic arterystenosis, 54 of them suffered from ocular Behcet disease, 45 of them suffered from uveitis disease, andthe rest were healthy subjects (control group) who had no ocular or systemic disease. The group sufferingfrom ophthalmic artery stenosis consisted of 25 females and 27 males with a mean age 35.5±0.5 years(range 23–65), the group suffering from ocular Behcet disease consisted of 25 females and 29 maleswith a mean age 35.5±0.5 years (range 21–63), the group suffering from uveitis disease consisted of 22females and 23 males with a mean age 34.5±0.5 years (range 22–62), and the healthy subjects were 31females and 32 males with a mean age 30.0±0.5 years (range 19–64).

Ophthalmic artery examinations were performed with a Doppler unit using a 10 MHz ultrasonic trans-ducer. The measurement system consisted of five units. These were 10 MHz ultrasonic transducer, analogDoppler unit (Diasonics Synergy), recorder (Sony), analog/digital interface board (Sound Blaster Pro-16bit), and a personal computer with a printer. The ultrasonic transducer was applied on a horizontal planeto the closed eyelids using sterile methylcellulose as a coupling gel. Care was taken not to apply pressureto the eye in order to avoid artifacts. The probe was most often placed at an angle of 60◦ from the midlinepointing towards the orbital apex. Good and consistent signals were obtained at 37–42 mm depth. Theophthalmic arterial Doppler signals were sampled in 5 kHz and framed by equal time intervals. The framelength was chosen as 256.

The adequate functioning of neural networks depends on the sizes of the training set and test set. Inthis application, 80 of 214 subjects were used for training and the rest for testing. A practical way to finda point of better generalization is to use a small percentage (around 20%) of the training set for crossvalidation. For obtaining a better network generalization 16 training subjects were selected randomly tobe used as a cross validation set. The training set consisted of 20 subjects suffering from ophthalmic arterystenosis, 20 subjects suffering from ocular Behcet disease, 20 subjects suffering from uveitis disease, and20 healthy subjects. The testing set consisted of 32 subjects suffering from ophthalmic artery stenosis, 34subjects suffering from ocular Behcet disease, 25 subjects suffering from uveitis disease, and 43 healthysubjects. The cross validation set consisted of 4 subjects suffering from ophthalmic artery stenosis, 4subjects suffering from ocular Behcet disease, 4 subjects suffering from uveitis disease, and 4 healthysubjects.

The standard waveform indices explained in Section 2.1 are incapable of analyzing multi-segmentaldisease or classifying the severity of disease. In addition, the standard waveform indices are affectedby a change in the heart rate and are unreliable when absent or reverse diastolic flow is present. Also,


our previous study demonstrated that standard waveform indices are inadequate to evaluate ophthalmicarterial Doppler waveforms[10]. It is difficult to separate the four groups (healthy, ophthalmic arterystenosis, ocular Behcet disease, and uveitis disease) of subjects using the values of the standard waveformindices, since there is a considerable overlap in the standard waveform indices of the four groups. In thisapplication, the values of standard waveform indices were not used as MLPNN inputs because of impreciseboundaries between the values of standard waveform indices of the four groups.

The 129 points of the logarithm of the values of PSDs obtained by the FFT method were used as theinputs of the first MLPNN. Sample PSDs obtained by the FFT method of the ophthalmic arterial Dopplersignals recorded from 35-year-old healthy subject (subject no: 12), 29-year-old subject suffering fromophthalmic artery stenosis (subject no: 17), 37-year-old subject suffering from ocular Behcet disease(subject no: 35), and 34-year-old subject suffering from uveitis disease (subject no: 41) are shown inFigs. 5(a)–(d), respectively. It can be noted that the PSD values of the ophthalmic arterial Doppler signalsobtained from a healthy subject (Fig.5(a)) and subjects suffering from ophthalmic arterial diseases (Figs.5(b)–(d)) are different from each other.

The 129 points of the logarithm of the values of PSDs obtained by the AR method were used as theinputs of the second MLPNN. Sample PSDs obtained by theAR method of the ophthalmic arterial Dopplersignals recorded from 35-year-old healthy subject (subject no: 12), 29-year-old subject suffering fromophthalmic artery stenosis (subject no: 17), 37-year-old subject suffering from ocular Behcet disease(subject no: 35), and 34-year-old subject suffering from uveitis disease (subject no: 41) are shown inFigs. 6(a)–(d), respectively. It can be noted that the PSD values of the ophthalmic arterial Doppler signalsobtained from a healthy subject (Fig.6(a)) and subjects suffering from ophthalmic arterial diseases (Figs.6(b)–(d)) are different from each other.

The 129 points of the logarithm of the values of PSDs obtained by the ARMA method were usedas the inputs of the third MLPNN. Sample PSDs obtained by the ARMA method of the ophthalmicarterial Doppler signals recorded from 35-year-old healthy subject (subject no: 12), 29-year-old subjectsuffering from ophthalmic artery stenosis (subject no: 17), 37-year-old subject suffering from ocularBehcet disease (subject no: 35), and 34-year-old subject suffering from uveitis disease (subject no: 41)are shown inFigs. 7(a)–(d), respectively. It can be noted that the PSD values of the ophthalmic arterialDoppler signals obtained from a healthy subject (Fig.7(a)) and subjects suffering from ophthalmic arterialdiseases (Figs.7(b)–(d)) are different from each other.

The computed discrete wavelet coefficients were used as the inputs of the fourth MLPNN. In order toextract features, the wavelet coefficients corresponding to theD1 − D7 andA7 frequency bands of theophthalmic arterial Doppler signals were computed. It was observed that the values of the coefficientsare very close to zero inA7. So the coefficients corresponding to the frequency band,A7 were discarded,thus reducing the number of feature vectors representing the signal. For each ophthalmic arterial Dopplersignal frame (256 samples), the detail wavelet coefficients (dk, k = 1,2,3,4,5,6,7) at the first, second,third, fourth, fifth, sixth and seventh levels (128+64+32+16+8+4+2 coefficients) were computed.Then 254 detail wavelet coefficients were obtained for each ophthalmic arterial Doppler signal frame. Inorder to reduce the dimensionality of the extracted feature vectors, statistics explained in Section 2.3 overthe set of the wavelet coefficients was used. Then the fourth MLPNN had 41 inputs, equal to the numberof input feature vectors. The detail wavelet coefficients corresponding to theD1 frequency band of theophthalmic arterial Doppler signals obtained from 35-year-old healthy subject (subject no: 12), 29-year-old subject suffering from ophthalmic artery stenosis (subject no: 17), 37-year-old subject suffering fromocular Behcet disease (subject no: 35), and 34-year-old subject suffering from uveitis disease (subject


ARTICLE IN PRESS

-20

-10

0

10

20

30

40

50

60

70

-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

0-40

-20

0

20

40

60

80

100

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)P

ower

Spe

ctra

l Den

sity

(dB

/Hz)

0 0.5 1 1.5 2 2.5 3-20

0

20

40

60

80

100P

ower

Spe

ctra

l Den

sity

(dB

/Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0.5 1 1.5 2 2.5 3

Frequency (Hz)

(a) (b)

(c) (d)

Fig. 5. PSDs obtained by the FFT method of the ophthalmic arterial Doppler signals recorded from: (a) 35-year-old healthysubject (subject no: 12), (b) 29-year-old subject suffering from ophthalmic artery stenosis (subject no: 17), (c) 37-year-old subjectsuffering from ocular Behcet disease (subject no: 35), (d) 34-year-old subject suffering from uveitis disease (subject no: 41).

no: 41) are given inFigs. 8(a)–(d), respectively. It can be noted that the detail wavelet coefficients ofthe ophthalmic arterial Doppler signals obtained from a healthy subject (Fig.8(a)) and subjects sufferingfrom ophthalmic arterial diseases (Figs.8(b)–(d)) are different from each other. In order to investigate theeffect of other wavelets on classification accuracies of the ophthalmic arterial Doppler signals, tests werecarried out using other wavelets also. Apart from db1, Symmlet of order 10 (sym10), Coiflet of order 4(coif4), and Daubechies of order 8 (db8) were also tried. The effects of these wavelets were comparedwith the use of statistical tools such as receiver operating characteristics (ROC) curves. The performancesof tests depend on the shapes of the ROC curves. A good test is one for which sensitivity rises rapidly and1-specificity hardly increases at all until sensitivity becomes high.As it is seen fromFig. 9the Daubechieswavelet offers better accuracy than the others, and db1 is marginally better than db8.


-20

-10

0

10

20

30

40

50

60

70

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

-40

-20

0

20

40

60

80

100

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

0 0.5 1 1.5 2 2.5 3-20

0

20

40

60

80

100

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

(a) (b)

(c) (d)

Fig. 6. PSDs obtained by the AR method of the ophthalmic arterial Doppler signals recorded from: (a) 35-year-old healthysubject (subject no: 12), (b) 29-year-old subject suffering from ophthalmic artery stenosis (subject no: 17), (c) 37-year-oldsubject suffering from ocular Behcet disease (subject no: 35), (d) 34-year-old subject suffering from uveitis disease (subject no:41).

The performances of the four MLPNNs trained with five different training algorithms were evaluatedby the convergence rates (number of training epochs) and the total classification accuracies. The trainingholds the key to an accurate solution, so the criterion to stop training must be very well described. Whenthe network is trained too much, the network memorizes the training patterns and does not generalize well.Cross validation is a highly recommended criterion for stopping the training of a network. Therefore, inthis application training of the MLPNNs were stopped when the error in the cross validation increased.The classification accuracy which is defined as the percentage ratio of the number of subjects correctlyclassified to the total number of subjects considered for classification depends on the features used as inputsof the MLPNNs. The number of training epochs and total classification accuracies of the four MLPNNs


ARTICLE IN PRESS

-20

-10

0

10

20

30

40

50

60

70

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

-40

-20

0

20

40

60

80

100

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

0 0.5 1 1.5 2 2.5 3-20

0

20

40

60

80

100

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

0 0.5 1 1.5 2 2.5 3

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

(a) (b)

(c) (d)

Fig. 7. PSDs obtained by the ARMA method of the ophthalmic arterial Doppler signals recorded from: (a) 35-year-old healthysubject (subject no: 12), (b) 29-year-old subject suffering from ophthalmic artery stenosis (subject no: 17), (c) 37-year-old subjectsuffering from ocular Behcet disease (subject no: 35), (d) 34-year-old subject suffering from uveitis disease (subject no: 41).

trained with five different algorithms used for classification of the ophthalmic arterial Doppler signalsare presented inTable 2. As it is seen fromTable 2, the first MLPNN (FFT PSD values used as inputs)trained with five different training algorithms has the lowest accuracies and the slowest convergence rates.The second MLPNN (AR PSD values used as inputs) and the third MLPNN (ARMA PSD values usedas inputs) trained with five different training algorithms offer nearly same accuracies and convergencerates. The fourth MLPNN (discrete wavelet coefficients used as inputs) trained with five different trainingalgorithms offers the highest accuracies and convergence rates.According toTable 2the total classificationaccuracies of the MLPNNs trained with the Levenberg-Marquardt algorithm are higher than that of theMLPNNs trained with other training algorithms.


0 20 40 60 80 100 120 140-40

-30

-20

-10

0

10

20

30

40

50

Number of detail wavelet coefficients

Det

ail w

avel

et c

oeffi

cien

ts

0 20 40 60 80 100 120 140-20

-15

-10

-5

0

5

10

15


Det

ail w

avel

et c

oeffi

cien

ts

0 20 40 60 80 100 120 140-10

-8

-6

-4

-2

0

2

4

6

8

10


Det

ail w

avel

et c

oeffi

cien

ts

0 20 40 60 80 100 120 140-6

-5

-4

-3

-2

-1

0

1

2

3


Det

ail w

avel

et c

oeffi

cien

ts

(a) (b)

(c) (d)

Fig. 8. The detail wavelet coefficients corresponding to theD1 frequency band of the ophthalmic arterial Doppler signals recordedfrom: (a) 35-year-old healthy subject (subject no: 12), (b) 29-year-old subject suffering from ophthalmic artery stenosis (subjectno: 17), (c) 37-year-old subject suffering from ocular Behcet disease (subject no: 35), (d) 34-year-old subject suffering fromuveitis disease (subject no: 41).

In order to investigate the effect of number of hidden layers on classification accuracies of theophthalmic arterial Doppler signals, tests were carried out using the fourth MLPNN trained with theLevenberg-Marquardt algorithm. Apart from two hidden layered MLPNN architecture, one, four, andsix hidden layered MLPNN architectures were also tried. Total classification accuracy obtained for eachMLPNN architecture when the ophthalmic arterial Doppler signals were classified, is shown inFig. 10.It can be seen that one and two hidden layered MLPNNs offer better accuracy than the others, and twohidden layered MLPNN is marginally better than one hidden layered MLPNN. Hence, two hidden layeredMLPNN was chosen for this application.


ARTICLE IN PRESS

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1-Specificity

Sen

sitiv

ity

db1db 8coif 4sym10

Fig. 9. ROC curves showing the effect of the wavelets on classification accuracies of the ophthalmic arterial Doppler signals.

Table 2The number of training epochs and total classification accuracies of the four MLPNNs trained with five different trainingalgorithms used for classification of the ophthalmic arterial Doppler signals

MLPNNs with different inputs Training algorithms Number of training epochs Total classification accuracies (%)

First MLPNN BP 6000 85.07DBD 5000 86.57EDBD 4800 86.57QP 4200 88.06Levenberg-Marquardt 4000 89.55

Second MLPNN BP 4200 85.82DBD 3800 87.31EDBD 3500 88.06QP 3000 89.55Levenberg-Marquardt 2500 91.04

Third MLPNN BP 4000 86.57DBD 3800 88.06EDBD 3500 88.81QP 2700 89.55Levenberg-Marquardt 2300 91.79

Fourth MLPNN BP 2000 90.30DBD 1800 92.54EDBD 1600 92.54QP 1200 93.28Levenberg-Marquardt 800 94.78


86

87

88

89

90

91

92

93

94

95

96

One hidden layeredMLPNN

Two hiddenlayered MLPNN

Four hiddenlayered MLPNN

Six hidden layeredMLPNN

Cla

ssifi

catio

n ac

cura

cy (

%)

Fig. 10. Total classification accuracy obtained for each MLPNN architecture when the ophthalmic arterial Doppler signals wereclassified.

3.3. Application of MLPNNs to internal carotid arterial Doppler signals

The internal carotid arterial Doppler signals were obtained from 160 subjects. The group consistedof 78 females and 82 males with ages ranging from 18 to 67 years and a mean age of 32.0 ± 0.5years. Toshiba 140A color Doppler ultrasonography was used during examinations and sonograms weretaken into consideration. According to the examination results, 59 of 160 subjects suffered from internalcarotid artery stenosis, 53 of them suffered from internal carotid artery occlusion, and the rest werehealthy subjects (control group) who had no arterial disease. The group suffering from internal carotidartery stenosis consisted of 26 females and 33 males with a mean age 33.0 ± 0.5 years (range 21–67),the group suffering from internal carotid artery occlusion consisted of 29 females and 24 males with amean age 32.5± 0.5 years (range 20–65), and the healthy subjects were 23 females and 25 males with amean age 31.5 ± 0.5 years (range 18–65).

Internal carotid artery examinations were performed with a Doppler unit using a 5 MHz ultrasonictransducer. The measurement system consisted of five units. These were 5 MHz ultrasonic transducer,analog Doppler unit (Toshiba 140A color Doppler ultrasonography), recorder (Sony), analog/digitalinterface board (Sound Blaster Pro-16 bit), a personal computer with a printer. The ultrasonic transducerwas applied on a horizontal plane to the neck using water-soluble gel as a coupling gel. Care was takennot to apply pressure to the neck in order to avoid artifacts. The probe was most often placed at an angleof 60◦ towards the internal carotid artery. The internal carotid arterial Doppler signals were sampled in5 kHz and framed by equal time intervals. The frame length was chosen as 256.

In this application, 60 of 160 subjects were used for training and the rest for testing. For obtaining abetter network generalization 12 training subjects were selected randomly to be used as a cross validationset. The training set consisted of 20 subjects suffering from internal carotid artery stenosis, 20 subjectssuffering from internal carotid artery occlusion, and 20 healthy subjects. The testing set consisted of39 subjects suffering from internal carotid artery stenosis, 33 subjects suffering from internal carotidartery occlusion, and 28 healthy subjects. The cross validation set consisted of 4 subjects suffering from


ARTICLE IN PRESS

0 0.5 1 1.5 2 2.5 3-20

-10

0

10

20

30

40

50

60

70

80

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

(a) (b)

(c)

Fig. 11. PSDs obtained by the FFT method of the internal carotid arterial Doppler signals recorded from: (a) 33-year-old healthysubject (subject no: 10), (b) 35-year-old subject suffering from internal carotid artery stenosis (subject no: 23), (c) 36-year-oldsubject suffering from internal carotid artery occlusion (subject no: 28).

internal carotid artery stenosis, 4 subjects suffering from internal carotid artery occlusion, and 4 healthysubjects.

Our previous study demonstrated that standard waveform indices are inadequate to evaluate internalcarotid arterial Doppler waveforms[11]. As it is mentioned in the previous study[11], it is difficult toseparate the three groups (healthy, internal carotid artery stenosis, and internal carotid artery occlusion)of subjects using the values of the standard waveform indices, since there is a considerable overlap inthe standard waveform indices of the three groups. In this application, the values of standard waveformindices were not used as MLPNN inputs because of imprecise boundaries between the values of standardwaveform indices of the three groups.

The 129 points of the logarithm of the values of PSDs obtained by the FFT method were used as theinputs of the first MLPNN. Sample PSDs obtained by the FFT method of the internal carotid arterialDoppler signals recorded from 33-year-old healthy subject (subject no: 10), 35-year-old subject suffering


0 0.5 1 1.5 2 2.5 3-20

-10

0

10

20

30

40

50

60

70

80

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

(a) (b)

(c)

Fig. 12. PSDs obtained by the AR method of the internal carotid arterial Doppler signals recorded from: (a) 33-year-old healthysubject (subject no: 10), (b) 35-year-old subject suffering from internal carotid artery stenosis (subject no: 23), (c) 36-year-oldsubject suffering from internal carotid artery occlusion (subject no: 28).

from internal carotid artery stenosis (subject no: 23), and 36-year-old subject suffering from internalcarotid artery occlusion (subject no: 28) are shown inFigs. 11(a)–(c), respectively. It can be noted that thePSD values of the internal carotid arterial Doppler signals obtained from a healthy subject (Fig.11(a))and subjects suffering from internal carotid arterial diseases (Figs.11(b) and (c)) are different from eachother.

The 129 points of the logarithm of the values of PSDs obtained by the AR method were used as theinputs of the second MLPNN. Sample PSDs obtained by the AR method of the internal carotid arterialDoppler signals recorded from 33-year-old healthy subject (subject no: 10), 35-year-old subject sufferingfrom internal carotid artery stenosis (subject no: 23), and 36-year-old subject suffering from internalcarotid occlusion (subject no: 28) are shown inFigs. 12(a)–(c), respectively. It can be noted that thePSD values of the internal carotid arterial Doppler signals obtained from a healthy subject (Fig.12(a))


ARTICLE IN PRESS

0

0

0.5 1 1.5 2 2.5 3-20

-10

10

20

30

40

50

60

70

80

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty(d

B/H

z)

0 0.5 1 1.5 2 2.5 3-100

-80

-60

-40

-20

0

20

Frequency (Hz)

Pow

er S

pect

ral D

ensi

ty (

dB/H

z)

(a) (b)

(c)

Fig. 13. PSDs obtained by the ARMA method of the internal carotid arterial Doppler signals recorded from: (a) 33-year-oldhealthy subject (subject no: 10), (b) 35-year-old subject suffering from internal carotid artery stenosis (subject no: 23), (c)36-year-old subject suffering from internal carotid artery occlusion (subject no: 28).

and subjects suffering from internal carotid arterial diseases (Figs.12(b) and (c)) are different from eachother.

The 129 points of the logarithm of the values of PSDs obtained by the ARMA method were used asthe inputs of the third MLPNN. Sample PSDs obtained by the ARMA method of the internal carotidarterial Doppler signals recorded from 33-year-old healthy subject (subject no: 10), 35-year-old subjectsuffering from internal carotid artery stenosis (subject no: 23), and 36-year-old subject suffering frominternal carotid artery occlusion (subject no: 28) are shown inFigs. 13(a)–(c), respectively. It can be notedthat the PSD values of the internal carotid arterial Doppler signals obtained from a healthy subject (Fig.13(a)) and subjects suffering from internal carotid arterial diseases (Figs.13(b) and (c)) are different fromeach other.


0 20 40 60 80 100 120 140-60

-40

-20

0

20

40

60

80


Det

ail w

avel

et c

oeffi

cien

ts

0 20 40 60 80 100 120 140-150

-100

-50

0

50

100

150


Det

ail w

avel

et c

oeffi

cien

ts

0 20 40 60 80 100 120 140-60

-50

-40

-30

-20

-10

0

10

20

30

40


Det

ail w

avel

et c

oeffi

cien

ts

(c)

(a) (b)

Fig. 14. The detail wavelet coefficients corresponding to theD1 frequency band of the internal carotid arterial Doppler signalsrecorded from: (a) 33-year-old healthy subject (subject no: 10), (b) 35-year-old subject suffering from internal carotid arterystenosis (subject no: 23), (c) 36-year-old subject suffering from internal carotid artery occlusion (subject no: 28).

The computed discrete wavelet coefficients were used as the inputs of the fourth MLPNN. In order toextract features, the wavelet coefficients corresponding to theD1 − D7 andA7 frequency bands of theinternal carotid arterial Doppler signals were computed. It was observed that the values of the coefficientsare very close to zero inA7. So the coefficients corresponding to the frequency band,A7 were discarded,thus reducing the number of feature vectors representing the signal. For each internal carotid arterialDoppler signal frame (256 samples), the detail wavelet coefficients (dk, k= 1,2,3,4,5,6,7) at the first,second, third, fourth, fifth, sixth and seventh levels (128+ 64+ 32+ 16+ 8 + 4 + 2 coefficients) werecomputed. Then 254 detail wavelet coefficients were obtained for each internal carotid arterial Dopplersignal frame. In order to reduce the dimensionality of the extracted feature vectors, statistics explainedin Section 2.3 over the set of the wavelet coefficients was used. Then the fourth MLPNN had 41 inputs,equal to the number of input feature vectors. The detail wavelet coefficients corresponding to theD1


ARTICLE IN PRESS

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1-Specificity

Sen

sitiv

ity

db1db8coi4sym10

Fig. 15. ROC curves showing the effect of the wavelets on classification accuracies of the internal carotid arterial Doppler signals.

frequency band of the internal carotid arterial Doppler signals obtained from 33-year-old healthy subject(subject no: 10), 35-year-old subject suffering from internal carotid artery stenosis (subject no: 23), and36-year-old subject suffering from internal carotid artery occlusion (subject no: 28) are given inFigs.14(a)–(c), respectively. It can be noted that the detail wavelet coefficients of the internal carotid arterialDoppler signals obtained from a healthy subject (Fig.14(a)) and subjects suffering from internal carotidarterial diseases (Figs.14(b) and (c)) are different from each other. In order to investigate the effectof other wavelets on classification accuracies of the internal carotid arterial Doppler signals, tests werecarried out using other wavelets also. Apart from db1, sym10, coif4, and db8 were also tried. The effectsof these wavelets were compared with the use of statistical tools such as ROC curves. As it is seen fromFig. 15the Daubechies wavelet offers better accuracy than the others, and db1 is marginally better thandb8.

The performances of the four MLPNNs trained with five different training algorithms were evaluatedby the convergence rates (number of training epochs) and the total classification accuracies. In thisapplication, training of the MLPNNs were stopped when the error in the cross validation increased.The classification accuracy which is defined as the percentage ratio of the number of subjects correctlyclassified to the total number of subjects considered for classification depends on the features used asinputs of the MLPNNs. The number of training epochs and total classification accuracies of the fourMLPNNs trained with five different algorithms used for classification of the internal carotid arterialDoppler signals are presented inTable 3. As it is seen fromTable 3, the first MLPNN (FFT PSD valuesused as inputs) trained with five different training algorithms has the lowest accuracies and the slowestconvergence rates. The second MLPNN (AR PSD values used as inputs) and the third MLPNN (ARMAPSD values used as inputs) trained with five different training algorithms offer nearly same accuraciesand convergence rates. The fourth MLPNN (discrete wavelet coefficients used as inputs) trained with fivedifferent training algorithms offers the highest accuracies and convergence rates. According toTable 3the total classification accuracies of the MLPNNs trained with the Levenberg-Marquardt algorithm arehigher than that of the MLPNNs trained with other training algorithms.


Table 3The number of training epochs and total classification accuracies of the four MLPNNs trained with five different trainingalgorithms used for classification of the internal carotid arterial Doppler signals

MLPNNs with different inputs Training algorithms Number of training epochs Total classification accuracies (%)

First MLPNN BP 5200 85.00DBD 4700 86.00EDBD 4500 87.00QP 4200 88.00Levenberg-Marquardt 3800 90.00

Second MLPNN BP 3500 86.00DBD 3000 88.00EDBD 2800 89.00QP 2500 89.00Levenberg-Marquardt 2000 92.00

Third MLPNN BP 3200 87.00DBD 2700 89.00EDBD 2600 90.00QP 2300 91.00Levenberg-Marquardt 1800 93.00

Fourth MLPNN BP 2800 89.00DBD 2200 91.00EDBD 2000 91.00QP 1500 93.00Levenberg-Marquardt 600 96.00

In order to investigate the effect of number of hidden layers on classification accuracies of the internalcarotid arterial Doppler signals, tests were carried out using the fourth MLPNN trained with the Levenberg-Marquardt algorithm. Apart from two hidden layered MLPNN architecture, one, four, and six hiddenlayered MLPNN architectures were also tried. Total classification accuracy obtained for each MLPNNarchitecture when the internal carotid arterial Doppler signals were classified, is shown inFig. 16. Itcan be seen that one and two hidden layered MLPNNs offer better accuracy than the others, and twohidden layered MLPNN is marginally better than one hidden layered MLPNN. Hence, two hidden layeredMLPNN was chosen for this application.

4. Conclusion

Neural networks have been used in a great number of automated diagnostic system applications becauseof the belief that they have great predictive power. Since classification is more accurate when the pattern issimplified through representation by important features, feature extraction and selection play an importantrole in classifying systems such as neural networks. Feature extraction is the determination of a feature ora feature vector from a pattern vector. Feature selection is an optional stage, whereby the feature vector


ARTICLE IN PRESS

84

86

88

90

92

94

96

98

One hidden layeredMLPNN

Two hiddenlayered MLPNN

Four hiddenlayered MLPNN

Six hidden layeredMLPNN

Cla

ssifi

catio

n ac

cura

cy (

%)

Fig. 16. Total classification accuracy obtained for each MLPNN architecture when the internal carotid arterial Doppler signalswere classified.

is reduced in size including only, from the classification viewpoint, what may be considered as the mostrelevant features required for discrimination. In the present study, feature extraction from the ophthalmicand internal carotid arterial Doppler signals for diagnosis of arterial diseases was examined. In eachapplication, four MLPNNs with different inputs (feature vectors) were used for diagnosis of ophthalmicand internal carotid arterial diseases. The assessment of feature extraction methods was performed bytaking into consideration of performances of the MLPNNs. The performances of the MLPNNs wereevaluated by the convergence rates (number of training epochs) and the total classification accuracies.In each application, the first MLPNN (FFT PSD values used as inputs) had the lowest accuracy andthe slowest convergence. The second MLPNN (AR PSD values used as inputs) and the third MLPNN(ARMA PSD values used as inputs) offered nearly same accuracies and convergence rates. The fourthMLPNN (discrete wavelet coefficients used as inputs) offered the highest accuracy and convergence rate.The conclusions drawn in the applications demonstrated that discrete wavelet coefficients are the featureswhich are best representing the Doppler signals and by the usage of the discrete wavelet coefficients bestdistinction between classes can be obtained.

Acknowledgements

This study has been supported by the State Planning Organization of Turkey (Project no: 2003K120470-20, Project name: Biomedical signal acquisition, processing and imaging).

References

[1] D. West,V. West, Model selection for a medical diagnostic decision support system: a breast cancer detection case,ArtificialIntel. Med. 20 (3) (2000) 183–204.


[2] D. West, V. West, Improving diagnostic accuracy using a hierarchical neural network to model decision subtasks, Int. J.Med. Inform. 57 (1) (2000) 41–55.

[3] H. Kordylewski, D. Graupe, K. Liu, A novel large-memory neural network as an aid in medical diagnosis applications,IEEE Trans. Inform. Technol. Biomed. 5 (3) (2001) 202–209.

[4] N. Kwak, C.-H. Choi, Input feature selection for classification problems, IEEE Trans. Neural Networks 13 (1) (2002)143–159.

[5] S. Osowski, D.D. Nghia, Fourier and wavelet descriptors for shape recognition using neural networks—a comparativestudy, Pattern Recognition 35 (2002) 1949–1957.

[6] D.H. Evans, W.N. McDicken, R. Skidmore, J.P. Woodcock, Doppler Ultrasound: Physics, Instrumentation and ClinicalApplications, Wiley, Chichester, 1989.

[7] J. Miao, P.J. Benkeser, F.T. Nichols, A computer-based statistical pattern recognition for Doppler spectral waveforms ofintracranial blood flow, Comput. Biol. Med. 26 (1) (1996) 53–63.

[8] I.A. Wright, N.A.J. Gough, F. Rakebrandt, M. Wahab, J.P. Woodcock, Neural network analysis of Doppler ultrasoundblood flow signals: A pilot study, Ultrasound Med. Biol. 23 (5) (1997) 683–690.

[9] N. Baykal, J.A. Reggia, N. Yalabık, A. Erkmen, M.S. Beksac, Feature discovery and classification of Doppler umbilicalartery blood flow velocity waveforms, Comput. Biol. Med. 26 (6) (1996) 451–462.

[10] I. Güler, E.D. Übeyli, Detection of ophthalmic artery stenosis by least-mean squares backpropagation neural network,Computer Biol. Med. 33 (4) (2003) 333–343.

[11] E.D. Übeyli, I. Güler, Neural network analysis of internal carotid arterial Doppler signals: predictions of stenosis andocclusion, Expert Sys. Appl. 25 (1) (2003) 1–13.

[12] I. Güler, F. Hardalaç, E.D. Übeyli, Determination of Behcet disease with the application of FFT and AR methods, Comput.Biol. Med. 32 (6) (2002) 419–434.

[13] I. Güler, E.D. Übeyli, Application of classical and model-based spectral methods to ophthalmic arterial Doppler signalswith uveitis disease, Comput. Biol. Med. 33 (6) (2003) 455–471.

[14] I. Güler, E.D. Übeyli, Spectral broadening of ophthalmic arterial Doppler signals using STFT and wavelet transform,Comput. Biol. Med. 34 (4) (2004) 345–354.

[15] I. Güler, E.D. Übeyli, Spectral analysis of internal carotid arterial Doppler signals using FFT,AR, MA, andARMA methods,Comput. Biol. Med. 34 (4) (2004) 293–306.

[16] S.M. Kay, S.L. Marple, Spectrum analysis—A modern perspective, Proc. IEEE 69 (11) (1981) 1380–1419.[17] J.A. Cadzow, Spectral estimation: an overdetermined rational model equation approach, Proc. IEEE 70 (9) (1982) 907–

939.[18] J.K. Hammond, P.R. White, The analysis of non-stationary signals using time-frequency methods, J. Sound Vibr. 190 (3)

(1996) 419–447.[19] J.C.S. Cardoso, M.G. Ruano, P.J. Fish, Nonstationarity broadening reduction in pulsed Doppler spectrum measurements

using time-frequency estimators, IEEE Trans. Biomed. Eng. 43 (12) (1996) 1176–1186.[20] F. Forsberg, H. Oung, L. Needleman, Doppler spectral estimation using time-frequency distributions, IEEE Trans. Ultrason.

Ferroelect. Frequency Control 46 (3) (1999) 595–608.[21] S. Soltani, On the use of the wavelet decomposition for time series prediction, Neurocomputing 48 (2002) 267–277.[22] I. Daubechies, The wavelet transform, time-frequency localization and signal analysis, IEEE Trans. Informat. Theory 36

(5) (1990) 961–1005.[23] M. Unser, A. Aldroubi, A review of wavelets in biomedical applications, Proc. IEEE 84 (4) (1996) 626–638.[24] M. Akay, Wavelet applications in medicine, IEEE Spectrum 34 (5) (1997) 50–56.[25] Y. Zhang,Y. Wang, W. Wang, B. Liu, Doppler ultrasound signal denoising based on wavelet frames, IEEE Trans. Ultrason.

Ferroelect. Frequency Control 48 (3) (2001) 709–716.[26] S.G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal.

Machine Intel. 11 (7) (1989) 674–693.[27] S.G. Chang, B. Yu, M. Vetterli, Multiple copy image denoising via wavelet thresholding, Proceedings of the IEEE

International Conference on Image Processing, Vol. 1, Chicago, USA, 1998, pp. 545–549.[28] I.A. Basheer, M. Hajmeer, Artificial neural networks: fundamentals, computing, design, and application, J. Microbiol.

Methods 43 (1) (2000) 3–31.[29] B.B. Chaudhuri, U. Bhattacharya, Efficient training and improved performance of multilayer perceptron in pattern

classification, Neurocomputing 34 (2000) 11–27.


ARTICLE IN PRESS

[30] S. Haykin, Neural Networks: A Comprehensive Foundation, Macmillan, New York, 1994.[31] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors, Nature 323 (1986)

533–536.[32] R.A. Jacobs, Increased rate of convergence through learning rate adaptation, Neural Networks 1 (1988) 295–307.[33] A.A. Minai, R.D. Williams, Back-propagation heuristics: a study of the extended delta-bar-delta algorithm, Proceedings

of International Joint Conference on Neural Networks, Vol. 1, San Diego, California, 17–21 June, 1990, pp. 595–600.[34] S.E. Fahlman, An empirical study of learning speed in backpropagation networks, Computer Science Technical Report,

CMU-CS-88-162, Carnegie Mellon University, Pittsburgh, 1988.[35] M.T. Hagan, M.B. Menhaj, Training feedforward networks with the Marquardt algorithm, IEEE Trans. Neural Networks

5 (6) (1994) 989–993.[36] R. Battiti, First- and second-order methods for learning: between steepest descent and Newton’s method, Neural Comput.

4 (1992) 141–166.[37] L-W. Chan, Efficacy of different learning algorithms of the back propagation network, IEEE Region 10 Conference on

Computer and Communication Systems, Vol. 1, Hong Kong, 24–27 September, 1990, pp. 23–27.[38] A. Sidani, T. Sidani, A comprehensive study of the backpropagation algorithm and modifications, in: IEEE Conference

Record, 80–84, Orlando FL USA, 29–31 March, 1994.[39] J.M. Hannan, J.M. Bishop, A comparison of fast training algorithms over two real problems, IEE Fifth International

Conference on Artificial Neural Networks, Conference Publication No. 440, 1–6, Cambridge UK, 7–9 July, 1997.

Inan Güler graduated from Erciyes University in 1981. He took his M.S. degree from Middle East Technical University in1985, and his Ph.D. degree fromIstanbul Technical University in 1990, all in Electronic Engineering. He is a professor atGazi University where he is Head of Department. His interest areas include biomedical systems, biomedical signal processing,biomedical instrumentation, electronic circuit design, neural networks, and artificial intelligence. He has written more than 100articles on biomedical engineering.

Elif Derya Übeylı graduated from Çukurova University in 1996. She took her M.S. degree in 1998, all in electronic engineering.She took her Ph.D. degree from Gazi University, electronics and computer technology. She is an instructor at the Department ofElectrical and Electronics Engineering at TOBB Economics and Technology University. Her interest areas are biomedical signalprocessing, neural networks, and artificial intelligence.

Feature extraction from Doppler ultrasound signals for ...

Documents

Transcript of Feature extraction from Doppler ultrasound signals for ...