Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our...

6
Combinations of Two LMS Adaptive Filters onu Trump Tallinn University of Technology Department of Radio and Telecommunication Engineering Ehitajate tee 5, 19086 Tallinn, Estonia [email protected] Abstract: In this paper we consider the combination of two adaptive filters that are simultaneously applied to the same inputs. The combination provides us with an adaptive algorithm that is able to find a good trade–off between the initial convergence speed and the mean–square error in steady state. The combination scheme consists of two adaptive filters that are simultaneously applied to the same inputs. One of the filters has a large step size allowing fast convergence and the other one has a small step size for a small steady state error. The outputs of the filters are combined through a mixing parameter. There are several ways to compute the combination parameter. In our treatment the combination parameter is computed from output signals of the individual filters. The scheme is optimal in the sense that it results from minimizing the mean–square error of the combination. System identification and beamforming are examined as application areas of the technique. Key–Words: Adaptive filtering, antenna arrays, system identification, combination of two adaptive filters. 1 Introduction Designing a Least Mean Square (LMS) family adap- tive algorithm includes solving the well–known trade– off between the initial convergence speed and the mean–square error in steady state according to the re- quirements of the application at hands. The trade–off is controlled by the step-size parameter of the algo- rithm. Large step size leads to a fast initial conver- gence but the algorithm also exhibits a large mean– square error in the steady state and in contrary, small step size slows down the convergence but results in a small steady state error [1, 2]. In several applications it is, however, eligible to have both and hence it would be very desirable to be able to design algorithms that can overcome the named trade–off. Recently there has been an interest in a combi- nation scheme that is able to optimize the trade–off between convergence speed and steady state error [3]. The scheme consists of two adaptive filters that are si- multaneously applied to the same inputs as depicted in Figure 1. One of the filters has a large step size al- lowing fast convergence and the other one has a small step size for a small steady state error. The outputs of the filters are combined through a mixing parameter λ. The performance of this scheme has been studied for some parameter update schemes [4, 5]. The refer- ence [4] uses convex combination i.e. λ is constrained to lie between 0 and 1. The parameter λ is in those papers found using an LMS type adaptive scheme and computing the sigmoidal function of the result. The reference [5] takes another approach computing the mixing parameter using an affine combination. This paper uses the ratio of time averages of the instanta- neous errors of the filters. The error function of the ratio is then computed to obtain λ. In this paper we compute the mixing parameter λ from output signals of the individual filters. The way of calculating the mixing parameter is optimal in the sense that it results from minimization of the mean- square error of the combined filter. The scheme was independently proposed in [6] and [7] and analysed in [8]. We will investigate two applications of the com- bination: system identification and beamforming. We describe each of the applications in detail and present a proper analysis. We will assume throughout the paper that the signals are complex–valued and that the combination scheme uses two LMS adaptive filters. The italic, bold face lower case and bold face upper case letters will be used for scalars, column vectors and matrices respec- tively. The superscript T denotes transposition and the superscript H Hermitian transposition of a matrix. The operator E[·] denotes mathematical expectation and Re{·} is the real part of a complex variable. 2 Combination of Two Adaptive Fil- ters Let us consider two adaptive filters, as shown in Fig- ure 1, each of them updated using the LMS adaptation Advances in Sensors, Signals, Visualization, Imaging and Simulation ISBN: 978-1-61804-119-7 53

Transcript of Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our...

Page 1: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

Combinations of Two LMS Adaptive Filters

Tonu TrumpTallinn University of Technology

Department of Radio and Telecommunication EngineeringEhitajate tee 5, 19086 Tallinn, Estonia

[email protected]

Abstract: In this paper we consider the combination of two adaptive filters that are simultaneously applied to thesame inputs. The combination provides us with an adaptive algorithm that is able to find a good trade–off betweenthe initial convergence speed and the mean–square error in steady state. The combination scheme consists oftwo adaptive filters that are simultaneously applied to the same inputs. One of the filters has a large step sizeallowing fast convergence and the other one has a small step size for a small steady state error. The outputsof the filters are combined through a mixing parameter. There are several ways to compute the combinationparameter. In our treatment the combination parameter is computed from output signals of the individual filters.The scheme is optimal in the sense that it results from minimizing the mean–square error of the combination.System identification and beamforming are examined as application areas of the technique.

Key–Words: Adaptive filtering, antenna arrays, system identification, combination of two adaptive filters.

1 Introduction

Designing a Least Mean Square (LMS) family adap-tive algorithm includes solving the well–known trade–off between the initial convergence speed and themean–square error in steady state according to the re-quirements of the application at hands. The trade–offis controlled by the step-size parameter of the algo-rithm. Large step size leads to a fast initial conver-gence but the algorithm also exhibits a large mean–square error in the steady state and in contrary, smallstep size slows down the convergence but results in asmall steady state error [1, 2]. In several applicationsit is, however, eligible to have both and hence it wouldbe very desirable to be able to design algorithms thatcan overcome the named trade–off.

Recently there has been an interest in a combi-nation scheme that is able to optimize the trade–offbetween convergence speed and steady state error [3].The scheme consists of two adaptive filters that are si-multaneously applied to the same inputs as depictedin Figure 1. One of the filters has a large step size al-lowing fast convergence and the other one has a smallstep size for a small steady state error. The outputs ofthe filters are combined through a mixing parameterλ. The performance of this scheme has been studiedfor some parameter update schemes [4, 5]. The refer-ence [4] uses convex combination i.e. λ is constrainedto lie between 0 and 1. The parameter λ is in thosepapers found using an LMS type adaptive scheme andcomputing the sigmoidal function of the result. The

reference [5] takes another approach computing themixing parameter using an affine combination. Thispaper uses the ratio of time averages of the instanta-neous errors of the filters. The error function of theratio is then computed to obtain λ.

In this paper we compute the mixing parameter λfrom output signals of the individual filters. The wayof calculating the mixing parameter is optimal in thesense that it results from minimization of the mean-square error of the combined filter. The scheme wasindependently proposed in [6] and [7] and analysed in[8]. We will investigate two applications of the com-bination: system identification and beamforming. Wedescribe each of the applications in detail and presenta proper analysis.

We will assume throughout the paper that thesignals are complex–valued and that the combinationscheme uses two LMS adaptive filters. The italic, boldface lower case and bold face upper case letters will beused for scalars, column vectors and matrices respec-tively. The superscript T denotes transposition andthe superscriptH Hermitian transposition of a matrix.The operator E[·] denotes mathematical expectationandRe· is the real part of a complex variable.

2 Combination of Two Adaptive Fil-ters

Let us consider two adaptive filters, as shown in Fig-ure 1, each of them updated using the LMS adaptation

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 53

Page 2: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

w1(n) -­‐

+

w2(n) -­‐

x(n)

y1(n)

y2(n)

y(n) d(n)

1-­‐λ(n)

λ(n)

-­‐

e(n)

-­‐

-­‐

-­‐

+

+

+

e1(n)

e2(n)

Figure 1: The combined adaptive filter.

rule

ei(n) = d(n)−wHi (n− 1)x(n), (1)

wi(n) = wi(n− 1) + µie∗i (n)x(n). (2)

In the above wi(n) is the N vector of coefficients ofthe i-th adaptive filter, with i = 1, 2 and x(n) is theknown N input vector, common for both of the adap-tive filters. The input process is assumed to be a zeromean wide sense stationary Gaussian process. µi isthe step size of i–th adaptive filter. We assume with-out loss of generality that µ1 > µ2. The case µ1 = µ2is not interesting as in this case the two filters remainequal and the combination renders to a single filter.

The desired signal in (1) can be expressed as

d(n) = wHo x(n) + ζ(n)., (3)

where the vector wo is the optimal Wiener filter coef-ficient vector for the problem at hands and the processζ(n) is the irreducible error that is statistically inde-pendent of all the other signals.

The outputs of the two adaptive filters are com-bined according to

y(n) = λ(n)y1(n) + [1− λ(n)]y2(n), (4)

where yi(n) = wHi (n − 1)x(n) and the mixing pa-

rameter λ(n) can be any real number.We define the a priori system error signal as

difference between the output signal of the optimalWiener filter at time n, given by yo(n) = wH

o x(n) =d(n) − ζ(n), and the output signal of our adaptivescheme y(n)

ea(n) = yo(n)−λ(n)y1(n)− (1−λ(n))y2(n). (5)

Let us now find λ(n) by minimizing the mean–square of the a priori system error. The derivative ofE[|ea(n)|2] with respect to λ(n) reads

∂E[|ea(n)|2]∂λ(n)

= (6)

2E[Re(yo(n)− y2(n))(y2(n)− y1(n))∗+λ(n)|(y2(n)− y1(n))|2].

Setting the derivative to zero results in

λ(n) =E[Re(d(n)− y2(n))(y1(n)− y2(n))∗]

E[|(y1(n)− y2(n))|2],

(7)where we have replaced the Wiener filter output sig-nal yo(n) by its observable noisy version d(n). Notehowever, that because the input signal x(n) and irre-ducible error ζ(n) are independent random processes,this can be done without introducing any error into ourcalculations.

3 System Identification

In several areas it is essential to build a mathematicalmodel of some phenomenon or system. In this classof applications, the adaptive filter can be used to finda best fit of a linear model to an unknown plant. Theplant and the adaptive filter are driven by the sameknown input signal and the plant output provides thedesired signal of the adaptive filter. The plant canbe dynamic and in this case we have a time varyingmodel. The system identification configuration is de-picted in Figure 2. As before x(n) is the input signal,v(n) is the measurement noise, y(n) is the adaptivefilter output signal and e(n) is the error signal. Thedesired signal is d(n) = wH

o x(n) + ζ(n), where wo

is the vector of Wiener filter coefficients and the irre-ducible error ζ(n) consists of the measurement noisev(n) together with the effects of the plant that can notbe explained with a length N linear model. The resultof pure system identification problem is the vector ofadaptive filter coefficients.

In here we are going to use the combination oftwo adaptive filters described in the previous Sectionto solve the system identification problem.

3.1 Excess Mean–Square Error

In this section we are interested in finding expressionsthat characterize transient performance of the com-bined algorithm i.e. we intend to derive formulae thatpredict entire course of adaptation of the algorithm.Before we can proceed we need, however, to introducesome notations. First let us denote the weight error

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 54

Page 3: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

Unknown plant

Adap-ve filter

Σ

x(n)

d(n)

y(n)

e(n) Σ

v(n)

Figure 2: Block diagram of generelized sidelobe can-celler

vector of i–th filter as wi(n) = wo−wi(n). Then theequivalent weight error vector of the combined adap-tive filter will be w(n) = λw1(n) + (1− λ)w2(n).

The a priori estimation error of an individual filteris defined as

ei,a(n) = wHi (n− 1)x(n). (8)

It follows from (5) that we can express the a priorierror of the combination as

ea(n) = λ(n)e1,a(n) + (1− λ(n)) e2,a(n) (9)

and because λ(n) is according to (7) a ratio of math-ematical expectations and, hence, deterministic, wehave for the excess mean–square error of the combi-nation, EMSE(n) = E[|ea(n)|2],E[|ea(n)|2] = λ2E[|e1,a(n)|2] (10)

+2λ(1− λ)E[Ree1,a(n)e∗2,a(n)]+(1− λ)2E[|e2,a(n)|2].

As ei,a(n) = wHi (n− 1)x(n), the expression of the

excess mean–square error becomes

E[|ea(n)|2] = λ2E[wH1 xxHw1] (11)

+2λ(1− λ)E[RewH1 xxHw2]

+(1− λ)2E[wH2 xxHw2].

In what follows we often drop the explicit time indexn as we have done in (11), if it is not necessary toavoid a confusion.

Noting that yi(n) = wHi (n − 1)x(n), we can

rewrite the expression for λ(n) in (7) as

λ(n) =E[wH

2 xxHw2]− E[RewH2 xxHw1]

d,

(12)

where d = E[wH1 xxHw1]−2E[RewH

1 xxHw2]+E[wH

2 xxHw2].We thus need to investigate the evolution of the

individual terms of the type EMSEk,l = E[wHk (n−

1)x(n)xH(n)wl(n − 1)] in order to reveal the timeevolution of EMSE(n) and λ(n).

Let us now define the eigendecomposition of thecorrelation matrix as

QHRxQ = Ω, (13)

where Q is a unitary matrix whose columns are theorthogonal eigenvectors of Rx and Ω is a diagonalmatrix having eigenvalues ωi associated with the cor-responding eigenvectors on its main diagonal. We alsodefine the transformed weight error vector as

vi(n) = QHwi(n). (14)

It can be shown [8] that

EMSEkl = E[wHk (n− 1)x(n)xH(n)wl(n− 1)

],

can be computed as

EMSEkl =N−1∑i=0

ωiE[v∗k,i(n− 1)vl,i(n− 1)

],

(15)where vk,i is the i–th element of vector v for k–thfilter. The EMSE of the combined filter is then

EMSE =N−1∑i=0

ωiE [|λ(n)vk,i(n− 1) (16)

+(1− λ(n)vl,i(n− 1)|2].

The components of type Υk,l,m = E[vk,i(n −1)vl,i(n− 1)] are given by

Υk,l,m = (1− µkωm)n (1− µlωm)n (17)[|vm(0)|2 +

Jminω2m − ωm

µl− ωm

µk

]

− Jminω2m − ωm

µl− ωm

µk

,

where Jmin = E[|eo|2] is the minimum mean–squareerror produced by the corresponding Wiener filter. Tocompute λ(n) we use (12) substituting (15) for its in-dividual components.

4 Adaptive Sensor Array

In this section we describe how to use the combina-tion of two adaptive filters in an adaptive beamformer.

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 55

Page 4: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

The beamformer we employ here is often termed asGeneralized Sidelobe Canceller [1].

Let φ denote the the angle of incidence of a planarwave impinging a linear sensor array, measured withrespect to the normal to the array. The electrical angleθ is related to the incidence angle as

θ =2πδ

λsinφ, (18)

where λ is the wavelength of the incident wave and δis the spacing between adjacent sensors of the lineararray.

Suppose that the signal impinging the array ofM = N + 1 sensors is given by

u(n) = A(Θ)s(n) + v(n), (19)

where s(n) is the vector of emitter signals, Θ is acollection of directions of arrivals, A(Θ) is the arraysteering matrix with its columns a(θ) defined as re-sponses toward the individual sources s(n) and v(n)is a vector of additive circularly symmetric Gaussiannoise. The M vectors

a(θ) =[1, ejθ, . . . , ej(M−1)θ

]T(20)

are called the steering vectors of the respectivesources. We assume that the source of interest is lo-cated at the electrical angle θ0.

The block diagram of the Generalized SidelobeCanceller is shown in Figure 3. The structure con-sists of two branches. The upper branch is the steer-ing branch, that directs its beam toward the desiredsource. The lower branch is the blocking branch thatblocks the signals impinging at the array from the di-rection of the desired source and includes an adaptivealgorithm that minimizes the mean–square error be-tween the output signals of the branches.

The weights in steering branch ws are selectedfrom the condition

wHs a(θ0) = g (21)

i.e. we require the response in the direction of thesource of interest θ0 to equal a constant g. Commonchoices for g are g = M and g = 1. Here we haveused g = M .

The signal at the output of the upper branch isgiven by

d(n) = wHs u(n). (22)

In the lower branch we have a blocking matrix,that will block any signal coming from the directionθ0. The columns of the M ×M − 1 blocking matrix

x(n)

ws

Σ

Cb wb

+

-­‐

d(n)

e(n) u(n)

Figure 3: Block diagram of generelized sidelobe can-celler

Cb are defined as being the orthogonal complement ofthe steering vector a(θ0) in the upper branch

aH(θ0)Cb = 0. (23)

The vector valued signal x(n) at the output of theblocking matrix is formed as

x(n) = CHb u(n). (24)

The output of the algorithm is

e(n) = d(n)−wHb (n)x(n). (25)

The signals x(n) and d(n) can be used as the inputand desired signals respectively in an adaptive algo-rithm to select the blocking weights wb. In this pa-per we use the combination of two adaptive filters thatgives us fast initial convergence and low steady statemisadjustment at the same time.

The EMSE of the adaptive algorithm can be anal-ysed as it is done in Section 3.1 . In this application weare also interested in signal to interference and noiseratio (SINR) at the array output. To evaluate this wefirst note that the power that signal of interest gener-ates at the array output is according to (21)

Ps = wHs a(θ0)σ

2s0a

H(θ0)ws = |g|2σ2s0 , (26)

where σ2s0 is the variance of the useful signal arrivingfrom the angle θ0.

To find the interference and noise power we firstdefine the reduced signal vector s and a reduced DOA

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 56

Page 5: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

collection Θ where we have left out the signal of in-terest and the steering vector corresponding to the use-ful signal but kept all the interferers and the interfer-ence steering vectors. The corresponding array steer-ing matrix is A(Θ). The correlation matrix of inter-ference and noise in the signal x(n), which is the inputsignal to our adaptive scheme, is then given by

Rx = CHb A(Θ)E [ssH ]AH(Θ)Cb + CH

s Csσ2v ,(27)

where σ2v is the noise variance, the first component inthe summation is due to the interfering sources andthe second component is due to the noise.

It follows from the standard Wiener filtering the-ory that the minimum interference and noise power atthe array output is given by

Jmin = σ2int,v − pHR−1p, (28)

where the desired signal variance excluding the signalfrom the source of interest is

σ2int,v = wHs ARAHws + σ2vw

Hs ws (29)

and the crosscorrelation vector between the adaptivefilter input signal and desired signal excluding the sig-nal from source of interest is

p = CHb A(Θ)E [ssH ]AHws + σ2vw

Hs ws. (30)

We can now find the eigendecomposition of Rx

and use the resulting eigenvalues in (15) and (16) tofind the excess mean–square error due to interferenceand noise only EMSEint,v. The error power can becomputed as minimum interference and noise powerat the array output plus excess mean–square error dueto interference and noise only

Pv,int = Jmin + EMSEint,v(n) (31)

and the signal to noise ratio is thus given by

SNR(n) =Ps

Pv,int(n). (32)

5 Simulation Results

In order to obtain a practical algorithm, the expecta-tion operators in both numerator and denominator of(7) have been replaced by exponential averaging of thetype

Pav(n) = (1− γ)Pav(n− 1) + γp(n), (33)

where p(n) is the quantity to be averaged, Pav(n)is the averaged quantity and γ is the smoothing pa-rameter. The averaged quantities were then used in

(7) to obtain λ. The curves shown in the Figures tofollow are averages over 100 independent trials.Thenoisy blue line represents the simulation result andthe smooth red line is the theoretical result. We oftenshow the simulation results and the theoretical curvesin the same Figures. In several cases the curves over-lap and are therefore indistinguishable.

To illustrate the system identification applicationwe have selected the sample echo path model numberone from [9], to be the unknown system to identifyand combined two 64 tap long adaptive filters.

Figure 4: Time–evolutions of EMSE with µ1 = 0.005and µ2 = 0.0005 and σ2v = 10−3.

In the system identification example we use Gaus-sian white noise with unity variance as the input sig-nal. The measurement noise is another white Gaus-sian noise with variance σ2v = 10−3. The step sizesare µ1 = 0.005 for the fast adapting filter and µ2 =0.0005 for the slowly adapting filter. Figure 4 depictsthe evolution of EMSE in time. One can see that thesystem converges fast in the beginning. The fast con-vergence is followed by a stabilization period betweensample times 1000 – 7000 followed by another con-vergence to a lower EMSE level between the sampletimes 8000 – 12000. The second convergence occurswhen the mean–square error of the filter with smallstep size surpasses the performance of the filter withlarge step size. One can observe that the there is agood accordance between the theoretical and the sim-ulated curves.

The combination parameter λ is shown in Figure5. At the beginning, when the fast converging filtergives smaller EMSE than the slowly converging one,λ is close to unity. When the slow filter catches upthe fast one λ starts to decrease and obtains a smallnegative value at the end of the simulation example.

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 57

Page 6: Combinations of Two LMS Adaptive Filters - WSEAS · Combinations of Two LMS Adaptive ... In our treatment the combination parameter is computed from output signals of the individual

Figure 5: Time–evolutions of λ with µ1 = 0.005 andµ2 = 0.0005 and σ2v = 10−3.

The theoretical and simulated curves fit well.In the beamforming example we have used a 8

element linear array with half wave-length spacing.The noise power is 10−4 in this simulation example.The useful signal which is 10 dB stronger than thenoise arrives form the broadside of the array. Thereare three strong interferers at −35, 10 and 15 withSNR1 = 33 dB and SNR2 = SNR3 = 30 dB re-spectively. The step sizes of the adaptive combinationare µ1 = 0.05 and µ2 = 0.006. The signal to inter-ference and noise ratio evolution is show in Figure 6.One can see a fast improvement of SINR at the begin-ning of the simulation example followed by a stabi-lization region. After a while a new region of SINRimprovement occurs and finally the SINR stabilizes atan improved level.

Figure 6: Time evolution of SINR.

6 Conclusions

In this paper we have investigated the combinationof two adaptive filters, which is a new and interest-ing way of achieving fast initial convergence and lowsteady state error of an adaptive filter at the same time,solving thus the trade–off one has in step size selec-tion . We were looking at two applications of the tech-nique - system identification and adaptive beamform-ing. In both applications we saw that the combinationworked as expected allowing the algorithm to con-verge fast to a certain level and then after a while pro-viding a second convergence to a lower mean–squareerror value.

References:

[1] S. Haykin, Adaptive Filter Theory, Fourth Edi-tion, Prentice Hall, New Jersey 2002

[2] A. Sayed, Adaptive Filters, John Wiley and sons,New Jersey 2008

[3] M. Martinez–Ramon, J. Arenas-Garcia,A. Navia–Vazquez, A. R. Figueiras-Vidal, AnAdaptive Combination of Adaptive Filters forPlant Identification, Proc. 14th InternationalConference on Digital Signal Processing,Santorini, Greece, 2002, pp. 1195–1198.

[4] J. Arenas-Garcia, A. R. Figueiras-Vidal,A. H. Sayed, Mean–Square Performance ofConvex Combination of Two Adaptive Filters,IEEE Transactions on Signal Processing 54,2006, pp. 1078–1090.

[5] N. J. Bershad and J. C. Bermudez andJ. H. Tourneret, An Affine Combination of TwoLMS Adaptive Filters – Transient Mean–SquareAnalysis, IEEE Transactions on Signal Process-ing 56, 2006, pp. 1853–1864.

[6] T. Trump, An output signal based combinationof two NLMS adaptive algorithms, Proc. 16thInternational Conference on Digital Signal Pro-cessing, Santorini, Greece, 2009.

[7] L. A. Azpicueta–Ruiz, A. R. Figueiras-Vidal,J. Arenas-Garcia, A New Least Squares Adap-tation Scheme for the Affine Combination ofTwo Adaptive Filters, Proc. IEEE InternationalWorkshop on Machine Learning for Signal Pro-cessing, Cancun, Mexico, 2008, pp. 327–332.

[8] T. Trump, Output signal based combination oftwo NLMS adaptive filters - transient analysis,Proceedings of the Estonian Academy of Sci-ences 60, 2011, pp. 258–268.

[9] ITU-T Recommendation G.168 Digital NetworkEcho Cancellers, 2009.

Advances in Sensors, Signals, Visualization, Imaging and Simulation

ISBN: 978-1-61804-119-7 58