Download - Data Normalized Filters

8/11/2019 Data Normalized Filters

1/9

w e xtend the transient analysis of the earlier chaptersto mo re general filter recursions,starting with the normalizedLMS algorithm.

25.1 NLMSFILTER

We thus consider the6-NLMSrecursion

for which the data normalization in(22.2) is given by

S[Ui] + l lUiIl2

(25.1)

(25.2)

In this case, relations(22.26)-(22.27)and (22.29)become

and we see that we need to evaluate the moments

Unfortunately, closed form expressions for these moments are not available in general,even for Gauss ian regressors. Still, we w ill be able to show that the filter is convergent inthe mean and is also mean-square stable fo r step-sizes satisfyingp < 2, and regardless

of the input distribution (Gaussian or otherwise)- ee App. 25.B. We therefore treat thegeneral case directly. Since the argum ents are similar to those in Chapter24 for LMS, weshall be brief.

Thus, introduce theM 2 x 1 vectors

A Au = vec(C), r = vec(R,)

371

Adaptive Filters, by Ali H. ayedCopyright @ 2008 John Wiley Sons, Inc


2/9

as well as theM 2 x M 2 matrices72

wi

CHAPTER25DATA-NORMALIZEDFILTERS

- E llwillzE I IwiIILE I I ~ i I I $ z u (25.1 1)

E I I ~ i l l F ( M Z - l ) u2

= ( ;12T @ E :,12)]and theM x M matrix

25.5)

25.6)

The matrixA is positive-definite, whileB is nonnegative-definite ee Prob.V.6. pply-ing the vec notation to both side s of the ex pression forC' in (25.3) we find that it reducesto

d F r 25.7)

where F is M 2 x M 2 and given by

AF = I - p A + p 2 B 25.8)

Moreover, the recursion forEw an be w ritten as

The same argum ents that were used in Sec. 24.1 will show that the mean-square behaviorof E - N L M S is characterized by the following M2-dim ensional state-space model:

where s the companion matrix

=

with

0 1

0 0 10 0 0 1

0 0 0 1-Po -p1 -p2 . . . -PMz-l

M 2 x M 2 )

Ma-1

p(z) = det(z1- F ) = xM2+ pkxk O


3/9

and the k-th entry ofY is given by 373ECTION25.2

NLMSFILTERk = 0 ' 1 , . . ,M 2 - 1 (25.12)

The definitions of{Wi, Y } re in terms of any of interest, e.g., most commonly,= qor u = r . It is shown in App. 25.B that anyp < 2 is a sufficient condition for the stabilityof (25.10).

Theorem 25.1 (Stability of E-NLMS)Consider the E-NLMS recursion (25.1)and assume the data { d ( i ) , u i } atisfy model (22.1) and the independenceassumption (22.23). Th e regressors need not be Gaussian. Then the filt eris convergent in the mean and is also mean-square stable for any p < 2.Moreover, the transient behavior of the filter is characterized by the state-space recursion (25.10)-(25.12), and th e mean-square dev iati on and the ex-cess mean-square error are given by

where r = vec(R,) and q = vec(1).

The expressions for theMSD and EMSE in the statem ent of the theo rem are derived ina manner sim ilar to (23.51 ) and (23.56). They can be rewritten as

~ MSD = p20:Tr(SCm,d) and EMSE = p 2 0 : T r ( S E e m q~

where

)A E ( u f u

6 + llu 112)2and the weigh ting matrices {Cmsd,C correspond to the vectorsUmsd = (I - F ) - l qand g e m s = (I F ) - r . That is,

Cmsd = vec-l(um,d) and c,,,, = vec-'(oemSe)

Learning CurveObserve that sinceE lea i)I2 = E l l t & - . l ~ l ~ u ,e again find that the time evolutionofE iea i)12 s described by the top entryof the state vectorWi in (25.10)-(25.12) withochosen asu = r . The lea rning curve of the filter will beE le i)i2 = 0, E lea i)12.

Small Step-Size ApproximationsSeveral approxim ations for theEMSE and MSD expression s that appe ar in the above theo-rem are derived in Prob s. V.18-V.20. The ultima te conclusion from these problems is thatfor small enough,Y and E , we get

11EMSE = &E (m) r(R,) and MSD = B E m)2 P 2 P

I

(25.13)The exp ression for theEMSE is the same we derived earlier in Lemm a 17 .1.


4/9

74 Gaussian RegressorsCHAPTER25NORMALIZEDFILTERS

If the regressors happen to be Gaussian, thenit can be shown that the M2-dimensionalstate-space m odel(25.10)-(25.12) reduces to an M-dimensional model- his assertionis proved in Probs.V.16 and V.17.

DATA-

25.2 DATA-NORMALIZED FILTERS

The arguments that were employed in the last two sections forLMS and 6-NLMS aregeneral enough an d can be applied to ada ptive filters with generic data non linearities of theform (22.2)-(22.3). To see this, consider again the variance and mean relations(22.26)-(22.27) and (22.29),which are reproduced below:

(25.14)If we now introduce theM2 x M 2 matrices

A (E [ IT@ 1) ,,, (25.15)

B E ( [ Z l T @ [ ] ) (25.16)

and theM x M matrix

(25.17)

thenEt3 i = (I - P ) E t 3 i - l

and the expression for C' ca n be written in terms of the linear vector relation

6' = F o (25.18)

where F is MZ x M2 and given by

AF = I - p A + p 2 B (25.19)

Let

H = -:I2 ] 2 M 2 x 2 M 2 ) (25.20)Then the same arguments that were used in Chapter24 will lead to the statement ofThm. 25.2 listed further ahead. The expressions for theMSD and EMSE in the state-ment of the theore m are derived in a manner similar to(23.51) and (23.56). They can berewritten as

I MSD = p2a:Tr(SC,,d) and EMSE = p2a:Tr(SCe,,e) I


5/9

whereS = E ( ~ b ~ i / g ~ [ ~ i ] )

and the weightingmatrices {Cmsd,C,,,,} correspond to the vectorsOmsd = (I - F ) - l qand gemse (I - F ) - ~ T . hat is,

Cmsd= VeC-l(Om,d) and C, = VeC-l(Oemse)

Theorem 25.2 (Stability of data-normalized filters) Consider data normalizedadaptive filters o f the form (22.2)-(22.3), and assume th e data { d ( i ) , i} at-isfy model (22.1) and the independence assumption (22.23). Then th e filteris convergent in the mean and is mean-square stable for step-sizes satisfying

0 < p < min{2/Xm,(P), l/Xmax(A-'B), l /max{X(H) E R }}

where the matrices {A, B , P, H} are defined by (25.15)-(25.17) and (25.20)and B is assumed finite. Moreover, the transient behavior of the filter ischaracterized by the M2-dimensional state-space recursion Wi = W i - 1 +p 2 g : y , where F s the companion mat rix

=

with

0 10 0 10 0 0 1

M 2 x M 2 )

MZ-1A

p ( z ) = d e t ( s 1 - F ) = s M Z p k z kk O

denoting the characteristic polynomial of F in (25.19). Also,

Aw

for any 0 of interest, e.g., g = q or = T . In addition, the mean-squaredeviation and the excess mean-square error are given by

375SECTION25.2

DATA-NORMALIZED

FILTERS

where T = vec(R,) and q = vec(1).


6/9

76 Learning CurveCHAPTER25NORMALIZEDFILTERS

As before, sinceE lea i)i2 E lIGi 1 Ilkuwe find that the time evo lution ofE lea z)l2 sdescribed by the top entry of the sta te vectorWi in with hosen aso = T . The learningcurve of the filter will beE le i)I2 = c: + E l ea i ) I2 .

DATA-

Small Step-Size ApproximationsIn Rob. V.39 it is shown that under a boundedness requirement on the m atrixB of fourthmoments, data-normalized adaptive filters can be guaranteed to be mean-square stable for

sufficiently small step-sizes. Th atis, there always exists a small-en ough step-size that lieswithin the stability range described inThm. 25.2.

Now observe that the performance results ofThm. 25.2 are in terms of the momentmatrices { A ,B , P } . These mom ents are generally hard to evaluate for arbitrary inputdistributions and data nonlinearitiesg[. . However, some simplifications occur when thestep-sizeis sufficiently small.This is because, inthis case, we may ignore the quadraticterm in p that appears in the expression forC in (25.14), and thereby approximate thevariance and m ean relations by

where P is as in (25.17).Using the weigh ting vector notation, we can write

(25.21)

(25.22)

F = I - p A (25.23)

where nowA = ( P ~ C N ) + ( I B P )

The variance relation(25.22) would then lead to the following approximate expressionsfor the filterEMSEand MSD:

EMSE = p2aiTr(SCemse) and MSD = p20~Tr(sCmsd)

where

and the weigh ting matrices {Cemse,Cmsd} correspon d to the vectorsoemse A-'vec( R,)/pand cmsd = A-lvec(I)/p. That is, {Cemse,Cmsd} are the un ique solution s of the Lya-punov equations

S = E ~ t u i / g ~ [ ~ i ] )

p p z m s d +PCrnsdP = I and p p z e m s e -I- p x e m s e p = u

It is easy to Verify thatCmsd = p-'P- ' /2 so that the performance expressions can berewrittenas

I EMSE = p20:Tr(SCemse), MSD = pa:Tr(SP-')/2

Remark 25.1 (Filters with error nonlinearities) There is more to say about the transient per-formance of adaptive filters, especiallyfor filters witherrornonlinearities in their update equations.This is a more challenging class of filters to study and their performance is examined in App. 9.C


7/9

of Sayed(2003)by using the same energy-conservation arguments of this part. The derivation usedin that appendix to study adaptive filters with error nonlinearities can also be used to provide analternative simplified transient analysis for data-no rmalized filters. The derivation is based on a longfilter assumption in order to justify a Gaussian condition on the distribution of thea priori errorsignal. Among other results, it is shown in App.9.C of Sayed (2003)that the transient behavior ofdata-normalized filters can be approximated by an M-dimensional linear time-invariant state-spacemodel even for non-Gaussian regressors. Appendix9.E of the same reference further examines thelearning abilities of adaptive filters and shows, among other interesting results, that the learningbehavior of LMS cannot be fully captured by relying solely on mean-square analysis

0

25.A APPENDIX: STABILITY BOUND

Consider a m atrixF of the formF = I A+p2 B with A > 0, B 2 0, and p > 0. Matrices of thisform arise frequently in the s tudy of the mean-square stability of adaptive filters (see, e.g.,(25.19)).The purpose of this section is to find conditions onp in termsof { A ,B } n order to guarantee thatall eigenvalues ofF are strictly inside the unit circle, i.e.,so that - 1 < X F) < 1.

To begin with, in order to guaranteeX F) < 1 , the step-size p should be such that (cf. theRayleigh-Ritz characterization of eigenvalues from Sec.B.1):

max z * ( I - p A + p 2 B ) z < 1/ lxl l=l

or, equivalently,A - p B > 0. The argument in parts (b) and (c) of R ob .V.3 then show thatthiscondition holds if, and only if,

/I < l/Xrnax(A- B) (25.24)Moreover, in order to guaranteeX F) > -1, the step-sizep should be such that

min z * ( I - p A + p 2 B ) z > -111~11=1

Aor, equivalently,G ( p ) = 21 - p A + p Z B > 0. When p = 0, the eigenvalues of G are allpositive and equal to2. As p increases, the eigenvalues ofG vary continuously withp. Indeed, theeigenvalues ofG(p) are the rootsof det[XI - G p)] = 0. This is a polynomial equation inX and itscoefficients are functions ofp. A fundamental result in function theory and matrix analysis states thatthe zeros of a polynomial depend continuously on its coefficients and, consequently, the eigenvalues

of G(p) vary continuously withp . This means thatG p ) will first become singular before becomingindefinite. For this reason, there is an upper bound onp, ay, prnax , uch that G ( p ) > 0 for allp < pmax his bound onp is equal to the smallest value ofp that makesG p ) ingular, i.e., forwhich det[G(p)] = 0. Now note that the determinant ofG ( p ) s equal to the determinant of theblock matrix

since

det [ I)= det(Z)det(X - WZ-lY)whenever2 s invertible. Moreover, since we can write

we find that the conditiondet[K(p)] = 0 is equivalent todet(1- pH 0, where

377ECTION25.ASTABILITY

BOUND


8/9

378

CHAPTER25DATA-NORMALIZEDFILTERS

In this way, the smallest positivep that results indet[K(p)] = 0 is equal to

1

max{X(H) E Rt}

in terms of the largest positive real eigenvalue ofH when it exists.

The results(25.24X25.25)can be grouped together to yield the condition

min

(25.25)

(25.26)

If H does not have any real positive eigenvalue, then the corresponding condition is removed and weonly requirep < l/Xm,(A-lB). The result (25.26) is valid for generalA > 0 and B 2 0. Theabove derivation does not exploit any particular structure in the matricesA and B defined by(25.16).

25.B APPENDIX: STABILITYOF NLMS

The purpose ofthis appendix is to show that forE-NLMS, any p < 2 is sufficient to guaranteemean-square stability. Thus, refer again to the discussion in Sec.25.1 and to the definitionsof thematrices {A, B , P, } n ( 2 5 4 x 2 5 3 ) .We already know from the result inApp. 25.A that stabilityin the mean and mean-square senses is guaranteed for step-sizes in the range

1 1Xmax(A- B) max { X ( H )E lF+}

where the third condition is in terms of the larges t positive real eigenvalue of the block matrix,

A A/2 -B/2. = [ I - . 0 ]

The first condition onp namely, p < 2/Xma,(P), guarantees convergence in the mean. The secondcondition on p, namely, p < 1/Xmax(A- B), guarantees X F) < 1. The last condition,p -1. The point now is that these conditions onp are metby any p < 2 (i.e., F is stable for anyp < 2). This is because there are some important relationsbetween the matrices{A,B, P} in the E-NLMS case. To see this, observe first that the term

which appears in the expression(25.6) for P s generally a rank-one matrix (unlessui = 0); it hasM - 1 zero eigenvalues and one possibly nonzero eigenvalue that is equal to8~ ~ ~ i ~ / ~ / ~11ui11* .This eigenvalue is less than unityso that

Xmax (-) 5 (25.28)Now recalling the following Rayleigh-Ritz characterization of the maximum eigenvalue of any Her-

mitian matrixR (from Sec.B.1):(25.29)

*Every rank-onematrix of the formxx , where x s a column vector of size M ,has M - zero eigenvalues andone nonzero eigenvalue that is equal to 1 1 ~ 1 1 ~


9/9

we conclude from(25.28)that

Applying the same characterization(25.29)to the matrixP in (25.6),and using the above inequality,we find that

Ax,,, P) = max x Px =

llxll=1

I (25.30)

In other words, the maximum eigenvalue ofP is bounded by one,so that the conditionp 0 implies P > 0. We also knowfrom (25.30) that A,,,(P) < 1. Therefore, all the eigenvalues ofP are positive and lie inside theopen interval( 0 , l ) . Moreover, over the interval0 < p < 2, the following quadratic functionof p

k p ) e 1 - 2 p x +pZX

I - 2 p ~ J P I I - 2 p ~ m in ( p ) ~ ~ x r n i n ( p ) ] ~

assumes values between1 and 1 - for each of the eigenvaluesX of P . Therefore, it holds that

from which we conclude that

379

ECTION25.BSTABILITYOF NLMS

where the scalar coefficient(Y = 1 - 2pAmin(P)+p2X,i,(P) is positive and strictly less than onefor 0 < p < 2. It then follows thatE 11.iii112remains bounded for alli.