Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf ·...

28
Stochastic Signals and Systems Identification Lecture 9. 2018. november 27.

Transcript of Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf ·...

Page 1: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Stochastic Signals and Systems

IdentificationLecture 9.

2018. november 27.

Page 2: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Autocovariance function

(yn) is w.s.st. Eyn = 0. A sequence of observations: y1, . . . , yN .

The first objective: estimate the auto-covariance function

r(τ) = Eyn+τyn.

A natural candidate is the sample covariance:

r(τ) = 1N − τ

N−τ∑n=1

yn+τyn, τ ≥ 0.

Certainly, no estimates will be available for τ ≥ N.

Page 3: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

r(τ) = 1N−τ

N−τ∑n=1

yn+τyn, τ ≥ 0.

The values yn+τyn form a dependent sequence: (((((((standard LLN

In order to get meaningful results we have to restrict ourselves to

systems the structure of which can be perfectly described by a

finite set of parameters.

We will consider three classes of processes: AR, MA and ARMA.

Page 4: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Identification of AR models

Page 5: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Least Squares estimate of an AR process

Let (yn) be w.s.st. stable AR(p) process defined by

yn + a∗1yn−1 + · · ·+ a∗pyn−p = en,

(en) is a w.s.st. orthogonal process.

Assume, that A∗(z−1) =p∑

k=1a∗kz−k is stabil. Thus (en) is the

innovation process of (yn).

Our goal is to estimate (a∗1, . . . , a∗p) using observations y1, . . . , yN

Page 6: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

yn + a∗1yn−1 + · · ·+ a∗pyn−p = en.

Introduce the notations:

ϕn = (−yn−1 · · · − yn−p)T , θ∗ = (a∗1 · · · a∗p)T .

The AR(p) equation can be rewritten as:

yn = ϕTn θ∗ + en,

a linear stochastic regression. (ϕn) is not independent of (en) (!)

Goal : Estimate θ∗ using the observations y1, . . . yN .

Page 7: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

yn = ϕTn θ∗ + en. θ∗ =?

Let us fix a tentative value of θ ∈ Rp. Define the error process:

εn(θ) = yn − ϕTn θ, n ≥ 0.

We may assume that y0, y−1, ..., y−p+1 are known.

The LSQ estimation method: minimize the cost function:

VN(θ) = 12

N∑n=1

ε2n(θ) = 1

2

N∑n=1

(yn − ϕT

n θ)2.

Page 8: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

VN(θ) = 12

N∑n=1

ε2n(θ) is quadratic and convex in θ, thus ∃ min.

For any minimizing value of θ0 we have:

∂θVN(θ0) = 0.

Differentiating VN(θ) w.r.t. θ we get for j = 1, . . . p:

∂θjVN(θ) =

N∑n=1

εTθj n(θ)εn(θ) with εT

θj n(θ) = ∂

∂θjεT

n (θ).

Page 9: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

∂∂θVN(θ) =

N∑n=1

εTθn(θ)εn(θ)

We have εn(θ) = yn − ϕTn θ, thus

εθn(θ) = −ϕTn .

We get the equation

∂θVN(θ) =

N∑n=1−ϕnεn(θ) = 0.

Page 10: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

LSQ estimate:N∑

n=1−ϕnεn(θ) = 0.

Remark. For θ = θ∗ we have εn(θ∗) = en.

Using the definition ϕn = (−yn−1 · · · − yn−p)T we get:

E ∂

∂θVN(θ∗) = 0.

After substituting εn(θ) we get

N∑n=1−ϕn(yn − ϕT

n θ) = 0.

This is a linear equation for θ, which certainly has a solution.

Page 11: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

LSQ estimate

PropositionLet θN be a LSQ estimator of θ∗ based on N samples. Then θNsatisfies the following normal equation:[ N∑

n=1ϕnϕ

Tn

]θN =

N∑n=1

ϕnyn.

The estimator θN is unique, if SN (or RN) is non-singular:

SN =N∑

n=1ϕnϕ

Tn , RN = 1

N

N∑n=1

ϕnϕTn .

Page 12: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

RN = 1N

N∑n=1

ϕnϕTn

The elements of RN are the empirical auto-covariances of (yn) :

RN(k, l) = 1N

N∑n=1

yn−kyn−l = r(l − k).

We assume, that limN

RN = R∗ a.s.

R∗ is the p-th order auto-covariance matrix.

Remark. It can be proved, that R∗ is non-singular. (e.g. by astate-space representation of y).

Page 13: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

θN = 1N R−1N

N∑n=1

ϕnyn.

PropositionLet (yn) be a w.s.st. stable AR(p) process, and lim

NRN = R∗.

Then the LSQ estimate θN converges to θ∗ a.s.

The error of the estimator θN is denoted by θN . Then:

θN = θN − θ∗ = (N∑

n=1ϕnϕ

Tn )−1

N∑n=1

ϕnen,

since yn = ϕTn θ∗ + en.

Page 14: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

An approximating error process is ˜θN = 1N (R∗)−1

N∑n=1

ϕnen.

PropositionThe approximating error process ˜θN has the following covariancematrix:

E ˜θN˜θ

TN = 1

N (R∗)−1σ2.

The (asymptotic) quality of the LSQ estimator is determined bythe covariance matrix R∗.

R∗ is the covariance matrix of the state-vector of SS representationof (yn). Thus it is the solution of a Lyapunov-equation.

Page 15: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Example. An AR(1) process

The process is given by

yn + a∗yn−1 = en.

Then R∗ = σ2(e)1− (a∗)2 , and thus the asymptotic variance of the

LSQ estimator of a∗ equals

limN→∞

N E(aN − a∗)2 = 1− (a∗)2

It follows, that if as a∗ is close to ±1, then the asymptotic varianceof the LSQ estimator is close to 0.

Page 16: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

The recursive LSQ method

Assume that θN is available.

θN =[ N∑

n=1ϕnϕ

Tn

]−1 N∑n=1

ϕnyn.

Suppose we get one more observation yN+1.

Do we need to recompute S−1N+1 and θN+1 from scratch, or is there

a way to compute S−1N+1 and θN+1 using S−1

N and θN?

Page 17: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

A matrix-inversion formula

Proposition. (The Sherman-Morrison lemma.) Let A ∈ Rm×m,and let b, c ∈ Rm. Assume that A is non-singular and so isA + bcT . Then(

A + bcT)−1

= A−1 − 1(1 + cTA−1b)A

−1bcTA−1.

A direct corollary is a recursion for the inverse of S−1N , where:

SN+1 = SN + ϕN+1ϕTN+1.

SN is the coefficient matrix of the normal equation. If SN isnon-singular for some N, then:

S−1N+1 = S−1

N − 11 + ϕT

N+1S−1N ϕN+1

S−1N ϕN+1ϕ

TN+1S−1

N .

Page 18: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Recursion for θN : The RLSQ method.

PropositionAssume that RN is non-singular, and let θN be the LSQ estimateof θ∗. Then θN+1 and RN+1 can be computed via the recursion

θN+1 = θN + 1N + 1R

−1N+1 ϕN+1(yN+1 − ϕT

N+1θN)

RN+1 = RN + 1N + 1(ϕN+1ϕ

TN+1 − RN)

The term (yN+1 − ϕTN+1θN) is approx. (yN+1 − ϕT

N+1θ∗), which is

the innovation eN+1.

The correction term E(yN+1 − ϕTN+1θ) = 0, when θ = θ∗.

Page 19: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Identification of MA models.

Page 20: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Identification of MA models.

Consider an MA process (yn) defined by

yn = en + c∗1en−1 + · · ·+ c∗r en−r =⇒ y = C∗(q−1)e,

withC∗(q−1) = 1 +

r∑j=1

c∗j q−j , degC∗ = r ,

where (en) is a w.s.st. orthogonal process. We will use the notation

θ∗ = (c∗1 , . . . , c∗r )T .

Assume that C∗ is stable, then e is the innovation process of y .

Page 21: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

yn = en + c∗1en−1 + · · ·+ c∗r en−r

Goal : Estimate θ∗ using the observations y1, . . . yN .

The key idea: try to reconstruct e1, ..., eN by inverting the system

Let us take a polynomial C(q−1) = 1 +r∑

j=1cjq−j .

Define an estimated driving noise process ε = (εn) by

ε = C−1(q−1)y , or, equivalently: C(q−1)ε = y .

If C(z−1) 6= 0 for |z | = 1, then the process ε is well-defined.

Page 22: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Example: Inverse of an MA(1) process.

An MA(1) process is given by

yn = en + c∗1en−1.

Then the inverse of the noise process is:

εn + c1εn−1 = yn, n ≥ 1.

To generate ε1 we would need to know ε0 which is not available.

A standard choice is εj = 0 for −r + 1 ≤ j ≤ 0.

Page 23: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

C(q−1) = 1 +r∑

j=1cjq−j

θ = (c1, . . . , cr )T . Define the estimated noise process (εn(θ)) by:

εn(θ) + c1εn−1(θ) + · · ·+ crεn−r (θ) = yn, ε−j(θ) = 0.

Consider the cost function: VN(θ) = 12

N−1∑n=0

ε2n(θ).

The Prediction Error (PE) estimator θN is the solution of:

minθ∈D

VN(θ), D = {θ ∈ Rr : C(z−1) = 1 +r∑

j=1cjz−j is stable}.

Page 24: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

εn(θ) + c1εn−1(θ) + · · ·+ crεn−r(θ) = yn

The cost function is VN(θ) = 12

N−1∑n=0

ε2n(θ).

We have∂

∂θVN(θ) = VθN(θ) =

N∑n=1

εθn(θ)εn(θ).

To get εθn(θ), we can differentiate the egquation w.r.to θj = cj :

∂θjεn(θ) + c1

∂θjεn−1(θ) + · · ·+ cr

∂θjεn−r (θ) + εn−j(θ) = 0.

Page 25: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

The gradient process.

The j-th coordinate of the gradient process is generated by:

εθj n(θ) + c1εθj ,n−1(θ) + · · ·+ crεθj ,n−r (θ) + εn−j(θ) = 0

with initial values εθj n(θ) = 0 for n ≤ 0.

Introduce the notation:

φn(θ) = (εn−1(θ), ..., εn−r (θ))T .

Conclusion. The gradient process εθ(θ) satisfies

C(q−1)εθ(θ) = −φ(θ).

Thus the equation defining the PE estimator is non-linear in θ.

Page 26: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

The asymptotic covariance matrix of θN

The error of the estimation is θN = θN − θ∗. Introduce the notation

R∗ = E(εθn(θ∗)εT

θn(θ∗)).

PropositionAssume that C∗(z−1) is stable . Then θN has the followingcovariance matrix:

EθN θTN = 1

N (R∗)−1σ2(e).

Page 27: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

The asymptotic covariance of the estimators

(A remarkable feature of the above results)

The asymptotic covariance matrix of the PE estimator of theparameters of the MA system

y = C∗(q−1)e,

and the asymptotic covariance matrix of the LSQ estimator of theparameters of the AR system

C∗(q−1)y = e

are the same.

Page 28: Identification Lecture9. 2018.november27.users.itk.ppke.hu/~vago/A_09_Identify_AR_slide_18.pdf · rb(τ) = 1 N−τ NX−τ n=1 y n+τy n, τ≥0. Thevaluesy n+τy n formadependent

Example: MA(1) process

yn = en + c∗en−1 c∗ 6= ±1.

We have seen, that R∗ = σ2(e)1− (c∗)2 , and thus

limN→∞

N E(cN − c∗)2 = 1− (c∗)2

It follows, that if c∗ is close to ±1, then σ2(cN) is close to 0.

Remark. A transfer function H(e−iω, θ∗), and its inverseH−1(e−iω, θ∗) can be equally accurately estimated, at leastasymptotically.

Exercise. Compute the gradient process for AR(1) and MA(1)processes.