pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… ·...

118
State Space Models and Filtering Jes´ us Fern´ andez-Villaverde University of Pennsylvania 1

Transcript of pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… ·...

Page 1: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Models and Filtering

Jesus Fernandez-VillaverdeUniversity of Pennsylvania

1

Page 2: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Form

• What is a state space representation?

• States versus observables.

• Why is it useful?

• Relation with filtering.

• Relation with optimal control.

• Linear versus nonlinear, Gaussian versus nongaussian.2

Page 3: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation

• Let the following system:

— Transition equation

xt+1 = Fxt +Gωt+1, ωt+1 ∼ N (0, Q)

— Measurement equation

zt = H0xt + υt, υt ∼ N (0, R)

— where xt are the states and zt are the observables.

• Assume we want to write the likelihood function of zT = ztTt=1.3

Page 4: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The State Space Representation is Not Unique

• Take the previous state space representation.

• Let B be a non-singular squared matrix conforming with F .

• Then, if x∗t = Bxt, F ∗ = BFB−1, G∗ = BG, and H∗ =¡H0B

¢0, wecan write a new, equivalent, representation:

— Transition equation

x∗t+1 = F ∗x∗t +G∗ωt+1, ωt+1 ∼ N (0, Q)

— Measurement equation

zt = H∗0x∗t + υt, υt ∼ N (0, R)

4

Page 5: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Example I

• Assume the following AR(2) process:zt = ρ1zt−1 + ρ2zt−2 + υt, υt ∼ N

³0,σ2υ

´

• Model is not apparently not Markovian.

• Can we write this model in different state space forms?

• Yes!

5

Page 6: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation I

• Transition equation:

xt =

"ρ1 1ρ2 0

#xt−1 +

"10

#υt

where xt =hyt ρ2yt−1

i0

• Measurement equation:zt =

h1 0

ixt

6

Page 7: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation II

• Transition equation:

xt =

"ρ1 ρ21 0

#xt−1 +

"10

#υt

where xt =hyt yt−1

i0

• Measurement equation:zt =

h1 0

ixt

• Try B =

"1 00 ρ2

#on the second system to get the first system.

7

Page 8: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Example II

• Assume the following MA(1) process:zt = υt + θυt−1, υt ∼ N

³0,σ2υ

´, and Eυtυs = 0 for s 6= t.

• Again, we have a more conmplicated structure than a simple Markov-ian process.

• However, it will again be straightforward to write a state space repre-sentation.

8

Page 9: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation I

• Transition equation:

xt =

"0 10 0

#xt−1 +

"1θ

#υt

where xt =hyt θυt

i0

• Measurement equation:zt =

h1 0

ixt

9

Page 10: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation II

• Transition equation:

xt = υt−1

• where xt = [υt−1]0

• Measurement equation:zt = θxt + υt

• Again both representations are equivalent!10

Page 11: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Example III

• Assume the following random walk plus drift process:

zt = zt−1 + β + υt, υt ∼ N³0,σ2υ

´

• This is even more interesting.

• We have a unit root.

• We have a constant parameter (the drift).

11

Page 12: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation

• Transition equation:

xt =

"1 10 1

#xt−1 +

"10

#υt

where xt =hyt β

i0

• Measurement equation:zt =

h1 0

ixt

12

Page 13: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Conditions on the State Space Representation

• We only consider Stable Systems.

• A system is stable if for any initial state x0, the vector of states, xt,

converges to some unique x∗.

• A necessary and sufficient condition for the system to be stable is that:|λi (F )| < 1

for all i, where λi (F ) stands for eigenvalue of F .

13

Page 14: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Introducing the Kalman Filter

• Developed by Kalman and Bucy.

• Wide application in science.

• Basic idea.

• Prediction, smoothing, and control.

• Why the name “filter”?

14

Page 15: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Definitions

• Let xt|t−1 = E³xt|zt−1

´be the best linear predictor of xt given the

history of observables until t− 1, i.e. zt−1.

• Let zt|t−1 = E³zt|zt−1

´= H0xt|t−1 be the best linear predictor of zt

given the history of observables until t− 1, i.e. zt−1.

• Let xt|t = E³xt|zt

´be the best linear predictor of xt given the history

of observables until t, i.e. zt.

15

Page 16: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

What is the Kalman Filter trying to do?

• Let assume we have xt|t−1 and zt|t−1.

• We observe a new zt.

• We need to obtain xt|t.

• Note that xt+1|t = Fxt|t and zt+1|t = H0xt+1|t, so we can go backto the first step and wait for zt+1.

• Therefore, the key question is how to obtain xt|t from xt|t−1 and zt.16

Page 17: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Minimization Approach to the Kalman Filter I

• Assume we use the following equation to get xt|t from zt and xt|t−1:

xt|t = xt|t−1 +Kt³zt − zt|t−1

´= xt|t−1 +Kt

³zt −H0xt|t−1

´

• This formula will have some probabilistic justification (to follow).

• What is Kt?

17

Page 18: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Minimization Approach to the Kalman Filter II

• Kt is called the Kalman filter gain and it measures how much we

update xt|t−1 as a function in our error in predicting zt.

• The question is how to find the optimal Kt.

• The Kalman filter is about how to build Kt such that optimally updatext|t from xt|t−1 and zt.

• How do we find the optimal Kt?

18

Page 19: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Additional Definitions

• Let Σt|t−1 ≡ Eµ³xt − xt|t−1

´ ³xt − xt|t−1

´0 |zt−1¶ be the predictingerror variance covariance matrix of xt given the history of observables

until t− 1, i.e. zt−1.

• Let Ωt|t−1 ≡ E

µ³zt − zt|t−1

´ ³zt − zt|t−1

´0 |zt−1¶ be the predictingerror variance covariance matrix of zt given the history of observables

until t− 1, i.e. zt−1.

• Let Σt|t ≡ Eµ³xt − xt|t

´ ³xt − xt|t

´0 |zt¶ be the predicting error vari-ance covariance matrix of xt given the history of observables until t−1,i.e. zt.

19

Page 20: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Finding the optimal Kt

• We want Kt such that minΣt|t.

• It can be shown that, if that is the case:Kt = Σt|t−1H

³H0Σt|t−1H +R

´−1

• with the optimal update of xt|t given zt and xt|t−1 being:

xt|t = xt|t−1 +Kt³zt −H0xt|t−1

´

• We will provide some intuition later.20

Page 21: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Example I

Assume the following model in State Space form:

• Transition equationxt = µ+ υt, υt ∼ N

³0,σ2υ

´

• Measurement equationzt = xt + ξt, ξt ∼ N

³0,σ2ξ

´

• Let σ2ξ = qσ2υ.

21

Page 22: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Example II

• Then, if Σ1|0 = σ2υ, what it means that x1 was drawn from the ergodic

distribution of xt.

• We have:K1 = σ2υ

1

1 + q∝ 1

1 + q.

• Therefore, the bigger σ2ξ relative to σ2υ (the bigger q) the lower K1and the less we trust z1.

22

Page 23: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm I

Given Σt|t−1, zt, and xt|t−1, we can now set the Kalman filter algorithm.

Let Σt|t−1, then we compute:

Ωt|t−1 ≡ Eµ³zt − zt|t−1

´ ³zt − zt|t−1

´0 |zt−1¶

= E

H0

³xt − xt|t−1

´ ³xt − xt|t−1

´0H+

υt³xt − xt|t−1

´0H +H0

³xt − xt|t−1

´υ0t+

υtυ0t|zt−1

= H0Σt|t−1H +R

23

Page 24: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm II

Let Σt|t−1, then we compute:

E

µ³zt − zt|t−1

´ ³xt − xt|t−1

´0 |zt−1¶ =

EµH0

³xt − xt|t−1

´ ³xt − xt|t−1

´0+ υt

³xt − xt|t−1

´0 |zt−1¶ = H0Σt|t−1

Let Σt|t−1, then we compute:

Kt = Σt|t−1H³H0Σt|t−1H +R

´−1Let Σt|t−1, xt|t−1, Kt, and zt then we compute:

xt|t = xt|t−1 +Kt³zt −H0xt|t−1

´24

Page 25: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm III

Let Σt|t−1, xt|t−1, Kt, and zt, then we compute:

Σt|t ≡ Eµ³xt − xt|t

´ ³xt − xt|t

´0 |zt¶ =

E

³xt − xt|t−1

´ ³xt − xt|t−1

´0−³xt − xt|t−1

´ ³zt −H0xt|t−1

´0K0t−

Kt³zt −H0xt|t−1

´ ³xt − xt|t−1

´0+

Kt³zt −H0xt|t−1

´ ³zt −H0xt|t−1

´0K0t|zt

= Σt|t−1 −KtH0Σt|t−1

where, you have to notice that xt−xt|t = xt−xt|t−1−Kt³zt −H0xt|t−1

´.

25

Page 26: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm IV

Let Σt|t−1, xt|t−1, Kt, and zt, then we compute:

Σt+1|t = FΣt|tF 0 +GQG0

Let xt|t, then we compute:

1. xt+1|t = Fxt|t

2. zt+1|t = H0xt+1|t

Therefore, from xt|t−1, Σt|t−1, and zt we compute xt|t and Σt|t.26

Page 27: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm V

We also compute zt|t−1 and Ωt|t−1.

Why?

To calculate the likelihood function of zT = ztTt=1 (to follow).

27

Page 28: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm: A Review

We start with xt|t−1 and Σt|t−1.

The, we observe zt and:

• Ωt|t−1 = H0Σt|t−1H +R

• zt|t−1 = H0xt|t−1

• Kt = Σt|t−1H³H0Σt|t−1H +R

´−1• Σt|t = Σt|t−1 −KtH0Σt|t−1

28

Page 29: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• xt|t = xt|t−1 +Kt³zt −H0xt|t−1

´

• Σt+1|t = FΣt|tF 0 +GQG0

• xt+1|t = Fxt|t

We finish with xt+1|t and Σt+1|t.

29

Page 30: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Intuition about the optimal Kt

• Remember: Kt = Σt|t−1H³H0Σt|t−1H +R

´−1

• Notice that we can rewrite Kt in the following way:Kt = Σt|t−1HΩ−1

t|t−1

• If we did a big mistake forecasting xt|t−1 using past information (Σt|t−1large) we give a lot of weight to the new information (Kt large).

• If the new information is noise (R large) we give a lot of weight to theold prediction (Kt small).

30

Page 31: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Probabilistic Approach to the Kalman Filter

• Assume:

Z|w = [X 0|w Y 0|w]0 ∼ NÃ"

x∗y∗

#,

"Σxx ΣxyΣyx Σyy

#!

• then:X|y,w ∼ N

³x∗ +ΣxyΣ

−1yy (y − y∗) ,Σxx −ΣxyΣ

−1yy Σyx

´

• Also xt|t−1 ≡ E³xt|zt−1

´and:

Σt|t−1 ≡ Eµ³xt − xt|t−1

´ ³xt − xt|t−1

´0 |zt−1¶31

Page 32: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Derivations I

If zt|zt−1 is the random variable zt (observable) conditional on zt−1, then:

• Let zt|t−1 ≡ E³zt|zt−1

´= E

³H0xt + υt|zt−1

´= H0xt|t−1

• LetΩt|t−1 ≡ E

µ³zt − zt|t−1

´ ³zt − zt|t−1

´0 |zt−1¶ =

E

H0

³xt − xt|t−1

´ ³xt − xt|t−1

´0H+

υt³xt − xt|t−1

´0H+

H0³xt − xt|t−1

´υ0t+

υtυ0t|zt−1

= H0Σt|t−1H +R

32

Page 33: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Derivations II

Finally, let

Eµ³zt − zt|t−1

´ ³xt − xt|t−1

´0 |zt−1¶ =E

µH0

³xt − xt|t−1

´ ³xt − xt|t−1

´0+ υt

³xt − xt|t−1

´0 |zt−1¶ == H0Σt|t−1

33

Page 34: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter First Iteration I

• Assume we know x1|0 and Σ1|0, thenÃx1z1|z0!N

Ã"x1|0H0x1|0

#,

"Σ1|0 Σ1|0HH0Σ1|0 H0Σ1|0H +R

#!

• Remember that:X|y,w ∼ N

³x∗ +ΣxyΣ

−1yy (y − y∗) ,Σxx −ΣxyΣ

−1yy Σyx

´

34

Page 35: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter First Iteration II

Then, we can write:

x1|z1, z0 = x1|z1 ∼ N³x1|1,Σ1|1

´where

x1|1 = x1|0 +Σ1|0H³H0Σ1|0H +R

´−1 ³z1 −H0x1|0

´and

Σ1|1 = Σ1|0 −Σ1|0H³H0Σ1|0H +R

´−1H0Σ1|0

35

Page 36: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• Therefore, we have that:

— z1|0 = H0x1|0

— Ω1|0 = H0Σ1|0H +R

— x1|1 = x1|0 +Σ1|0H³H0Σ1|0H +R

´−1 ³z1 −H0x1|0

´— Σ1|1 = Σ1|0 −Σ1|0H

³H0Σ1|0H +R

´−1H0Σ1|0

• Also, since x2|1 = Fx1|1 +Gω2|1 and z2|1 = H0x2|1 + υ2|1:

— x2|1 = Fx1|1

— Σ2|1 = FΣ1|1F 0 +GQG0

36

Page 37: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter th Iteration I

• Assume we know xt|t−1 and Σt|t−1, thenÃxtzt|zt−1

!N

Ã"xt|t−1H0xt|t−1

#,

"Σt|t−1 Σt|t−1HH0Σt|t−1 H0Σt|t−1H +R

#!

• Remember that:X|y,w ∼ N

³x∗ +ΣxyΣ

−1yy (y − y∗) ,Σxx −ΣxyΣ

−1yy Σyx

´

37

Page 38: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter th Iteration II

Then, we can write:

xt|zt, zt−1 = xt|zt ∼ N³xt|t,Σt|t

´where

xt|t = xt|t−1 +Σt|t−1H³H0Σt|t−1H +R

´−1 ³zt −H0xt|t−1

´and

Σt|t = Σt|t−1 −Σt|t−1H³H0Σt|t−1H +R

´−1H0Σt|t−1

38

Page 39: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter Algorithm

Given xt|t−1, Σt|t−1 and observation zt

• Ωt|t−1 = H0Σt|t−1H +R

• zt|t−1 = H0xt|t−1

• Σt|t = Σt|t−1 −Σt|t−1H³H0Σt|t−1H +R

´−1H0Σ

• xt|t = xt|t−1 + Σt|t−1H³H0Σt|t−1H +R

´−1 ³zt −H0xt|t−1

´039

Page 40: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• Σt+1|t = FΣt|tF 0 +GQGt|t−1

• xt+1|t = Fxt|t−1

40

Page 41: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Putting the Minimization and the Probabilistic Approaches Together

• From the Minimization Approach we know that:

xt|t = xt|t−1 +Kt³zt −H0xt|t−1

´

• From the Probability Approach we know that:

xt|t = xt|t−1 + Σt|t−1H³H0Σt|t−1H +R

´−1 ³zt −H0xt|t−1

´

41

Page 42: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• But since:Kt = Σt|t−1H

³H0Σt|t−1H +R

´−1

• We can also write in the probabilistic approach:

xt|t = xt|t−1 +Σt|t−1H³H0Σt|t−1H +R

´−1 ³zt −H0xt|t−1

´=

= xt|t−1 +Kt³zt −H0xt|t−1

´

• Therefore, both approaches are equivalent.

42

Page 43: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Writing the Likelihood Function

We want to write the likelihood function of zT = ztTt=1:log `

³zT |F,G,H,Q,R

´=

TXt=1

log `³zt|zt−1F,G,H,Q,R

´=

−TXt=1

N2log 2π +

1

2log

¯Ωt|t−1

¯+1

2

TXt=1

v0tΩ−1t|t−1vt

vt = zt − zt|t−1 = zt −H0xt|t−1

Ωt|t−1 = H0tΣt|t−1Ht +R

43

Page 44: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Initial conditions for the Kalman Filter

• An important step in the Kalman Fitler is to set the initial conditions.

• Initial conditions:

1. x1|0

2. Σ1|0

• Where do they come from?

44

Page 45: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Since we only consider stable system, the standard approach is to set:

• x1|0 = x∗

• Σ1|0 = Σ∗

where x solves

x∗ = Fx∗

Σ∗ = FΣ∗F 0 +GQG0

How do we find Σ∗?

Σ∗ = [I − F ⊗ F ]−1 vec(GQG0)45

Page 46: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Initial conditions for the Kalman Filter II

Under the following conditions:

1. The system is stable, i.e. all eigenvalues of F are strictly less than one

in absolute value.

2. GQG0 and R are p.s.d. symmetric

3. Σ1|0 is p.s.d. symmetric

Then Σt+1|t→ Σ∗.

46

Page 47: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Remarks

1. There are more general theorems than the one just described.

2. Those theorems are based on non-stable systems.

3. Since we are going to work with stable system the former theorem is

enough.

4. Last theorem gives us a way to find Σ as Σt+1|t→ Σ for any Σ1|0 westart with.

47

Page 48: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter and DSGE models

• Basic Real Business Cycle model

maxE0

∞Xt=0

βt ξ log ct + (1− ξ) log (1− lt)

ct + kt+1 = kαt (eztlt)

1−α + (1− δ) k

zt = ρzt−1 + εt, εt ∼ N (0,σ)

• Parameters: γ = α,β, ρ, ξ, η,σ

48

Page 49: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Equilibrium Conditions

1

ct= βEt

(1

ct+1

³1 + αezt+1kα−1t+1 l

1−αt+1 − η

´)

1− ξ

1− lt=

ξ

ct(1− α) eztkαt l

−αt

ct + kt+1 = eztkαt l

1−αt + (1− η) kt

zt = ρzt−1 + εt

49

Page 50: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Special Case

• We set, unrealistically but rather useful for our point, η = 1.

• In this case, the model has two important and useful features:

1. First, the income and the substitution effect from a productivity

shock to labor supply exactly cancel each other. Consequently, ltis constant and equal to:

lt = l =(1− α) ξ

(1− α) ξ + (1− ξ) (1− αβ)

2. Second, the policy function for capital is kt+1 = αβeztkαt l1−α.

50

Page 51: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Special Case II

• The definition of kt+1 implies that ct = (1− αβ) eztkαt l1−α.

• Let us try if the Euler Equation holds:1

ct= βEt

(1

ct+1

³αezt+1kα−1t+1 l

1−αt+1

´)1

(1− αβ) eztkαt l1−α = βEt

(1

(1− αβ) ezt+1kαt+1l1−α

³αezt+1kα−1t+1 l

1−αt+1

´)1

(1− αβ) eztkαt l1−α = βEt

(1− αβ) kt+1

)αβ

(1− αβ)=

βα

(1− αβ)

51

Page 52: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• Let us try if the Intratemporal condition holds1− ξ

1− l =ξ

(1− αβ) eztkαt l1−α (1− α) eztkαt l

−α

1− ξ

1− l =ξ

(1− αβ)

(1− α)

l

(1− αβ) (1− ξ) l = ξ (1− α) (1− l)

((1− αβ) (1− ξ) + (1− α) ξ) l = (1− α) ξ

• Finally, the budget constraint holds because of the definition of ct.

52

Page 53: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Transition Equation

• Since this policy function is linear in logs, we have the transition equa-tion for the model: 1

log kt+1zt

= 1 0 0logαβλl1−α α ρ

0 0 ρ

1log ktzt−1

+ 011

²t.

• Note constant.

• Alternative formulations.

53

Page 54: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Measurement Equation

• As observables, we assume log yt and log it subject to a linearly additivemeasurement error Vt =

³v1,t v2,t

´0.

• Let Vt ∼ N (0,Λ), where Λ is a diagonal matrix with σ21 and σ22, asdiagonal elements.

• Why measurement error? Stochastic singularity.

• Then:Ãlog ytlog it

!=

Ã− logαβλl1−α 1 0

0 1 0

! 1log kt+1zt

+ Ãv1,tv2,t

!.

54

Page 55: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Solution to the Model in State Space Form

xt =

1log ktzt−1

, zt =Ãlog ytlog it

!

F =

1 0 0logαβλl1−α α ρ

0 0 ρ

, G = 011

, Q = σ2

H0 =Ã− logαβλl1−α 1 0

0 1 0

!, R = Λ

55

Page 56: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Solution to the Model in State Space Form III

• Now, using zT , F,G,H,Q, and R as defined in the last slide...

• ...we can use the Ricatti equations to compute the likelihood functionof the model:

log `³zT |F,G,H,Q,R

´

• Croos-equations restrictions implied by equilibrium solution.

• With the likelihood, we can do inference!

56

Page 57: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

What do we Do if η 6= 1?

We have two options:

• First, we could linearize or log-linearize the model and apply theKalman filter.

• Second, we could compute the likelihood function of the model usinga non-linear filter (particle filter).

• Advantages and disadvantages.

• Fernandez-Villaverde, Rubio-Ramırez, and Santos (2005).57

Page 58: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Kalman Filter and linearized DSGE Models

• We linearize (or loglinerize) around the steady state.

• We assume that we have data on log output (log yt), log hours (log lt),and log investment (log ct) subject to a linearly additive measurement

error Vt =³v1,t v2,t v3,t

´0.

• We need to write the model in state space form. Remember thatbkt+1 = P bkt +Qzt

and blt = Rbkt + Szt58

Page 59: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Writing the Likelihood Function I

• The transitions Equation:

1bkt+1zt+1

= 1 0 00 P Q0 0 ρ

1bktzt

+ 001

²t.

• The Measurement Equation requires some care.

59

Page 60: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Writing the Likelihood Function II

• Notice that byt = zt + αbkt + (1− α)blt• Therefore, using blt = Rbkt + Szt

byt = zt + αbkt + (1− α)(Rbkt + Szt) =(α+ (1− α)R) bkt + (1 + (1− α)S) zt

• Also since bct = −α5blt + zt + αbkt and using again blt = Rbkt + Sztbct = zt + αbkt − α5(R

bkt + Szt) =(α− α5R)

bkt + (1− α5S) zt

60

Page 61: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Writing the Likelihood Function III

Therefore the measurement equation is: log ytlog ltlog ct

= log y α+ (1− α)R 1 + (1− α)Slog l R Slog c α− α5R 1− α5S

1bktzt

+

v1,tv2,tv3,t

.

61

Page 62: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Likelihood Function of a General Dynamic Equilibrium Economy

• Transition equation:St = f (St−1,Wt; γ)

• Measurement equation:Yt = g (St, Vt; γ)

• Interpretation.

62

Page 63: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Some Assumptions

1. We can partition Wt into two independent sequencesnW1,t

oandn

W2,t

o, s.t. Wt =

³W1,t,W2,t

´and dim

³W2,t

´+dim (Vt) ≥ dim (Yt).

2. We can always evaluate the conditional densities p³yt|Wt

1, yt−1, S0; γ

´.

Lubick and Schorfheide (2003).

3. The model assigns positive probability to the data.

63

Page 64: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Our Goal: Likelihood Function

• Evaluate the likelihood function of the a sequence of realizations ofthe observable yT at a particular parameter value γ:

p³yT ; γ

´

• We factorize it as:

p³yT ; γ

´=

TYt=1

p³yt|yt−1; γ

´

=TYt=1

Z Zp³yt|Wt

1, yt−1, S0; γ

´p³Wt1, S0|yt−1; γ

´dWt

1dS0

64

Page 65: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Law of Large Numbers

If

(½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

)Tt=1

N i.i.d. draws fromnp³Wt1, S0|yt−1; γ

´oTt=1,

then:

p³yT ; γ

´'

TYt=1

1

N

NXi=1

pµyt|wt|t−1,i1 , yt−1, st|t−1,i0 ; γ

65

Page 66: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

...thus

The problem of evaluating the likelihood is equivalent to the problem of

drawing from np³Wt1, S0|yt−1; γ

´oTt=1

66

Page 67: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Introducing Particles

•nst−1,i0 , w

t−1,i1

oNi=1

N i.i.d. draws from p³Wt−11 , S0|yt−1; γ

´.

• Each st−1,i0 , wt−1,i1 is a particle and

nst−1,i0 , w

t−1,i1

oNi=1

a swarm of

particles.

•½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

N i.i.d. draws from p³Wt1, S0|yt−1; γ

´.

• Each st|t−1,i0 , wt|t−1,i1 is a proposed particle and

½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

a swarm of proposed particles.

67

Page 68: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

... and Weights

qit =pµyt|wt|t−1,i1 , yt−1, st|t−1,i0 ; γ

¶PNi=1 p

µyt|wt|t−1,i1 , yt−1, st|t−1,i0 ; γ

68

Page 69: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Proposition

Letnesi0, ewi1oNi=1 be a draw with replacement from

½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

and probabilities qit. Thennesi0, ewi1oNi=1 is a draw from p

³Wt1, S0|yt; γ

´.

69

Page 70: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Importance of the Proposition

1. It shows how a draw½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

from p³Wt1, S0|yt−1; γ

´can be used to draw

nst,i0 , w

t,i1

oNi=1

from p³Wt1, S0|yt; γ

´.

2. With a drawnst,i0 , w

t,i1

oNi=1

from p³Wt1, S0|yt; γ

´we can use p

³W1,t+1; γ

´to get a draw

½st+1|t,i0 , w

t+1|t,i1

¾Ni=1

and iterate the procedure.

70

Page 71: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Sequential Monte Carlo I: Filtering

Step 0, Initialization: Set tà 1 and initialize p³Wt−11 , S0|yt−1; γ

´=

p (S0; γ).

Step 1, Prediction: Sample N values

½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

from the

density p³Wt1, S0|yt−1; γ

´= p

³W1,t; γ

´p³Wt−11 , S0|yt−1; γ

´.

Step 2, Weighting: Assign to each draw st|t−1,i0 , w

t|t−1,i1 the weight

qit.

Step 3, Sampling: Drawnst,i0 , w

t,i1

oNi=1

with rep. from

½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

with probabilitiesnqit

oNi=1. If t < T set t à t + 1 and go to

step 1. Otherwise stop.

71

Page 72: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Sequential Monte Carlo II: Likelihood

Use

(½st|t−1,i0 , w

t|t−1,i1

¾Ni=1

)Tt=1

to compute:

p³yT ; γ

´'

TYt=1

1

N

NXi=1

p

µyt|wt|t−1,i1 , yt−1, st|t−1,i0 ; γ

72

Page 73: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A “Trivial” Application

How do we evaluate the likelihood function p³yT |α,β,σ

´of the nonlinear,

nonnormal process:

st = α+ βst−1

1 + st−1+ wt

yt = st + vt

where wt ∼ N (0,σ) and vt ∼ t (2) given some observables yT = ytTt=1and s0.

73

Page 74: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

1. Let s0,i0 = s0 for all i.

2. Generate N i.i.d. draws½s1|0,i0 , w1|0,i

¾Ni=1

from N (0,σ).

3. Evaluate pµy1|w1|0,i1 , y0, s

1|0,i0

¶= pt(2)

Ãy1 −

Ãα+ β

s1|0,i0

1+s1|0,i0

+ w1|0,i!!.

4. Evaluate the relative weights qi1 =

pt(2)

Ãy1−

Ãα+β

s1|0,i0

1+s1|0,i0

+w1|0,i!!

PNi=1 pt(2)

Ãy1−

Ãα+β

s1|0,i0

1+s1|0,i0

+w1|0,i!!.

74

Page 75: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

5. Resample with replacement N values of½s1|0,i0 , w1|0,i

¾Ni=1

with rela-

tive weights qi1. Call those sampled valuesns1,i0 , w

1,ioNi=1.

6. Go to step 1, and iterate 1-4 until the end of the sample.

75

Page 76: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A Law of Large Numbers

A law of the large numbers delivers:

p³y1| y0,α,β,σ

´' 1

N

NXi=1

pµy1|w1|0,i1 , y0, s

1|0,i0

¶and consequently:

p³yT¯α,β,σ

´'

TYt=1

1

N

NXi=1

p

µyt|wt|t−1,i1 , yt−1, st|t−1,i0

76

Page 77: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Comparison with Alternative Schemes

• Deterministic algorithms: Extended Kalman Filter and derivations

(Jazwinski, 1973), Gaussian Sum approximations (Alspach and Soren-

son, 1972), grid-based filters (Bucy and Senne, 1974), Jacobian of the

transform (Miranda and Rui, 1997).

Tanizaki (1996).

• Simulation algorithms: Kitagawa (1987), Gordon, Salmond and Smith(1993), Mariano and Tanizaki (1995) and Geweke and Tanizaki (1999).

77

Page 78: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A “Real” Application: the Stochastic Neoclassical Growth Model

• Standard model.

• Isn’t the model nearly linear?

• Yes, but:

1. Better to begin with something easy.

2. We will learn something nevertheless.

78

Page 79: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Model

• Representative agent with utility function U = E0P∞t=0 β

t

³cθt (1−lt)1−θ

´1−τ1−τ .

• One good produced according to yt = eztAkαt l1−αt with α ∈ (0, 1) .

• Productivity evolves zt = ρzt−1 + ²t, |ρ| < 1 and ²t ∼ N (0,σ²).

• Law of motion for capital kt+1 = it + (1− δ)kt.

• Resource constrain ct + it = yt.79

Page 80: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

• Solve for c (·, ·) and l (·, ·) given initial conditions.

• Characterized by:Uc(t) = βEt

hUc(t+ 1)

³1 + αAezt+1kα−1t+1 l(kt+1, zt+1)

α − δ´i

1− θ

θ

c(kt, zt)

1− l(kt, zt)= (1− α) eztAkαt l(kt, zt)

−α

• A system of functional equations with no known analytical solution.

80

Page 81: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Solving the Model

• We need to use a numerical method to solve it.

• Different nonlinear approximations: value function iteration, perturba-tion, projection methods.

• We use a Finite Element Method. Why? Aruoba, Fernandez-Villaverdeand Rubio-Ramırez (2003):

1. Speed: sparse system.

2. Accuracy: flexible grid generation.

3. Scalable.81

Page 82: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Building the Likelihood Function

• Time series:

1. Quarterly real output, hours worked and investment.

2. Main series from the model and keep dimensionality low.

• Measurement error. Why?

• γ = (θ, ρ, τ ,α, δ,β,σ²,σ1,σ2,σ3)

82

Page 83: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

State Space Representation

kt = f1(St−1,Wt; γ) = etanh−1(λt−1)kαt−1l

³kt−1, tanh−1(λt−1); γ

´1−α ∗1− θ

1− θ(1− α)

³1− l

³kt−1, tanh−1(λt−1); γ

´´l³kt−1, tanh−1(λt−1); γ

´+ (1− δ) kt−1

λt = f2(St−1,Wt; γ) = tanh(ρ tanh−1(λt−1) + ²t)

gdpt = g1(St, Vt; γ) = etanh−1(λt)kαt l

³kt, tanh

−1(λt); γ´1−α

+ V1,t

hourst = g2(St, Vt; γ) = l³kt, tanh

−1(λt); γ´+ V2,t

invt = g3(St, Vt; γ) = etanh−1(λt)kαt l

³kt, tanh

−1(λt); γ´1−α ∗1− θ

1− θ(1− α)

³1− l

³kt, tanh

−1(λt); γ´´

l³kt, tanh

−1(λt); γ´

+ V3,tLikelihood Function

83

Page 84: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Since our measurement equation implies that

p (yt|St; γ) = (2π)−32 |Σ|−12 e−

ω(St;γ)2

where ω(St; γ) = (yt − x(St; γ)))0Σ−1 (yt − x(St; γ)) ∀t, we have

p³yT ; γ

´=

(2π)−3T2 |Σ|−T2

Z TYt=1

Ze−

ω(St;γ)2 p

³St|yt−1, S0; γ

´dSt

p (S0; γ) dS1' (2π)−3T2 |Σ|−T2

TYt=1

1

N

NXi=1

e−ω(sit;γ)

2

84

Page 85: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Priors for the Parameters

Priors for the Parameters of the ModelParameters Distribution Hyperparameters

θρταδβσ²σ1σ2σ3

UniformUniformUniformUniformUniformUniformUniformUniformUniformUniform

0,10,10,1000,10,0.050.75,10,0.10,0.10,0.10,0.1

85

Page 86: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Likelihood-Based Inference I: a Bayesian Perspective

• Define priors over parameters: truncated uniforms.

• Use a Random-walk Metropolis-Hastings to draw from the posterior.

• Find the Marginal Likelihood.

86

Page 87: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Likelihood-Based Inference II: a Maximum Likelihood Perspective

• We only need to maximize the likelihood.

• Difficulties to maximize with Newton type schemes.

• Common problem in dynamic equilibrium economies.

• We use a simulated annealing scheme.

87

Page 88: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

An Exercise with Artificial Data

• First simulate data with our model and use that data as sample.

• Pick “true” parameter values. Benchmark calibration values for thestochastic neoclassical growth model (Cooley and Prescott, 1995).

Calibrated ParametersParameter θ ρ τ α δValue 0.357 0.95 2.0 0.4 0.02Parameter β σ² σ1 σ2 σ3Value 0.99 0.007 1.58*10−4 0.0011 8.66*10−4

• Sensitivity: τ = 50 and σ² = 0.035.

88

Page 89: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.9 0.92 0.94 0.96 0.98

Likelihood cut at ρ

1.5 2 2.5 3 3.5

Likelihood cut at τ

0.38 0.4 0.42

Likelihood cut at α

0.018 0.02 0.022

Likelihood cut at δ

7 8 9 10 11

x 10-3

Likelihood cut at σ

0.98 0.985 0.99

Likelihood cut at β

0.25 0.3 0.35 0.4

Likelihood cut at θ

Nonlinear

Linear

Pseudotrue

Figure 5.1: Likelihood Function Benchmark Calibration

Page 90: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.94850.9490.9495 0.95 0.95050.9510.95150

1000

2000

3000

4000

5000ρ

1.996 1.998 2 2.002 2.0040

1000

2000

3000

4000

5000τ

0.3995 0.4 0.40050

2000

4000

α

0.01957 0.0196 0.019630

2000

4000

δ

6.99 6.995 7 7.005 7.01

x 10-3

0

5000σ

0.988 0.989 0.99 0.9910

5000β

0.3564 0.3568 0.3572 0.35760

5000θ

1.578 1.579 1.58 1.581 1.582 1.583 1.584

x 10-4

0

5000

σ1

1.116 1.117 1.118 1.119 1.12

x 10-3

0

2000

4000

σ2

8.645 8.65 8.655 8.66 8.665 8.67 8.675

x 10-4

0

2000

4000

σ3

Figure 5.2: Posterior Distribution Benchmark Calibration

Page 91: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.9 0.92 0.94 0.96 0.98

Likelihood cut ρ

40 45 50 55

Likelihood cut τ

0.36 0.38 0.4 0.42 0.44

Likelihood cut α

0.016 0.018 0.02 0.022

Likelihood cut δ

0.03 0.035 0.04

Likelihood cut σ

0.95 0.96 0.97 0.98 0.99

Likelihood cut β

0.3 0.35 0.4

Likelihood cut θ

Nonlinear

Linear

Pseudotrue

Figure 5.3: Likelihood Function Extreme Calibration

Page 92: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.9495 0.95 0.95050

2000

4000

6000

ρ

49.95 50 50.050

2000

4000

6000

τ

0.3996 0.3998 0.4 0.40020

2000

4000

6000

α

0.019555 0.019565 0.019575 0.0195850

2000

4000

6000

δ

0.035 0.035 0.035 0.035 0.035 0.035 0.0350

2000

4000

6000

σ

0.989 0.9895 0.99 0.99050

2000

4000

6000

β

0.3567 0.3569 0.3571 0.35730

5000

θ

1.58 1.5805 1.581 1.5815 1.582 1.5825

x 10-4

0

5000

σ1

1.117 1.1175 1.118 1.1185 1.119

x 10-3

0

2000

4000

6000

σ2

8.655 8.66 8.665

x 10-4

0

2000

4000

6000

σ3

Figure 5.4: Posterior Distribution Extreme Calibration

Page 93: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0 1 2 3 4

x 105

0.4

0.6

0.8

0 1 2 3 4

x 105

40

60

80τ

0 1 2 3 4

x 105

0.350.4

0.45

α

0 1 2 3 4

x 105

0.01

0.02

0.03δ

0 1 2 3 4

x 105

0.02

0.03

0.04σ

0 1 2 3 4

x 105

0.9

0.95

0 1 2 3 4

x 105

0.35

0.4θ

0 1 2 3 4

x 105

0

0.05

σ1

0 1 2 3 4

x 105

0

0.005

0.01

0.015

σ2

0 1 2 3 4

x 105

0

0.02

0.04

σ3

Figure 5.5: Converge of Posteriors Extreme Calibration

Page 94: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.96 0.97 0.98 0.99 10

5000

10000ρ

1.68 1.7 1.72 1.74 1.760

5000

10000

15000τ

0.32 0.322 0.324 0.326 0.328 0.33 0.3320

1

2x 10

4 α

6.2 6.25 6.3 6.35 6.4

x 10-3

0

5000

10000δ

0.0198 0.02 0.0202 0.02040

5000

10000

15000σ

0.9964 0.9966 0.9968 0.997 0.9972 0.9974 0.99760

5000

10000

15000β

0.385 0.39 0.395 0.40

5000

10000θ

0.0435 0.044 0.0445 0.045 0.0455 0.046 0.04650

5000

10000

σ1

0.014 0.0145 0.015 0.0155 0.0160

5000

10000

σ2

0.037 0.0375 0.038 0.0385 0.039 0.0395 0.040

5000

10000

15000

σ3

Figure 5.6: Posterior Distribution Real Data

Page 95: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0.39 0.395 0.4 0.405 0.41

-7000

-6000

-5000

-4000

-3000

-2000

-1000

Transversal cut at α

Exact100 Particles1000 Particles10000 Particles

0.97 0.98 0.99 1 1.01

-7000

-6000

-5000

-4000

-3000

-2000

-1000

Transversal cut at β

Exact100 Particles1000 Particles10000 Particles

0.93 0.94 0.95 0.96 0.97

-220

-200

-180

-160

-140

-120

-100

-80

-60

-40

Transversal cut at ρ

Exact100 Particles1000 Particles10000 Particles

6.85 6.9 6.95 7 7.05 7.1 7.15

x 10-3

-220

-200

-180

-160

-140

-120

-100

-80

-60

-40

Transversal cut at σε

Exact100 Particles1000 Particles10000 Particles

Figure 6.1: Likelihood Function

Page 96: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.2: C.D.F. Benchmark Calibration

Page 97: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.3: C.D.F. Extreme Calibration

Page 98: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

0 2000 4000 6000 8000 100000

0.2

0.4

0.6

0.8

110000 particles

0 0.5 1 1.5 2

x 104

0

0.2

0.4

0.6

0.8

120000 particles

0 1 2 3

x 104

0

0.2

0.4

0.6

0.8

130000 particles

0 1 2 3 4

x 104

0

0.2

0.4

0.6

0.8

140000 particles

0 1 2 3 4 5

x 104

0

0.2

0.4

0.6

0.8

150000 particles

0 2 4 6

x 104

0

0.2

0.4

0.6

0.8

160000 particles

Figure 6.4: C.D.F. Real Data

Page 99: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Posterior Distributions Benchmark CalibrationParameters Mean s.d.

θρταδβσ²σ1σ2σ3

0.3570.9502.0000.4000.0200.9890.007

1.58×10−41.12×10−28.64×10−4

6.72×10−53.40×10−46.78×10−48.60×10−51.34×10−51.54×10−59.29×10−65.75×10−86.44×10−76.49×10−7

89

Page 100: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Maximum Likelihood Estimates Benchmark CalibrationParameters MLE s.d.

θρταδβσ²σ1σ2σ3

0.3570.9502.0000.4000.0020.9900.007

1.58×10−41.12×10−38.63×10−4

8.19×10−60.0010.020

2.02×10−62.07×10−51.00×10−60.0040.0070.0070.005

90

Page 101: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Posterior Distributions Extreme CalibrationParameters Mean s.d.

θρταδβσ²σ1σ2σ3

0.3570.95050.000.4000.0200.9890.035

1.58×10−41.12×10−28.65×10−4

7.19×10−41.88×10−47.12×10−34.80×10−53.52×10−68.69×10−64.47×10−61.87×10−82.14×10−72.33×10−7

91

Page 102: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Maximum Likelihood Estimates Extreme CalibrationParameters MLE s.d.

θρταδβσ²σ1σ2σ3

0.3570.95050.0000.4000.0190.9900.035

1.58×10−41.12×10−38.66×10−4

2.42×10−66.12×10−30.022

3.62×10−77.43×10−61.00×10−50.0150.0170.0140.023

92

Page 103: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Convergence on Number of Particles

Convergence Real DataN Mean s.d.

100002000030000400005000060000

1014.5581014.6001014.6531014.6661014.6881014.664

0.32960.25950.18290.16040.14650.1347

93

Page 104: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Posterior Distributions Real DataParameters Mean s.d.

θρταδβσ²σ1σ2σ3

0.3230.9691.8250.3880.0060.9970.0230.0390.0180.034

7.976× 10−40.0080.0110.001

3.557× 10−59.221× 10−52.702× 10−45.346× 10−44.723× 10−46.300× 10−4

94

Page 105: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Maximum Likelihood Estimates Real DataParameters MLE s.d.

θρταδβσ²σ1σ2σ3

0.3900.9871.7810.3240.0060.9970.0230.0380.0160.035

0.0440.7081.3980.0190.160

8.67×10−30.2240.0600.0610.076

95

Page 106: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Logmarginal Likelihood Difference: Nonlinear-Linear

p Benchmark Calibration Extreme Calibration Real Data0.1 73.631 117.608 93.650.5 73.627 117.592 93.550.9 73.603 117.564 93.55

96

Page 107: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Nonlinear versus Linear Moments Real DataReal Data Nonlinear (SMC filter) Linear (Kalman filter)

outputhoursinv

Mean s.d1.950.360.42

0.0730.0140.066

Mean s.d1.910.360.44

0.1290.0230.073

Mean s.d1.610.340.28

0.0680.0040.044

97

Page 108: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

A “Future” Application: Good Luck or Good Policy?

• U.S. economy has become less volatile over the last 20 years (Stockand Watson, 2002).

• Why?

1. Good luck: Sims (1999), Bernanke and Mihov (1998a and 1998b)

and Stock and Watson (2002).

2. Good policy: Clarida, Gertler and Galı (2000), Cogley and Sargent

(2001 and 2003), De Long (1997) and Romer and Romer (2002).

3. Long run trend: Blanchard and Simon (2001).

98

Page 109: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

How Has the Literature Addressed this Question?

• So far: mostly with reduced form models (usually VARs).

• But:

1. Results difficult to interpret.

2. How to run counterfactuals?

3. Welfare analysis.

99

Page 110: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Why Not a Dynamic Equilibrium Model?

• New generation equilibrium models: Christiano, Eichebaum and Evans(2003) and Smets and Wouters (2003).

• Linear and Normal.

• But we can do it!!!

100

Page 111: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Environment

• Discrete time t = 0, 1, ...

• Stochastic process s ∈ S with history st = (s0, ..., st) and probability

µ³st´.

101

Page 112: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Final Good Producer

• Perfectly Competitive Final Good Producer that solves

maxyi(st)

µZyi³st´θdi¶1θ −

Zpi³st´yi³st´di.

• Demand function for each input of the form

yi³st´=

pi³st´

p (st)

1

θ−1y³st´,

with price aggregator:

p³st´=

ÃZpi³st´ θθ−1 di

!θ−1θ

.

102

Page 113: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

The Intermediate Good Producer

• Continuum of intermediate good producers, each of one behaving as

monopolistic competitor.

• The producer of good i has access to the technology:

yi³st´= max

½ez(s

t)kαi³st−1

´l1−αi

³st´− φ, 0

¾.

• Productivity z³st´= ρz

³st−1

´+ εz

³st´.

• Calvo pricing with indexing. Probability of changing prices (before

observing current period shocks) 1− ζ.

103

Page 114: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Consumers Problem

Est

∞Xt=0

βt

εc³st´ ³c ³st´− dc ³st−1´´σc

σc− εl

³st´ l ³st´σl

σl+ εm

³st´m ³

st´σm

σm

p³st´ ³c³st´+ x

³st´´+M

³st´+Zst+1

q³st+1

¯st´B³st+1

´dst+1 =

p³st´ ³w³st´l³st´+ r

³st´k³st−1

´´+M

³st−1

´+B

³st´+ Π

³st´+ T

³st´

B³st+1

´≥ B

k³st´= (1− δ) k

³st−1

´− φ

x³st´

k³st−1

´+ x ³st´ .

104

Page 115: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Government Policy

• Monetary Policy: Taylor rulei³st´= rgπg

³st´

+a³st´ ³

π³st´− πg

³st´´

+b³st´ ³y³st´− yg

³st´´+ εi

³st´

πg³st´= πg

³st−1

´+ επ

³st´

a³st´= a

³st−1

´+ εa

³st´

b³st´= b

³st−1

´+ εb

³st´

• Fiscal Policy.105

Page 116: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Stochastic Volatility I

• We can stack all shocks in one vector:ε³st´=³εz³st´, εc

³st´, εl

³st´, εm

³st´, εi

³st´, επ

³st´, εa

³st´, εb

³st´´0

• Stochastic volatility:

ε³st´= R

³st´0.5

ϑ³st´.

• The matrix R³st´can be decomposed as:

R³st´= G

³st´−1

H³st´G³st´.

106

Page 117: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Stochastic Volatility II

• H³st´(instantaneous shocks variances) is diagonal with nonzero ele-

ments hi³st´that evolve:

log hi³st´= log hi

³st−1

´+ ςiηi

³st´.

• G³st´(loading matrix) is lower triangular, with unit entries in the

diagonal and entries γij³st´that evolve:

γij³st´= γij

³st−1

´+ ωijνij

³st´.

107

Page 118: pages.stern.nyu.edupages.stern.nyu.edu/~dbackus/Identification/Kalman/JFV_LectureNo… · Comparison with Alternative Schemes • Deterministic algorithms: Extended Kalman Filter

Where Are We Now?

• Solving the model: problem with 45 state variables: physical capital,

the aggregate price level, 7 shocks, 8 elements of matrix H³st´, and

the 28 elements of the matrix G³st´.

• Perturbation.

• We are making good progress.

108