Bayesian Estimation of DSGE Models

Stéphane Adjemian

Université du Maine, GAINS & CEPREMAP

stephane.adjemian@univ-lemans.fr

http://www.dynare.org/stepan

June 28, 2011

June 28, 2011 Université du Maine, GAINS & CEPREMAP Page 1

Bayesian paradigm (motivations)

• Bayesian estimation of DSGE models with Dynare.

1. Data are not informative enough...

2. DSGE models are misspecified.

3. Model comparison.

• Prior elicitation.

• Efficiency issues.

Bayesian paradigm (basics)

• A model defines a joint probability distribution

parametrized function over a sample of variables:

p(Y⋆T |θ) (1)

⇒ Likelihood.

• We Assume that our prior information about parameters

can be summarized by a joint probability density function.

Let the prior density be p0(θ).

• The posterior distribution is given by (Bayes theorem

squared):

p1 (θ|Y⋆T ) =

p0 (θ) p(Y⋆T |θ)

p(Y⋆T )

Bayesian paradigm (basics, contd)

• The denominator is defined by

p (Y⋆T ) =

Θp0 (θ) p(Y

⋆T |θ)dθ (3)

⇒ the marginal density of the sample.

⇒ A weighted mean of the sample conditional densities

over all the possible values for the parameters.

• The posterior density is proportional to the product of the

prior density and the density of the sample.

p1 (θ|Y⋆T ) ∝ p0 (θ) p(Y

⋆T |θ)

⇒ That’s all we need for any inference about θ!

• The prior density deforms the shape of the likelihood!

A simple example (I)

• Data Generating Process

yt = µ+ εt

where εt ∼iid

N (0, 1) is a gaussian white noise.

• Let YT ≡ (y1, . . . , yT ). The likelihood is given by:

p(YT |µ) = (2π)−T2 e−

∑Tt=1(yt−µ)2

• And the ML estimator of µ is:

µML,T =1

yt ≡ y

A simple example (II)

• Note that the variance of this estimator is a simple function

of the sample size

V[µML,T ] =1

• Noting that:

(yt − µ)2 = νs2 + T (µ− µ)2

with ν = T − 1 and s2 = (T − 1)−1∑T

t=1(yt − µ)2.

• The likelihood can be equivalently written as:

p(YT |µ) = (2π)−T2 e−

12(νs

2+T (µ−µ)2)

The two statistics s2 and µ are summing up the sample

information.

A simple example (II, bis)

(yt − µ)2 =

([yt − µ]− [µ− µ])2

(yt − µ)2 +T∑

(µ− µ)2 −T∑

(yt − µ)(µ− µ)

= νs2 + T (µ− µ)2 −

yt − T µ

)(µ− µ)

= νs2 + T (µ− µ)2

The last term cancels out by definition of the sample mean.

A simple example (III)

• Let our prior be a gaussian distribution with expectation

µ0 and variance σ2µ.

• The posterior density is defined, up to a constant, by:

p (µ|YT ) ∝ (2πσ2µ)−

(µ−µ0)2

σ2µ × (2π)−T2 e−

12(νs

2+T (µ−µ)2)

where the missing constant (denominator) is the marginal

density (does not depend on µ).

• We also have:

p(µ|YT ) ∝ exp

(T (µ− µ)2 +

σ2µ(µ− µ0)

A simple example (IV)

A(µ) = T (µ− µ)2 +1

(µ− µ0)2

= T(µ2 + µ

2 − 2µµ)+

(µ2 + µ

20 − 2µµ0

)µ2 − 2µ

(T µ+

µ2 − 2µT µ+ 1

T + 1σ2µ

µ−T µ+ 1

T + 1σ2µ

(T µ+ 1

T + 1σ2µ

A simple example (V)

• Finally we have:

p(µ|YT ) ∝ exp

)[µ−

T µ+ 1σ2µµ0

T + 1σ2µ

• Up to a constant, this is a gaussian density with (posterior)

expectation:

E [µ] =T µ+ 1

σ2µµ0

T + 1σ2µ

and (posterior) variance:

V [µ] =1

T + 1σ2µ

A simple example (VI, The bridge)

• The posterior mean is a convex combination of the prior

mean and the ML estimate.

– If σ2µ → ∞ (no prior information) then E[µ] → µ (ML).

– If σ2µ → 0 (calibration) then E[µ] → µ0.

• If σ2µ <∞ then the variance of the ML estimator is greater

than the posterior variance.

• Not so simple if the model is non linear in the estimated

parameters...

– Asymptotic (Gaussian) approximation.

– Simulation based approach (MCMC,

Metropolis-Hastings, ...).

Bayesian paradigm (II, Model Comparison)

• Comparison of marginal densities of the (same) data across

models.

• p(Y⋆T |I) measures the fit of model I.

• Suppose we have a prior distribution over models A, B, ...:

p(A), p(B), ...

• Again, using the Bayes theorem we can compute the

posterior distribution over models:

p(I|Y⋆T ) =

p(I)p(Y⋆T |I)∑

Ip(I)p(Y⋆

Estimation of DSGE models (I, Reduced form)

• Compute the steady state of the model (a system of non

linear recurrence equations.

• Compute linear approximation of the model.

• Solve the linearized model:

yt − y(θ)n×1

= T (θ)n×n

(yt−1 − y(θ))n×1

+R(θ)n×q

εtq×1

where n is the number of endogenous variables, q is the

number of structural innovations.

– The reduced form model is non linear w.r.t the deep

parameters.

– We do not observe all the endogenous variables.

Estimation of DSGE models (II, SSM)

• Let y⋆t be a subset of yt gathering p observed variables.

• To bring the model to the data, we use a state-space

representation:

y⋆t = Z (yt + y(θ))+ηt (5a)

yt = T (θ)yt−1 +R(θ)εt (5b)

where yt = yt − y(θ).

• Equation (5b) is the reduced form of the DSGE model.

⇒ state equation

• Equation (5a) selects a subset of the endogenous variables,

Z is a p× n matrix filled with zeros and ones.

⇒ measurement equation

Estimation of DSGE models (III, Likelihood) – a –

• Let Y⋆T = y⋆1 , y

⋆2 , . . . , y

⋆T be the sample.

• Let ψ be the vector of parameters to be estimated (θ, the

covariance matrices of ε and η).

• The likelihood, that is the density of Y⋆T conditionally on

the parameters, is given by:

L(ψ;Y⋆T ) = p (Y⋆

T |ψ) = p (y⋆0 |ψ)T∏

p(y⋆t |Y

⋆t−1, ψ

• To evaluate the likelihood we need to specify the marginal

density p (y⋆0 |ψ) (or p (y0|ψ)) and the conditional density

p(y⋆t |Y

⋆t−1, ψ

Estimation of DSGE models (III, Likelihood) – b –

• The state-space model (5), describes the evolution of the

endogenous variables’ distribution.

• The distribution of the initial condition (y0) is set equal to

the ergodic distribution of the stochastic difference

equation (so that the distribution of yt is time invariant).

• Because we consider a linear(ized) reduce form model and

the disturbances are supposed to be gaussian (say

ε ∼ N (0,Σ)) then the initial (ergodic) distribution is also

gaussian:

y0 ∼ N (E∞[yt],V∞[yt])

• Unit roots (diffuse kalman filter).

Estimation (III, Likelihood) – c –

• Evaluation of the density of y⋆t |Y⋆t−1 is not trivial, because

y⋆t also depends on unobserved endogenous variables.

• The following identity can be used:

p(y⋆t |Y

⋆t−1, ψ

Λp (y⋆t |yt, ψ) p(yt|Y

⋆t−1, ψ)dyt (7)

The density of y⋆t |Y⋆t−1 is the mean of the density of y⋆t |yt

weigthed by the density of yt|Y⋆t−1.

• The first conditional density is given by the measurement

equation (5a).

• A Kalman filter is used to evaluate the density of the latent

variables (yt) conditional on the sample up to time t− 1

(Y⋆t−1) [⇒ predictive density ].

Estimation (III, Likelihood) – d –

• The Kalman filter can be seen as a bayesian recursiveestimation routine:

p (yt|Y⋆t−1, ψ) =

p (yt|yt−1, ψ) p (yt−1|Y⋆t−1, ψ) dyt−1 (8a)

p (yt|Y⋆t , ψ) =

p (y⋆t |yt, ψ) p (yt|Y⋆t−1, ψ)∫

Λp (y⋆t |yt, ψ) p

(yt|Y⋆t−1, ψ

• Equation (8a) says that the predictive density of the latent

variables is the mean of the density of yt|yt−1, given by the

state equation (5b), weigthed by the density yt−1

conditional on Y⋆t−1 (given by (8b)).

• The update equation (8b) is an application of the Bayes

theorem → how to update our knowledge about the latent

variables when new information (data) becomes available.

Estimation (III, Likelihood) – e –

p (yt|Y⋆t , ψ) =

p (y⋆t |yt, ψ) p(yt|Y

⋆t−1, ψ

)∫Λ p (y

⋆t |yt, ψ) p

(yt|Y⋆

t−1, ψ)dyt

• p(yt|Y⋆

t−1, ψ)

is the a priori density of the latent variables

at time t.

• p (y⋆t |yt, ψ) is the density of the observation at time t

knowing the state and the parameters (this density is

obtained from the measurement equation (5a)) ⇒ the

likelihood associated to y⋆t .

•∫Λ p (y

∗t |yt, ψ) p

(yt|Y⋆

t−1, ψ)dyt is the marginal density of

the new information.

Estimation (III, Likelihood) – f –

• The linear–gaussian Kalman filter recursion is given by:

vt = y⋆t − Z(yt + y(θ)

Ft = ZPtZ′+V [η]

Kt = T (θ)PtT (θ)′

F−1t

yt+1 = T (θ)yt +Ktvt

Pt+1 = T (θ)Pt(T (θ)−KtZ)′ + R(θ)ΣR(θ)′

for t = 1, . . . , T , with y0 and P0 given.

• Finally the (log)-likelihood is:

lnL (ψ|Y⋆T ) = −

2ln(2π)−

|Ft| −1

2v′tF

−1t vt

• References: Harvey, Hamilton.

Simulations for exact posterior analysis

• Noting that:

E [ϕ(ψ)] =

Ψϕ(ψ)p1(ψ|Y

⋆T )dψ

we can use the empirical mean of(ϕ(ψ(1)), ϕ(ψ(2)), . . . , ϕ(ψ(n))

), where ψ(i) are draws from

the posterior distribution to evaluate the expectation of

ϕ(ψ). The approximation error goes to zero when n→ ∞.

• We need to simulate draws from the posterior distribution

⇒ Metropolis-Hastings.

• We build a stochastic recurrence whose limiting

distribution is the posterior distribution.

Simulations (Metropolis-Hastings) – a –

1. Choose a starting point Ψ0 & run a loop over 2-3-4.

2. Draw a proposal Ψ⋆ from a jumping distribution

J(Ψ⋆|Ψt−1) = N (Ψt−1, c× Ωm)

3. Compute the acceptance ratio

r =p1(Ψ

⋆|Y⋆T )

p(Ψt−1|Y⋆T )

=K(Ψ⋆|Y⋆

K(Ψt−1|Y⋆T )

4. Finally

Ψ⋆ with probability min(r, 1)

Ψt−1 otherwise.

Simulations (Metropolis-Hastings) – b –

K (θo)

posterior kernel

Simulations (Metropolis-Hastings) – c –

K (θo)

θ1 = θ

= K (θ∗)

posterior kernel

Simulations (Metropolis-Hastings) – d –

K (θo)

K(θ∗) ??

posterior kernel

Simulations (Metropolis-Hastings) – e –

• How should we choose the scale factor c (variance of the

jumping distribution) ?

• The acceptance rate should be strictly positive and not too

important.

• How many draws ?

• Convergence has to be assessed...

• Parallel Markov chains → Pooled moments have to be

close to Within moments.

Dynare syntax (I)

var A B C;

varexo E;

parameters a b c d e f ;

model(linear);

A=A(+1)-b/e*(B-C(+1)+A(+1)-A);

C=f*A+(1-d)*C(-1);

Dynare syntax (II)

estimated_params;

stderr e, uniform_pdf,,,0,1;

a, normal_pdf, 1.5, 0.25;

b, gamma_pdf, 1, 2;

c, gamma_pdf, 2, 2, 1;

varobs pie r y rw;

estimation(datafile=dataraba,first_obs=10,

...,mh_jscale=0.5);

Prior Elicitation

• The results may depend heavily on our choice for the prior

density or the parametrization of the model (not

asymptotically).

• How to choose the prior ?

– Subjective choice (data driven or theoretical), example:

the Calvo parameter for the Phillips curve.

– Objective choice, examples: the (optimized)

Minnesota prior for VAR (Phillips, 1996).

• Robustness of the results must be evaluated:

– Try different parametrization.

– Use more general prior densities.

– Uninformative priors.

Prior Elicitation (parametrization of the model) – a –

• Estimation of the Phillips curve :

πt = βEπt+1 +(1− ξp)(1− βξp)

((σc + σl)yt + τt

• ξp is the (Calvo) probability (for an intermediate firm) of

being able to optimally choose its price at time t. With

probability 1− ξp the price is indexed on past inflation

an/or steady state inflation.

• Let αp ≡1

1−ξpbe the expected period length during which

a firm will not optimally adjust its price.

• Let λ =(1−ξp)(1−βξp)

ξpbe the slope of the Phillips curve.

• Suppose that β, σc and σl are known.

Prior Elicitation (parametrization of the model) – b –

• The prior may be defined on ξp, αp or the slope λ.

• Say we choose a uniform prior for the Calvo probability:

ξp ∼ U[.51,.99]

The prior mean is .75 (so that the implied value for αp is 4

quarters). This prior is often think as a non informative

prior...

• An alternative would be to choose a uniform prior for αp:

αp ∼ U[1− 1.51

,1− 1.99 ]

• These two priors are very different!

Prior Elicitation (parametrization of the model) – c –

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

# The prior on αp is much more informative than the prior on

Prior Elicitation (parametrization of the model) – d –

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

Implied prior density of ξp = 1− 1αp

if the prior density of αp is

uniform.

Prior Elicitation (more general prior densities)

• Robustness of the results may be evaluated by considering

a more general prior density.

• For instance, in our simple example we could assume a

student prior density for µ instead of a gaussian density.

Prior Elicitation (flat prior)

• If a parameter, say µ, can take values between −∞ and ∞,

the flat prior is a uniform density between −∞ and ∞.

• If a parameter, say σ, can take values between 0 and ∞, the

flat prior is a uniform density between −∞ and ∞ for log σ:

p0(log σ) ∝ 1 ⇔ p0(σ) ∝1

• Invariance.

• Why is this prior non informative ?...∫p0(µ)dµ is not

defined! ⇒ Improper prior.

• Practical implications for DSGE estimation.

Prior Elicitation (non informative prior)

• An alternative, proposed by Jeffrey, is to use the Fisher

information matrix:

p0(ψ) ∝ |I(ψ)|12

I(ψ) = E

[(∂p(Y⋆

T |ψ)

)(∂p(Y⋆

T |ψ)

• The idea is to mimic the information in the data...

• Automatic choice of the prior.

• Invariance to any continuous transformation of the

parameters.

• Very different results (compared to the flat prior) ⇒ Unit

root controverse.

Effective prior mass

• Dynare excludes parameters such that the steady state does

not exist, or such that the BK conditions are not satisfied.

• The effective prior mass can be less than 1.

• Comparison of marginal densities of the data is not

informative if the prior mass is not invariant across models.

• The estimation of the posterior mode is more difficult if the

effective prior mass is less than 1.

Efficiency (I)

• If possible use options use_dll (need a compiler, gcc)

• Do not let dynare compute the steady state!

• Even if there is no closed form solution for the steady state,

the static model can be concentrated and reduced to a

small nonlinear system of equations, which can be solved by

using standard newton algorithm in the steastate file.

This numerical part can be done in a mex routine called by

the steastate file.

• Alternative initialization of the Kalman filter (lik_init=4).

Efficiency (II)

Fixed point of the state equation Fixed point of the Riccati equation

Execution time 7.28 3.31

Likelihood -1339.034 -1338.915

Marginal density 1296.336 1296.203

α 0.3526 0.3527

β 0.9937 0.9937

γa 0.0039 0.0039

γm 1.0118 1.0118

ρ 0.6769 0.6738

ψ 0.6508 0.6508

δ 0.0087 0.0087

σεa 0.0146 0.0146

σεm 0.0043 0.0043

Table 1: Estimation of fs2000. In percentage the maximum

deviation is less than 0.45.

Bayesian Estimation of DSGE Models - · PDF fileBayesian paradigm (motivations) •...

Transcript of Bayesian Estimation of DSGE Models - · PDF fileBayesian paradigm (motivations) •...

Bayesian Estimation of DSGE Models - · PDF fileBayesian paradigm (motivations) •...

Documents

Transcript of Bayesian Estimation of DSGE Models - · PDF fileBayesian paradigm (motivations) •...

Computing DSGE Models with Recursive Preferences

Exploring!the!Nexus!Between!MacroYPruden6al! Policies!and ... · interdependence between macro-prudential instruments and monetary policy. Keywords: DSGE models, Bayesian estimation,

Bayesian Estimation of a DSGE Model for Norway

Bayesian Estimation of Linearized DSGE Modelsmx.nthu.edu.tw/~tkho/DSGE/lecture5.pdf · Bayesian Estimation of Linearized DSGE Models ... code. The program appendix includes a whole

Sequential Monte Carlo Methods (for DSGE Models) · Bayesian Estimation of DSGE Models, with E. Herbst, 2015, Princeton University Press \Sequential Monte Carlo Sampling for DSGE

Bayesian Inference for DSGE Models - Northwestern Universityfaculty.wcas.northwestern.edu/~lchrist/course/Armenia_2017/... · Bayesian inference is about describing the mapping from

How to Solve DSGE Models

DSGE Models for Monetary Policy Analysis...analysis with DSGE models requires using data to assign numerical values to model parameters. The chapter describes and implements Bayesian

Bayesian estimation of a DSGE model for the Portuguese … · Bayesian estimation of a DSGE model for the Portuguese ... advisor Maximiano Pinheiro for his guidance and support. I

Sequential Monte Carlo Methods (for DSGE Models)...Bayesian Estimation of DSGE Models, 2015, Princeton University Press \Sequential Monte Carlo Sampling for DSGE Models," 2014, Journal

Bayesian Methods for DSGE models Lecture 1 Macro models …kris-nimark.net/pdf/BGSE_SS_Lecture_1.pdf · Bayesian Methods for DSGE models Course overview 1.Macro models as data generating

Estimating and Identifying Empirical BVAR-DSGE Models for ... · Estimating and Identifying Empirical BVAR-DSGE Models for Small Open Economies Tim Robinson 1. Introduction DSGE models

Methods and Applications to DSGE Models

Bayesian Inference for DSGE Models - Northwestern …faculty.wcas.northwestern.edu/~lchrist/course/Gerzensee...Outline State space-observer form. Œ convenient for model estimation

Agricultural Economic Dynamics in a Bayesian DSGE Model ... · Agricultural Economic Dynamics in a Bayesian DSGE Model for Iran Mahdi Khosravi*1, Hossein Mehrabi Boshrabadi2 Accepted:

DSGE Models - solution strategies - uni-freiburg.de

Bayesian estimation of a DSGE model for the Portuguese economy€¦ · Bayesian estimation of a DSGE model for the Portuguese ... Lawrence Christiano and Matthias Kehrig. ... and

Microfoundations of DSGE Models: I Lecture€¦ · Microfoundations of DSGE Models: I Lecture Barbara Annicchiarico BBLM del Dipartimento del ... Microfoundations of DSGE Models 7

Bayesian Methods for DSGE models - European …apps.eui.eu/Personal/Canova/Teachingmaterial/bayes_dsge...Del Negro M. and Schorfheide, F. (2008) Forming priors for DSGE models (and

Forecasting with DSGE Models: Theory and · PDF fileForecasting with DSGE Models: Theory and Practice ... Schorfheide Forecasting with DSGE Models: Theory and Practice. ... (1984),