LLR_PengWang

7/29/2019 LLR_PengWang

1/22

Estimating Long Run Risk: A Panel Data Approach

Peng Wang

Hong Kong University of Science and Technology

December, 2010

Abstract

The long-run risk (LLR) model of Bansal and Yaron (2004) provides a building

block for the resolution of the equity-premium puzzle, among many other puzzles.

However, in their calibration exercise, the magnitude of this long-run risk of consump-

tion is rather small and dicult to detect using only several macro data series. It is

hard for traditional small-scale models to estimate this component due to the very

small signal-noise ratio. For the LLR argument to work in practice, the representa-

tive consumer must be able to make inference about the long-run risk based on the

information set available. The original paper of Bansal and Yaron (2004) assumes that

the representative agent observes the long-run risk, and then the agents consumption

plan becomes a function of the underlying long-run risk. This paper assumes that theagent is endowed with a rich information set of dividend-price ratios, from which the

consumer may learn the long-run risk. Then I propose a panel data approach such that

the small risk component of consumption can be accurately estimated, and thus lends

support for the presence of the long-run risk. Based on the factor representation for

cross-sectional asset price-dividend ratios implied by the theoretical model, a two-stage

factor augmented maximum likelihood estimator will provide more robust estimation

results than existing small-scale estimation methods. In the rst stage, the principal

components estimators oer precise estimates of the space spanned by the state vari-

ables, namely the long-run component of consumption and the stochastic volatility.In the second stage, maximum likelihood estimation is carried out to recover the true

state variables and the structural parameters. The method used in this paper has

far-reaching implications for a more general class of the dynamic general equilibrium

models, where the estimation of the state variables is of direct relevance.

Department of Economics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon,Hong Kong. Tel: +852-23587630. Fax: +852-23582084. Email: [email protected]

1


2/22

1 Introduction

This paper concerns an application of large dimensional factor models to the analysis of

dynamic stochastic economic models. Specically I consider the analysis of structural dy-

namic factor models where both the cross-sectional dimension and the time dimension tendto innity, which is called the large-(N; T) framework. The economic model I consider is

the long-run risk (LLR) model of Bansal and Yaron (2004). The LLR model is of particular

interest to motivate the use of large dimensional factor models in estimating economically

important time series which is statistically dicult to measure using aggregate data.

In the LLR model, all economic variables of interest are functions of the long-run risk of

consumption growth.1 This component is very small in magnitude but is crucial in resolving

the equity premium puzzle. It is hard for traditional small-scale models to estimate this

component due to the very small signal-noise ratio. However, modern factor models provide

a way to consistently estimate this long-run component. The asymptotic property of theestimators is built upon double asymptotics of large (N; T) ; in which both the cross-section

dimension and the time dimension tend to innity. Such a methodology allows one to make

inference about the long-run component itself besides its parametric dynamic process. The

estimators in this paper will be structural in the sense that the factor structure is implied

by the model. Specically, the logarithm of cross-sectional asset price-dividend ratios is

approximately a linear function of the long run component, which is the state variable of

the economic model. If we treat the price-dividend ratios as measured with i.i.d. errors,

principal components estimators of the long run component oer consistent estimators for

the state variables, which further help the estimation of deep parameters.

There is a large body of work that has studied the implications of large factor models

to economic analysis. Bernanke, et al (2004) proposed a factor-augmented VAR model to

overcome the limited-information problem in the conventional small VAR models and was

able to resolve the price puzzle. Boivin and Giannoni (2006) studied the DSGE models in

a data rich environment, using the implied factors to augment a state-space representation of

the DSGE model to overcome the measurement errors problem. Other applications of factor

models include Ng and Ludvigson (2007, 2009), who studied the risk-return tradeo and the

bond risk premia, Ng and Moench (2010), who studied the housing market dynamics, Mum-taz and Surico (2009), who studied the transmission of international shocks, etc. My work

diers from all the above in that I am considering a factor-augmented maximum likelihood

approach for a general likelihood function, not necessarily for a VAR or linear state-space

1 This is indeed a common feature of the DSGE models where linearized or log-linearized solutions areavailable.

2


3/22

system. A two-step method will be shown to be parsimonious while eective.

2 Modeling the consumption growth

Centered around Bansal and Yaron (2004)s model setup are two key assumptions. Firstly,

the consumption growth contains a small stationary component, which is highly persistent.

They named this component the long-run risk of consumption growth. Secondly, the economy

is characterized by a representative agent, who is endowed with an Epstein-Zin recursive

preference. With such a preference, the agent cares about the consumption growth risk

far into the future in contrast to the case with additive utility functions. Although the

consumption growth process is statistically indistinguishable from an i.i.d. process, the role

of the predictable long run component is amplied by the Epstein-Zin preference, and the

feedback is shown in asset prices.Let Ct be the consumption at time t; and ct = log (Ct) log(Ct1) be a measure for

the consumption growth, which is modeled as follows:

ct+1 = c + xt + tt+1

xt+1 = xt + 'etet+1

2t+1 2 = v(2t 2) + wwt+1

where xt is a small but persistent component capturing the long-run risk in the consumptiongrowth, and 2t is the conditional volatility of the consumption growth.

2.1 Representative agents problem

The representative agent has the Epstein-Zin recursive preference, characterized by

Vt = [(1 )C1

t + (Et[V1t+1 ])

1

]

1 ;

where is the coecient of relative risk aversion, 1

11= ; and is the intertemporalelasticity of substitution (IES). Time t budget constraint is given by

Wt+1 = (Wt Ct)Rc;t+1;

where Rc;t+1 Pt+1+Ct+1Pt is the gross return on consumption claim.

3


4/22

The agents problem at time t is to maximize Vt subject to the time t budget constraint,

which results in the following Euler equation

Et(Mt+1Rj;t+1) = 1; or

Et(exp(mt+1 + rj;t+1)) = 1

with mt log(Mt) and rj;t log(Rj;t): The log version of intertemporal marginal rate ofsubstitution (IMRS) is given by

mt+1 = log

ct+1 + ( 1)rc;t+1

2.2 Estimating the LLR: from the agents Prospective

The relevant state variables of this economy are xt and 2t : For any dividend paying asset j;

the log of price-dividend ratio is a linear function of state variables.

zj;t = A0;j + A1;jxt + A2;j2t ;

where Ai;j is function of model parameters (Bansal and Yaron, 2004, and BKY, 2007):

In Bansal and Yaron (2004)s economy, the long-run component xt and the stochastic

volatility 2t are both in the agents information set. The assumption that the agent observes

not only the consumption growth but also xt and 2t seems a rather strong one because the

latter two series are statistically insignicant. The rational expectations assumption equatesthe agents belief of the hidden process to that of the objective process. I will show that if

the agent can fully utilize the information from a large cross section of observed asset prices,

xt and 2t can be treated as known. To x idea, I start from the agents prospective and use

a simple setup where the parametrized dynamic process of consumption growth is known to

the agent, however, the historical time series of xt and 2t are unobserved. Thus the agent

must draw inference and make predictions of xt and 2t based on observables.

I will compare two scenarios categorized by the agents information set. The small in-

formation scenario assumes that the agent uses only the information from aggregate series,

namely, the consumption growth and the asset price index, to estimate the hidden states.

The large information scenario assumes that the agent incorporate information from the

disaggregate asset prices when estimating the hidden states. I will show that the latter dis-

aggregate approach is superior to the former aggregate method in the sense that the hidden

states can be precisely estimated.

4


5/22

The consumption process is calibrated as in Bansal and Yaron (2004).

ct+1 = c + xt + tt+1

xt+1 = xt + 'etet+1

2t+1 2 = v(2t 2) + wwt+1

The error terms t; et; and wt are i.i.d. N(0; 1): The following table summarizes parameter

values.Parameters c 'e v w

0:0015 0:979 0:044 0:0078 0:987 0:23 105

I simulate the time series fct; xt; 2tg according to the calibrated model. Then I generatea panel of logarithm of dividend-price ratios according to

zj;t = A0;j + A1;jxt + A2;j2t + uj;t; j = 1; :::; 100; t = 1;:::; 100

where N = 100; T = 100; Ai;j are i.i.d. draws from U(0; 1); and uj;t is i.i.d. N(0; 1) serving

the role as the measurement error. The aggregate index of log dividend-price ratio z0;t is

formed as the sum of the cross-sectional average of zj;t and an i.i.d. N(0; 2) measurement

error:

z0;t =1

N

N

Xj=1 zj;t =A0 + A1xt + A2

2t + ut + u0;t

p! A0 + A1xt + A22t + u0;t;

where the last part comes from the fact that utp! 0 as N!1:

2.2.1 Estimating the hidden state variables

Given the available information, an agent will look at the state space representation of the

model, and optimally update his prediction about the hidden state variables using some

ltering technique. A very unique feature of the above model is the low signal-noise ratio.

The conditional variances ofct and zj;t are much larger than that ofxt and 2t : It is thus very

dicult to eectively detect the process of hidden states using only two observed aggregate

series. In contrast, in spite of the high noise, the disaggregate approach can eectively detect

the hidden state. To focus on the main idea, I rst use a quasi-maximum likelihood method,

treating t as a constant although a time-varying t is used to generate the data. In this way,

the agent is able to form a linear state-space system and use the Kalman lter to update his

5


6/22

belief about the hidden long-run component xt:2 The following graph compares the estimates

from two information scenarios with the true long-run component.

0 20 40 60 80 100-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5x 10

-3 LLR component of consumption growth

x

x1(small info)

x2(large info)

The following graph shows the relative magnitude of hidden states with respect to the con-

sumption growth.

0 20 40 60 80 100-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

0.02

0.025

Consumption growth

ct

c+x

t

t

2 In general, the state-space system is of a nonlinear form due to the presence oft: Instead of using acomputationally intensive nonlinear ltering approach, I will consider a two-step factor-augmented maximumlikelihood approach later.

6


7/22

Given the small magnitude of the long-run component, the conditional variance of the ob-

served consumption growth is largely due to the error term. It comes at no surprise that

small information approach yields unsatisfactory results. However, the large information ap-

proach is able to accumulate the signal from the disaggregate data and provides consistent

estimation of the small long-run component. Although the agent is assumed to have thefalse belief that t is a constant, the disaggregate approach is able to precisely estimate the

entire time series of xt: This is due to the fact that 2t is a slow moving series, the variance

of which is negligible compared to that of xt and ct:

2.2.2 Monte Carlo experiments

In this section, I conduct some Monte Carlo experiments to demonstrate how the size of

information aects the estimation of hidden state variables. For the same data generating

process, I will consider dierent combinations of N and T: I will also consider dierentDGPs for the coecients Ai;j in the log dividend-price ratio equation. The following table

summarizes the Monte Carlo average of mean squared errors of x for each model design

for M = 200 repetitions (Monte Carlo standard errors are in parenthesis): MSE(x) =1

MT

PMm=1

PTt=1

hx(m)t x(m)t

i2. All values are reported as units of 105:

Ai;j N(0; 1) Ai;j U[0; 1]T N Small Info. Large Info. Small Info. Large Info.

50 5 0.3249(0.4118) 0.0891(0.0516) 0.3765(0.4073) 0.1140(0.0694)

10 0.3071(0.3849) 0.0657(0.0353) 0.4372(0.5409) 0.0990(0.0629)

50 0.1993(0.1834) 0.0316(0.0124) 0.4106(0.4776) 0.0561(0.0267)

100 0.1805(0.1726) 0.0212(0.0065) 0.4495(0.5318) 0.0397(0.0160)

200 0.1810(0.1614) 0.0141(0.0036) 0.4221(0.4943) 0.0264(0.0085)

100 5 0.3813(0.5106) 0.0917(0.0411) 0.4857(0.5282) 0.1135(0.0532)

10 0.3202(0.3108) 0.0688(0.0306) 0.5127(0.6447) 0.1020(0.0558)

50 0.2466(0.2204) 0.0325(0.0091) 0.5343(0.5896) 0.0547(0.0159)

100 0.2196(0.1550) 0.0217(0.0058) 0.4900(0.6127) 0.0396(0.0114)

200 0.2140(0.1504) 0.0137(0.0029) 0.4347(0.4599) 0.0276(0.0068)200 5 0.4605(0.5367) 0.0917(0.0357) 0.4979(0.5018) 0.1171(0.0448)

10 0.3183(0.3523) 0.0697(0.0228) 0.5520(0.6837) 0.1051(0.0394)

50 0.2681(0.1714) 0.0317(0.0075) 0.5392(0.5827) 0.0543(0.0131)

100 0.2308(0.1143) 0.0212(0.0039) 0.5279(0.5549) 0.0398(0.0090)

200 0.2486(0.1506) 0.0140(0.0022) 0.4775(0.4726) 0.0265(0.0045)

7


8/22

To summarize, it comes at no surprise that the large information approach uniformly

outperform the small information method. I will focus on the role of the size of information

on the estimation precision using the large information approach. For N xed, increasing

the time dimension T seems not to be necessarily accompanied by a decline in the mean

squared errors. On the other hand, for a given T; a larger cross-sectional dimension alwayshelps the estimation of the hidden long-run component. Such properties hold for both DGPs

for the loadings coecients Ai;j .

3 Estimation: from the econometricians prospective

The previous exercise focuses on the state estimation from the representative agents prospec-

tive, where all the model parameters are treated as known. Such an estimation practice can

be viewed as the learning process of the agent, who update his belief about the hidden statevariables using available information set.

The following part will look at the state estimation problem from an econometricians

prospective, where all the model parameters have to be estimated. And the parameter

estimates will be used to form the estimates for the hidden states.

3.1 GMM/SMM estimation

Exploring moment conditions implied by the Euler equation, a simulated method of moment

(SMM) procedure based on the GMM principle is proposed by Bansal, Kiku, and Yaron(BKY, 2007). The Euler equation for an asset with return rj;t is given by

Et fexp[mt+1 + rj;t+1]g = 1:

where the logarithm of IMRS is given by

mt+1 = log

ct+1 + ( 1) rc;t+1:

Crucial to the construction of moment conditions is to construct estimates for the consump-tion return process rc;t+1.

Using a log-linear approximation for log(Pt+1Ct+1

+ 1); BKY approximate rc;t+1 = log(Pt+1Ct+1

+

1) + ct+1 zt byrc;t+1 k0 + k1zt+1 + ct+1 zt

where zt log(Pt=Ct) is the price-dividend ratio for consumption claim, k1 = exp(z)1+exp(z) ;

8


9/22

k0 = log(1 + exp(z)) k1z; with z = E(zt):The log of price-dividend ratio is a linear function of state variables,

zt = A0;c + A1;cxt + A2;c2t :

where Ai;c is function of model parameters. Let A = [A0;c A1;c A2;c] = A(z); Yt = [1 xt 2t ]0;

then z is solved as the following xed-point problem

z = A(z) Y ; where Y E(Yt):

To close the construction of rc;t+1; one needs to recover the time series of state variables

xt and 2t and thus to recover the time series of zt: BKY (2007) applied a two-step method.

In the rst step, xt is identied by regressing the consumption growth on the risk-free rate

and the market price-dividend ratio,

ct+1 = b0xYt + tt+1; with Yt = [1; zm;t; rf;t]

0 :

In the second step, the conditional volatility 2t is obtained by regressing the squared con-

sumption residual on the same set of observables,

2t Eth

(ct+1 b0xYt)2i

= b0Yt

Once the construction of the consumption return process is done, we are able to formulate

the pricing kernel mt+1 for the GMM estimation. Due to the lack of a closed-form solution,

computationally intensive simulation method such as the simulated method of moments is

implemented in BKY (2007) to estimate the model.

The method proposed by this paper avoids the estimation of the consumption return

process, which requires numerically solving a xed-point problem. Instead of looking at

model restrictions on asset returns, I emphasize the importance of the linear approaximate

solution of log dividend-price ratios.

3.2 Estimate state variables using cross-section P/D data

Let zj;t = A0;j + A1;jxt + A2;j2t + ej;t; where ej;t is the unobserved idiosyncratic error (or

the measurement error) of asset js log price-dividend ratio. Then the model implies the

following factor representation

Zt = Z + Ft + Et; or Z = Z 10 + F + E

9


10/22

with ZNT being an (possibly unbalanced) panel of asset price-dividend ratios, Z being the

time-invariant individual xed eect, N2 being the factor loadings, Ft = [xt; 2t ]0 being the

unobserved factors. Also we can include the risk-free rate in Zt:

The rst two leading eigenvectors F of the matrix (ZZ 10)0(ZZ 1

0)NT oer a consistent

estimator for the column space of true F:3 Then we may denote the estimator for the statevector as,

xt2t

= R Ft; or(

xt = R11 F1t + R12 F2t2t = R21 F1t + R22 F2t

;

for some rotation matrix R: We still need an estimator for the rotation matrix. Here we

exploit the dynamic structure of consumption growth. Rewrite the demeaned consumptiongrowth process as

ct+1 = R11 F1t + R12 F2t + tt+1:

The above equation is a member of a larger system, namely,

"ct+1

Zt

#=

"R11; R12

#"F1t

F2t

#+

"tt+1

Et

#:

To estimate the predictive component of ct+1; one may apply the principal components

estimator for the above system of equations to obtain a consistent estimator xt = R11 F1t + R12 F2t; 4 where xt is the estimated long-run component of consumption growth.Alternatively, on may estimate Ft by using only Zt and regress ct+1 on F1t and F2t to

obtain xt = R11 F1t + R12 F2t: The alternative strategy works because E

tt+1jF

=

E

E

tt+1jF;Z; jF = Ett+1jZ; = 0 and 1TPTt=1 Ft HFt2 = op(1): The

large sample properties of xt from both approaches are the same.

To estimate the conditional volatility 2t ; BKY (2007) explored the conditional expecta-

tion representation of 2t : 2t Et[(ct+1 xt)2]: The projection of (ct+1 xt)2 onto the

time-t information set, e.g.f

xsgt

s=1provides a possible way to recover the process 2

t : My

3 Bai and Ng (2002) showed that when (N; T)! 1; consistency of the factors can be established in thesense that 1

T

PTt=1

Ft HFt2 = op(1) for some deterministic matrix H: Bai (2003) further showed thatpNFt HFt

!d N(0; Vt) under very general conditions on the model.

4 Although in the factor model, xit = 0

ift + eit;the factors ft and factor loadings i cannot be separatelyidentied without further restrictions, the common component cit 0ift is consistently estimated by cit 0

ift, see Bai and Ng (2002) and Bai (2003).

10


11/22

estimation strategy for 2t diers from BKY in two ways. First, I will explore the conditional

likelihood function parametrized by R21 and R22: Second, I will augment the likelihood func-

tion by the principal components estimators F : Such a strategy is able to obtain a consistent

estimator for 2t given the consistency of the MLE estimators R21 and R22:

Explore the assumption that conditional on xt and 2; ct+1 is independent normal. Wemay formulate the conditional likelihood

f(cjx; 2) =TYt=1

1p2t

exp[0:5(ct+1 xtt

)2]; or

log f(cjx; 2) = 0:5TXt=1

(ct+1 xt

t)2 0:5

Xlog(2t ) 0:5T log(2):

Replacing xt and 2t by their principal component estimators (parametrized by the rotation

matrix R), one obtains an approximate likelihood function,

L(R21; R22) = 0:5TXt=1

(ct+1 xt

t)2 0:5

TXt=1

log(2t ) 0:5T log(2)

= 0:5TXt=1

(ct+1 xt)2R21 F1t + R22 F2t

0:5TXt=1

log(R21 F1t + R22 F2t) 0:5T log(2)

Maximize the above objective function to obtain consistent estimators for R21; R22: Then we

can construct the time series of state variables:xt2t

= R Ft

A natural extension of the above method is to add the dynamics of 2t to the above

likelihood. Instead of studying the likelihood for ct+1 conditioning on xt and 2t ; we may

11


12/22

formulate the joint likelihood of ct+1; xt; and 2t :

f(c; 2; x)

= f(cj2; x) f(xj2) f(2)= T

Qt=1

f(ct+1j2t ; xt) f(xtjxt1; 2t1) f(2t j2t1)

=TQt=1

1p2t

exp[0:5(ct+1 xtt

)2] 1p2'et1

exp[0:5( xt xt1'et1

)2]

1p2w

exp[0:5(2t v2t1 + (1 v)2

w)2]

TQt=1

1p2t

exp[0:5(ct+1 xtt

)2] 1p2'et1

exp[0:5( xt xt1'et1

)2]

1

p2wexp[

0:5(

2t v2t1 + (1 v)2

w)2]:

In this way, we are able to jointly estimate fR21; R22; w; v; ; 'e; g :An alternative computationally simple approach is to explore the orthogonality condition

implied by the model. First notice that

E

2t2t+1j2t ; xt

= 2tE

2t+1j2t ; xt

= 2t :

More over, let F be the T 2 factor matrix which represent the space spanned by x and 2;then

2t = E

2t2t+1j2t ; xt

= E

2t

2t+1jFt

:

This implies that the projection of 2t2t+1 onto Ft yields

2t : In practice, one may regress

2t 2t+1 on Ft and use the tted value as an estimator of

2t :

3.2.1 Simulation studies

I will simulate the model using the same parameters as previous sections. From an econo-

metricians prospective, the consumption growth and the asset dividend-price ratios are the

observables. Both the state variables fxt; 2tg and the model parameters are unknown and

will be jointly estimated using the optimality conditions implied by the model.

I will focus on the two step approach, namely, obtain the principal component estimators

for the space spanned by the hidden states and then use the likelihood method to estimate

the model parameters.

12


13/22

The following graph shows a typical estimation result from simulated data.

0 20 40 60 80 100-2

-1

0

1

2

3

4

5x 10

-3 LLR component of consumption growth

x

x1(small info)

x2(large info)

x3(pca)

The series x1 and x2 are estimated from the agents prospective, using small and large

information respectively. The series x3 is the econometricians two-step estimates, treating

all model parameters as unknown. As demonstrated by this graph, both x2 and x3 are very

close to the true series x; while the small information approach fails to obtain reasonable

estimates. The results of two-step method is very close to that of the large information

method, in spite of the fact that the former estimates model parameters and the statestogether. The simple regression approach is used to recover the time series of the conditional

variance. As shown in the following graph, the estimates are reasonably close to the true

series. A more detailed Monte Carlo study will be conducted to examine the properties of

13


14/22

the estimator.

0 20 40 60 80 1005

6

7

8

9

10

11x 10

-5 Conditional Variance

2

2

hat

To further compare the econometricians two-step method and the agents large informa-

tion method. I further conduct a Monte Carlo experiment for dierent model sizes. For the

same data generating process, I consider dierent combinations of N and T: I will also con-

sider dierent DGPs for the coecients Ai;j in the log dividend-price ratio equation. Again,

the following table summarizes the Monte Carlo average of mean squared errors of x for

each model design for M = 200 repetitions (Monte Carlo standard errors are in parenthesis):

14


15/22

MSE(x) = 1MT

PMm=1

PTt=1

hx(m)t x(m)t

i2. All values are reported as units of 105:

Ai;j N(0; 1) Ai;j U[0; 1]T N Agent Econometrician Agent Econometrician

50 5 0.0930(0.0532) 0.3726(0.2553) 0.1119(0.0799) 0.3708(0.2689)

10 0.0750(0.0427) 0.3925(0.2912) 0.0959(0.0595) 0.3875(0.2826)

50 0.0328(0.0123) 0.3188(0.2617) 0.0533(0.0214) 0.3853(0.2787)

100 0.0210(0.0064) 0.2871(0.2737) 0.0384(0.0172) 0.3509(0.2505)

200 0.0145(0.0041) 0.3080(0.2278) 0.0269(0.0101) 0.3519(0.2393)

100 5 0.0908(0.0396) 0.2855(0.1479) 0.1264(0.0787) 0.3301(0.2269)

10 0.0706(0.0286) 0.2480(0.1222) 0.1008(0.0509) 0.3239(0.1910)

50 0.0321(0.0096) 0.2021(0.1199) 0.0577(0.0196) 0.2788(0.1437)

100 0.0212(0.0056) 0.1792(0.1414) 0.0389(0.0111) 0.2454(0.1237)200 0.0138(0.0029) 0.1706(0.1578) 0.0277(0.0071) 0.1951(0.1105)

200 5 0.0968(0.0387) 0.2719(0.1368) 0.1202(0.0450) 0.2828(0.1519)

10 0.0731(0.0257) 0.2459(0.1065) 0.1061(0.0492) 0.2991(0.1574)

50 0.0312(0.0068) 0.1449(0.0629) 0.0557(0.0146) 0.2347(0.0878)

100 0.0217(0.0038) 0.1174(0.0707) 0.0383(0.0071) 0.1912(0.0678)

200 0.0139(0.0021) 0.0897(0.0708) 0.0266(0.0046) 0.1428(0.0626)

For T as small as 50; the two-step method not necessarily improves estimation as N becomes

larger. This is consistent with the large sample theory of the second step, which requires

large T to obtain a consistent estimator of the regression coecient. Even if we observe the

factor Ft; the convergence rate of the OLS estimator R for the rotation matrix isp

T : When

T is as large as 100 or 200; combined with large N; the two-step method is able to provide

reasonably good estimation results. In general, a larger N is accompanied by an improved

estimation precision. Such a pattern exists in both DGPs of Ai;j :

4 Issues related to the panel data

Data for the cross-sectional dividend price ratios can be obtained from Fama and Frenchs

website. I will use the 100 portfolio data for estimating the long-run component. There

are several possible issues that might complicate the empirical studies. First, individual

xed eects might present in the panel data model. In the conventional large-N small-T

framework, dierencing method is usually used to remove the individual xed eects and

15


16/22

then the model parameters are estimated based on the transformed model. In an large-

(N; T) framework, the individual xed eects can be consistently estimated (see Hahn, J.,

Kuersteiner, 2002 for an example) without resorting to the dierencing method.

Second, there are some missing observations in the 100 portfolio data constructed by

Fama and French. Although the principal component methods do not directly apply forsuch a data set, the expectation-maximization (EM) algorithm of Stock and Watson (1998)

can be applied to handle panel data with missing observations. Consider the panel data Xit;

i = 1;:::;N;t = 1;:::;T: Let Dit be a binary variable which takes value 0 if the observation

(i; t) is missing and 1 otherwise. The EM algorithm iterates between an expectation step

and a maximization step until some convergence criteria are met.

To initialize the EM algorithm, I replace the missing observations of a portfolio k by i:i:d:

random variables Ukt which are uniformly distributed with mean equal to the sample mean

of that portfolio. Such a replacement is done for all N portfolios to form a balanced panel

X(0) of dimension T N:X

(0)it =

(Xit; if Dit = 1

Uit; if Dit = 0:

Then the maximization step can be carried out using X(0). In a typical (k) th max-imization step, we start with a balanced panel X(k): Principal components estimators

(k)

i

and f(k)t are obtained from X

(k) by solving the restricted least squares problem

n(k)i ; f(k)t o = arg minfi;ftg

N

Xi=1

T

Xt=1

hX(k)it 0ifti

2

s:t:1

T

TXt=1

ftf0t = Ir;

NXi=1

i0i is diagonal.

Let (k) be the Nr loading matrix and F(k) be the Tr factor matrix. Then F(k) consiststhe leading r eigenvectors of the T T matrix 1NTX(k)X(k)0; and (k) = 1TX0F(k):

In the (k) th expectation step, we form a balanced panel X(k+1) :

X

(k+1)

it =( Xit; if Dit = 1

(k)0i f(k)t ; if Dit = 0 :

We keep iterating until some convergence criteria are met, say maxfi;t:Dit=0g

X(k+1)it X(k)it

LLR_PengWang

Documents

Transcript of LLR_PengWang