1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH...

49
1 Today’s Agenda 1. ARCH/GARCH 2. GMM (Intro by Example) 3. GMM (Formal Treatment) 4. A Realistic Example: Chan et al. (1992)

Transcript of 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH...

Page 1: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

1 Today’s Agenda

1. ARCH/GARCH2. GMM (Intro by Example)3. GMM (Formal Treatment)4. A Realistic Example: Chan et al. (1992)

Page 2: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

2 Autoregressive Conditional Heteroskedasticity(ARCH)

• Suppose we have the returns process {rt}Tt=1 .• First, we model the conditional mean, for exampleas

rt = µ+ φrt−1 + ut• We know that the unconditional first moments areE (rt) =

µ1−φ and V ar(rt) =

σ2u1−φ2

• We also know that the conditional first momentEt−1 (rt) = µ + φrt is time varying, even though theunconditional moment is not!

• Q: Can we also have the same situation for thesecond conditional moment, i.e. to have a time-varying conditional second moment, although theunconditional second moment is constant overtime?

• A: Yes.• Note: The unconditional second moment of ut is σ2u.

Page 3: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Motivation: Do we need to model the conditionalsecond moment of returns and interest rates?

• A: Let’s see what the data shows.

Page 4: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Intuitively, we want to model u2t to follow an ARprocess just as we had rt follow an AR process.

• We can write such a process asu2t = ζ + αu2t−1 + wt

where wt is a white noise process with E¡w2t¢=

σ2w.

• Note that Et−1¡u2t¢= ζ + αu2t−1, so that the condi-

tional moment is time-varying althoughE¡u2t¢= σ2u.

• The above process is called an autoregressiveconditional heteroskedasticity model of order 1, orARCH(1).

• We can generalize to an ARCH(p) model as:

u2t = ζ + α1u2t−1 + α2u

2t−2 + ... + αpu

2t−p + wt

Page 5: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Note that a variance cannot be negative. We needto place certain restrictions on

u2t = ζ + αu2t−1 + wtin order to insure that u2t is always positive.

• We need:– wt to be bound from below by −ζ, where ζ > 0.

– α ≥ 0.– For covariance stationarity of u2t , we also need

α < 1 as in the other AR model.• With all those conditions, we can see that

V ar (ut) = E¡u2t¢= σ2u =

ζ

1− α

Page 6: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• This is the ARCH(1) model.• We have written it in its autoregressive form, similarto the AR processes.

• It is more convenient, but less intuitive, to presentthe ARCH(1) model as:

ut =phtvt

where vt is iid with mean 0, and E¡v2t¢= 1.

• Suppose thatht = ζ + au2t−1

then combining the above equations, we obtain:u2t = htv

2t

• Now, since vt is iid thenEt−1

¡u2t¢= Et−1

¡h2t v

2t

¢= Et−1

¡h2t¢Et−1

¡v2t¢

= ζ + au2t−1as before.

Page 7: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Reconciling the two definitions.• From one side, we have u2t = htv2t• From another side, we have u2t = ht +wt. Therefore

htv2t = ht + wt

or

wt = ht¡v2t − 1

¢• From here, we can see that Et−1

¡w2t¢is time

varying, whereas E¡w2t¢= σ2w

Page 8: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• The ARCH process gives us conditional het-eroskedasticity, but it turns out that Et−1

¡u2t¢has

turned out to be a very persistent process (in thedata).

• We can capture such a process with an ARCH(p)process u2t = ζ + α1u

2t−1 + α2u

2t−2 + ...+ αpu

2t−p +wt

where p is very large.• This solution is inefficient. There are too manyparameters to estimate!

• Q: What to do?• A: GARCH.• The GARCH, Generalized ARCH allows us tocapture the persistence of conditional volatility in aparsimonious way.

Page 9: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• GARCH: Recall that we could write:ut =

phtvt

where ht = ζ + au2t−1 for an ARCH process.• Suppose, we specify ht as

ht = ζ + δht−1 + au2t−1• The direct link between ht and ht−1 is exactly whatis needed to capture the dependence between σ2tand σ2t−1.

• A process with ht = ζ + δht−1 + au2t−1 is called aGARCH(1,1).

• The unconditional second moment of ut is:E¡u2t¢= σ2u =

ζ

1− (α + δ)

• To insure stationarity, we need to restriction:δ + α < 1.

Page 10: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Of course, we can generalize to a GARCH(p,q) as:u2t = ζ + δ1ht−1 + δ2ht−2 + ... + δpht−p + α1u

2t−1 + α2u

2t−2 + ... +

• Similarly, the parameters of the conditional and theunconditional second moment are related by:

E¡u2t¢=

ζ

1− (α1 + ... + aq + δ1 + ... + δp)

• Also, αi ≥ 0 and δj ≥ 0

Page 11: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• For most practical purposes a GARCH(1,1) isGREAT.– There is a trade-off. You introduce more parame-ters to capture the accurate dynamics, but thereare more parameters to estimate.

– Those parameters have restrictions. The estima-tion is tricky.

– Bottom line, for 95% of the applications,GARCH(1,1) does a great job.

• GARCH is successful, because it can capturethe persistence in Et−1

¡u2t¢, which is the most

significant feature that needs to be captured.

Page 12: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Another useful model to estimate is the IGARCHmodel, or integrated GARCH

• The IGARCH(1,1) is a GARCH(1,1) whereδ + α = 1

• If this condition is satisfied, it can be shown that theconditional variance of ut is infinite.

• The processes ut and u2t are not covariancestationary.

• However, the process ut is stationary (i.e itsconditional density does not depend on t ).

• The IGARCH is important because it captures theimportant case of a strong dependence that leadsto non-stationarity.

• Q: Do variances have to be stationary?

Page 13: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

The GARCH literature has gone crazy chasingafter the perfect conditional heteroskedasticity model.

Some of the models we have are:• ARCH in Means• Exponential GARCH• Nonlinear GARCH• Asymmetric GARCH• Fractionally Integrated GARCH (FIGARCH)• ABS.ASYMM.FIGARCH ????

Page 14: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• The interest in forecasting conditional volatilityusing past volatility has been greater in the appliedfield more so than in academia.

• The GARCH models are unsatisfactory, from aneconomic perspective, because– Explaining vol with past vol tells us nothing aboutthe underlying economic factors that cause thevolatility to move.

– If a structural break occurs, a recursive model willfail miserably...a structural model might not.

– In general, the GARCH models are difficult togeneralize to a multivariate setting.

Page 15: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Q: Why not use (Schwert, French, and Stambaugh(1989))

σ2t =

Ã1

n

nXi=1

r2i,t

!∗ 22

as a “non-parametric” method, where ri,t is thei− th daily return in month t.

• ARCH and GARCHmethods are called “parametric”methods.

• Q: Which one is better?• A: It depends on the application at hand. Forforecasting, it is difficult to beat a GARCH(1,1).

Page 16: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

Future Research:• The conditional volatility literature is huge.• There are very few papers on conditional correla-tion.

• But, understanding how and why correlations varywith time is crucial in finance.– Think of a portfolio choice problem (correlationsmatter a lot)

– Think of a CAPM model. The correlation betweenthe market return and the return of an asset isassumed constant (and, hence the risk premiumis constant), but it need not be.

• There are no good models that try to capture thecorrelation as function of macroeconomic factors(except SVV (2001))–good for a research paper.

Page 17: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• This is great, but how do we estimate a GARCHmodel.

• OLS clearly does not work.• We need a method that would allow us to donon-linear estimation

• We also need a general method that we can apply toany nonlinear problem, with minimal assumptions.

• We do not want to have assumptions on thedistribution of the residuals.

• Therefore, we need GMM (Generalized Method ofMoments)

Page 18: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

3 Simple Introduction to GMM

• Recall that any variable xt has a distribution Fx(x).If x has moments E(xj), j = 1, ...then thosemoments can be used to retrieve Fx(x).

• Caution: Some variables do not have moments(Cauchy distribution case).

• Suppose we have random variables xt, yt, zt.– A population moment of those variables is

E [g (xt, yt, zt)]

– A sample moment of those variables is1

T

TXt=1

g (xt, yt, zt)

– By the ergodicity theorem (or the LLN in crosssection, we know that)

1

T

TXt=1

g (xt, yt, zt)→p E [g (xt, yt, zt)]

under some mild conditions on the function g(.).

Page 19: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• In other words, the distance between the sam-ple and the population moment goes to zero inprobability as T →∞ :(

1

T

TXt=1

g (xt, yt, zt)− E [g (xt, yt, zt)])→p 0

• Can we use this “insight” to estimate parameters.Suppose that the function g depends not only onthe data but also on the unknown parameters, θ.

• We want to choose the parameter θ in order tominimize the distance between the data and thepopulation moment.

• In a simpler example, let’s concentrate on aunivariate case. Then g(x|θ) = µ, the populationmean. In other words, θ = µ.

• The problem becomes (trivially):(1

T

TXt=1

xt − µ)→p 0

Page 20: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Here is a more interesting example: OLS as GMM• The model is:

yt = xtβ + εt

• The FOC in the OLS case could be written as:E (xtεt) = 0

• This is a moment condition that also depends onparameters. To see that, write

E (xt (yt − xtβ)) = 0

E (xtyt) = βE¡x2t¢

β =E (xtyt)

E (x2t )

• Therefore, approximating the population means bytheir sample analogues, we get

1

T

TXt=1

(xt (yt − xtβ)) = 0

1

T

TXt=1

xtyt = β1

T

TXt=1

x2t

β =1T

PTt=1 xtyt

1T

PTt=1 x

2t

Page 21: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• But we can also write another moment condition:E¡x2tεt

¢= 0

• Then, as aboveE¡x2t (yt − xtβ)

¢= 0

β =E¡x2t yt

¢E (x3t )

• Therefore, using sample moments to approximatepopulation moments, we get

β2 =1T

PTt=1 x

2t yt

1T

PTt=1 x

3t

• We can also useE (g(xt)εt) = 0

for some function g(.). Note: You should also beable to show that E (xtεt) = 0 implies E (g(xt)εt) =0. Then, for a known function g(.)

βg =1T

PTt=1 g(xt)yt

1T

PTt=1 g(xt)xt

Page 22: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Oupss! Problem. We have one parameter, β, butthree possible estimators

β =1T

PTt=1 xtyt

1T

PTt=1 x

2t

→p β

β2 =1T

PTt=1 x

2t yt

1T

PTt=1 x

3t

→p β

βg =1T

PTt=1 g(xt)yt

1T

PTt=1 g(xt)xt

→p β

• Which one do we choose?• Result: Under some very restrictive assumptions(i.e. exogeneity of xt, homoskedasticity, uncorre-lated εt, etc), the OLS is the best linear unbiasedestimator (BLUE).

• In other words, in has the smallest variance amongall linear unbiased estimators.

• However, who knows if those assumptions aresatisfied. In all likelihood, they are not.

• Q: Can we stack all the moments in a vector as

E(g (x|β))3x1

= E

xtεx2tεg(xt)εt

= 0and choose the value of β that satisfies the threesample moments?

Page 23: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• A: Off course, not! Three equations, potentiallynonlinear, with only one unknown....Who knowshow many solutions there are, if any.

• But, we can construct a quadratic function, as:E

µg (x|β)01x3

W3x3g (x|β)3x1

¶= 0

for some symmetric positive definite matrixW.• Now, we have the information into the threeequations, weighted by the elements of the matrixW.

• Problem: What matrixW to choose?• A: Any symmetric positive matrix will give usconsistent estimates (i.e. βW →p β), but we areconcerned with efficiency, or smallest possiblestandard errors around βW.

Page 24: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• ENDOGENEITY: Instrumental Variables (IV) andGMM.

• By construction, we had E (ε|x) = 0 implied thatE (εx) = 0. In other words, the residuals and theexplanatory variables are uncorrelated.

• However, in structural models, it is often thecase that we want to run regressions when thisrequirement is not satisfied. For example:

FirmV aluet = α + βDebtt + εt

• But it is not reasonable to assume that Debt is anexogenous variable. For example, new (relativelylow Firm Value) firms do not have access to debt.Indeed, we might try to run the opposite regression:

Debtt = δ + ζFirmV aluet + υt

• So, hereE (Debttεt) = E ((δ + ζFirmV aluet + υt) εt)

6= 0

• Q: If E (Debttεt) 6= 0, can we still have β →p β?

• Breaking the E (Debttεt) = 0 condition is thecardinal sin in empirical work!!!

• Q: What to do?• Note: We can argue that most equations in financesuffer from this endogeneity problem.

Page 25: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Well, we can look for a variable, Zt which is:– Correlated with Debtt (it proxies for Debtt )– Is uncorrelated with εt.

• If such a variable exists, then we can write themoment condition:

E (Ztεt) = 0

• Or, if we look at the sample moments, then1

T

TXt=1

Ztεt =1

T

TXt=1

Zt (yt − βxt)

1

T

TXt=1

Ztyt = β1

T

TXt=1

Ztxt

βIV =1T

PTt=1Ztyt

1T

PTt=1Ztxt

=1T

PTt=1Zt (βxt + εt)1T

PTt=1Ztxt

= β +1T

PTt=1Ztεt

1T

PTt=1Ztxt

• Then we can show that βIV →p β, whereas βolsdoes not.

• This estimator was motivated from GMM.

Page 26: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• All this is great, but how do we choose the instru-ment Zt

• This is usually the big question.• Usually, Zt = Debtt−k, because

E (Ztεt) = E (Debtt−kεt)= E ((δ + ζFirmV aluet−k + υt−k) εt)

Page 27: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

ARCH/GARCH Estimation using GMM:• Recall the model:

yt = xtβ + utu2t = ht + wt = ζ + au2t−1 + wt

• The moment conditions are:E (utxt) = E ((yt − xtβ)xt) = 0 (1)

E

µu2t −

ζ

1− α

¶= 0

E (wtzt) = E¡¡u2t − ζ + au2t−1

¢zt¢= 0

• Note: 3 moments and three parameters (β, ζ, α).• Replace those conditions by their sample ana-logues:

1

T

TXt=1

((yt − xtβ)xt)³u2t − ζ

1−α´¡¡

u2t − ζ + au2t−1¢zt¢ = 0

• Solve for (β, ζ, α).

Page 28: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• GMM is a very powerful way of looking at anestimation problem.

• All we need is a moment condition that holds.• The problem does not have to be linear.• No distributional assumptions are needed.• We can use GMM to estimate– The non-linearized version of the ConsumptionCAPM.

– Nonlinear process, such as ARCH, GARCH,etc.– Interesting interest rate models (Chan et. al(1992)).

Page 29: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

4 GMM–Formal Treatment, Tests, Use andMissuse

• We start the estimation from an “orthogonality”condition:

E (h (wt; θ0)) = 0

where– h (wt; θ) is a r dimensional vector of momentconditions, which depends on the data on someunknown parameters to be estimated.

– The parameters are collected in vector θ ofdimension a, where a ≤ r. The true value of θ isdenoted by θ0.

– Note that h (., .) is a random variable.• The “Method of Moments” principle states that wecan estimate parameters by working with samplemoments instead of population moments (Why?).

• Therefore, instead of working withE (h (wt; θ0)) = 0

which we cannot evaluate (Why?), we work withits sample analogue:

g (wt; θ) =1

T

TXt=1

h (wt; θ)

Page 30: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Example: OLS yt = βxt + εtE (xtεt) = 0

E¡x2tε¢= 0

E¡x3tεt

¢= 0

• Here, we have three moment conditions (r = 3),and one parameter to estimate (a = 1).

• You can think of h (wt; θ) = xt (yt − βxt)x2t (yt − βxt)x3t (yt − βxt)

,wt = (yt, xt) and θ = β.

• We will work with the sample analogues

g (wt; θ) =1

T

TXt=1

xt (yt − βxt)x2t (yt − βxt)x3t (yt − βxt)

• Note, that from the e... theorem, we have

g (wt; θ)→p E (h (wt; θ))

Page 31: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Since there might be more moment conditionsthan parameters to estimate, we will work with thequadratic

Q = g (wt; θ)1xr

0WTrxrg (wt; θ)

rx1

whereWT is a positive definite matrix that dependson the data.

• The above quadratic can be minimized with respectto θ using analytic or numerical methods (dependingon the complexity of h).

• It would be “logical” to put more weight on momentswhose variance is smaller. Therefore, we want thematrixWT to be inversely related to V ar (h (wt; θ)) ,orWT = V ar (h (wt; θ))

−1 .

• Before we pose the problem, we note that theweighing matrix V ar (h (wt; θ))−1 does not take intoaccount the dependence in the data. Therefore, wewill work with

Γj = E (h (wt; θ)h (wt−j; θ))

S =∞Xj=0

Γj

• The matrix S takes into account the dependence inthe data.

Page 32: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• It turns out that we can prove (CLT with seriallydependent data)√

T (g (wt; θ0)) ∼a N(0, S)• Note that if Γj = 0, j ≥ 1 (serially independent data),then S = V ar (h (wt; θ)) = E (h (wt; θ)h (wt; θ)) .

• Finally we will letWT = S−1T

• Therefore, the problem is:Q = g (wt; θ)

1xr

0S−1Trxr

g (wt; θ)rx1

• The FOC is:½∂g

∂θ0|θ=θ¾

axr

0S−1Trxr

g³wt; θ

´rx1

= 0ax1

• So, what are the properties of θ?

Page 33: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Denote

D0T

rxa=

∂g1(wt;θ)

∂θ0

∂gr(wt;θ)∂θ0

• We will show that√

T³θ − θ0

´∼a N

³0,¡DS−1D0

¢−1´

Page 34: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• The “proof”’ follows a few very simple steps– Use the Mean-Value theorem, to write

g³wt; θ

´= g (wt; θ0) +D

0T

³θ − θ0

´– Pre-multiply both sides by

n∂g∂θ0 |θ=θ

o0S−1T to get½

∂g

∂θ0|θ=θ¾0S−1T g

³wt; θ

´=

½∂g

∂θ0|θ=θ¾0S−1T g (wt; θ0) +

½∂g

∂θ0|θ=θ¾0S−1T D

0T

³θ − θ0

´– But, we know by definition that½

∂g

∂θ0|θ=θ¾0S−1T g

³wt; θ

´= 0

Page 35: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

– Or½∂g

∂θ0|θ=θ¾0S−1T g (wt; θ0) = −

½∂g

∂θ0|θ=θ¾0S−1T D

0T

³θ − θ0

´– Rearranging, we get³

θ − θ0´

ax1

= D−1Taxr

g (wt; θ0)rx1

– Then, √T³θ − θ0

´= D−1T

√Tg (wt; θ0)

– But, recall that√T (g (wt; θ0)) ∼a N(0, S)

– Hence,√T³θ − θ0

´∼ aN(0, D−1T SD

−10T )

∼ aN

µ0,³DTS

−1D0T

´−1¶

Page 36: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Final Result: The GMM estimates are asymp-totically normally distributed with a variance-covariance matrix equal to

¡DTS

−1D0T

¢−1/T

• This is a huge result. All we needed was a set ofmoment conditions, nothing else!

• We also need the data wt to be stationary.• Many “standard” problems can be written as GMM.• The real power of GMM is that one framework canhandle a lot of interesting problems.

Page 37: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

It should be immediately obvious that the numberof orthogonality conditions and the• In practice, the “type” of conditioning informationwill have a great impact on the estimates θ. Thinkinstrumental variables.

• The question is: Which moments to choose?• This is quite discomforting. If slight variations in ourproblem yields widely different estimates of θ, whatcan we conclude?

Page 38: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• Also: Estimating the matrix S makes a hugedifference. Recall that

Γj = E (h (wt; θ)h (wt−j; θ))

S =∞Xj=0

Γj

• Using sample analogues to obtain Γj and S isnot the right way. Newey and West (1987) haveproposed a “corrected” way, which is:

S = Γ0 +

qXv=1

µ1− v

q + 1

¶³Γj + Γ0j

´• Even with this Newey-West method does not yieldgood results when the dimension of the system islarge.

• Moreover, the truncation point, q, introducesanother source of error.

Page 39: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• People have shown that small perturbations in Sresults in big differences in the estimates θ. In otherwords, suppose we use a matrix

S + P

where P has small values on its diagonals(perturbing the variances only). This results inwidely different estimates. So, small differences inestimating S matter a lot.

• The mechanics of why this is so reside in takinginverses...

• Since we only need the optimal weighing matrixS for efficiency (smallest variance), is it possibleto find a matrix that, although not yielding efficientestimates, yields robust estimates?

• In practice: The best (most robust) results areobtained with I, the identity matrix.

• Empirical rule of thumb: Try the identity matrix first.Then, try the optimal weighing matrix, S. If theresults are widely different, stick with I.

Page 40: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

In his 1982 seminal paper, L.P. Hansen arguedthat the multiplicity of the moments, or the over-identifications, are an advantage, rather than adisadvantage.• Even though we cannot have g

³θ, wt

´= 0, it must

be the case that at, and close to, the true value θ0,g() will be close to zero.

• Note that, since√T (g (wt; θ0)) ∼a N(0, S)

then, it must be the case thatTg (wt; θ0)

0 S−1g (wt; θ0) ∼a????• It might be conjectured that the same would holdtrue for θ, or that

Tg³wt; θ

´0S−1g

³wt; θ

´∼a χ2 (r)

• However, this intuition is false, because it isnot necessarily the same linear combination ofg³wt; θ

´that would be close to zero. Instead, it can

be shown thatTg³wt; θ

´0S−1g

³wt; θ

´∼a χ2 (r − a)

• Note: This statistic is trivial to estimate. Plug θ intog (.), etc.

Page 41: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• The testJ = Tg

³wt; θ

´0S−1g

³wt; θ

´called the rank test, the test for over-identifyingrestrictions, Hansen’s J test, etc. has been usedextensively in finance.

• Indeed, people have relied exclusively on this testto judge the fit of their models.

• Problems:– As discussed above, the over-identifying re-strictions are subject to the “which moments”critique.

– The J test also depends crucially on S, whichcannot be estimated accurately.

• Not surprisingly, the J test rejects a lot of models.But, people are now aware of its deficiencies.

Page 42: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• The GMM framework is rich enough that we canthink of many other ways of testing the hypothesesof interest. As a starting point, we can break theorthogonality restrictions into those that identify andthose that over-identify the parameters

E

µh1 (wt; θ0)h2 (wt; θ0)

¶=

·00

¸• Some people have suggested to see how the esti-mates would change as we add more restrictions toh2 (say, starting from no over-identifying restrictions,and adding progressively).

• This set-up has also yielded insights into thestability properties of the moments and (or versus)the estimates.

Page 43: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

Realistic Example: Chan, Karolyi, Longstaff, andSanders (1992, JF):• Motivation: Estimate a “very” general process forthe short rate.

• Q: Why?• A: Because, many models in finance have to specifya short rate. Usually they specify the short rateprocess that is most convenient, i.e. the short rateprocess that allows them to derive nice, simple,closed-form results.

• But one has to realize that the results from thosemodels are conditional upon having the rightshort-rate process.

Page 44: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

The proposed process for the short rate is:dr = (α + βr) dt + σrγdZ

• The short rate is written in continuous time to relateit to most finance models.

• The estimation is done in discrete time.• The discretization is done by taking that dt is onemonth.

• No other adjustments are made.

Page 45: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

Aside:• The above problems assume a particular form ofthe variance process. They are called “stochasticvolatility” models (when γ 6= 0).

• Note the difference in modelling. Recall theARCH/GARCH literature.

• In ARCH/GARCH, the focus is on the secondconditional moment.

• In the “stochastic volatility” models, the entireprocess is modelled and the variance depends onthe level of the process.

• In practice and in implementation, there is littledifference.

Page 46: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

Implementation:• Monthly Tbill interest rate, June 1964-December1989.

• The short rate is:∆rt+1 = α + βrt + εt+1or rt+1 = α + (1 + β) rt + εt+1E¡ε2t+1

¢= σ2r2γt

• Note that while the continuous time processesassumed that dZt is normally distributed, this ispurely for tractability.

• In econometrics, we don’t need to make surean assumption. The shock εt+1 can have anydistribution.

• The parameters to be estimated are: α, β, γ, andσ2.

Page 47: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

GMM• Specify the moment conditions (We need at least 4moment conditions. Why?)

•E (h (wt; θ)) = 0

E

εt+1εt+1rt

ε2t+1 − σ2r2γt³ε2t+1 − σ2r2γt

´rt

= 0

where wt = (rt−1, rt) , θ =¡α,β, γ,σ2

¢.

• The sample moment is:

g (wt; θ) =1

T

TXt=1

εt+1εt+1rt

ε2t+1 − σ2r2γt³ε2t+1 − σ2r2γt

´rt

• We are done! For the rest, follow the algorithm.• The same algorithm applies for all problems thatcan be written as a moment condition.

Page 48: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

We could also proceed in the following way:• Regress rt on rt−1 using OLS. Get estimates α, β,and εt.

• Estimate σ2 and γ using non-linear least squares:

minσ2,γ

TXt=1

³ε2t+1 − σ2r2γt

´2• Would we obtain the same answer?• What method is preferable in theory?• What method is preferable in practice?

Page 49: 1. ARCH/GARCH - Rady School of Management · •Another useful model to estimate is the IGARCH model, or integrated GARCH • The IGARCH(1,1) is a GARCH(1,1) where δ+α=1 • If

• A good project: Redo the Chan et al. (1992) paperwith data up to 2001 and with monthly and dailydata. Compare the results.