Part 14: Generalized Regression14-1/46
Econometrics IProfessor William Greene
Stern School of Business
Department of Economics
Part 14: Generalized Regression14-2/46
Econometrics I
Part 14 – Generalized Regression
Part 14: Generalized Regression14-3/46
Part 14: Generalized Regression14-4/46
Generalized Regression Model
Setting: The classical linear model assumes that E[] = Var[] = 2I. That is, observations are uncorrelated and all are drawn from a distribution with the same variance. The generalized regression (GR) model allows the variances to differ across observations and allows correlation across observations.
Part 14: Generalized Regression14-5/46
Implications The assumption that Var[] = 2I is used to derive the
result Var[b] = 2(XX)-1. If it is not true, then the use of s2(XX)-1 to estimate Var[b] is inappropriate.
The assumption was used to derive most of our test statistics, so they must be revised as well.
Least squares gives each observation a weight of 1/n. But, if the variances are not equal, then some observations are more informative than others.
Least squares is based on simple sums, so the information that one observation might provide about another is never used.
Part 14: Generalized Regression14-6/46
GR Model The generalized regression model:
y = X + , E[|X] = 0, Var[|X] = 2. Regressors are well behaved. We consider some
examples Trace = n. (This is a normalization with no content.)
Leading Cases Simple heteroscedasticity Autocorrelation Panel data and heterogeneity more generally.
Part 14: Generalized Regression14-7/46
Least Squares Still unbiased. (Proof did not rely on ) For consistency, we need the true variance of b,
Var[b|X] = E[(b-β)(b-β)’|X] = (X’X)-1 E[X’εε’X] (X’X)-1 = 2 (X’X)-1 XX (X’X)-1 .
Divide all 4 terms by n. If the middle one converges to a finite matrix of constants, we have the result, so we need to examine
(1/n)XX = (1/n)ij ij xi xj.
This will be another assumption of the model. Asymptotic normality? Easy for heteroscedasticity case, very difficult
for autocorrelation case.
Part 14: Generalized Regression14-8/46
Robust Covariance Matrix Robust estimation: Generality How to estimate Var[b|X] = 2 (X’X)-1 XX (X’X)-1 for
the LS b? The distinction between estimating
2 an n by n matrix
and estimating the KxK matrix
2 XX = 2 ijij xi xj NOTE…… VVVIRs for modern applied econometrics.
The White estimator Newey-West.
Part 14: Generalized Regression14-9/46
The White Estimatorn1 2 1
i i ii 1
n 2i2 i 1
2i
i i2
12
2 2
Est.Var[ ] ( ) e ( )
eUse ˆ
nne ˆ ˆ = , =diag( ) note tr( )=nˆ ˆˆ
ˆˆEst.Var[ ]n n n n
ˆDoes ˆ
n n
b X'X x x' X'X
Ω Ω
X'X X'ΩX X'Xb
X'ΩX X'ΩX
0?
Part 14: Generalized Regression14-10/46
Groupwise Heteroscedasticity
Regression of log of per capita gasoline use on log of per capita income, gasoline price and number of cars per capita for 18 OECD countries for 19 years. The standard deviation varies by country. The “solution” is “weighted least squares.”
Countries are ordered by the standard deviation of their 19 residuals.
Part 14: Generalized Regression14-11/46
White Estimator+--------+--------------+----------------+--------+--------+----------+|Variable| Coefficient | Standard Error |t-ratio |P[|T|>t]| Mean of X|+--------+--------------+----------------+--------+--------+----------+ Constant| 2.39132562 .11693429 20.450 .0000 LINCOMEP| .88996166 .03580581 24.855 .0000 -6.13942544 LRPMG | -.89179791 .03031474 -29.418 .0000 -.52310321 LCARPCAP| -.76337275 .01860830 -41.023 .0000 -9.04180473| White heteroscedasticity robust covariance matrix |+----------------------------------------------------+ Constant| 2.39132562 .11794828 20.274 .0000 LINCOMEP| .88996166 .04429158 20.093 .0000 -6.13942544 LRPMG | -.89179791 .03890922 -22.920 .0000 -.52310321 LCARPCAP| -.76337275 .02152888 -35.458 .0000 -9.04180473
Part 14: Generalized Regression14-12/46
Autocorrelated ResidualslogG=β1 + β2logPg + β3logY + β4logPnc + β5logPuc + ε
Part 14: Generalized Regression14-13/46
Newey-West Estimator
n 20 i i ii 1
L n
1 l t t l t t l t l tl 1 t l 1
l
Heteroscedasticity Component - Diagonal Elements
1e
nAutocorrelation Component - Off Diagonal Elements
1we e ( )
nl
w 1 = "Bartlett weight"L 1
1Est.Var[ ]=
n
S x x'
S x x x x
Xb
1 1
0 1[ ]n n
'X X'XS S
Part 14: Generalized Regression14-14/46
Newey-West Estimate--------+-------------------------------------------------------------Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X--------+-------------------------------------------------------------Constant| -21.2111*** .75322 -28.160 .0000 LP| -.02121 .04377 -.485 .6303 3.72930 LY| 1.09587*** .07771 14.102 .0000 9.67215 LPNC| -.37361** .15707 -2.379 .0215 4.38037 LPUC| .02003 .10330 .194 .8471 4.10545--------+---------------------------------------------------------------------+-------------------------------------------------------------Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of XRobust VC Newey-West, Periods = 10--------+-------------------------------------------------------------Constant| -21.2111*** 1.33095 -15.937 .0000 LP| -.02121 .06119 -.347 .7305 3.72930 LY| 1.09587*** .14234 7.699 .0000 9.67215 LPNC| -.37361** .16615 -2.249 .0293 4.38037 LPUC| .02003 .14176 .141 .8882 4.10545--------+-------------------------------------------------------------
Part 14: Generalized Regression14-15/46
Generalized Least SquaresA transformation of the model: P = -1/2. P’P = -1
Py = PX + P or y* = X* + *.Why? E[**’|X*]= PE[’|X*]P’ = PE[’|X]P’ = σ2PP’ = σ2 -1/2 -1/2 = σ2 0
= σ2I
Part 14: Generalized Regression14-16/46
Generalized Least SquaresAitken theorem. The Generalized Least Squares
estimator, GLS.
Py = PX + P or
y* = X* + *. E[**’|X*]= σ2I
Use ordinary least squares in the transformed model. Satisfies the Gauss – Markov theorem.
b* = (X*’X*)-1X*’y*
Part 14: Generalized Regression14-17/46
Generalized Least Squares
Efficient estimation of and, by implication, the inefficiency of least squares b.
= (X*’X*)-1X*’y* = (X’P’PX)-1 X’P’Py = (X’Ω-1X)-1 X’Ω-1y
≠ b. is efficient, so by construction, b is not.
β̂
β̂ β̂
Part 14: Generalized Regression14-18/46
Asymptotics for GLSAsymptotic distribution of GLS. (NOTE. We apply
the full set of results of the classical model to the transformed model.
Unbiasedness
Consistency - “well behaved data”
Asymptotic distribution
Test statistics
Part 14: Generalized Regression14-19/46
Unbiasedness
1
1
1
ˆ ( )
( )
ˆE[ ] ( ) E[ | ]
= if E[ | ]
-1 -1
-1 -1
-1 -1
β X'Ω X X'Ω y
β X'Ω X X'Ω ε
β| X =β X'Ω X X'Ω ε X
β ε X 0
Part 14: Generalized Regression14-20/46
Consistency
12
n
i ii 1ii
Use Mean Square
ˆVar[ | ]= ?n n
Requires to be "well behaved"n
Either converge to a constant matrix or diverge.
Heteroscedasticity case:
1 1n n
Autocorrelatio
-1
-1
-1
X'Ω Xβ X 0
X'Ω X
X'Ω Xx x'
n n 2i ji 1 j 1
ij
n case:
1 1. n terms. Convergence is unclear.
n n
-1X'Ω X
x x '
Part 14: Generalized Regression14-21/46
Asymptotic Normality
111
11
1
' 1ˆn( ) n 'n n
Converge to normal with a stable variance O(1)?
' a constant matrix?
n
1' a mean to which we can apply the
n central limit theorem?
Het
X Ω Xβ β X Ω ε
X Ω X
X Ω ε
n1 2i i i ii 1
i i i i
i i
eroscedasticity case?
1 1' . Var , is just data.
n n
Apply Lindeberg-Feller. (Or assuming / is a draw from a common
distribution with mean and fixed va
x xX Ω ε =
x
riance - some recent treatments.)
Autocorrelation case?
Part 14: Generalized Regression14-22/46
Asymptotic Normality (Cont.)
n n1i j i ii 1 j 1
1
For the autocorrelation case
1 1'
n n
Does the double sum converge? Uncertain. Requires elements
of to become small as the distance between i and j increases.
(Has to resem
ijX Ω ε = Ω x x
Ω
ble the heteroscedasticity case.)
Part 14: Generalized Regression14-23/46
Test Statistics (Assuming Known Ω) With known Ω, apply all familiar results to the
transformed model: With normality, t and F statistics apply to least
squares based on Py and PX With asymptotic normality, use Wald statistics
and the chi-squared distribution, still based on the transformed model.
Part 14: Generalized Regression14-24/46
Generalized (Weighted) Least SquaresHeteroscedasticity Case
1
22 2
n
1
-1/2 2
n
1n n1
i ii 1 i 1i i
i i
i2
0 ... 0
0 ... 0Var[ ]
0 0 ... 0
0 0 ...
1/ 0 ... 0
0 1/ ... 0
0 0 ... 0
0 0 ... 1/
1 1ˆ ( ) ( ) y
ˆy
ˆ
-1 -1i iβ X'Ω X X'Ω y x x x
xβ2
n
i 1
n K
Part 14: Generalized Regression14-25/46
Autocorrelation t = t-1 + ut (‘First order autocorrelation.’ How does this come
about?) Assume -1 < < 1. Why?ut = ‘nonautocorrelated white noise’t = t-1 + ut (the autoregressive form) = (t-2 + ut-1) + ut
= ... (continue to substitute)= ut + ut-1 + 2ut-2 + 3ut-3 + ...= (the moving average form)
Part 14: Generalized Regression14-26/46
Autocorrelation
2t t t 1 t 1
it ii=0
22i 2 u
u 2i=0
t t 1 t t 1 t
2t t 1 t t 1 t
Var[ ] Var[u u u ...]
= Var u
=1
An easier way: Since Var[ ] Var[ ] and u
Var[ ] Var[ ] Var[u ] 2 Cov[ ,u ]
=
2 2t u
2u
2
Var[ ]
1
Part 14: Generalized Regression14-27/46
Autocovariances
t t 1 t 1 t t 1
t 1 t 1 t t 1
t-1 t
2u
2
t t 2 t 1 t t 2
Continuing...
Cov[ , ] = Cov[ u , ]
= Cov[ , ] Cov[u , ]
= Var[ ] Var[ ]
=(1 )
Cov[ , ] = Cov[ u , ]
t 1 t 2 t t 2
t t 1
2 2u2
= Cov[ , ] Cov[u , ]
= Cov[ , ]
= and so on.(1 )
Part 14: Generalized Regression14-28/46
Autocorrelation Matrix
2 1
22
2 2 32
1 2 3
1
1
11
1
(Note, trace = n as required.)
Ω
Ω
T
T
u T
T T T
Part 14: Generalized Regression14-29/46
Generalized Least Squares
2
1/ 2
21
2 11/ 2
3 2
T T 1
1 0 0 ... 0
1 0 ... 0
0 1 ... 0
... ... ... ... ...
0 0 0 0
1 y
y y
y y
...
y
Ω
Ω y=
Part 14: Generalized Regression14-30/46
The Autoregressive Transformation
t t t t 1 t
t 1 t 1
t t 1 t t 1
t t 1 t
y u
y
y y ( ) + ( )
y y ( ) + u
(Where did the first observation go?)
t
t-1
t t-1
t t-1
x 'β
x 'β
x x 'β
x x 'β
Part 14: Generalized Regression14-31/46
Unknown The problem (of course), is unknown. For now, we
will consider two methods of estimation: Two step, or feasible estimation. Estimate first, then do
GLS. Emphasize - same logic as White and Newey-West. We don’t need to estimate . We need to find a matrix that behaves the same as (1/n)X-1X.
Properties of the feasible GLS estimator Maximum likelihood estimation of , 2, and all at the
same time. Joint estimation of all parameters. Fairly rare. Some
generalities… We will examine two applications: Harvey’s model of
heteroscedasticity and Beach-MacKinnon on the first order autocorrelation model
Part 14: Generalized Regression14-32/46
Specification must be specified first. A full unrestricted contains n(n+1)/2 - 1 parameters.
(Why minus 1? Remember, tr() = n, so one element is determined.)
is generally specified in terms of a few parameters. Thus, = () for some small parameter vector . It becomes a question of estimating .
Examples:
Part 14: Generalized Regression14-33/46
Harvey’s Model of Heteroscedasticity
Var[i | X] = 2 exp(zi)
Cov[i,j | X] = 0
e.g.: zi = firm size
e.g.: zi = a set of dummy variables (e.g., countries) (The groupwise heteroscedasticity model.)
[2 ] = diagonal [exp( + zi)],
= log(2)
Part 14: Generalized Regression14-34/46
AR(1) Model of Autocorrelation
2 1
22
2 2 32
1 2 3
1
1
11
1
Ω
T
T
u T
T T T
Part 14: Generalized Regression14-35/46
Estimated AR(1) Model----------------------------------------------------------------------AR(1) Model: e(t) = rho * e(t-1) + u(t)Initial value of rho = .87566Maximum iterations = 1Method = Prais - WinstenIter= 1, SS= .022, Log-L= 127.593Final value of Rho = .959411Std. Deviation: e(t) = .076512Std. Deviation: u(t) = .021577Autocorrelation: u(t) = .253173N[0,1] used for significance levels--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+-------------------------------------------------------------------------Constant| -20.3373*** .69623 -29.211 .0000 FGLS LP| -.11379*** .03296 -3.453 .0006 3.72930 LY| .87040*** .08827 9.860 .0000 9.67215 LPNC| .05426 .12392 .438 .6615 4.38037 LPUC| -.04028 .06193 -.650 .5154 4.10545 RHO| .95941*** .03949 24.295 .0000--------+-------------------------------------------------------------------------Constant| -21.2111*** .75322 -28.160 .0000 OLS LP| -.02121 .04377 -.485 .6303 3.72930 LY| 1.09587*** .07771 14.102 .0000 9.67215 LPNC| -.37361** .15707 -2.379 .0215 4.38037 LPUC| .02003 .10330 .194 .8471 4.10545
Part 14: Generalized Regression14-36/46
Two Step EstimationThe general result for estimation when is
estimated.
GLS uses [X-1X]X -1 y which converges in probability to .
We seek a vector which converges to the same thing that this does. Call it “Feasible GLS” or FGLS, based on [X X]X y
The object is to find a set of parameters such that
[X X]X y - [X -1 X]X -1 y 0
ˆ -1Ω ˆ -1Ω
ˆ -1Ω ˆ -1Ω
Part 14: Generalized Regression14-37/46
Feasible GLS
For FGLS estimation, we do not seek an estimator of
such that
ˆ
ˆThis makes no sense, since is nxn and does not "converge" to
anything. We seek a matrix such that
Ω
Ω- Ω 0
Ω
Ω
ˆ (1/n) (1/n)
For the asymptotic properties, we will require that
ˆ (1/ n) (1/n)
Note in this case, these are two random vectors, which we require
to converge
-1 -1
-1 -1
X'Ω X - X'Ω X 0
X'Ω - X'Ω 0
to the same random vector.
Part 14: Generalized Regression14-38/46
Two Step FGLS
VVIR (Theorem 9.6) To achieve full efficiency, we do not need an efficient estimate of the parameters in , only a consistent one.
Part 14: Generalized Regression14-39/46
Harvey’s Model
Examine Harvey’s model once again. Methods of estimation:
Two step FGLS: Use the least squares residuals to estimate , then use
Full maximum likelihood estimation. Estimate all parameters simultaneously.
A handy result due to Oberhofer and Kmenta - the “zig-zag” approach. Iterate back and forth between and .
11 1ˆ̂ ˆ ˆ
X X X y
Part 14: Generalized Regression14-40/46
Harvey’s Model for Groupwise Heteroscedasticity
Groupwise sample, yig, xig,…
N groups, each with Ng observations.
Var[εig] = σg2
Let dig = 1 if observation i,g is in group g, 0 else.
= group dummy variable.Var[εig] = σg
2 exp(θ2d2 + … θGdG)Var1 = σg
2 , Var2 = σg2 exp(θ2) and so on.
Part 14: Generalized Regression14-41/46
Estimating Variance Components OLS is still consistent: Est.Var1 = e1’e1/N1 estimates σg
2
Est.Var2 = e2’e2/N2 estimates σg2 exp(θ2)
Estimator of θ2 is ln[(e2’e2/N2)/(e1’e1/N1)] (1) Now use FGLS – weighted least squares Recompute residuals using WLS slopes (2) Recompute variance estimators Iterate to a solution… between (1) and (2)
Part 14: Generalized Regression14-42/46
Baltagi and Griffin’s Gasoline DataWorld Gasoline Demand Data, 18 OECD Countries, 19 yearsVariables in the file are
COUNTRY = name of country YEAR = year, 1960-1978LGASPCAR = log of consumption per carLINCOMEP = log of per capita incomeLRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars
See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasolne Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp. 117-137. The data were downloaded from the website for Baltagi's text.
Part 14: Generalized Regression14-43/46
Analysis of Variance
Part 14: Generalized Regression14-44/46
Least Squares First Step----------------------------------------------------------------------Multiplicative Heteroskedastic Regression Model...Ordinary least squares regression ............LHS=LGASPCAR Mean = 4.29624 Standard deviation = .54891 Number of observs. = 342Model size Parameters = 4 Degrees of freedom = 338Residuals Sum of squares = 14.90436Wald statistic [17 d.f.] = 699.43 (.0000) (Large)B/P LM statistic [17 d.f.] = 111.55 (.0000) (Large)Cov matrix for b is sigma^2*inv(X'X)(X'WX)inv(X'X)--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+-------------------------------------------------------------Constant| 2.39133*** .20010 11.951 .0000LINCOMEP| .88996*** .07358 12.094 .0000 -6.13943 LRPMG| -.89180*** .06119 -14.574 .0000 -.52310LCARPCAP| -.76337*** .03030 -25.190 .0000 -9.04180--------+-------------------------------------------------------------
Part 14: Generalized Regression14-45/46
Variance Estimates = log[e(i)’e(i)/T]
Sigma| .48196*** .12281 3.924 .0001 D1| -2.60677*** .72073 -3.617 .0003 .05556 D2| -1.52919** .72073 -2.122 .0339 .05556 D3| .47152 .72073 .654 .5130 .05556 D4| -3.15102*** .72073 -4.372 .0000 .05556 D5| -3.26236*** .72073 -4.526 .0000 .05556 D6| -.09099 .72073 -.126 .8995 .05556 D7| -1.88962*** .72073 -2.622 .0087 .05556 D8| .60559 .72073 .840 .4008 .05556 D9| -1.56624** .72073 -2.173 .0298 .05556 D10| -1.53284** .72073 -2.127 .0334 .05556 D11| -2.62835*** .72073 -3.647 .0003 .05556 D12| -2.23638*** .72073 -3.103 .0019 .05556 D13| -.77641 .72073 -1.077 .2814 .05556 D14| -1.27341* .72073 -1.767 .0773 .05556 D15| -.57948 .72073 -.804 .4214 .05556 D16| -1.81723** .72073 -2.521 .0117 .05556 D17| -2.93529*** .72073 -4.073 .0000 .05556
Part 14: Generalized Regression14-46/46
OLS vs. Iterative FGLS
Looks like a substantial gain in reduced standard errors--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+------------------------------------------------------------- |Ordinary Least Squares |Cov matrix for b is sigma^2*inv(X'X)(X'WX)inv(X'X)Constant| 2.39133*** .20010 11.951 .0000LINCOMEP| .88996*** .07358 12.094 .0000 -6.13943 LRPMG| -.89180*** .06119 -14.574 .0000 -.52310LCARPCAP| -.76337*** .03030 -25.190 .0000 -9.04180--------+------------------------------------------------------------- |Regression (mean) functionConstant| 1.56909*** .06744 23.267 .0000LINCOMEP| .60853*** .02097 29.019 .0000 -6.13943 LRPMG| -.61698*** .01902 -32.441 .0000 -.52310LCARPCAP| -.66938*** .01116 -59.994 .0000 -9.04180
Top Related