Heteroskedasticity ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera...

ECON 4550 Econometrics Memorial University of Newfoundland

Adapted from Vera Tabakova’s notes

8.1 The Nature of Heteroskedasticity

8.2 Using the Least Squares Estimator

8.3 The Generalized Least Squares Estimator

8.4 Detecting Heteroskedasticity

1 2( )E y x

1 2( )i i i i ie y E y y x

1 2i i iy x e

Figure 8.1 Heteroskedastic Errors

2( ) 0 var( ) cov( , ) 0i i i jE e e e e

var( ) var( ) ( )i i iy e h x

ˆ 83.42 10.21i iy x

ˆ 83.42 10.21i i ie y x

Food expenditure example:

Figure 8.2 Least Squares Estimated Expenditure Function and Observed Data Points

The existence of heteroskedasticity implies:

The least squares estimator is still a linear and unbiased estimator, but

it is no longer best. There is another estimator with a smaller

variance.

The standard errors usually computed for the least squares estimator

are incorrect. Confidence intervals and hypothesis tests that use these

standard errors may be misleading.

21 2 var( )i i i iy x e e

var( )( )

21 2 var( )i i i i iy x e e

2 2 12 2

( )var( )

x xb w

2 2 12 2

ˆ( )ˆvar( )

x x eb w e

ˆ 83.42 10.21

(27.46) (1.81) (White se)

(43.41) (2.09) (incorrect se)

i iy x

White: se( ) 10.21 2.024 1.81 [6.55, 13.87]

Incorrect: se( ) 10.21 2.024 2.09 [5.97, 14.45]

We can use a robust estimator: regress food_exp income, robust

In SHAZAM the HETCOV option on the OLS command reports the White's heteroskedasticity-consistent standard errors

OLS FOOD INCOME / HETCOV

Confidence interval estimate using the White standard errors CONFID INCOME / TCRIT=2.024

But the White standard errors reported by SHAZAM have numeric differences compared to our textbook results.

There is a correction we could apply, but we will not worry about it for now

2( ) 0 var( ) cov( , ) 0

i i i i j

E e e e e

2 2var i i ie x

1i i i

i i i i

x x x x

1 i i i

i i i i i

i i i i

y x ey x x x e

x x x x

1 1 2 2i i i iy x x e

2 21 1var( ) var var( )i

i i ii ii

ee e x

To obtain the best linear unbiased estimator for a model with

heteroskedasticity of the type specified in equation (8.11):

1. Calculate the transformed variables given in (8.13).

2. Use least squares to estimate the transformed model given in (8.14).

The generalized least squares estimator is as a weighted least

squares estimator. Minimizing the sum of squared transformed errors

that is given by:

When is small, the data contain more information about the

regression function and the observations are weighted heavily.

When is large, the data contain less information and the

observations are weighted lightly.

22 1/2 2

( )N N N

ii i i

i i ii

ee x e

ˆ 78.68 10.45

(se) (23.79) (1.39)i iy x

2 2ˆ ˆse( ) 10.451 2.024 1.386 [7.65,13.26]ct

Food example again, where was the problem coming from?

regress food_exp income [aweight = 1/income]

ˆ 78.68 10.45

(se) (23.79) (1.39)i iy x

2 2ˆ ˆse( ) 10.451 2.024 1.386 [7.65,13.26]ct

Food example again, where was the problem coming from?

Specify a weight variable (SHAZAM works with the inverse) GENR W=1/INCOME OLS FOOD INCOME / WEIGHT=W 95% confidence interval, p. 205 CONFID INCOME / TCRIT=2.024

Note that

The residual statistics reported in a WLS regression (SIGMA**2, STANDARD ERROR OF THE ESTIMATE-SIGMA, and SUM OF SQUARED ERRORS-SSE) are all based on the transformed (weighted) residuals

You should remember when making model comparisons, that the high-variance observations are systematically underweighted by this procedure

This may be a good thing, if you want to avoid having these observations dominate the model selection comparisons.

But if you want model selection to be based on how well the alternative models fit the original (untransformed) data, you must base the model selection tests on the untransformed residuals.

2 2var( )i i ie x

2 2ln( ) ln( ) ln( )i ix

exp ln( ) ln( )

exp( )

21 2 2exp( )i i s iSz z

21 2ln( )i iz

1 2( )i i i i iy E y e x e

2 21 2ˆln( ) ln( )i i i i ie v z v

2ˆln( ) .9378 2.329i iz

21 1ˆ ˆˆ exp( )i iz

1i i i

i i i i

1 1var var( ) 1i

i ii i i

1ˆ ˆ ˆ

i ii i i

y xy x x

1 1 2 2i i i iy x x e

1 2 2i i k iK iy x x e

21 2 2var( ) exp( )i i i s iSe z z

The steps for obtaining a feasible generalized least squares estimator

for are:

1. Estimate (8.25) by least squares and compute the squares of

the least squares residuals .

2. Estimate by applying least squares to the equation

1 2, , , K

1 2, , , S

21 2 2ˆln i i S iS ie z z v

3. Compute variance estimates .

4. Compute the transformed observations defined by (8.23),

including if .

5. Apply least squares to (8.24), or to an extended version of

(8.24) if .

21 2 2ˆ ˆ ˆˆ exp( )i i S iSz z

3, ,i iKx x 2K

ˆ 76.05 10.63

(se) (9.71) (.97)iy x

For our food expenditure example:

gen z = log(income)regress food_exp incomepredict ehat, residualgen lnehat2 = log(ehat*ehat)regress lnehat2 z

* --------------------------------------------* Feasible GLS* --------------------------------------------predict sig2, xbgen wt = exp(sig2)regress food_exp income [aweight = 1/wt]

The HET command can be used for Maximum Likelihood Estimation of the model given in Equations (8.25) and (8.26), p. 207.

This method is an alternative estimation method to the GLS method discussed in the text (so the results will also be different):

HET FOOD INCOME (INCOME) / MODEL=MULT

9.914 1.234 .133 1.524

(se) (1.08) (.070) (.015) (.431)

WAGE EDUC EXPER METRO

1 2 3 1,2, ,Mi M Mi Mi Mi MWAGE EDUC EXPER e i N

1 2 3 1,2, ,Ri R Ri Ri Ri RWAGE EDUC EXPER e i N

1 9.914 1.524 8.39Mb

Using our wage data (cps2.dta):

2 2var( ) var( )Mi M Ri Re e

2 2ˆ ˆ31.824 15.243M R

9.052 1.282 .1346

6.166 .956 .1260

1Mi Mi Mi MiM

M M M M M

WAGE EDUC EXPER e

1,2, , Mi N

1Ri Ri Ri RiR

R R R R R

WAGE EDUC EXPER e

1,2, , Ri N

Feasible generalized least squares:

1. Obtain estimated and by applying least squares separately to

the metropolitan and rural observations.

3. Apply least squares to the transformed model

ˆ M ˆ R

ˆ when 1ˆ

ˆ when 0

1ˆ ˆ ˆ ˆ ˆ ˆ

i i i i iR

i i i i i i

WAGE EDUC EXPER METRO e

9.398 1.196 .132 1.539

(se) (1.02) (.069) (.015) (.346)

WAGE EDUC EXPER METRO

_cons -9.398362 1.019673 -9.22 0.000 -11.39931 -7.397408 metro 1.538803 .3462856 4.44 0.000 .8592702 2.218336 exper .1322088 .0145485 9.09 0.000 .1036595 .160758 educ 1.195721 .068508 17.45 0.000 1.061284 1.330157 wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 36081.2155 999 36.1173328 Root MSE = 5.1371 Adj R-squared = 0.2693 Residual 26284.1488 996 26.3897076 R-squared = 0.2715 Model 9797.0667 3 3265.6889 Prob > F = 0.0000 F( 3, 996) = 123.75 Source SS df MS Number of obs = 1000

(sum of wgt is 3.7986e+01). regress wage educ exper metro [aweight = 1/wt]

* --------------------------------------------* Rural subsample regression* --------------------------------------------regress wage educ exper if metro == 0 scalar rmse_r = e(rmse)scalar df_r = e(df_r)* --------------------------------------------* Urban subsample regression* --------------------------------------------regress wage educ exper if metro == 1 scalar rmse_m = e(rmse)scalar df_m = e(df_r)* --------------------------------------------* Groupwise heteroskedastic regression using FGLS* --------------------------------------------gen rural = 1 - metrogen wt=(rmse_r^2*rural) + (rmse_m^2*metro)regress wage educ exper metro [aweight = 1/wt]

STATA Commands:

Remark: To implement the generalized least squares estimators

described in this Section for three alternative heteroskedastic

specifications, an assumption about the form of the

heteroskedasticity is required. Using least squares with White

standard errors avoids the need to make an assumption about the

form of heteroskedasticity, but does not realize the potential

efficiency gains from generalized least squares.

8.4.1 Residual Plots

Estimate the model using least squares and plot the least squares

residuals.

With more than one explanatory variable, plot the least squares

residuals against each explanatory variable, or against , to see if

those residuals vary in a systematic way relative to the specified

variable.

8.4.2 The Goldfeld-Quandt Test

( , )2 2

ˆ M M R R

M MN K N K

2 2 2 20 0: against :M R M RH H

ˆ 31.8242.09

ˆ 15.243M

ˆ 31.8242.09

ˆ 15.243M

* --------------------------------------------* Goldfeld Quandt test* --------------------------------------------

scalar GQ = rmse_m^2/rmse_r^2scalar crit = invFtail(df_m,df_r,.05)scalar pvalue = Ftail(df_m,df_r,GQ)scalar list GQ pvalue crit

21ˆ 3574.8

22ˆ 12,921.9

ˆ 12,921.93.61

ˆ 3574.8F

For the food expenditure data

You should now be able to obtain this test statistic

And check whether it exceeds the critical value

Sort the data by income:

SORT INCOME FOOD / DESC

OLS FOOD INCOME

On the DIAGNOS command the CHOWONE= option reports the Goldfeld-Quandt test for heteroskedasticity (bottom of page 212) with a p-value for a one-sided test. The HET option reports the tests for heteroskedasticity reported on page 215.

DIAGNOS / CHOWONE=20 HET

SORT INCOME FOOD / DESC

OLS FOOD INCOME

On the DIAGNOS command the CHOWONE= option reports the Goldfeld-Quandt test for heteroskedasticity (bottom of page 212) with a p-value for a one-sided test. The HET option reports the tests for heteroskedasticity reported on page 215.

DIAGNOS / CHOWONE=20 HET

Of course, this option also computes the Chow test statistic for structural change (that is, tests the null of parameter stability in the two subsamples).

SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW PVALUE G-Q DF1 DF2 PVALUE 20 20 0.23259E+06 64346. 0.45855 0.636 3.615 18 18 0.005 CHOW TEST - F DISTRIBUTION WITH DF1= 2 AND DF2= 36

SHAZAM considers that the alternative hypothesis is smaller error variance in the second subset relative to the first subset.

Some authors present the alternative as larger variance in the second subset.

Goldfeld and Quandt recommend ordering the observations by the values of one of the explanatory variables.

This can be done with the SORT command in SHAZAM. The DESC option on the SORT command should be used if it is assumed that the variance is positively related to the value of the sort variable.

8.4.3 Testing the Variance Function

1 2 2( )i i i i K iK iy E y e x x e

2 21 2 2var( ) ( ) ( )i i i i S iSy E e h z z

1 2 2 1 2 2( ) exp( )i S iS i S iSh z z z z

21 2( ) exp ln( ) ln( )i ih z x

1 2 2 1 2 2( )i S iS i S iSh z z z z

1 2 2 1( ) ( )i S iSh z z h

: not all the in are zero

2 21 2 2var( ) ( )i i i i S iSy E e z z

2 21 2 2( )i i i i S iS ie E e v z z v

21 2 2i i S iS ie z z v

2 2 2( 1)SN R

8.4.3a The White Test

1 2 2 3 3( )i i iE y x x

2 22 2 3 3 4 2 5 3z x z x z x z x

8.4.3b Testing the Food Expenditure Example

4,610,749,441 3,759,556,169SST SSE

2 1 .1846SSE

2 2 40 .1846 7.38N R

2 2 40 .18888 7.555 -value .023N R p

whitetst

estat imtest, white

Further testing in SHAZAM

DIAGNOS / HET Will yield a battery of

heteroskedasticity tests using different specifications

SHAZAM for food exampleDIAGNOS\HET REQUIRED MEMORY IS PAR= 7 CURRENT PAR= 22480 DEPENDENT VARIABLE = FOOD 40 OBSERVATIONS REGRESSION COEFFICIENTS 10.2096426868 83.4160065402 HETEROSKEDASTICITY TESTS CHI-SQUARE D.F. P-VALUE TEST STATISTIC E**2 ON YHAT: 7.384 1 0.00658 E**2 ON YHAT**2: 7.549 1 0.00600 E**2 ON LOG(YHAT**2): 6.516 1 0.01069 E**2 ON LAG(E**2) ARCH TEST: 0.089 1 0.76544 LOG(E**2) ON X (HARVEY) TEST: 10.654 1 0.00110 ABS(E) ON X (GLEJSER) TEST: 11.466 1 0.00071 E**2 ON X TEST: KOENKER(R2): 7.384 1 0.00658 B-P-G (SSR) : 7.344 1 0.00673 E**2 ON X X**2 (WHITE) TEST: KOENKER(R2): 7.555 2 0.02288 B-P-G (SSR) : 7.514 2 0.02336

Same in SLR

51Principles of Econometrics, 3rd Edition

(8A.1)

1 2i i iy x e

2( ) 0 var( ) cov( , ) 0 ( )i i i i jE e e e e i j

2 2 i ib w e

E b E E w e

(8A.2)

var var

var cov ,

i i i j i ji j

w e w w e e

(8A.3)

2 2var( )i

(8B.2)

(8B.1)21 2 2i i S iS ie z z v

( ) / ( 1)

SST SSE SF

SSE N S

ˆ ˆ ˆ and N N

i ii i

SST e e SSE v

(8B.4)

(8B.3)2 2( 1)( 1)

/ ( ) S

SST SSES F

SSE N S

2var( ) var( )i i

SSEe v

(8B.5)2

2var( )i

SST SSE

(8B.6)

(8B.7)

24ˆ2 e

SST SSE

22 2 4

1var 2 var( ) 2 var( ) 2i

i i ee e

2 2 2 2

1ˆ ˆvar( ) ( )

SSTe e e

(8B.8)

SST SSE

Heteroskedasticity ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera...

Documents

Transcript of Heteroskedasticity ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera...

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE ...

Dell 4550 Manual

Heteroskedasticity in nonmarket valuation

Spatial Heteroskedasticity and Autocorrelation Consistent ...yisun/Bandwidth_in_SHAC.pdf · 1 Introduction This paper studies spatial heteroskedasticity and autocorrelation consistent

Mixed Normal Conditional Heteroskedasticity

Chapter 5 Autoregressive Conditional Heteroskedasticity Modelsfaculty.utrgv.edu/diego.escobari/teaching/Econ8375/Handouts/chapt… · Chapter 5 Autoregressive Conditional Heteroskedasticity

IDENTIFICATION THROUGH HETEROSKEDASTICITY: …

Chapter 06 - Heteroskedasticity

Overheads ECON232 Heteroskedasticity Handout

Chapter 11 Heteroskedasticity 11.1 The Nature of ...econ446/wiley/Chapter11.pdf · Chapter 11 Heteroskedasticity 11.1 The Nature of Heteroskedasticity • In Chapter 3 we introduced

RobustInferenceinModelsIdentiﬁedvia Heteroskedasticity

Heteroskedasticity-robust estimation of means

Structural Vector Autoregressions with Heteroskedasticity ...

Structural Change Tests under Heteroskedasticity: Joint ...

Recurrent Conditional Heteroskedasticity - arXiv

Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.

Chapter 8: Heteroskedasticity - Miami University

The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.

Heteroskedasticity and Errors in variables

Conditional Heteroskedasticity in Long Memory Model ...