Prediction in the Panel Data Model with Spatial Correlation · 1 Introduction The econometrics of...

22
Prediction in the Panel Data Model with Spatial Correlation Badi H. Baltagi and Dong Li Texas A&M University Department of Economics College Station, TX 77843-4228 (409) 845-7380 This version: May 1999 First version: February 1999 Keywords: Prediction, spatial correlation, panel data, cigarette demand Abstract This paper considers the problem of prediction in a panel data regression model with spatial autocorrelation. In particular, we consider a simple demand equation for cigarettes based on a panel of 46 states over the period 1963-1992. The spatial autocorrelation due to neighboring states and the individual heterogeneity across states is taken explicitly into account. We derive the best linear unbiased predic- tor for the random error component model with spatial correlation and compare the performance of several predictors of the states demand for cigarettes for one year and five years ahead. The estimators whose predictions are compared include OLS, fixed effects ignoring spatial correlation, fixed effects with spatial correla- tion, random effects GLS estimator ignoring spatial correlation and random effects estimator accounting for the spatial correlation. Based on RMSE forecast perfor- mance, it is important to take into account spatial correlation and heterogeneity across the states.

Transcript of Prediction in the Panel Data Model with Spatial Correlation · 1 Introduction The econometrics of...

Prediction in the Panel Data Model with Spatial

Correlation

Badi H. Baltagi and Dong Li

Texas A&M University

Department of Economics

College Station, TX 77843-4228

(409) 845-7380

This version: May 1999

First version: February 1999

Keywords: Prediction, spatial correlation, panel data, cigarette demand

Abstract

This paper considers the problem of prediction in a panel data regression modelwith spatial autocorrelation. In particular, we consider a simple demand equationfor cigarettes based on a panel of 46 states over the period 1963-1992. The spatialautocorrelation due to neighboring states and the individual heterogeneity acrossstates is taken explicitly into account. We derive the best linear unbiased predic-tor for the random error component model with spatial correlation and comparethe performance of several predictors of the states demand for cigarettes for oneyear and five years ahead. The estimators whose predictions are compared includeOLS, fixed effects ignoring spatial correlation, fixed effects with spatial correla-tion, random effects GLS estimator ignoring spatial correlation and random effectsestimator accounting for the spatial correlation. Based on RMSE forecast perfor-mance, it is important to take into account spatial correlation and heterogeneityacross the states.

1 Introduction

The econometrics of spatial models have focused mainly on estimation and

test of hypotheses, see Anselin (1988), Anselin, Bera, Florax and Yoon (1996)

and Anselin and Bera (1998) to mention a few. This paper focuses on pre-

diction in spatial models based on panel data. In particular, we consider a

simple demand equation for cigarettes based on a panel of 46 states over the

period 1963-1992. The spatial autocorrelation due to neighboring states and

the individual heterogeneity across states is taken explicitly into account. In

order to explain how spatial autocorrelation may arise in the demand for

cigarettes, we note that cigarette prices vary among states primarily due to

variation in state taxes on cigarettes. For example, in 1988, state excise taxes

ranged from 2 cents per pack in a producing state like North Carolina to 38

cents per pack in the state of Minnesota. In 1997, these state taxes varied

from a low of 2.5 cents per pack for Virginia to $1.00 per pack in Alaska

and Hawaii. Since cigarettes can be stored and are easy to transport, these

varying taxes result in casual smuggling across neighboring states. For ex-

ample, while New Hampshire had a 12 cents per pack tax on cigarettes in

1988, neighboring Massachusetts and Maine had a 26 and 28 cents per pack

tax. Border effect purchases not explained in the demand equation can cause

spatial autocorrelation among the disturbances. 1

1Alternatively, one can model this using spatially lagged regressors like populationdensity of neighboring states and prices and incomes of neighboring states. In fact, Baltagiand Levin (1986) used the minimum price in neighboring states to capture border effectspurchases.

1

This paper models the demand for cigarettes as follows:

yit = x′itβ + εit i = 1, ..., N ; t = 1, ..., T (1)

where yit denotes the real per capita sales of cigarettes by persons of smok-

ing age (14 years and older) measured in packs per head. The explanatory

variables include the average retail price of a pack of cigarettes measured

in real terms, and the real per capita disposable income of each state. All

variables are expressed in logarithms and the estimated coefficients represent

elasticities. N = 46 states and T = 30 years. We only use the first 25 years

for estimation and reserve the last 5 years for out of sample forecasts. For

data sources, see Baltagi and Levin (1986). Here, we update the data 12

years from 1981 to 1992. The disturbance term follows an error component

model with spatially autocorrelated residuals, see Anselin (1988, p 152). The

disturbance vector for time t is given by

εt = µ+ φt (2)

where εt = (ε1t, ..., εNt)′, µ = (µ1, ..., µN)′ denotes the vector of state effects

and φt = (φ1t, ..., φNt)′ are the remainder disturbances which are independent

of µ. The φt’s follow the spatial error dependence model

φt = λWφt + νt (3)

where W is the matrix of known spatial weights of dimension N × N and

λ is the spatial autoregressive coefficient. νt = (ν1t, ..., νNt)′ is iid(0, σ2

ν)

2

and is independent of φt and µ. The spatial matrix W is constructed as

follows: a neighboring state takes the value 1, otherwise it is zero. The rows

of this matrix are normalized so that they sum to one. The µi’s are the

unobserved state specific effects which can be fixed or random, see Hsiao

(1986) or Baltagi (1995). State specific effects include but are not limited to

the following: (i) Indian reservations sell tax-exempt cigarettes. States with

Indian reservations like Montana, New Mexico and Arizona are among the

biggest losers of tax revenues from these tax exempt sales. The Advisory

Commission on Intergovernmental Relations (ACIR 1985) estimated a loss

of $309 million from tax exemption or tax evasion in 1983. (ii) States with

tax exempt military bases like Florida, Texas, Washington and Georgia also

lose revenues from these tax exempt sales. (iii) Utah, a state with a high

percentage of Mormons ( a religion which forbids smoking) had a per capita

sales of cigarettes in 1988 of 55 packs, a little less than half the national

average of 113 packs. (iv) Nevada, a highly touristic state, has per capita

sales of cigarettes above the national average. Not accounting for these state

specific effects may lead to biased estimates.

2 Estimation

Table 1 reports the estimates of a simple, albeit naive demand model for

cigarettes using pooled OLS.2 These estimates ignore the states heterogeneity

and the spatial autocorrelation. The price elasticity estimate is -0.62, while

the income elasticity estimate is 0.11 and both are statistically significant.

2For a dynamic demand model of cigarettes, see Baltagi and Levin (1986) and for arational addiction model, see Becker, Grossman and Murphy (1994).

3

Next, we take into account the spatial autocorrelation, and estimate the

model using MLE described in Anselin (1988) but ignoring the heterogeneity

across states. This is reported as pooled spatial in Table 1. This yields a

slightly higher price (-0.88) and income elasticities (0.29) than OLS ignoring

the spatial correlation. Both elasticities are significant. The estimate of λ is

0.41.3 In addition, we conducted a grid search procedure over λ to ensure a

global maximum. The likelihood ratio test for λ = 0 yields a value of 120.8

which is asymptotically distributed as χ21 under the null hypothesis. The null

is rejected justifying concern over spatial autocorrelation.

Table 2 allows for different parameter (heterogeneous) estimates for each

year. The first set of estimates give the cross-sectional demand equation esti-

mates using OLS for each year. The price elasticity estimates varied between

-0.66 in 1963 to -1.44 in 1967, while the income elasticity estimates varied

between 0.16 in 1980 to a high of 0.83 in 1968. Pesaran and Smith (1995)

suggested averaging these heterogeneous estimates to obtain a pooled esti-

mator. This yields a price elasticity estimate of -1.19 and an income elasticity

estimate of 0.48, both of which are significant. These are reported as average

heterogeneous OLS in Table 1. These individual cross-section regressions

and their average do not take the spatial autocorrelation into account. Using

the normality assumption, we re-estimate these cross-sectional demand equa-

tions using the maximum likelihood estimates (MLE) described in Anselin

(1988) which account for spatial autocorrelation in the disturbances. These

heterogeneous spatial estimates are reported in Table 2 along with the corre-

sponding estimate of λ. We also report for each year the LM test for λ = 0,

3This was obtained using the OPTMUM procedure of GAUSS version 3.2.37.

4

given by equation (59) of Anselin and Bera (1998). Most of the spatial coef-

ficients estimates are insignificant at the 5% level except for five out of the

25 years used for estimation. These are 1976, 1981, 1983, 1984 and 1987.

The heterogeneous MLE estimates accounting for spatial autocorrelation do

not differ much from the heterogeneous OLS estimates ignoring spatial auto-

correlation. The price elasticity estimates varied from a low of -0.63 in 1963

to a high of -1.49 in 1981, while the income elasticity estimates varied from

a low of 0.19 in 1980 to a high of 0.83 in 1968. The average pooled spatial

heterogeneous MLE estimator yields a price elasticity estimate of -1.24 and

an income elasticity estimate of 0.51 with a spatial autocorrelation parameter

estimate of λ of 0.17, all of which are significant. These are reported in Ta-

ble 1 as the average spatial maximum likelihood estimates. Note that these

estimates are slightly higher than the average heterogeneous OLS estimates

ignoring spatial autocorrelation.

Next, we account for heterogeneity across states by using the fixed effects

(FE) estimator. This model assumes that the µi ’s are fixed parameters to

be estimated. The F -statistic for testing the significance of the state dum-

mies, see equation (2.12) of Baltagi (1995), yields a value of 88.9 which is

statistically significant. Note that if these state effects are ignored, the OLS

estimates and their standard errors in Table 1 would be biased and incon-

sistent, see Moulton (1986). 4 Ignoring the spatial effects, the FE estimator

can be obtained by running the regression with state dummy variables or

4Note that prices vary across states mainly due to tax changes across states. To theextent that endogeneity in prices is due to its correlation with the state effects makes thefixed effects estimator a viable estimator which controls for endogeneity by wiping out thestate effects.

5

by performing the within transformation and then running OLS, see Hsiao

(1986). Denote these estimates by βFE. These are reported in Table 1 as

FE. Compared to the OLS estimates, the price elasticity estimate drops to

-0.47 and the income elasticity estimate becomes negative -0.26 and both

are significant. The latter effect is not unlikely, since income can be a proxy

for education levels and smoking is known to decrease with higher education

levels.

This FE estimator still does not take into account the spatial autocorrela-

tion. This paper estimates the fixed effects with spatial autocorrelation using

MLE.5 In addition, we checked this global maximum using a grid search pro-

cedure over λ. In fact, Figure 1 shows that the maximum likelihood function

is well behaved for values of λ around the global maximum. The estimates are

reported in Table 1 as FE-Spatial. These results yield a slightly higher price

elasticity estimate of -0.78 and a slightly lower income elasticity estimate

of -0.13 than the FE estimator. Both estimates are statistically significant.

The λ estimate is 0.61. The likelihood ratio test for λ = 0, yields a χ21 test

statistic of 251.4. This is statistically significant and rejects the null of λ = 0

in the FE model.

For the random effects model, the µi’s are iid(0, σ2µ) and are independent

of the φit’s, see Anselin (1988). For this model, we need to derive the variance-

covariance matrix. Let B = IN − λW, then the disturbances in equation (3)

can be written as follows: φt = (IN − λW )−1νt = B−1νt. Substituting φt in

(2), we get

ε = (ιT ⊗ IN)µ+ (IT ⊗B−1)ν (4)

5This was obtained using the OPTMUM procedure of GAUSS version 3.2.37.

6

where ιT is a vector of ones of dimension T and IN is an identity matrix of

dimension N . The variance covariance matrix is

Ω = E(εε′) = σ2µ(ιT ι

′T ⊗ IN) + σ2

ν(IT ⊗ (B′B)−1) (5)

Let Ψ = 1σ2νΩ =

σ2µ

σ2ν(ιT ι

′T ⊗ IN) + (IT ⊗ (B′B)−1) and θ =

σ2µ

σ2ν, then

Ψ = JT ⊗ (TθIN) + IT ⊗ (B′B)−1 = JT ⊗ V + ET ⊗ (B′B)−1 (6)

where V = TθIN + (B′B)−1 and ET = IT − JT . It is easy to verify that

Ψ−1 = JT ⊗ V −1 + ET ⊗ (B′B) (7)

see Anselin (1988, p.154). Also, see Wansbeek and Kapteyn (1983) for a

similar trick for the classical error component model without spatial auto-

correlation. In this case, GLS on (1) using this Ψ−1 yields βGLS. Note that

the computation is simplified, since the NT × NT matrix Ψ−1 is based on

inverting two lower order matrices, V and B both of dimensions N ×N .

If λ = 0, so that there is no spatial autocorrelation, then B = IN and Ω

from (5) becomes the usual error component variance-covariance matrix

ΩRE = E(εε′) = σ2µ(ιT ι

′T ⊗ IN) + σ2

ν(IT ⊗ IN) (8)

In this case V = (Tθ + 1)IN = (Tσ2

µ+σ2ν

σ2ν

)IN and

Ψ−1RE =

σ2ν

σ21

(JT ⊗ IN) + ET ⊗ IN (9)

7

where σ21 = Tσ2

µ+σ2ν . Applying GLS using this ΩRE yields the random effects

(RE) estimator which we will denote by βRE . The one-sided Breusch and

Pagan (1980) test for σ2µ = 0 yields a N(0, 1) test statistic of 81.1 which is sta-

tistically significant. Feasible GLS is based on Amemiya’s (1971) method of

estimating the variance components. This is an analysis of variance method

that uses FE residuals in place of the true disturbances, see Baltagi (1995).

The results are reported as RE in Table 2. In fact, the price elasticity estimate

is -0.47 and the income elasticity estimate is -0.25 and both are significant.

These RE estimates are close to those of the FE estimator. However, a Haus-

man (1978) test statistic for misspecification based on the difference between

the FE and RE estimators of β yield a χ22 test statistic of 26.8 which is sta-

tistically significant. The null hypothesis is rejected and the RE estimator is

not consistent.

If λ 6= 0, MLE under normality of the disturbances using this error com-

ponent model with spatial autocorrelation is derived in Anselin (1988). Here

we apply this MLE using the OPTMUM procedure of GAUSS version 3.2.37.

In addition, we checked the global maximum by running a grid search proce-

dure over λ and ρ = σ2µ/(σ

2µ + σ2

ν). The latter is a positive fraction allowing

a grid search over values of ρ between zero and one. Figure 2 shows that the

maximum likelihood function is well behaved for values of λ and φ around the

global maximum. The results are reported in Table 1 as RE-Spatial. These

results yield a higher price elasticity estimate of -0.80 and a lower income

elasticity estimate of -0.07 than the RE estimator. The price elasticity is

statistically significant while the income elasticity is not. The λ estimate is

0.65 which is close to that of the FE-spatial model. The likelihood ratio test

8

for λ = 0, yields a χ21 test statistic of 249.4. This is statistically significant

and rejects that λ = 0 in the RE model.

We now turn to comparing these various estimators using five years ahead

forecasts. These are out of sample predictions for 1988, 1989, .., and 1992.

3 Prediction

Goldberger (1962) showed that, for a given Ω, the best linear unbiased pre-

dictor (BLUP) for the ith state at a future period T + S is given by

yi,T+S = x′i,T+SβGLS + ω′Ω−1εGLS (10)

where ω = E(εi,T+Sε) is the covariance between the future disturbance εi,T+S

and the sample disturbances ε. βGLS is the GLS estimator of β from (1) based

on Ω, and εGLS denotes the corresponding GLS residual vector.

For the error component model without spatial autocorrelation (λ = 0),

Wansbeek and Kapteyn (1978) and Taub (1979) derived this BLUP and

showed that it reduces to

yi,T+S = x′i,T+SβGLS +σ2µ

σ21

(ι′T ⊗ l′i)εGLS (11)

where in this case, ω = E(εi,T+Sε) = E[(µi + νi,T+S)ε] = σ2µ(ιT ⊗ li) and li is

the ith column of IN . Substituting Ψ−1RE defined in (9) into (10), we immedi-

ately get (11). The typical element of the last term of (11) isTσ2

µ

σ21εi.,GLS where

εi.,GLS =∑Tt=1 εti,GLS/T. Therefore, the BLUP of yi,T+S for the RE model

modifies the usual GLS forecasts by adding a fraction of the mean of the

9

GLS residuals corresponding to the ith state. In order to make this forecast

operational, βGLS is replaced by its feasible GLS estimate βRE reported in

Table 1 and the variance components are replaced by their feasible estimates.

The corresponding predictor is labelled the RE predictor in Table 3.

This paper derives the BLUP correction term when both error compo-

nents and spatial autocorrelation are present. In this case ω = E(εi,T+Sε) =

E[(µi + φi,T+S)ε] = σ2µ(ιT ⊗ li) since the φ’s are not correlated over time.

Using Ω−1 = 1σ2νΨ−1 as defined in (7), we get

ω′Ω−1 =σ2µ

σ2ν

(ι′T ⊗ l′i)[(JT ⊗ V −1) + (ET ⊗ (B′B))] = θ(ι′T ⊗ l′iV −1) (12)

since ι′TET = 0. Therefore

ω′Ω−1εGLS = θ(ι′T ⊗ l′iV −1)εGLS = θ l′iV−1

T∑t=1

εt,GLS = TθN∑j=1

δj εj.,GLS (13)

where δj is the jth element of the ith row of V −1 and εj.,GLS =∑Tt=1 εtj,GLS/T.

In other words, the BLUP adds to x′i,T+SβGLS a weighted average of the GLS

residuals for the N regions averaged over time. The weights depend upon

the spatial matrix W and the spatial autocorrelation coefficient λ. To make

this predictor operational, we replace βGLS, θ and λ by their estimates from

the RE-spatial MLE reported in Table 1. The corresponding predictor is

labelled RE-spatial in Table 3.

When there is no spatial autocorrelation, i.e., λ = 0, the BLUP correction

term given in (13) reduces to the Wansbeek and Kapteyn (1978) and Taub

(1979) predictor term given in (11). Also, when there are no random state

10

effects, so that σ2µ = 0, then θ = 0 and the BLUP prediction term in (13)

drops out completely from equation (10). In this case, Ω in (5) reduces to

σ2ν(IT ⊗ (B′B)−1) and GLS on this model, based on the MLE of λ, yields the

pooled spatial estimator reported in Table 1. The corresponding predictor is

labelled the pooled spatial predictor in Table 3.

If the fixed effects model without spatial autocorrelation is the true model,

then the BLUP is given by

yi,T+S = x′i,T+SβFE + µi (14)

see Baillie and Baltagi (1998), with µi estimated as µi = yi. − x′i.βFE and

yi. =∑Tt=1 yit/T and xi. similarly defined. Note that in this case, λ = 0, so

that φit in (3) reduces to νit and the latter are not serially correlated over

time. Therefore, ω = E(νi,T+Sν) = 0, and the last term of (10) for the FE

model is zero. However, the µi appear in the predictions as shown in (14).

The corresponding predictor is labelled the FE predictor in Table 3.

If the fixed effects model with spatial autocorrelation is the true model,

then the problem is to predict

yi,T+S = x′i,T+Sβ + µi + φi,T+s (15)

with φT+S = λWφT+S + vT+s obtained from (3). Unlike the previous case,

λ 6= 0 and the µi’s and β have to be estimated from MLE, i.e., using the

FE-spatial estimates. The disturbance vector from (3) can be written as

φ = (IT ⊗ B−1)v, so that ω = E(φi,T+Sφ) = 0 since the υ’s are not serially

11

correlated over time. So the BLUP for this model looks like that for the FE

model without spatial correlation given in (14) except that the µi’s and β

are estimated assuming λ 6= 0. The corresponding predictor is labelled the

FE-spatial predictor in Table 3.

Table 3 gives the RMSE for the one year, two year,.., and five year ahead

forecasts along with the RMSE for all 5 years. These are out of sample fore-

casts from 1987 to 1992. Each year’s RMSE is obtained from 46 state by

state predictions. We compare the forecasts for all 5 years. The pooled OLS

predictor in Table 3 is computed as yi,T+S = x′i,T+SβOLS. Pooled OLS, which

ignores spatial autocorrelation and heterogeneity across the states gives the

highest RMSE of 0.2093. Accounting for spatial autocorrelation using the

pooled spatial estimator lowers this RMSE to 0.1922. This predictor replaces

the OLS estimator of β by that of pooled spatial MLE reported in Table 1.

Substituting the average heterogeneous OLS estimator (which ignores spatial

autocorrelation but allows for parameter heterogeneity across time) lowers

this RMSE to 0.1892. This forecast performance is slightly improved by ac-

counting for spatial autocorrelation. Substituting the average heterogeneous

spatial MLE yields a RMSE of 0.1860. A substantial improvement in the

forecast performance occurs when one takes into account the state hetero-

geneity. The simple FE estimator without spatial autocorrelation yields a

RMSE of 0.1501 followed closely by the RE estimator without spatial au-

tocorrelation with a RMSE of 0.1509. These predictors were described in

(14) and (11), respectively. Additional reduction in the forecast RMSE is

obtained by taking into account both heterogeneity and spatial autocorrela-

tion. The best forecast performance for all five years is obtained by the FE

12

estimator with spatial autocorrelation which yields a RMSE of 0.1278, fol-

lowed closely by the RE with spatial autocorrelation estimator with a RMSE

of 0.1279. The FE-spatial predictor is obtained as in (14) but with the FE-

spatial estimates from Table 1 replacing the FE estimates. The RE-spatial

predictor is obtained from (10), with the BLUP correction term given in (13),

by substituting the RE-spatial estimates from Table 1.

For the simple cigarette demand model chosen to illustrate our forecasts,

taking into account the heterogeneity across states and the spatial autocor-

relation yields the best out of sample forecast performance as measured by

their RMSE. The FE-spatial estimator gives the lowest RMSE for the first

four years and is only surpassed by the RE-spatial in the fifth year. Overall,

both the RE-spatial and FE-spatial estimators perform well in predicting

cigarette demand.

Some of the limitations of our study is that we used a simple static

model of cigarette demand when a dynamic or a rational addiction model of

cigarette demand may be more appropriate. However, the latter models in-

troduce additional econometric complications for our forecasting illustrations

and these are beyond the scope of this paper. Despite these limitations, this

paper lays out a simple methodology for forecasting with panel data models

that are spatially autocorrelated. These methods will hopefully prove useful

to researchers forecasting with these models.

13

REFERENCES

Advisory Commission on Intergovernmental Relations (1985). Cigarette Tax

Evasion: A Second Look, ACIR, Washington, D.C.

Amemiya, T. (1971), The estimation of the variances in a variance compo-

nents model, International Economic Reviews, 12, 1-13.

Anselin, L. (1988) Spatial Econometrics: Methods and Models, Dordrecht:

Kluwer.

Anselin, L. and A. Bera (1998), Spatial dependence in linear regression

models with an introduction to spatial econometrics, in the Handbook

of Applied Economic Statistics, A. Ullah and D. Giles, eds., Marcel

Dekker, New York, pp.237-289.

Anselin, L., A. Bera, R. Florax and M.J. Moon (1996), Simple diagnostic

tests for spatial dependence, Regional Science and Urban Economics,

26, 77-104.

Baillie, R. and Baltagi, B. (1998), Prediction from the regression model

with one-way error components, in Analysis of Panel Data and Limited

Dependent Variable Models, H. Pesaran, K. Lahiri, C. Hsiao and L-F.

Lee, eds.,Cambridge University Press, Cambridge.

Baltagi, B.H. (1995) Econometric Analysis of Panel Data, Chichester, Wi-

ley.

Baltagi, B.H. and D. Levin (1986), Estimating dynamic demand for cigarettes

using panel data: The effects of bootlegging, taxation and advertising

14

reconsidered, Review of Economics and Statistics, 48, 148-155.

Becker, G. S. , M. Grossman and K. M. Murphy (1994), An empirical

analysis of cigarette addiction, American Economic Review, 84, 396-

418.

Breusch, T.S. and A. Pagan, 1980, The Lagrange multiplier tests and its ap-

plications to model specification in econometrics, Review of Economic

Studies, 47, 239-253.

Goldberger, A.S. (1962) Best linear unbiased prediction in the generalized

linear regression model, Journal of the American Statistical Association

57, 369–375.

Hausman, J.A. (1978), Specification tests in econometrics, Econometrica,

46, 1251-1271.

Hsiao, C. (1986), Analysis of Panel Data, Cambridge, Cambridge University

Press.

Moulton, B. R. (1986), Random group effects and the precision of regression

estimates, Journal of Econometrics, 32, 385-397.

Nerlove, M. (1971), Further evidence on the estimation of dynamic economic

relations from a time-series of cross-sections, Econometrica, 39, 359-

382.

Pesaran, M.H. and R. Smith (1995), Estimating long-run relationships from

dynamic heterogeneous panels, Journal of Econometrics, 68, 79-113.

15

Taub, A.J. (1979), Prediction in the context of the variance components

model, Journal of Econometrics, 10, 103-107.

Wansbeek, T. and A. Kapteyn (1978), The separation of individual varia-

tion and systematic change in the analysis of panel data, Annales de

l’INSEE, 30-31, 659-680.

Wansbeek, T. and A. Kapteyn (1983), A note on spectral decomposition

and maximum likelihood estimation of ANOVA models with balanced

data, Statistics and Probability Letters, 1, 213-215.

16

Table 1: Pooled Estimates of Cigarette Demand

Price Income

Pooled OLS−0.618(−13.7)

0.114(4.00)

Pooled Spatial−0.882(−16.4)

0.285(8.29)

Average Heterogeneous OLS−1.193(−37.6)

0.476(26.4)

Average Spatial MLE−1.235(−39.1)

0.505(27.2)

FE−0.474(−17.7)

−0.259(−12.6)

FE-Spatial −0.775(−20.7)

−0.131(−3.45)

RE −0.474(−17.8)

−0.251(−12.3)

RE-Spatial −0.803(−20.8)

−0.070(−1.77)

∗The F-statistic for H0;µ = 0 yields a value of 88.95, which is statisticallysignificant. The one-side Breusch-Pagan test for H0;σ2

µ = 0 yields a N(0, 1)test statistic of 81.1 which is statistically significant. Hausman’s test based onFE and RE yields a χ2

2 of 26.8 which is statistically significant.

Table 2: Heterogeneous Estimates of Cigarette Demand

Heterogeneous OLS Heterogeneous Spatial LM∗

Price Income Price Income λ

1963−0.663

(−1.925)0.718

(5.621)−0.625

(−1.841)0.730

(5.600)0.097

(0.517)0.278

(0.597)

1964 −1.215(−3.368)

0.619(4.629)

−1.210(−3.450)

0.622(4.712)

0.039(0.206)

0.044(0.834)

1965 −1.204(−3.465)

0.634(4.525)

−1.203(−3.563)

0.635(4.575)

0.003(0.021)

0.000(0.986)

1966 −1.429(−4.438)

0.736(4.710)

−1.435(−4.526)

0.740(4.743)

0.070(0.411)

0.218(0.641)

1967−1.438

(−4.494)0.791

(5.426)−1.455

(−4.571)0.797

(5.452)0.081

(0.489)0.331

(0.565)

1968−1.411

(−4.478)0.831

(5.861)−1.417

(−4.526)0.833

(5.969)0.030

(0.175)0.040

(0.842)

1969−1.155

(−4.609)0.787

(5.502)−1.164

(−4.669)0.790

(5.583)0.044

(0.251)0.080

(0.777)

1970−0.998

(−4.078)0.779

(4.929)−1.010

(−4.135)0.786

(4.960)0.067

(0.395)0.209

(0.648)

1971−0.882

(−3.129)0.661

(3.669)−0.882

(−3.195)0.667

(3.710)0.062

(0.377)0.200

(0.655)

1972 −1.003(−3.955)

0.573(2.872)

−1.028(−4.078)

0.600(2.905)

0.148(0.923)

1.191(0.275)

1973 −1.022(−3.980)

0.394(1.964)

−1.072(−4.093)

0.442(2.097)

0.195(1.213)

1.966(0.161)

1974 −1.048(−4.353)

0.432(2.179)

−1.102(−4.440)

0.463(2.261)

0.189(1.169)

1.820(0.177)

1975−1.142

(−4.681)0.400

(2.096)−1.207

(−4.763)0.435

(2.198)0.179

(1.091)1.576

(0.209)

1976−1.245

(−4.666)0.443

(2.189)−1.450

(−4.921)0.510

(2.402)0.298

(1.859)4.056

(0.044)

1977−1.278

(−4.638)0.381

(1.913)−1.448

(−4.899)0.456

(2.176)0.291

(1.782)3.769

(0.052)

1978−1.308

(−4.482)0.298

(1.528)−1.482

(−4.758)0.419

(1.963)0.287

(1.671)3.092

(0.078)

1979−1.253

(−4.217)0.270

(1.484)−1.314

(−4.296)0.319

(1.657)0.140

(0.802)0.803

(0.370)

1980 −1.267(−3.903)

0.164(0.920)

−1.289(−4.017)

0.191(1.037)

0.089(0.516)

0.341(0.560)

1981 −1.275(−4.733)

0.300(1.890)

−1.493(−5.262)

0.432(2.512)

0.336(2.000)

4.083(0.043)

1982 −1.263(−4.212)

0.316(1.867)

−1.280(−4.375)

0.344(2.016)

0.160(0.973)

1.258(0.262)

1983−1.433

(−5.086)0.295

(1.971)−1.480

(−5.593)0.340

(2.239)0.281

(1.777)3.963

(0.047)

1984−1.263

(−4.407)0.327

(2.205)−1.253

(−4.670)0.316

(2.180)0.301

(2.046)5.510

(0.019)

1985−1.235

(−4.681)0.260

(1.955)−1.231

(−4.757)0.256

(1.920)0.222

(1.336)2.115

(0.146)

1986−1.328

(−4.338)0.289

(2.047)−1.317

(−4.509)0.300

(2.098)0.254

(1.600)3.220

(0.073)

1987−1.064

(−3.584)0.209

(1.556)−1.040

(−3.698)0.208

(1.519)0.329

(2.099)4.922

(0.026)∗This gives the LM statistic for H0;λ = 0 and the corresponding p-value in parenthesis.

Table 3: RMSE Performance of Out-of-Sample Forecasts

1988 1989 1990 1991 1992 5 YearsPooled OLS 0.1947 0.2022 0.2239 0.2226 0.2016 0.2093

Pooled Spatial 0.1862 0.1888 0.2072 0.2002 0.1769 0.1922Average Heterogeneous OLS 0.1927 0.1896 0.2029 0.1913 0.1674 0.1892

Average Spatial MLE 0.1901 0.1862 0.1990 0.1867 0.1666 0.1860FE 0.1152 0.1241 0.1595 0.1739 0.1680 0.1501

FE-Spatial 0.1027 0.1051 0.1360 0.1404 0.1478 0.1278RE 0.1158 0.1249 0.1604 0.1749 0.1687 0.1509

RE-Spatial 0.1042 0.1070 0.1371 0.1407 0.1444 0.1279