Post on 30-Dec-2015
description
Sugar Cane Production Sugar Cane Production in Puerto Rico, 1958/59-in Puerto Rico, 1958/59-1973/74: A Comparison 1973/74: A Comparison of Four Model of Four Model Specifications for Specifications for Describing Small Describing Small Heterogeneous Space-Heterogeneous Space-Time DatasetsTime Datasets
bybyDaniel A. GriffithDaniel A. Griffith
Ashbel Smith Professor Ashbel Smith Professor of Geospatial Information of Geospatial Information
SciencesSciences
ABSTRACTABSTRACTResearchers increasingly are accounting for heterogeneity in their empirical analyses. When data form a short time series—too short to utilize an ARIMA model—a random effect term can be employed to account for serial correlation. When data also are georeferenced, forming a space-time dataset, a random effect term can be included that is spatially structured in order to account for spatial autocorrelation, too. But space-time heterogeneity can be accounted for in various ways, including specifications involving recently developed spatial filtering methodology. This paper summarizes comparisons of four model specifications—simple pooled space-time; sequential, comparative statics; temporally varying coefficients with a spatially unstructured random effect; and, temporally varying coefficients with a spatially structured random effect—illustrating implementations with annual sugar cane production data for the 73 municipalities of Puerto Rico during 1958/59-1973/74. Covariates whose importance is assessed include elevation and distance from the primate city.
Panel data versus space-time dataPanel data are a form of longitudinal data, and can
be a cross-section (i.e., the spatial dimension) of individuals (e.g., farms) that are surveyed periodically over a given time horizon.
With repeated observations of the same individuals, panel data permit a researcher to study the dynamics of change with short time series.
A main advantage of panel data: controlling for unobserved heterogeneity (the fundamental complication of non-experimental data collection)
BUT longitudinal data need not involve the same individuals: if a sample is not the same, observed changes also may result from sampling error
Spatial filtering
A given random variable can be decomposed into a spatial component and an aspatial component: impulse-response function approach (based upon the autoregressive model), Getis approach (based on the K function), eigenfunction spatial filtering approach.
The spatial component relates to spatial autocorrelation
High Peak district biomass index:ratio of remotely sensed data spectral
bands B3 and B4
Spatially autocorrelated Geographically random
Defining spatial autocorrelation
Auto: self
Correlation: degree of relative correspondence
Positive: similar values cluster together on a map
Negative: dissimilar valuesCluster together on a map
Spatial auto-correlation
n
)x(x
n
)y(y
)/nx)(xy1(y
n
1i
2i
n
1i
2i
n
1iii
n
)y(y
n
)y(y
c/)y)(yy(yc
n
1i
2i
n
1i
2i
n
1i
n
1i
n
1jij
n
1jjiij
from r to MC
Constructing eigenfunctions for filtering spatial autocorrelation out of georeferenced variables:
Moran Coefficient = (n/1T C1)x
YT(I – 11T/n)C (I – 11T/n)Y/ YT(I – 11T/n)Y
the eigenfunctions come from
(I – 11T/n)C (I – 11T/n)
Eigenvectors for spatial filter construction
The first eigenvector, say E1, is the set of real number numerical values that has the largest MC achievable by any set for the spatial arrangement defined by the geographic connectivity matrix C. The second eigenvector is the set of values that has the largest achievable MC by any set that is uncorrelated with E1. The third eigenvector is the third such set of values. And so on. This sequential construction of eigenvectors continues through En, the set of values that has the largest negative MC achievable by any set that is uncorrelated with the preceding (n-1) eigenvectors.
Useful citation
Random effects model
is a random observation effect (differences among individual observational units)
is a time-varying residual error (links to change over time)
The composite error term is the sum of the two.
) , f( εξXβY ξ
ε
Random effects model: normally distributed intercept term
• ~ N(0, ) and uncorrelated with covariates
• supports inference beyond the nonrandom sample analyzed
• simplest is where intercept is allowed to vary across areal units (repeated observations are individual time series)
• The random effect variable is integrated out (with numerical methods) of the likelihood fcn
• accounts for missing variables & within unit correlation (commonality across time periods)
2σξ
Sugar cane production in Puerto Rico• Began in the 1530s• Experienced a sharp decline during 1580-1650• Introduction of slave labor resulted in considerable
expansion during 1765-1823• By 1828, sugar exports were sizeable• Spanish monarchy discouraging expansion
throughout much of the 1800s• United States took possession of the island in
1899, fully developing the long-demanded railroad on the island and channeling considerable investment into sugar cane production, achieving maximum expansion in the 1920
• Production peaked around 1950
Island-wide time series
, )I0.81524(I1.68I7.231000
tons0.15060LN
1.68I7.231000
tons0.84940LN0.009811.68I7.23
1000
tonsN̂L
1tt2-t2t
1-t1t
tt
US intervention
1924 sugar cane railroad
Finally started by the Spanish Crown, but aggressively completed by US investors
Covariates of sugar cane production
elevationelevation distance from San Juandistance from San Juan
covariate spatial filterscovariate spatial filters
Model specifications
1974
1959tiSJ,ti,tdist,
1974
1959t
iti,t,elev
1974
1959tti,t0,
ti,
ti, dIβelevIβIβp100
pLN
1974
1959tiSJ,ti,tdist,
1974
1959t
iti,t,elevt0ti,
ti, dIβelevIβ1958)(Tββp100
pLN
I-A: initialI-A: initial
I-B: with linear time trendI-B: with linear time trend
i
1974
1959tiSJ,ti,tdist,
1974
1959t
iti,t,elevt0ti,
ti, εdIβelevIβ1958)(Tββp100
pLN
II: with random effectII: with random effect
18
1j
1974
1959tji,ti,t,E
1974
1959tiSJ,ti,tdist,
1974
1959t
iti,t,elevt0ti,
ti,
eIβ
dIβelevIβ1958)(Tββp100
pLN
j
III: with spatial filterIII: with spatial filter
i
18
1j
1974
1959tji,ti,t,E
1974
1959tiSJ,ti,tdist,
1974
1959t
iti,t,elevt0ti,
ti,
εeIβ
dIβelevIβ1958)(Tββp100
pLN
j
IV: with spatially structured random effectIV: with spatially structured random effect
Sugar cane production:1958/59-1973/74
1958/591958/591963/641963/64
1968/691968/69 1973/741973/74
ScaleDark red: high
Dark green: low
Year covariates Deviance Pseudo-R2 MC for % Residual MC
1958/59
Time-based intercept,
mean elevation, Distance from San
Juan
1565 0.503 0.31968 0.04912
1959/60 1503 0.527 0.33317 0.05521
1960/61 1561 0.540 0.35751 0.06663
1961/62 1543 0.559 0.38669 0.08844
1962/63 1490 0.576 0.41571 0.10887
1963/64 1544 0.579 0.42272 0.10598
1964/65 1467 0.599 0.46101 0.12206
1965/66 1523 0.586 0.48383 0.16313
1966/67 1610 0.571 0.49018 0.18957
1967/68 1601 0.545 0.47420 0.17009
1968/69 1259 0.620 0.53851 0.17194
1969/70 1273 0.574 0.47448 0.13531
1970/71 1149 0.518 0.43049 0.18262
1971/72 1164 0.548 0.43207 0.12463
1972/73 1146 0.477 0.42875 0.19466
1973/74 899 0.566 0.39513 0.04261
Year
Spatially unstructured Spatially structured
Deviance statistic
Pseudo-R2 Residual MC Selected vectors Deviance statistic
Pseudo-R2 Residual MC
58/59 473 0.881 0.33271E3, E4, E6, E7,
E8, E13, E18
378 0.957 -0.02771
59/60 403 0.906 0.34707E3, E4, E6, E7,
E8, E13, E18
321 0.975 -0.07181
60/61 368 0.938 0.31433E1, E3, E4, E6,
E7, E8, E13, E18
326 0.982 -0.03271
61/62 303 0.961 0.33815E3, E4, E6, E7,
E11
279 0.988 0.03076
62/63 261 0.983 0.19217 E4 252 0.992 0.17739
63/64 281 0.986 0.14692 E1, E4 271 0.993 0.09181
64/65 263 0.984 0.17054 E3, E4 254 0.989 0.07083
65/66 266 0.986 0.22023 E3 254 0.988 0.04146
Mixed binomial regression: time varying covariate coefficients, spatially unstructured and structured
random effects
Year
Spatially unstructured Spatially structured
Devi-ance
Pseudo-R2 Residual MC Selected vectors Devi-ance
Pseudo-R2 Residual MC
66/67 302 0.977 0.33270 E3, E6, E8 273 0.985 0.09299
67/68 329 0.964 0.28672 E1, E3, E4, E6, E8 290 0.976 0.03851
68/69 320 0.966 0.30690E1, E3, E4, E5, E6, E8,
E12, E13, E14, E16
218 0.981 -0.08747
69/70 310 0.956 0.19651E1, E2, E3, E4, E6, E8,
E11, E16, E18
250 0.976 -0.03816
70/71 339 0.914 0.34359E1, E3, E4, E6, E7, E8,
E11, E15, E18
181 0.979 -0.04857
71/72 384 0.893 0.14420
E1, E2, E3, E4, E5, E6,
E8, E9, E10, E11, E12,
E16, E17, E18
207 0.965 -0.12290
72/73 427 0.806 0.24568
E1, E2, E3, E4, E6, E8,
E9, E10, E11, E12, E13,
E16, E17, E18
158 0.964 -0.13529
73/74 347 0.906 0.07071E1, E2, E3, E4, E6, E8,
E9, E10, E11, E12, E18
167 0.945 -0.07292
Spatial filters for space-time spatially structured random effects
1958/591958/59MC = 0.77, GR = 0.30MC = 0.77, GR = 0.30
1963/641963/64MC = 0.93, GR = 0.18MC = 0.93, GR = 0.18
1968/691968/69MC = 0.86, GR = 0.18MC = 0.86, GR = 0.18
1973/741973/74MC = 0.94, GR = 0.22MC = 0.94, GR = 0.22
(normally distributed) random intercept: areal unit specific across all years
feature Spatially unstructured Added to spatial structure
Sample mean -0.00864 -0.00665
Sample variance 1.63044 1.63797
Moran Coefficient (MC) 0.08672 0.08778
Geary Ratio (GR) 1.10196 1.09907
P(Shapiro-Wilk) < 0.0001 (4 lower tail outliers)
< 0.0001 (4 lower tail outliers)
Correlations with covariates
(-0.17873, 0.32086) (-0.17833, 0.32095)
Time series plots: intercept &
covariate binomial regression coefficients
interceptintercept
● simple pooled model■ comparative static model
♦ model with a spatially unstructured random effect ▲mixed model with spatially structured random effect
mean elevationmean elevation distancedistance
Time series plots: covariate
binomial regression coefficient
standard errors mean elevationmean elevation
distancedistance
● simple pooled model■ comparative static model♦ model with a spatially
unstructured random effect ▲ mixed model with spatially
structured random effect
Residual serial correlation
The random effects estimator approximates the degree of serial correlation (or its importance in the model), and hence allows the computation of corrected estimates.
The 73 residual Durbin-Watson statistics have a range of (0.140, 2.513), with a mean of 0.836 and a standard deviation of 0.546.
Determining significance here is complicated because of small T, inclusion of a random effects term, and variable SF eigenvecvtor #s
Graphical portrayal of DWs
GLM residuals (heuristic using 4 dfs lost)
0 – 0.74 1.93 – 2.08 3.26 – 4
0.74 – 1.93 2.07 – 3.26
undecided
positive serial correlation
Summary of results
STAR-binomial specification
ti,
73
1j 1tj,ijs
1ti,T
ielev
idisttti,
εarea
scwρ
area
scρ
elevβ
distβμarea
scLN
time
space
space-time
Pseud- & quasi-likelihood estimation
885.0R-pseudo
3.38ρ̂
7.75ρ̂
0.19β̂
0.05β̂
0.03T1.40μ̂
2
s
T
elev
dist
T
Extra binomial variation remains
1958/59 1565 473 378
1959/60 1503 403 321
1960/61 1561 368 326
1961/62 1543 303 279
1962/63 1490 261 252
1963/64 1544 281 271
1964/65 1467 263 254
1965/66 1523 266 254
1966/67 1610 302 273
1967/68 1601 329 290
1968/69 1259 320 218
1969/70 1273 310 250
1970/71 1149 339 181
1971/72 1164 384 207
1972/73 1146 427 158
1973/74 899 347 167
● pineapple production■ milk production♦ sugar cane production ▲ tobacco production
implications
1. spatial autocorrelation appears to be a source of part of the overdispersion
2. random effects (e.g., missing covariates) appear to be a source of part of the overdispersion
3. land use competition may be a source of part of the overdispersion
4. spatial filters for mean elevation and distance have six eigenvectors in common; of these, one is shared with most of the annual comparative static spatial filters, and two with most of the spatially structured random effect term spatial filters
5. the components of spatial autocorrelation in sugar cane production vary over time
6. a spatially unstructured random effect term that seeks to account for serial correlation in multiple short time series can better highlight latent spatial autocorrelation
7. a spatial filter can effectively structure a random effect term
8. failure to include a spatially structured random effect term can result in biased parameter estimates (largely because of the nonlinear nature of the model specification)
9. spatial and temporal autocorrelation interact in a complex way
THE ENDTHE END