Smoothing and Weighted Average Techniques · 2017. 12. 23. · weighted average of orthogonal local...

Smoothing and Weighted Average Techniques

Shreyan Ganguly, Jeffrey Gory, Meng Li,Matt Pratola, and Angela Dean

October 28, 2014

SSES Reading Group Smoothing and Weighted Averages October 28, 2014 1 / 30

General Overview

Non-Stationarity

Second order stationarity implies that the covariance structurebetween two distinct points in the spatial domain is only afunction of the distance between them

Under the assumption of non-stationarity, this is not the case

One method of estimating the covariance structure of anon-stationary spatial process is with smoothing or weightedaverage techniques


General Overview

Notable Papers

Fuentes (2001) represented a non-stationary spatial process as aweighted average of orthogonal local stationary processes

Fuentes and Smith (2001) extended this idea to a continuousconvolution of local stationary processes

Kim, Mallick, and Holmes (2005) proposed a similar approachwith piecewise Gaussian processes that allows for sharptransitions in the covariance structure

Nott and Dunsmuir (2002) used a weighted average of stationaryprocesses to describe the conditional behaviour of a processgiven the values of the process at a set of monitoring sites


Fuentes (2001)

High Frequency Kriging Procedure

Suppose Z (·) is a non-stationary process observed on a givenspatial region D

The region D is divided into k disjoint subregions Si ,i = 1, 2, . . . , k such that

D =k⋃

i=1

Si

Here k can be considered to be a variable valid for estimation


Fuentes (2001)

Representation of Z

Z is represented as a weighted average of orthogonal localstationary processes Zi

Z (s) =k∑

i=1

Zi(s)wi(s)

Each wi(·) is a weight function (e.g . the inverse of the distancebetween s and Si)Each Zi is a stationary process with spatial covariancerepresenting the spatial structure of Z in the subregion Si

The non-stationary covariance of Z is defined as

cov(Z (s),Z (t)) =k∑

i=1

wi(s)wi(t)cov(Zi(s),Zi(t))

where cov(Zi(s),Zi(t)) = Kθi(s − t), Kθi

(·) is MaternSSES Reading Group Smoothing and Weighted Averages October 28, 2014 5 / 30

Fuentes (2001)

Algorithm to Find k

Start with small number of subregions (i .e. a small value of k)

Increase k by dividing each subregion Si in half

Repeat this until we have less than 36 observations in thesubregions

or

until the AIC suggests no significant improvement in theestimation of θi for i = 1, . . . , k by increasing the number ofsubregions k


Fuentes (2001)

More Details on Finding k

If data are in a regular grid, the Whittle (1954) approximation tothe local stationary functions is used to calculate the globallikelihood to obtain the AIC value

If the data are not in a regular grid, then the method proposedby Kitanidis (1993) and by Mardia and Marshall (1984) is usedto efficiently maximize the global likelihood to obtain the AIC

The minimum number of observations in a subregion is restrictedto 36 as it is not recommended to calculate the periodogramwhen the number of sample points available to compute thesample covariance is less than 36 (see Haas 1990, Journel andHuijbregts 1978, SAS/STAT Technical Report 1996)


Fuentes (2001)

Estimation of Covariance Parameters

Recall that the non-stationary covariance of Z is represented as


i=1

wi(s)wi(t)cov(Zi(s),Zi(t))

Assuming cov(Zi(s),Zi(t)) = Kθi(s − t) is a Matern covariance

structure we have


i=1

wi(s)wi(t)Kθi(s − t)

We need to estimate θi for each Si


Fuentes (2001)

Estimation of Covariance Parameters (cont.)

Using a spectral approach assume the spectral function

fi(ω) = φi(|ω|2)(−νi−d/2)

for Zi for each subgrid Si so that θi = (φi , νi)T

Let Gi be the generalized covariance corresponding to fi

φi and νi are obtained using a weighted non-linear least squarestechnique in the spectral domain

With these estimates we get the estimated generalizedcovariance

G (Z (s),Z (t)) =k∑

i=1

wi(s)wi(t)Gi(Zi(s),Zi(t))


Fuentes and Smith (2001)

Continuous Case

Spatially Weighted Average of Local Stationary Processes:

Z (s) =k∑

i=1

Zi(s)wi(s)

where wi is the weight function and Z ′i s are orthogonalstationary processes with Zi(s) ⊥ Zj(t), for any i 6= j

Continuous Convolution of Local Stationary Processes:

Z (s) =

∫D

K (s − u)Zθ(u)(s)du

where K is kernel function and Zθ(u) u ∈ D, is a family ofindependent stationary Gaussian processes indexed by θ, which isassumed to vary smoothly across space



Kernel Smoothing Method

Stationary Gaussian processes can be represented in the form

Z (s) =

∫D

K (s − u)X (u)du

where K (·) is some kernel function and X (·) is a Gaussian whitenoise process

This can be extended to non-stationary processes



Kernel Smoothing Models

Model 1: Process Convolution Model

Z (s) =

∫D

Ks(u)X (u)du (1)

where Ks depends on the location s

Model 2: Continuous Convolution of Local Stationary Processes

Z (s) =

∫D

K (s − u)Zθ(u)(s)du (2)

where Zθ(u)(s), s ∈ D is a family of independent stationaryGaussian processes indexed by θ(u)



Kernel Smoothing Models (cont.)

Generally, Model (1) and (2) are not equivalent

Since Zθ(u) is a family of independent stationary Gaussianprocess, it can be written in the form,

Zθ(u)(s) =

∫R2

Ku(x − s)Xu(x)dx

where Ku is a kernel for each location u, and Xu is anindependent white noise process for each location u

Only in the case that the process Xu is common across all u(instead of being an independent white noise process for eachlocation u) can Model (1) and (2) be expressed interchangeably

However, Zθ(u)(s), s ∈ D are correlated through kernelsmoothing



Covariance Structure of Non-Stationary Process Z

Let Zθ(u) be a stationary Gaussian process with covariancematrix Cθ(u), that is,

cov(Zθ(u)(x1),Zθ(u)(x2)) = Cθ(u)(x1 − x2)

The covariance between locations x1 and x2 of the non-stationaryprocess Z is a convolution of the local covariances Cθ(u)(x1− x2),

C (x1, x2; θ) =

∫D

K (x1 − u)K (x2 − u)Cθ(u)(x1 − x2)du

Assume that θ(u) is a continuous function of u

In implementation, the integration can be approximated byRiemann sum or Monto Carlo integration.



Hierarchical Bayesian Model

Z (s) =

∫D

K (s − u)Zθ(u)(s)du

Stage 1:Conditional on model parameters {θ(u), u ∈ D}, Z is Gaussian

Stage 2:θ(u) = a + ri + cj + εθ(u), where the process εθ(u) is astationary spatial process with mean zero and its covariancestructure can be modeled by some well-known covariancefunction with unknown parametersAnother parameter that needs to be estimated is the bandwidthh that defines the kernel function K


Kim, Mallick, and Holmes (2005)

Piecewise Gaussian Processes

Much like Fuentes (2001), partition region into severalsubregions and fit a stationary process within each subregion

Adapted to deal with sharp transitions in covariance structure,so processes assumed independent across subregions

The resulting non-stationary model is not smooth

One possible application is modeling soil permeability

Voronoi tessellation (Green and Sibson 1978) used to determinenumber of subregions and their centers

Fit model within a Bayesian framework


Kim, Mallick, and Holmes (2005)

Voronoi Tessellation


Nott and Dunsmuir (2002)

General Idea

Measurements of a spatial process are available at a set ofmonitoring sites

An empirical covariance matrix can be estimated from thesemeasurements

Want to extend this covariance matrix to a covariance functionover the entire region of interest

Do this with a weighted average of local stationary processes



Illustration

Image from Haas 1990



Notation

Z = {Z (s), s ∈ Rd} is the spatial process of interest

s1, . . . , sn are the sites at which Z is observed

Γ is the covariance matrix for (Z (s1), . . . ,Z (sn))T

Wi(s), i = 1, . . . , I are a collection of independent, zero-mean,stationary processes which describe the behavior of Z locallyabout a collection z1, . . . , zI of I (unmonitored) locations

Ri(h), h ∈ Rd is the covariance function corresponding to Wi(s)

Ci = [Ri(sj − sk)]nj ,k=1 is the covariance matrix of

Wi = (Wi(s1), . . . ,Wi(sn))T

ci(s) is the vector (Ri(s − s1), . . . ,Ri(s − sn))T ofcross-covariances between Wi(s) and Wi



Representation of Wi(s)

We can represent Wi(s) as

Wi(s) = ci(s)TC−1i Wi + δi(s)

where δi(s) is a zero-mean, non-stationary random field withcovariance function

Rδi (s, t) = Ri(t − s)− ci(s)TC−1i ci(t)

Note that ci(s)TC−1i w is the simple kriging predictor of Wi(s)given Wi = w

δi(s) describes the residual variation, so δi(s) = 0 whens ∈ {s1, . . . , sn}



Representation of Non-Stationary Process

Define a non-stationary process W ∗(s) by

W ∗(s) =∑i

νi(s)µi(s) +∑i

νi(s)1/2δi(s)

Each νi(s) is a weight function

Each δi(s) is a residual process

Each µi(s) is a mean process given by

µi(s) = ci(s)TC−1i W ∗

where W ∗ is a zero-mean nx1 random vector with covariancematrix Γ uncorrelated with each δi(s)



Covariance of Non-Stationary Process

With this representation, W ∗(s) has covariance function

R∗(s, t) =∑i ,j

νi(s)νj(t)ci(s)TC−1i ΓC−1j cj(t)

+∑i

νi(s)1/2νi(t)1/2Rδi (s, t)

We use a covariance function of this form to estimate thecovariance of Z (s)



Covariance of Non-Stationary Process (cont.)

Changing notation we can write

Cov(W ∗(s),W ∗(t)) = Σ0(s, t) +I∑

i=1

wi(s)wi(t)Cθi (s − t)

Σ0(s, t) is a function of the empirical covariance matrix Γ andthe local stationary models Wi(s)

The Wi(s) are computed so that Cov(W ∗(sk),W ∗(sl)) = Γkl

The 2nd term is similar to the model proposed by Fuentes (2001)

Weighted average of residual covariance functions

Condition on observed values at the monitoring sites



Choice of Weight Functions

The weight functions νi(s) should have the following properties

νi(s) ≥ 0 ∀i = 1, . . . , I and ∀s ∈ Rd∑i νi(s) = 1

argmax(νi(s)) = zi ∀i = 1, . . . , I

νi(s) decays smoothly to zero as ||s − zi || −→ ∞One choice of νi(s) is

νi(s) = fη(s−zi )∑j fη(s−zj )

where fη(·) is a kernel function given by

fη(t) = exp(− ||t||2

η)

and η is a smoothing parameter



Estimation of Γ

Can estimate covariance matrix Γ by averaging over time

Γnxn =1

M

M∑i=1

(zi − z)(zi − z)T

Could alternatively use the Loader & Switzer (1992) empiricalBayes shrinkage estimator

ΓLS = λΓ + (1− λ)C



Estimation of the Ri(h)

For each i = 1, . . . , I fit a variogram within a window centeredon the point zi

z1, . . . , zI could be on a grid covering the region of interest

Might instead be based on knowledge of non-stationarity

Recommend at least 10 sites in each window

Related to moving window kriging (Haas 1990,1995)

One can then estimate ci(s), cj(t), Ci , Cj , and Rδi (s, t)

and ultimately compute R∗(s, t)



Advantages of this Method

Produces a valid non-negative definite non-stationary covariancefunction

”Computationally attractive,” especially in comparison todeformation techniques

Can be extended to multivariate processes


References

Fuentes, M. (2001). “A high frequency kriging approach fornon-stationary environmental processes,” Environmetrics, 12(5),469-483.

Fuentes, M. and Smith, R.L. (2001). “A new class ofnonstationary spatial models,” Technical report, North CarolinaState University, Institute of Statistics Mimeo Series #2534,Raleigh, NC.

Haas, T.C. (1990). “Lognormal and Moving Window Methodsof Estimating Acid Deposition,” Journal of the AmericanStatistical Association, 85, 950-963.


References (cont.)

Kim, H.M., Mallick, B.K., and Holmes, C.C. (2005). “Analyzingnonstationary spatial data using piecewise Gaussian processes,”Journal of the American Statistical Association, 100, 653-658.

Nott, D.J. and Dunsmuir, W.T.M. (2002). “Estimation ofnonstationary spatial covariance structure,” Biometrika, 89,819-829.

Sampson, P.D. (2010). ”Constructions for Nonstationary SpatialProcesses,” in Handbook of Spatial Statistics. Boca Raton, FL:CRC Press, 119-130.


Smoothing and Weighted Average Techniques · 2017. 12. 23. · weighted average of orthogonal local...

Documents

Transcript of Smoothing and Weighted Average Techniques · 2017. 12. 23. · weighted average of orthogonal local...