Statistics, data, and deterministic models NRCSE.

28
Statistics, data, and deterministic models NRCSE
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Statistics, data, and deterministic models NRCSE.

Page 1: Statistics, data, and deterministic models NRCSE.

Statistics, data, and deterministic models

NRCSE

Page 2: Statistics, data, and deterministic models NRCSE.

Some issues in model assessment

Spatiotemporal misalignmentGrid boxes vs observations

Types of errorMeasurement error and bias

Model error

Approximation error

Manipulate data or model output?

Two case studies:SARMAP – kriging

MODELS-3 – Bayesian melding

Other uses of Bayesian hierarchical models

Page 3: Statistics, data, and deterministic models NRCSE.

Assessing the SARMAP model

60 days of hourly observations at 32 sites in Sacramento region

Hourly model runs for three “episodes”

Page 4: Statistics, data, and deterministic models NRCSE.

Task

Estimate from data the ozone level at x’s in a grid square. Use sum to estimate integral over grid square.

Issues:Transformation

Diurnal cycle

Temporal dependence

Spatial dependence

Space-time interaction

Page 5: Statistics, data, and deterministic models NRCSE.

Transformation

Heterogeneous variability–mean and variance positively related

Square root transformation

All modeling now on square root scale–approximately normal

Page 6: Statistics, data, and deterministic models NRCSE.

Diurnal cycle

Page 7: Statistics, data, and deterministic models NRCSE.

Temporal dependence

Page 8: Statistics, data, and deterministic models NRCSE.

Spatial dependence

Page 9: Statistics, data, and deterministic models NRCSE.

Estimating a grid square average

Estimate using

(not averages of squares of kriging estimates on the square root scale)

Vt (s) = Zt (s)

Vt (s) =μ t (s) +Wt (s)Wt (s) =α1(s)Wt−1(s) + α2 (s)Wt−2 (s) + Yt (s)

1

AVt

2 (s)dsA∫

1

ME Vt (s j )

2 data from 1,..., t{ }∑

Page 10: Statistics, data, and deterministic models NRCSE.

Looking at an episode

Page 11: Statistics, data, and deterministic models NRCSE.

Afternoon comparison

Page 12: Statistics, data, and deterministic models NRCSE.

Nighttime comparison

Page 13: Statistics, data, and deterministic models NRCSE.

A Bayesian approach

SARMAP study spatially data richIf spatially sparse data, how estimate grid squares?

P = Z + M + A + O = ()Z + B + E

P = process model outputO = observationsZ = truth

Calculate (Z |P,O) for prediction

Calculate (O |P, = 1, M = S = A = 0) for model assessment

Page 14: Statistics, data, and deterministic models NRCSE.

CASTNet and Models-3

CASTNet is a dry deposition network

Models-3 sophisticated air quality model

Average fluxes on 36x36 km2 grid

Weekly data and hourly output

Page 15: Statistics, data, and deterministic models NRCSE.

Estimated model bias

The multiplicatice bias is taken spatially constant (= 0.5). The additive bias E(M+A+) is spatially distributed.

Page 16: Statistics, data, and deterministic models NRCSE.

Assessing model fit

Predict CASTNet observation Oi from posterior mean of prediction using Models-3 output Pi and remaining observations O-I.

Average length of 90% credible intervals is 7 ppb

Average length using only Models-3 is 3.5 ppb

Page 17: Statistics, data, and deterministic models NRCSE.

Crossvalidation

Page 18: Statistics, data, and deterministic models NRCSE.

The Bayesian hierarchical approach

Three levels of modelling:

Data model:

f(data | process, parameters)

Process model:

f(process | parameters)

Parameter model:

f(parameters)

Use Bayes’ theorem to compute posterior

f(process, parameters | data)

Page 19: Statistics, data, and deterministic models NRCSE.

Some applications

Data assimilation

Satellite tracking

Precipitation measurement

Combination of data on different scales

Image analysis

Agricultural field trials

Page 20: Statistics, data, and deterministic models NRCSE.

Application to Models-3

where (I) are samples from the posterior distribution of

(Z(s0 ) P,O) ∝ f(Z(s0 ) P,O,)f( P,O)d∫≈

1M

f(Z(s0 P,O,(i) )i=1

M

ME IL NC IN FL MI

CASTNet 0.15 3.29 0.90 3.14 0.57 1.02

Models-3 0.33 3.33 5.32 9.59 0.52 1.04

Adjusted

Models-3

0.12 2.88 1.09 3.12 0.44 1.01

Page 21: Statistics, data, and deterministic models NRCSE.

Predictions

Page 22: Statistics, data, and deterministic models NRCSE.

NCAR-GSP (IMAGe)

Theme for 2005: Data Assimilation in the Geosciences

Page 23: Statistics, data, and deterministic models NRCSE.

ENSO project

El Niño/Southern Oscillation is driven by surface temperature in tropical Pacific

Data 2ox2o monthly SST anomalies at 2261 locations; zonal 10m wind

Previous work indicates EOFs of SST may develop in a Markovian fashion

Forecast 7 months ahead uses data from Jan 70 through latest available.

Cressie-Wikle-Berliner (http://www.stat.ohio-state.edu/ ~sses/collab_enso.php)

Page 24: Statistics, data, and deterministic models NRCSE.

Model

Data model:

Process model:

Parameter model:

The current state is a mixture over three regimes (determined by SOI), with mixing probabilities that depend on the wind statistic

Standardize by subtracting climatology (monthly average 1971-2000)

Zt =Φat + νt

EOFs

a t+τ =μ t +Htat + ηt+τ

Ht =H(It, J t )regimes

winds

Page 25: Statistics, data, and deterministic models NRCSE.

Latest ENSO forecast

Page 26: Statistics, data, and deterministic models NRCSE.

Latest forecast with data

fore

cast

dat

a

Page 27: Statistics, data, and deterministic models NRCSE.

Relative performance

Performance measure for anomalies:

ave((forecast - data)2} over all pixels in Niño3.4-region

Relative Performance of Forecast A relative to Forecast B is

RP(A,B)=log(Perf B / Perf A)

RP(A,B)>0 indicates A better than B

Persistence: Predict using data 7 months ago

Climatology: Predict using 0

Page 28: Statistics, data, and deterministic models NRCSE.

Comparison to climatology and persistence