A Bayesian hierarchical modeling approach to reconstructing past climates David Hirst Norwegian...

Post on 19-Dec-2015

220 views 1 download

Tags:

Transcript of A Bayesian hierarchical modeling approach to reconstructing past climates David Hirst Norwegian...

A Bayesian hierarchical modeling approach to

reconstructing past climates

David Hirst

Norwegian Computing Center

Temperature data

• Many locations

• Direct measure of temperature

• Annual or better resolution

• small (known?) error

• Not too many missing values

• Short series

Proxy data

• Long series

• Few (”strange”) locations• Relationship with temperature unclear, may

change over time• Often coarse resolution• Large (unknown) error• Lots of missing values• Pre-processing critical

Current reconstruction methods:

1) Choose proxies

2) Create matrix X of pre-processed proxy by time

3) Create matrix Y of instrumental temperatures.

4) Relate X to Y (by PCA of one or both, then regression of X on Y or Y on X)

5) Use X to predict Y back in time

Difficulties with existing methods:

• Missing data

• Spatial association between proxies and instruments lost

• PCA of proxy data dangerous

• Uncertainty in temperature data ignored

• Difficult to include proxies at different resolutions

Consequences:

• Underestimation of past climate variability

• Wrong uncertainty

An alternative approach

• Regard both instruments and proxies as observations of an underlying temperature process.

• Model all observations including appropriate error terms

In general:

• Model temperature as an underlying space-time field

• Model data (proxies and thermometers) as observations of this field

• Use appropriate functional relationship between proxies and temperature

• Use appropriate error terms

Specifically:

True temperature T(t) an AR(1) process:

21 ,~ TtTt TNT

Observations O = linear function of T plus AR(1) error E + measurement error

tititiiti ETO ,,,

For low resolution proxy replace T by mean over appropriate period

A simulation study

• 50 years of thermometer data

• 250 years of proxies

• True temperature AR1, coefficient=0.95, sd =1

• 10 thermometers, small AR1 error (coef=0.7, sd=0.1)

• 5 proxies, (coef=0.7, sd=1)

For comparison, regression estimator

• Find first pc of proxies

• Regress thermometer mean on pc

• predict ”temperature” (actually thermometer mean) using regression

time before present

pro

xy v

alu

e

0 50 100 150 200 250

-50

51

0 true

no.therm = 10no.prox = 5prox error sd = 1

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

reBayesianregressiontrue

no.therm = 10no.prox = 5prox error sd = 1

Point estimates

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 0.76coverage = 0.82int.width = 2.03

Bayesian

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 1.26coverage = 0.38int.width = 1.36

Regression

Add uncertainty to proxies

• Only 2 proxies

• error sd = 2

time before present

pro

xy v

alu

e

0 50 100 150 200 250

-50

51

0

true

no.therm = 10no.prox = 2prox error sd = 2

0 50 100 150 200 250

-50

5

time before present

tem

pe

ratu

reBayesianregressiontrue

no.therm = 10no.prox = 2prox error sd = 2

Point estimates

0 50 100 150 200 250

-50

5

time before present

tem

pe

ratu

rermse = 1.53coverage = 0.82int.width = 4.09

Bayesian

0 50 100 150 200 250

-50

5

time before present

tem

pe

ratu

rermse = 2.44coverage = 0.42int.width = 3.34

Regression

The effect of missing data

• 5 proxies, error sd = 1

• 50% proxy data missing at random

time before present

pro

xy v

alu

e

0 50 100 150 200 250

-50

51

0 true

no.therm = 10no.prox = 5prox error sd = 1

50% missing

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

reBayesianregressiontrue

no.therm = 10no.prox = 5prox error sd = 1

Point estimates, 50% missing

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

reBayesianregressiontrue

no.therm = 10no.prox = 5prox error sd = 1

Point estimates

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 0.81coverage = 0.9int.width = 2.6

Bayesian, 50% missing

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 0.76coverage = 0.82int.width = 2.03

Bayesian

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 1.61coverage = 0.56int.width = 2.67

Regression, 50% missing

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

rermse = 1.26coverage = 0.38int.width = 1.36

Regression

Add a trend

• Only 150 years for proxies

• cosine trend, cycle 50 years, amplitude 4 (first 50 years) 8 (next 50) and 12 (last 50)

• AR1 model for temperature no longer correct

time before present

pro

xy v

alu

e

0 50 100 150

-50

5

true

no.therm = 10no.prox = 5prox error sd = 1

0 50 100 150

-6-4

-20

24

time before present

tem

pe

ratu

reBayesianregressiontrue

no.therm = 10no.prox = 5prox error sd = 1

Point estimates

0 50 100 150

-6-4

-20

24

time before present

tem

pe

ratu

rermse = 0.79coverage = 0.81int.width = 1.98

Bayesian

0 50 100 150

-6-4

-20

24

time before present

tem

pe

ratu

rermse = 1.06coverage = 0.54int.width = 1.74

Regression

Add lots of ”bad” proxies

• 2 proxies linearly related to temperture

• 20 proxies unrelated to temperature

0 50 100 150 200 250

-6-4

-20

24

68

time before present

tem

pe

ratu

re

Bayesianregressiontrue

no.therm = 10no.good prox = 2no.bad prox = 20prox error sd = 2

Point estimates

Some data from China

• Two proxies used in Moberg et at 2005

• 10 closest instrumental data sets

1000 1200 1400 1600 1800 2000

56

78

91

01

1

year

tem

pe

ratu

re

BeijingChina

Chinese Proxies

1850 1900 1950 2000

51

01

5

year

tem

pe

ratu

re

Instrumental Beijing China

0 200 400 600 800 1000

-4-3

-2-1

01

time before present

tem

pe

ratu

re

Modelling conclusions

• A flexible model which can take account of many sources of uncertainty

• Theoretically easy to include spatial correlations• Can include proxies at different resolutions• Missing data not a problem• Avoids underestimation of variability if model

correct• Functional form of temperature and error series

very important

Other conclusions

• Impossible to work with proxies without help from appropriate scientists (preferably those who collected the data)

• Pre-processing crucial

• Selection of proxies important

• Some assumptions impossible to verify