Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of...
-
Upload
alicia-casey -
Category
Documents
-
view
225 -
download
1
Transcript of Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of...
![Page 1: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/1.jpg)
Applications of Bayesian sensitivity Applications of Bayesian sensitivity and uncertainty analysis to the and uncertainty analysis to the statistical analysis of computer statistical analysis of computer simulators for carbon dynamicssimulators for carbon dynamics
Marc KennedyMarc Kennedy
Clive Anderson, Stefano Conti, Tony O’HaganClive Anderson, Stefano Conti, Tony O’Hagan
Probability & Statistics, University of SheffieldProbability & Statistics, University of Sheffield
![Page 2: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/2.jpg)
OutlineOutline
Uncertainties in computer simulatorsUncertainties in computer simulators Bayesian inference about simulator outputsBayesian inference about simulator outputs
– Creating an Creating an emulatoremulator for the simulator for the simulator– Deriving uncertainty and sensitivity measuresDeriving uncertainty and sensitivity measures
Example applicationExample application Some recent extensionsSome recent extensions
![Page 3: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/3.jpg)
Uncertainties in computer Uncertainties in computer simulatorssimulators
Consider a complex deterministic code with Consider a complex deterministic code with a vector of inputs and single outputa vector of inputs and single output
Use of the code is subject to:Use of the code is subject to:– Input uncertaintyInput uncertainty– Code uncertaintyCode uncertainty
)(xfy
![Page 4: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/4.jpg)
Input uncertaintyInput uncertainty
The inputs to the simulator are unknown for a The inputs to the simulator are unknown for a given real world scenariogiven real world scenario
Therefore the true value of the output is uncertainTherefore the true value of the output is uncertain A Monte Carlo approach is often used to take this A Monte Carlo approach is often used to take this
uncertainty into accountuncertainty into account– Sample from the probability distribution of XSample from the probability distribution of X
– Run the simulator for each point in the sample to give a Run the simulator for each point in the sample to give a sample from the distribution of Ysample from the distribution of Y
– Very inefficient…not practical for complex codesVery inefficient…not practical for complex codes
![Page 5: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/5.jpg)
Code uncertaintyCode uncertainty
The code output at a given input point is The code output at a given input point is unknown until we run it at that pointunknown until we run it at that point– In practice codes can take hours or days to run, so In practice codes can take hours or days to run, so
we have a limited number of runswe have a limited number of runs
We have some prior beliefs about the outputWe have some prior beliefs about the output– Smooth function of the inputsSmooth function of the inputs
![Page 6: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/6.jpg)
Bayesian inference about Bayesian inference about simulator outputssimulator outputs
Bayesian solution involves building an Bayesian solution involves building an emulatoremulator Highly efficientHighly efficient
– Makes maximum use of all available informationMakes maximum use of all available information– A single set of simulator runs is required to train the A single set of simulator runs is required to train the
emulator. All sensitivity and uncertainty information is emulator. All sensitivity and uncertainty information is derived directly from thisderived directly from this
– The inputs for these runs can be chosen to give good The inputs for these runs can be chosen to give good information about the simulator outputinformation about the simulator output
A natural way to treat the different uncertainties A natural way to treat the different uncertainties within a coherent frameworkwithin a coherent framework
![Page 7: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/7.jpg)
Inference about functions using Inference about functions using Gaussian processesGaussian processes
We model as an unknown function We model as an unknown function having a Gaussian process prior distributionhaving a Gaussian process prior distribution
hh(.) is a vector of regression functions and (.) is a vector of regression functions and are unknown coefficientsare unknown coefficients
)),(,)((~],)([ 22 cNf T βhβ
Prior expectation of the model output as a function of the inputs
)(f
β
![Page 8: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/8.jpg)
Inference about functions using Inference about functions using Gaussian processesGaussian processes
We model as an unknown function We model as an unknown function having a Gaussian process prior distributionhaving a Gaussian process prior distribution
c(.,.) is a correlation function, which defines c(.,.) is a correlation function, which defines our beliefs about smoothness of the output our beliefs about smoothness of the output and is the GP varianceand is the GP variance
)),(,)((~],)([ 22 cNf T βhβ
Prior beliefs about covariance between model outputs
)(f
2
![Page 9: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/9.jpg)
Choice of correlation functionChoice of correlation function
We use the product of univariate Gaussian We use the product of univariate Gaussian functions:functions:
Where is a measure of the roughness of Where is a measure of the roughness of the function in the the function in the kkth inputth input
p
kkkk xxbc
1
2})'(exp{)'( xx,
kb
![Page 10: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/10.jpg)
roughness = 0.5roughness = 0.5
![Page 11: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/11.jpg)
roughness = 0.2roughness = 0.2
![Page 12: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/12.jpg)
roughness = 0.1roughness = 0.1
![Page 13: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/13.jpg)
roughness = 0.01roughness = 0.01
![Page 14: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/14.jpg)
Conditioning on code runsConditioning on code runs
Conditional on the observed set of training Conditional on the observed set of training runs,runs,
is still a Gaussian process, with simple is still a Gaussian process, with simple analytical forms for the posterior mean and analytical forms for the posterior mean and covariance functionscovariance functions
),( ii fy x ni ,,2,1
)(f
![Page 15: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/15.jpg)
2 code runs2 code runs
![Page 16: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/16.jpg)
2 code runs2 code runs
![Page 17: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/17.jpg)
2 code runs2 code runs
Large b
Small b
![Page 18: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/18.jpg)
3 code runs3 code runs
![Page 19: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/19.jpg)
5 code runs5 code runs
![Page 20: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/20.jpg)
More about the emulatorMore about the emulator
The emulator The emulator meanmean is an estimate of the is an estimate of the model output and can be used as a surrogatemodel output and can be used as a surrogate
The emulator is much more…The emulator is much more…– It is a It is a probability distributionprobability distribution for the whole for the whole
functionfunction– This allows us to derive inferences for many This allows us to derive inferences for many
output related quantities, particularly integralsoutput related quantities, particularly integrals
![Page 21: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/21.jpg)
Inference for integralsInference for integrals
For particular forms of input distribution For particular forms of input distribution (Gaussian or uniform), analytical forms (Gaussian or uniform), analytical forms have been derived for integration-based have been derived for integration-based sensitivity measuressensitivity measures
– Main effects of individual inputsMain effects of individual inputs
– Joint effects of pairs of inputsJoint effects of pairs of inputs
– Sensitivity indicesSensitivity indices
![Page 22: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/22.jpg)
Example ApplicationExample Application
![Page 23: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/23.jpg)
Sheffield Dynamic Global Sheffield Dynamic Global Vegetation ModelVegetation Model (SDGVM) (SDGVM)
Developed within the Centre for Terrestrial Developed within the Centre for Terrestrial Carbon DynamicsCarbon Dynamics
Our job with SDGVM is to: Our job with SDGVM is to: – Apply Apply sensitivity analysis sensitivity analysis for model testingfor model testing
– Identify the greatest sources of uncertaintyIdentify the greatest sources of uncertainty
– Correctly reflect the uncertainty in predictionsCorrectly reflect the uncertainty in predictions
![Page 24: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/24.jpg)
Plant respiration
Photosynthesis
Loss
Soil respiration
Loss
– Terrestrial carbon source if NEP is negative
– Terrestrial carbon sink if NEP is positive
Net Ecosystem Production
(CARBON FLUX)
![Page 25: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/25.jpg)
Some Inputs ParametersSome Inputs Parameters
Leaf life spanLeaf life span Leaf areaLeaf area Budburst temperature Budburst temperature Senescence temperatureSenescence temperature Wood densityWood density Maximum carbon storageMaximum carbon storage Xylem conductivityXylem conductivity
Soil clay %Soil clay % Soil sand %Soil sand % Soil depthSoil depth Soil bulk densitySoil bulk density
![Page 26: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/26.jpg)
Main Effect: Leaf life spanMain Effect: Leaf life span
100 150 200 250 300 350
leaf life-span
01
02
03
0
me
an
NE
P
If leaves die young, NEP is predicted to be higher, on average. Why?
![Page 27: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/27.jpg)
Main Effect: Leaf life span (updated)Main Effect: Leaf life span (updated)
100 150 200 250 300 350
leaf life-span
05
10
15
20
25
30
Me
an
NE
P
If leaves die young, SDGVM allowed a second growing season, resulting in increased carbon uptake. This problem was fixed by the modellers
![Page 28: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/28.jpg)
Main Effect: Senescence TemperatureMain Effect: Senescence Temperature
4 5 6 7 8 9 10
senescence
01
02
03
0
me
an
NE
P
Small values mean the leaves stay until the temperature is very low
Large values mean the leaves drop earlier, so reduce the growing season
![Page 29: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/29.jpg)
When soil bulk density was added to the active parameter set, the Gaussian Process model did not fit the training data properly
![Page 30: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/30.jpg)
Error discovered in the soil moduleError discovered in the soil module
NEP
-20
0
20
40
60
80
0 500000 1000000 1500000
NEP
-20
0
20
40
60
80
0 500000 1000000 1500000
Before… After…
Bulk density Bulk density
![Page 31: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/31.jpg)
Our GP model depends on the output being a smooth function of the inputs. The problem was again fixed by the modellers
![Page 32: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/32.jpg)
SDGVM: new sensitivity SDGVM: new sensitivity analysisanalysis
Extended sensitivity analysis to 14 input Extended sensitivity analysis to 14 input parameters (using a more stable version)parameters (using a more stable version)
Assumed uniform probability distributions Assumed uniform probability distributions for each of the parametersfor each of the parameters
The aim here is to identify the greatest The aim here is to identify the greatest potential sources of uncertaintypotential sources of uncertainty
![Page 33: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/33.jpg)
160 170 180 190 200
max. age (years)
150
160
170
180
190
1.8 2.0 2.2 2.4 2.6
water potential (M Pa)
150
160
170
180
190
160 180 200
leaf life span (days)
150
160
170
180
190
0.0035 0.0040 0.0045
minimum growth rate (m)
150
160
170
180
190
NE
P (
g/m
2 /y)
NE
P (
g/m
2 /y)
![Page 34: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/34.jpg)
Leaf life span 69.1% by investing effort to learn by investing effort to learn more about this parameter, more about this parameter, output uncertainty could be output uncertainty could be
significantly reducedsignificantly reduced
Minimum growth rate 14.2%
Water potential 3.4%
Maximum age 1.0%
Percentage of total output variance
![Page 35: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/35.jpg)
Extensions to the theoryExtensions to the theory
![Page 36: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/36.jpg)
Multiple outputsMultiple outputs
So far we have created independent So far we have created independent emulators for each outputemulators for each output– Ignores information about the correlation Ignores information about the correlation
between outputsbetween outputs We are experimenting with simple models We are experimenting with simple models
linking the outputs togetherlinking the outputs together This is an important first step in treating This is an important first step in treating
dynamic emulatorsdynamic emulators and in and in aggregating code aggregating code outputsoutputs
![Page 37: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/37.jpg)
Dynamic emulatorsDynamic emulators
Physical systems typically evolve over timePhysical systems typically evolve over time Their behaviour is modelled via dynamic Their behaviour is modelled via dynamic
codescodes
– wherewhere xx are tuning constants andare tuning constants and zztt are context-are context-specific driversspecific drivers
– Recursive emulation ofRecursive emulation of yytt over the appropriate over the appropriate time span shows promising resultstime span shows promising results
),,( 1 ttt zxyfy
![Page 38: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/38.jpg)
CENTURY output ( ) and dynamic emulator ( )
![Page 39: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/39.jpg)
Aggregating outputsAggregating outputs
Motivated by the UK carbon budget problemMotivated by the UK carbon budget problem– The total UK carbon absorbed by vegetation is a sum of The total UK carbon absorbed by vegetation is a sum of
individual pixels/sitesindividual pixels/sites
– Each site has a different set of input parameters (e.g. Each site has a different set of input parameters (e.g. vegetation/soil properties), but some of these are vegetation/soil properties), but some of these are correlatedcorrelated
This is a multiple output codeThis is a multiple output code– Each site represents a different outputEach site represents a different output
Bayesian uncertainty analysis is being extended, Bayesian uncertainty analysis is being extended, to make inference about the sumto make inference about the sum
![Page 40: Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649e625503460f94b5f050/html5/thumbnails/40.jpg)
ReferencesReferences
For Bayesian analysis of computer models:For Bayesian analysis of computer models:– Kennedy, M. C. and O’Hagan, A. (2001). Kennedy, M. C. and O’Hagan, A. (2001).
Bayesian calibration of computer models (with Bayesian calibration of computer models (with discussion) J. Roy. Statist. Soc. B, 63: 425-464discussion) J. Roy. Statist. Soc. B, 63: 425-464
For Bayesian Sensitivity analysis:For Bayesian Sensitivity analysis:– Oakley, J. E. and O’Hagan, A. (2004). Oakley, J. E. and O’Hagan, A. (2004).
Probabilistic sensitivity analysis of complex Probabilistic sensitivity analysis of complex models: A Bayesian approach. J. Roy. Statist. models: A Bayesian approach. J. Roy. Statist. Soc. B, 66: 751-769Soc. B, 66: 751-769