Stochastic PDE model identification from sparse …feilu/slides/FeiLu_2019NOAA.pdfPrior...

1
References 1. Lu, F., Weitzel, N. & Monahan, A. (2019). Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data. http://arxiv.org/abs/1904.05310 2.Fanning, A. F., & Weaver, A. J. (1996). An atmospheric energymoisture balance model: Climatology, interpentadal climate change, and coupling to an ocean general circulation model. Journal of Geophysical Research: Atmospheres, 101(D10), 15111-15128. 3. Lindgren, F., Rue, H., & Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4), 423-498. 4. Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle markov chain monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3), 269-342. 5. Lindsten, F., Jordan, M. I., & Schön, T. B. (2014). Particle Gibbs with ancestor sampling. The Journal of Machine Learning Research, 15(1), 2145-2184. Stochastic PDE model identification from sparse noisy data Fei Lu 1 , Nils Weitzel 2 and Adam Monahan 3 Introduction FL acknowledges the supports from the National Science Foundation grant DMS-1821211 and the Johns Hopkins University. Email enquiries to [email protected]. 1 Department of Mathematics, Johns Hopkins University 2 Institut für Geowissenschaften und Meteorologie, Universität Bonn 3 School of Earth and Ocean Sciences, University of Victoria Posterior Posterior sample sample The non-linear SEBM Particle Markov chain Monte Carlo Numerical integration of the SEBM Bayesian inference with strongly regularized posterior Numerical experiments -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Solution at time65 1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 The state trajectory is well estimated by the sample ensemble. The figure below shows the ensemble of trajectories at an observed node (left) and an unobserved node (right), with marginal distributions at different time instants (bottom row). The ensemble trajectories (in cyan) spread around the true trajectory, and the ensemble mean (blue diamond with one stand derivation in magenta) is close to the true path for most of the time. The ensemble mean filtered out the noise in observations (red circles), reducing the relative error by 30%. Ill-posedness of the inverse problem and sparse noisy data are two major challenges in the modeling of high-dimensional spatiotemporal processes. We introduce a Bayesian inference method with a strongly regularized posterior to overcome these challenges, enabling joint state-parameter estimation with uncertainty quantification. We demonstrate the method on a physically motivated nonlinear stochastic partial differential equation ( SPDE ) arising from paleoclimate construction. Summary Future work Re-parametrization or nonparametric inference to avoid ill-posedness Modify model and apply the method for cases with field data We introduced a strongly regularized posterior that overcomes the ill-posedness in parameter estimation for an SPDE model from sparse noisy data, and investigate joint state-parameter estimation of a stochastic energy balance model arising in paleoclimate field reconstructions. Accurate state estimation from sparse and noisy data is achieved Large uncertainty presents in the parameter estimation, due to the ill-posedness The equation is discretized in space by finite element, and integrated in time by a semi- backward Euler scheme so as to be stable. We consider a stochastic energy balance model (SEBM) deduced from the process based a deterministic model of Fanning and Weaver (1996). It contains a diffusive transport term, an external forcing term, a linear term that models atmosphere-ocean fluxes, a forth order term that corresponds to outgoing long-wave radiation, and a stochastic forcing: The stochastic forcing f(t,x) is smooth in space and white in time, and it is represented by the Gaussian Markov random fields approximation of Matern noise [Lindgren et al. 2011]. @ t u(t, x) - v Δu(t, x)) = (0 + 1 u + 2 u 4 )+ f (t, x). The data are noisy observations at sparse nodes of the spatiotemporal process. We assume that the noises, representing measurement errors, are independent identically distributed Gaussian. Simplified protocol model settings: Consider here a tiny mesh with only 12 nodes and 20 elements on the unit sphere. A typical solution is shown in the right figure. The temperature oscillates around one (non- dimensionalized). Data: at each time, only 6 out of 12 nodes are observed (sparse), with additive independent Gaussian noises. Joint state and parameter estimation: we estimate jointly the states and parameters. Posterior = Prior × Normalized Likelihood p(θ, u | y)= p(θ) ( p(u | θ)p(y | u, θ) p(y) ) 1/N The strongly regularized posterior normalizes the likelihood so that the prior can sufficiently constrain the parameters into the physical range. We also enforce a prior for the states according to the climatological distribution. The posterior quantifies the uncertainty in state and parameter estimation. The posterior is of extremely high-dimensional. For example, for a tiny mesh with 12 nodes and 100 time steps of observations, the discrete state variable is of dimension (1200 +3). This number growths to millions as the mesh refines and time steps increases. A major challenge is to generate samples for such a high dimensional distribution. A family of sampling methods that combines the advantages of both Sequential Monte Carlo (SMC) and Markov Chain Monte Carlo [Andrieu, Doucet & Holensetin 2010]: The chain: moves from one SMC ensemble to another Each chain step: sample from the SMC ensemble With the target distribution as invariant measure (improving SMC) Transition through conditional SMC Previous chain as a reference path in the next SMC. The reference path is retained over the resampling steps, and interacts with other paths by contributing its weight. Particle Gibbs with Ancestor Sampling (PGAS) [Lindstein, Jordan & Schon 2014] Sample the ancestor of the reference path to increase mixing of the chain Gibbs sampling for the parameter using the reference path Settings: Observe 100 time steps, each time step observing 6 nodes. The noise to signal ratio is 1%. PGAS uses 5 particles for the SMC and a Markov chain of length 10000. Parameter settings (Prior) We derive upper/lower bounds of parameters from physical principles, and use them to set a prior probability density: a Gaussian distribution with mean = (upper bound + lower bound)/2 and with std = (upper bound - lower bound)/2 . Bound Lower 27.64 -25.46 -6.00 Upper 32.57 -22.70 -4.80 θ 0 θ 1 θ 2 Inverse problem: ill-posed. Due to the strong correlation between the terms in the parametric form, the inverse problem is ill- posed. Numerically, this means that the Fisher information matrix is ill-conditioned. As a result, large estimation errors appear for both perfect and noisy observations, leading to estimators out of physical ranges (see in the right figure) and failure of model identification. 2 3 4 5 Data size (log 10 N) 10 0 10 1 10 2 10 3 Std of errors Std of errors of MLE from true and noisy trajectories 0 true-traj. 1 true-traj. 4 true-traj. 0 noisy-traj. 1 noisy-traj. 4 noisy-traj. The mean and standard deviation of the relative errors in trajectory estimation in 100 independent experiments is 1.14% (0.41%) over all observed and unobserved nodes. State estimation: Parameter estimation: The posterior of the parameters is represented by the samples, shown below in a scatter plot and marginal density plots. Large uncertainty presents in the posterior, due to the ill-posedness and the need of regularization. The true values of the first two parameters are in the high probability region, but the true value of the last parameter is of low posterior possibility. Mean -0.32 0.02 0.03 Std 0.61 0.42 0.21 θ 0 θ 1 θ 2 The left table shows means and standard deviations of the errors of the maximum of posterior (MAP) in 100 independent simulations. Relatively large uncertainties in MAP’s are seen, due to the ill-posedness. Therefore, it is important to quantify the uncertainty by the posterior. -5.5 -5 -4.5 4 -25 -24 -23 1 26 28 30 32 0 -5.5 -5 -4.5 4 -25 -24 -23 1 26 28 30 32 0 27 28 29 30 31 32 33 0 0 0.5 1 Posterior True value Posterior mean Prior -26 -25.5 -25 -24.5 -24 -23.5 -23 -22.5 1 0 0.5 1 1.5 -6.2 -6 -5.8 -5.6 -5.4 -5.2 -5 -4.8 -4.6 4 0 5 10 time = 20 0.95 1 1.05 State 0 0.1 0.2 0.3 0.4 0.5 0.6 Probability Posterior True Mean time = 60 0.95 1 1.05 State 0 0.1 0.2 0.3 0.4 0.5 0.6 time = 100 0.95 1 1.05 State 0 0.1 0.2 0.3 0.4 0.5 0.6 time = 20 0.9 1 1.1 State 0 0.1 0.2 0.3 0.4 0.5 Probability Posterior True Mean time = 60 0.9 1 1.1 State 0 0.1 0.2 0.3 0.4 0.5 time = 100 0.9 1 1.1 State 0 0.1 0.2 0.3 0.4 0.5 An observed node An unobserved node

Transcript of Stochastic PDE model identification from sparse …feilu/slides/FeiLu_2019NOAA.pdfPrior...

Page 1: Stochastic PDE model identification from sparse …feilu/slides/FeiLu_2019NOAA.pdfPrior Permeability Stochastic PDE model identification from sparse noisy data Fei Lu1, Nils Weitzel2

References1. Lu, F., Weitzel, N. & Monahan, A. (2019). Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data. http://arxiv.org/abs/1904.05310 2.Fanning, A. F., & Weaver, A. J. (1996). An atmospheric energy‐moisture balance model: Climatology, interpentadal climate change, and coupling to an ocean general circulation model.  Journal of Geophysical Research: Atmospheres, 101(D10), 15111-15128.3. Lindgren, F., Rue, H., & Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach.  Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4), 423-498.4. Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle markov chain monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3), 269-342.5. Lindsten, F., Jordan, M. I., & Schön, T. B. (2014). Particle Gibbs with ancestor sampling. The Journal of Machine Learning Research, 15(1), 2145-2184.

PriorPermeability

Stochastic PDE model identification from sparse noisy dataFei Lu1, Nils Weitzel2 and Adam Monahan3

Introduction

FL acknowledges the supports from the National Science Foundation grant DMS-1821211 and the Johns Hopkins University. Email enquiries to [email protected].

1Department of Mathematics, Johns Hopkins University 2Institut für Geowissenschaften und Meteorologie, Universität Bonn

3School of Earth and Ocean Sciences, University of Victoria

Posteriorsample

Posteriorsample

Posteriorsample

The non-linear SEBM

Particle Markov chain Monte Carlo

Numerical integration of the SEBM

Bayesian inference with strongly regularized posterior

Numerical experiments

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Solution at time65

1

1.01

1.02

1.03

1.04

1.05

1.06

1.07

1.08

The state trajectory is well estimated by the sample ensemble. The figure below shows the ensemble of trajectories at an observed node (left) and an unobserved node (right), with marginal distributions at different time instants (bottom row). The ensemble trajectories (in cyan) spread around the true trajectory, and the ensemble mean (blue diamond with one stand derivation in magenta) is close to the true path for most of the time. The ensemble mean filtered out the noise in observations (red circles), reducing the relative error by 30%.

Ill-posedness of the inverse problem and sparse noisy data are two major challenges in the modeling of high-dimensional spatiotemporal processes. We introduce a Bayesian inference method with a strongly regularized posterior to overcome these challenges, enabling joint state-parameter estimation with uncertainty quantification. We demonstrate the method on a physically motivated nonlinear stochastic partial differential equation (SPDE) arising from paleoclimate construction.

Summary

Future work

• Re-parametrization or nonparametric inference to avoid ill-posedness• Modify model and apply the method for cases with field data

We introduced a strongly regularized posterior that overcomes the ill-posedness in parameter estimation for an SPDE model from sparse noisy data, and investigate joint state-parameter estimation of a stochastic energy balance model arising in paleoclimate field reconstructions.

Accurate state estimation from sparse and noisy data is achievedLarge uncertainty presents in the parameter estimation, due to the ill-posedness

The equation is discretized in space by finite element, and integrated in time by a semi-backward Euler scheme so as to be stable.

We consider a stochastic energy balance model (SEBM) deduced from the process based a deterministic model of Fanning and Weaver (1996). It contains a diffusive transport term, an external forcing term, a linear term that models atmosphere-ocean fluxes, a forth order term that corresponds to outgoing long-wave radiation, and a stochastic forcing:

The stochastic forcing f(t,x) is smooth in space and white in time, and it is represented by the Gaussian Markov random fields approximation of Matern noise [Lindgren et al. 2011].

@tu(t, x)� v�u(t, x)) = (✓0 + ✓1u+ ✓2u4) + f(t, x).

The data are noisy observations at sparse nodes of the spatiotemporal process. We assume that the noises, representing measurement errors, are independent identically distributed Gaussian.

Simplified protocol model settings:

Consider here a tiny mesh with only 12 nodes and 20 elements on the unit sphere.A typical solution is shown in the right figure. The temperature oscillates around one (non-dimensionalized). Data: at each time, only 6 out of 12 nodes are observed (sparse), with additive independent Gaussian noises.

Joint state and parameter estimation: we estimate jointly the states and parameters.

Posterior = Prior × Normalized Likelihood

p(θ, u |y) = p(θ)( p(u |θ)p(y |u, θ)p(y) )

1/N

The strongly regularized posterior normalizes the likelihood so that the prior can sufficiently constrain the parameters into the physical range. We also enforce a prior for the states according to the climatological distribution. The posterior quantifies the uncertainty in state and parameter estimation.

The posterior is of extremely high-dimensional. For example, for a tiny mesh with 12 nodes and 100 time steps of observations, the discrete state variable is of dimension (1200 +3). This number growths to millions as the mesh refines and time steps increases. A major challenge is to generate samples for such a high dimensional distribution.

A family of sampling methods that combines the advantages of both Sequential Monte Carlo (SMC) and Markov Chain Monte Carlo [Andrieu, Doucet & Holensetin 2010]:

The chain: moves from one SMC ensemble to another • Each chain step: sample from the SMC ensemble • With the target distribution as invariant measure (improving SMC)

Transition through conditional SMC• Previous chain as a reference path in the next SMC. The reference path is retained

over the resampling steps, and interacts with other paths by contributing its weight. Particle Gibbs with Ancestor Sampling (PGAS) [Lindstein, Jordan & Schon 2014]

• Sample the ancestor of the reference path to increase mixing of the chain• Gibbs sampling for the parameter using the reference path

Settings: Observe 100 time steps, each time step observing 6 nodes. The noise to signal ratio is 1%. PGAS uses 5 particles for the SMC and a Markov chain of length 10000.

Parameter settings (Prior) We derive upper/lower bounds of parameters from physical principles, and use them to set a prior probability density: a Gaussian distribution with mean = (upper bound + lower bound)/2 and with std = (upper bound - lower bound)/2 .

Bound

Lower 27.64 -25.46 -6.00

Upper 32.57 -22.70 -4.80

θ0 θ1 θ2

Inverse problem: ill-posed. Due to the strong correlation between the terms in the parametric form, the inverse problem is ill-posed. Numerically, this means that the Fisher information matrix is ill-conditioned. As a result, large estimation errors appear for both perfect and noisy observations, leading to estimators out of physical ranges (see in the right figure) and failure of model identification. 2 3 4 5

Data size (log10N)

100

101

102

103

Std

of e

rrors

Std of errors of MLE from true and noisy trajectories

0 true-traj.

1 true-traj.

4 true-traj.

0 noisy-traj.

1 noisy-traj.

4 noisy-traj.

The mean and standard deviation of the relative errors in trajectory estimation in 100 independent experiments is 1.14% (0.41%) over all observed and unobserved nodes.

State estimation:

Parameter estimation: The posterior of the parameters is represented by the samples, shown below in a scatter plot and marginal density plots. Large uncertainty presents in the posterior, due to the ill-posedness and the need of regularization. The true values of the first two parameters are in the high probability region, but the true value of the last parameter is of low posterior possibility.

Mean -0.32 0.02 0.03

Std 0.61 0.42 0.21

θ0 θ1 θ2

The left table shows means and standard deviations of the errors of the maximum of posterior (MAP) in 100 independent simulations. Relatively large uncertainties in MAP’s are seen, due to the ill-posedness. Therefore, it is important to quantify the uncertainty by the posterior.

-5.5 -5 -4.5

4

-25 -24 -23

1

26 28 30 32

0

-5.5

-5

-4.5

4

-25

-24

-23

1

26

28

30

32

0

27 28 29 30 31 32 330

0

0.5

1PosteriorTrue valuePosterior meanPrior

-26 -25.5 -25 -24.5 -24 -23.5 -23 -22.51

0

0.5

1

1.5

-6.2 -6 -5.8 -5.6 -5.4 -5.2 -5 -4.8 -4.6

4

0

5

10

time = 20

0.95 1 1.05State

0

0.1

0.2

0.3

0.4

0.5

0.6

Probability

PosteriorTrueMean

time = 60

0.95 1 1.05State

0

0.1

0.2

0.3

0.4

0.5

0.6time = 100

0.95 1 1.05State

0

0.1

0.2

0.3

0.4

0.5

0.6time = 20

0.9 1 1.1State

0

0.1

0.2

0.3

0.4

0.5

Probability

PosteriorTrueMean

time = 60

0.9 1 1.1State

0

0.1

0.2

0.3

0.4

0.5

time = 100

0.9 1 1.1State

0

0.1

0.2

0.3

0.4

0.5

An observed node An unobserved node