[0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in...

56
Scaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive Modelling - 23. April 2015 research supported by the European Research Council under FP7 Grant AdG247277 C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 1 / 18

Transcript of [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in...

Page 1: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Scaling Limits in Computational Bayesian Inversion

Claudia Schillings, Christoph Schwab

Warwick Centre for Predictive Modelling - 23. April 2015

research supported by the European Research Council under FP7 Grant AdG247277

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 1 / 18

Page 2: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Outline

1 Bayesian Inversion of Parametric Operator Equations

2 Sparsity of the Forward Solution

3 Sparsity of the Posterior

4 Sparse Quadrature

5 Numerical Results

6 Small Observation Noise Covariance

7 Summary

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 2 / 18

Page 3: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Physical ModelG(u)→ δ

u parameter vector / parameter function

G the forward map modelling the physical process

δ result / observations

G forward response operator

Forward Problem

Find the output δ for givenparameters u

→ well-posed

Inverse Problem

Find the parameters u from(noisy) observations δ

→ ill-posed

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 4: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Physical ModelG(u)→ δ

u parameter vector / parameter function

G the forward map modelling the physical process

δ result / observations

G forward response operator

Forward Problem

Find the output δ for givenparameters u

→ well-posed

Inverse Problem

Find the parameters u from(noisy) observations δ

→ ill-posed

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 5: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Deterministic optimization problem

minu

12‖δ − G(u)‖2 + R(u)

‖δ − G(u)‖ potential / data misfit

R regularization term

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 6: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Deterministic optimization problem

minu

12‖δ − G(u)‖2 + R(u)

Large-scale, deterministic optimization problem

No quantification of the uncertainty in the unknown u

Proper choice of the regularization term R

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 7: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Bayesian inverse problem

u, η, δ random variables / fields

Goal of computation: moments of system quantities under the posterior w.r. tonoisy data δ

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 8: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Bayesian inverse problem

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 9: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Bayesian inverse problem

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 10: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Bayesian inverse problem

Quantification of uncertainty in u and system quantities

Well-posedness of the inverse problem

Incorporation of prior knowledge on the uncertain data u

Need of efficient approximations of the posterior

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 11: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Inverse Problem

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η

Bayesian inverse problem

MCMC methods

MAP estimators

Ad hoc algorithms (EnKF, SMC, GP,...)

Sparse deterministic approximations of posterior expectations

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 3 / 18

Page 12: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

Goal: Efficient estimation of system quantities from noisy observations

Infinite-dimensional parameter space

Fast convergence by exploiting sparsityof the underlying forward problem

Suitable for application to a broad classof forward problems

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 13: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

Goal: Efficient estimation of system quantities from noisy observations

Infinite-dimensional parameter space

Fast convergence by exploiting sparsityof the underlying forward problem

Suitable for application to a broad classof forward problems

UQ in Nano Optics

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 14: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

Goal: Efficient estimation of system quantities from noisy observations

Infinite-dimensional parameter space

Fast convergence by exploiting sparsityof the underlying forward problem

Suitable for application to a broad classof forward problems

UQ in Biochemical Networks

Source: Chen et al.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 15: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

Goal: Efficient estimation of system quantities from noisy observations

Infinite-dimensional parameter space

Fast convergence by exploiting sparsityof the underlying forward problem

Suitable for application to a broad classof forward problems

UQ in Groundwater Flow

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

00.10.20.30.40.50.60.70.80.91 x

y

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 16: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

X separable Banach space

G : X 7→ X the forward map

Abstract Operator Equation

Given u ∈ X, find q ∈ X : A(u; q) = F(u) in Y ′

with A(u; ·) ∈ L(X ,Y ′), F : X 7→ Y ′, X , Y reflexive Banach spaces,a(v,w) := Y 〈w,Av〉Y′ ∀v ∈ X ,w ∈ Y corresponding bilinear form

O : X 7→ RK bounded, linear observation operator

G : X 7→ RK uncertainty-to-observation map, G = O G

η ∈ RK the observational noise (η ∼ N (0,Γ))

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 17: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

X separable Banach space

G : X 7→ X the forward map

Model Parametric Elliptic Problem

Given u ∈ X, find q ∈ X : −∇ ·(u∇q

)= f in D, q = 0 in ∂D

with weak formulation∫

D u(x)∇q(x) · ∇w(x)dx = X 〈w, f 〉X ′ for all w ∈ X ,X = Y = H1

0(D), D ⊂ Rd bounded Lipschitz domain with Lipschitz boundary ∂D

O : X 7→ RK bounded, linear observation operator

G : X 7→ RK uncertainty-to-observation map, G = O G

η ∈ RK the observational noise (η ∼ N (0,Γ))

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 18: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Find the unknown data u ∈ X from noisy observations

δ = G(u) + η,

X separable Banach space

G : X 7→ X the forward map

O : X 7→ RK bounded, linear observation operator

G : X 7→ RK uncertainty-to-observation map, G = O G

η ∈ RK the observational noise (η ∼ N (0,Γ))

Least squares potential Φ : X × RK → R

Φ(u; δ) :=12

((δ − G(u))>Γ−1(δ − G(u))

)Reformulation of the forward problem with unknown stochastic inputdata as an infinite dimensional, parametric deterministic problem

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 19: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problems (Stuart 2010)

Parametric representation of the unknown u

u = u(y) := 〈u〉+∑j∈J

yjψj ∈ X

y = (yj)j∈J iid sequence of real-valued random variables yj ∼ U [−1, 1]

〈u〉, ψj ∈ X

J finite or countably infinite index set

Prior measure on the uncertain input data

µ0(dy) :=⊗j∈J

12λ1(dyj) .

(U,B) =(

[−1, 1]J,⊗

j∈J B1[−1, 1]

)measurable space

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 4 / 18

Page 20: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problem

Theorem (Schwab and Stuart 2011)

Assume that G(u)∣∣∣u=〈u〉+

∑j∈J yjψj

is bounded and continuous.

Then µδ(dy), the distribution of y ∈ U given δ, is absolutely continuouswith respect to µ0(dy), and

dµδ

dµ0(y) =

1Z

Θ(y)

with the parametric Bayesian posterior Θ given by

Θ(y) = exp(−Φ(u; δ)

)∣∣∣u=〈u〉+

∑j∈J yjψj

,

and the normalization constant

Z =

∫U

Θ(y)µ0(dy) .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 5 / 18

Page 21: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problem

Expectation of a Quantity of Interest φ : X → S

Eµδ[φ(u)] = Z−1

∫U

exp(−Φ(u; δ)

)φ(u)

∣∣∣u=〈u〉+

∑j∈J yjψj

µ0(dy) =: Z′/Z

with Z =∫

U exp(− 12

((δ − G(u))>Γ−1(δ − G(u))

))µ0(dy).

Reformulation of the forward problem with unknown stochastic input dataas an infinite dimensional, parametric deterministic problem

Parametric, deterministic representation of the derivative of the posteriormeasure with respect to the prior µ0

Approximation of Z′ and Z to compute the expectation of QoI under theposterior given data δ

Efficient algorithm to approximate the conditional expectations giventhe data with dimension-independent rates of convergence

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 5 / 18

Page 22: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Bayesian Inverse Problem

Expectation of a Quantity of Interest φ : X → S

Eµδ[φ(u)] = Z−1

∫U

exp(−Φ(u; δ)

)φ(u)

∣∣∣u=〈u〉+

∑j∈J yjψj

µ0(dy) =: Z′/Z

with Z =∫

U exp(− 12

((δ − G(u))>Γ−1(δ − G(u))

))µ0(dy).

Exploiting sparsity in the parametric operator equation

Parameters belonging to a specified sparsity class

Analytic regularity of the parametric, deterministic Bayesian posterior

Parametric, deterministic Bayesian posterior belongs to the samesparsity class

→ Sparsity of generalized pce + dimension-independent convergencerates for Smolyak integration algorithms

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 5 / 18

Page 23: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Model Parametric Elliptic Problem

∫D

u(x)∇q(x) · ∇w(x)dx = X 〈w, f 〉X ′ for all w ∈ X

with D ⊂ Rd bounded Lipschitz domain with Lipschitz boundary ∂D, X = H10(D), f ∈ L2(D).

Structural Assumptionsu(x, y) = 〈u〉(x) +

∑j∈J

yjψj(x), x ∈ X

with 〈u〉 ∈ L∞(D) and (ψj)j∈J ⊂ L∞(D), y = (y1, y2, . . . ) ∈ U = [−1, 1]J

Assumption 1 (UEA)0 < umin ≤ u(x, y) ≤ umax <∞ ∀x ∈ D, y ∈ U

with 0 < umin ≤ umax <∞.

Assumption 2 ∑j∈J‖ψj‖L∞(D) ≤

κ

1 + κ〈u〉min

with 〈u〉min = minx∈D 〈u〉(x) > 0 and κ > 0

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 6 / 18

Page 24: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 25: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

2018

1614

12

j

108

64

20

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

||ψ

j||

(||ψj||)

j summable, p=1

(||ψj||)

j summable, p=1/2

(||ψj||)

j summable, p=1/3

(||ψj||)

j summable, p=1/4

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 26: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

implies p-sparsity in the solution’s Taylor expansion,

∀y ∈ U : q(y) =∑ν∈F τνyν , τν := 1

ν!∂νy q(y) |y=0 , F = ν ∈ NN

0 : |ν| <∞:

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 27: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

implies p-sparsity in the solution’s Taylor expansion,

∀y ∈ U : q(y) =∑ν∈F τνyν , τν := 1

ν!∂νy q(y) |y=0 , F = ν ∈ NN

0 : |ν| <∞:

There exists a p-summable, monotone envelope t = tνν∈Fi.e. tν := supµ≥ν ‖τν‖X with C(p, q) := ‖t‖`p(F) <∞ .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 28: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

implies p-sparsity in the solution’s Taylor expansion,

∀y ∈ U : q(y) =∑ν∈F τνyν , τν := 1

ν!∂νy q(y) |y=0 , F = ν ∈ NN

0 : |ν| <∞:

There exists a p-summable, monotone envelope t = tνν∈Fi.e. tν := supµ≥ν ‖τν‖X with C(p, q) := ‖t‖`p(F) <∞ .

2018

1614

12

j

108

64

20

0.6

0.5

0.4

0.3

0.2

0.1

0

0.7

0.8

0.9

1

||τ

j||, t j

taylor coeff. (||τj||)

j, 1D

monotone envelope (tj)j

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 29: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos Approximations

Sparsity in the unknown coefficient function u,

i.e. if∑∞

j=1 ‖ψj‖pL∞(D) <∞ for 0 < p < 1,

implies p-sparsity in the solution’s Taylor expansion,

∀y ∈ U : q(y) =∑ν∈F τνyν , τν := 1

ν!∂νy q(y) |y=0 , F = ν ∈ NN

0 : |ν| <∞:

There exists a p-summable, monotone envelope t = tνν∈Fi.e. tν := supµ≥ν ‖τν‖X with C(p, q) := ‖t‖`p(F) <∞ .

and monotone ΛTN ⊂ F corresponding to the N largest terms of t with

supy∈U

∥∥∥q(y)−∑ν∈ΛT

N

τνyν∥∥∥X≤ C(p, t)N−(1/p−1) .

A. Cohen, R. DeVore and Ch. Schwab. Analytic regularity and polynomial approximation of

parametric and stochastic elliptic PDEs, Analysis and Applications, 2010.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 30: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Polynomial Chaos ApproximationsIdea of ProofHolomorphic Extension

z 7→ q(z) , z ∈ Uρ := ⊗j≥1zj ≤ ρj

holomorph with bound ‖q(z)‖X ≤ Cδ for any positive sequence ρ = ρjj≥1 such that∑j≥1 ρj|ψj(x)| ≤ 〈u〉(x)− δ, x ∈ D for some δ > 0.

Estimate on the Taylor Coefficients

‖τν‖X ≤ Cδ infρ−ν :∑j≥1

ρj|ψj(x)| ≤ 〈u〉(x)− δ, x ∈ D

by recursive application of Cauchy’s formula

Construction of a particular ρ

‖ψj‖L∞j≥1 ∈ `p(N)⇒ tνν∈F ∈ `p(F)

Stechkin

supy∈U

∥∥∥q(y)−∑ν∈ΛT

N

τνyν∥∥∥X≤∑ν /∈ΛT

N

‖tν‖X ≤ CN−1p +1

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 7 / 18

Page 31: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparsity of the Posterior

Theorem (ClS and Schwab 2013)Assume that the unknown coefficient function u is p-sparse for0 < p < 1. Then, the posterior density’s Taylor expansion is p-sparsewith the same p.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 8 / 18

Page 32: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparsity of the Posterior

Theorem (ClS and Schwab 2013)Assume that the unknown coefficient function u is p-sparse for0 < p < 1. Then, the posterior density’s Taylor expansion is p-sparsewith the same p.

N-term Approximation Results

supy∈U

∥∥∥Θ(y)−∑ν∈ΛP

N

ΘPνPν(y)

∥∥∥X≤ N−s‖θP‖`p

m(F), s :=1p− 1 .

Adaptive Smolyak quadrature algorithm with convergence ratesdepending only on the summability of the parametric operator

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 8 / 18

Page 33: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparsity of the Posterior

Theorem (ClS and Schwab 2013)Assume that the unknown coefficient function u is p-sparse for0 < p < 1. Then, the posterior density’s Taylor expansion is p-sparsewith the same p.

Examples

Parametric initial value ODEs (Hansen & Schwab; Vietnam J. Math. 2013)

Affine-parametric, linear operator equations (ClS & Schwab; 2013)

Semilinear elliptic PDEs (Hansen & Schwab; Math. Nachr. 2013)

Elliptic multiscale problems (Hoang & Schwab; Analysis and Applications 2012)

Nonaffine holomorphic-parametric, nonlinear problems (Cohen, Chkifa &Schwab; 2013)

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 8 / 18

Page 34: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparsity of the Posterior

Theorem (ClS and Schwab 2013)Assume that the unknown coefficient function u is p-sparse for0 < p < 1. Then, the posterior density’s Taylor expansion is p-sparsewith the same p.

Goal of computation

Expectation of a Quantity of Interest φ : X → S

Eµδ[φ(u)] = Z−1

∫U

exp(−Φ(u; δ)

)φ(u)

∣∣∣u=〈u〉+

∑j∈J yjψj

µ0(dy) =: Z′/Z

with Z =∫

U exp(− 12

((δ − G(u))>Γ−1(δ − G(u))

))µ0(dy).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 8 / 18

Page 35: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Quadrature Operator

For any finite monotone set Λ ⊂ F , the quadrature operator is definedby

QΛ =∑ν∈Λ

∆ν =∑ν∈Λ

⊗j≥1

∆νj

with associated collocation grid

ZΛ = ∪ν∈ΛZν .

(Qk)k≥0 sequence of univariate quadrature formulas∆j = Qj − Qj−1, j ≥ 0 univariate quadrature difference operatorQν =

⊗j≥1 Qνj , ∆ν =

⊗j≥1 ∆νj tensorized multivariate operators

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 9 / 18

Page 36: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Sparse Quadrature Operator

For any finite monotone set Λ ⊂ F , the quadrature operator is definedby

QΛ =∑ν∈Λ

∆ν =∑ν∈Λ

⊗j≥1

∆νj

with associated collocation grid

ZΛ = ∪ν∈ΛZν .

−1 0 1−1

0

1

−1 0 1−1

0

1

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 9 / 18

Page 37: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Convergence Rates for Adaptive Smolyak Integration

TheoremAssume that the unknown coefficient function u is p-sparse for 0 < p < 1.

Then there exists a sequence (ΛN)N≥1 of monotone index sets ΛN ⊂ Fsuch that #ΛN ≤ N and

|Z −QΛN [Θ]| ≤ C1N−1p +1

,

with Z =∫

U Θ(y)µ0(dy), Θ(y) = exp(−Φ(u; δ)

)∣∣∣u=〈u〉+

∑j∈J yjψj

and,

‖Z′ −QΛN [Ψ]‖X ≤ C2N−1p +1

,

with Z′ =∫

U Ψ(y)µ0(dy), Ψ(y) = Θ(y)φ(u)∣∣∣

u=〈u〉+∑

j∈J yjψj

C1,C2 > 0 independent of N.

Remark: SAME index sets ΛN for BOTH, Z′ and Z.

ClS and Schwab Sparsity in Bayesian Inversion of Parametric Operator Equations, 2013.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 10 / 18

Page 38: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Uncertainty Quantification in Nano Optics

Goal: Quantification of the influence of defects in fabrication processon the optical response of nano structures

Propagation of plane wave and its interaction withscatterer described by Helmholtz equation (2D).

Stochastic shape of the scatterer

0 < ρmin ≤ ρ(ω, φ) ≤ ρmax, ω ∈ Ω, φ ∈ [0, 2π)

Collaboration with Ralf Hiptmair, Laura Scarabosio

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 11 / 18

Page 39: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Uncertainty Quantification in Nano Optics

Forward Problem

−∆u− k21u = 0 in D1(ω)

−∆u− k22u = 0 in D2(ω)

u|+ − u|− = 0 on Γ(ω)

∂u∂n

∣∣∣∣+

− µ ∂u∂n

∣∣∣∣−

= 0 on Γ(ω)

lim|x|→∞

√|x|(

∂ |x| − ik1

)(u(ω)− ui)(x) = 0, ui(x) = eik1·x

Parametrization of the shape

r(ω, ϕ) = r0(ϕ) +

64∑k=1

1kζ

y2k(ω) cos(kϕ) +1kζ

y2k+1(ω) sin(kϕ), ϕ ∈ [0, 2π)

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 11 / 18

Page 40: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical Results of Sparse Interpolation

100

101

102

103

10−6

10−5

10−4

10−3

10−2

10−1

100

101

# Λ

Interpolation, dimension 64

Estimated error CC, nonadaptive strategy, zeta=2

Estimated error CC, nonadaptive strategy, zeta=3

Estimated error CC, nonadaptive strategy, zeta=4

Figure: Estimated error curves of the interpolation error w.r. to the cardinality of theindex set ΛN based on the sequences CC with nonadaptive refinement, uniformdistribution of the parameters, dimension of the parameter space 64.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 12 / 18

Page 41: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Concentration Effect of the Posterior for 0 < Γ 1

Model parametric elliptic problem

−div(u∇q) = f in D := [0, 1] , q = 0 in ∂D ,

with f (x) = 100 · x and

u(x, y) = 0.15 + y1ψ1 + y2ψ2 ,

with J = 2, J = 1, 2, ψ1(x) = 0.1 sin(πx), ψ2(x) = 0.025 cos(2πx) and withyj ∼ U [−1, 1], j ∈ J .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 13 / 18

Page 42: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Concentration Effect of the Posterior for 0 < Γ 1

For given (noisy) data δ,δ = G(u) + η ,

we are interested in the behavior of the posterior

Θ(y) = exp(−ΦΓobs(u; δ)

)∣∣∣u=∑2

j=1 yjψj,

withΦΓobs(u; δ) =

12

((δ − G(u))>Γ−1(δ − G(u))

).

and variance of the noise Γ = λ · id

QoI φ is the solution of the forward problem at x = 0.5.

Observation operator O consists of 2 system responses at x = 0.25 and x = 0.75.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 13 / 18

Page 43: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Concentration Effect of the Posterior for 0 < Γ 1

y1

-1 -0.5 0 0.5 1

y2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=0.25

2

reference parameterposterior

Figure: Contour plot of the posterior with observational noise λ = 0.252 .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 13 / 18

Page 44: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Concentration Effect of the Posterior for 0 < Γ 1

y1

-1 -0.5 0 0.5 1

y2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=0.5

2

reference parameterposterior

Figure: Contour plot of the posterior with observational noise λ = 0.52 .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 13 / 18

Page 45: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Concentration Effect of the Posterior for 0 < Γ 1

y1

-1 -0.5 0 0.5 1

y2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=0.05

2

reference parameterposterior

Figure: Contour plot of the posterior with observational noise λ = 0.052.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 13 / 18

Page 46: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Asymptotic Analysis

Theorem (ClS and Schwab 2014)Assume that the parameter space is finite (possibly after dimension -truncation) and G(·), δ are such that the assumptions of Laplace’smethod hold; in particular, the minimum y0 of

S(y) =12

((δ − G(u))>(δ − G(u))

)is nondegenerate.Then, as Γ ↓ 0, the Bayesian estimate admits an asymptotic expansion

Eµδ[φ] =

Z′ΓZΓ∼ a0 + a1λ+ a2λ

2 + ....

where a0 = φ(y0).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 14 / 18

Page 47: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Curvature Rescaling Regularization

Removing the degeneracy in the integrand function

The maximizer y0 of the posterior measure Θ(y) is computed byminimizing the potential

12((δ − G(u))>(δ − G(u))

)∣∣∣u=

∑2j=1 yjψj

.

using a trust-region Quasi-Newton approach with SR1 updates.

Diagonalize the approximated Hessian HSR1 = QMQ> and regularize theintegrand by the curvature rescaling transformation

y0 + Γ1/2QM−1/2z , z ∈ RJ .

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 15 / 18

Page 48: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

y1

-1 -0.5 0 0.5 1

y2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=0.25

2

reference parameterposterioroptimal point

y1

-1 -0.5 0 0.5 1y

2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the transf. posterior, 2 meas. at 0.25 and 0.75, 6=0.25

2

transformed posterior

Figure: Contour plot of the posterior with observational noise Γobs = 0.252 · Id (left)and contour plot of the transformed posterior (right).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 49: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

y1

-1 -0.5 0 0.5 1

y2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=0.25

2

reference parameterposterioroptimal point

y1

-1 -0.5 0 0.5 1y

2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Truncated domain, 2 meas. at 0.25 and 0.75, 6=0.25

2

transformed posteriortruncated domain

Figure: Contour plot of the posterior with observational noise Γobs = 0.252 · Id (left)and and truncated domain of integration of the rescaled Smolyak approach in theoriginal coordinate system (right).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 50: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

100

101

102

103

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

102

Estim

ate

d (

ab

so

lute

) e

rro

r (n

orm

aliz

atio

n c

on

sta

nt

Z)

orig. integrand

trans. integrand

100

101

102

103

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

102

Estim

ate

d (

ab

so

lute

) e

rro

r (Z

’)

orig. integrand

trans. integrand

Figure: Comparison of the estimated (absolute) error curves using the Smolyakapproach for the original integrand (gray) and the transformed integrand (black) for thecomputation of ZΓobs (left) and Z′Γobs

(right) with observational noise Γobs = 0.252 · Id.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 51: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

Extension to lognormal case

ln(u(x, y)) = 0.15 + y1ψ1 + y2ψ2 ,

with J = 2, J = 1, 2, ψ1(x) = 0.1 sin(πx), ψ2(x) = 0.025 cos(2πx) and withyj ∼ N (0, 1), j ∈ J.

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 52: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

Extension to lognormal case

y1

-3 -2 -1 0 1 2 3

y2

-3

-2

-1

0

1

2

3Contour plot of the posterior, 2 meas. at 0.25 and 0.75, , 6=0.01

2

reference parameterposterioroptimal point

y1

-3 -2 -1 0 1 2 3y

2-3

-2

-1

0

1

2

3Contour plot of the posterior, 2 meas. at 0.25 and 0.75, 6=1.0001

reference parameterposterioroptimal point

Figure: Contour plot of the posterior density with observational noise Γobs = 0.012 · Id(left), and contour plot of the transformed posterior (right).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 53: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Numerical ExperimentsCurvature rescaling

Extension to lognormal case

y1

-3 -2 -1 0 1 2 3

y2

-3

-2

-1

0

1

2

3Contour plot of the posterior, 2 meas. at 0.25 and 0.75, , 6=0.01

2

reference parameterposterioroptimal point

y1

-6 -4 -2 0 2 4 6

y2

-6

-4

-2

0

2

4

6Contour plot of the transformed posterior, 2 meas. at 0.25 and 0.75, 6=0.0001

transformed posteriorquadrature points

Figure: Contour plot of the posterior density with observational noise Γobs = 0.012 · Id(left), and contour plot of the transformed posterior (right).

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 16 / 18

Page 54: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Conclusions and Outlook

New class of sparse, adaptive quadrature methods for Bayesianinverse problems for a broad class of operator equations

Dimension-independent convergence rates depending only on thesummability of the parametric operator

Numerical confirmation of the predicted convergence rates

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 17 / 18

Page 55: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

Conclusions and Outlook

New class of sparse, adaptive quadrature methods for Bayesianinverse problems for a broad class of operator equations

Dimension-independent convergence rates depending only on thesummability of the parametric operator

Numerical confirmation of the predicted convergence rates

Efficient treatment of small observation noise variance Γ

Development of preconditioning techniques to overcome theconvergence problems in the case Γ ↓ 0

Combination of optimization and sampling techniques

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 17 / 18

Page 56: [0.4cm] Scaling Limits in Computational Bayesian Inversion · PDF fileScaling Limits in Computational Bayesian Inversion Claudia Schillings, Christoph Schwab Warwick Centre for Predictive

ReferencesChkifa A 2013 On the Lebesgue constant of Leja sequences for the unit disk Journal of Approximation Theory 166176-200

Chkifa A, Cohen A and Schwab C 2013 High-dimensional adaptive sparse polynomial interpolation and applications toparametric PDEs Foundations of Computational Mathematics 1-33

Cohen A, DeVore R and Schwab C 2011 Analytic regularity and polynomial approximation of parametric and stochasticelliptic PDEs Analysis and Applications 9 1-37

Gerstner T and Griebel M 2003 Dimension-adaptive tensor-product quadrature Computing 71 65-87

Hansen M and Schwab C 2013 Sparse adaptive approximation of high dimensional parametric initial value problemsVietnam J. Math. 1 1-35

Hansen M, Schillings C and Schwab C 2012 Sparse approximation algorithms for high dimensional parametric initial valueproblems Proceedings 5th Conf. High Performance Computing, Hanoi

Schillings C and Schwab C 2013 Sparse, adaptive Smolyak algorithms for Bayesian inverse problems Inverse Problems29 065011

Schillings C and Schwab C 2014 Sparsity in Bayesian inversion of parametric operator equations Inverse Problems 30065007

Schillings C 2013 A Note on Sparse, Adaptive Smolyak Quadratures for Bayesian Inverse Problems Oberwolfach Report10/1 239-293

Schillings C and Schwab C 2014 Scaling Limits in Computational Bayesian Inversion SAM Report 2014-26

Schwab C and Stuart A M 2012 Sparse deterministic approximation of Bayesian inverse problems Inverse Problems 28045003

C. Schillings (UoW) Bayesian Inversion WCPM - 23. Apr. 2015 18 / 18