MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free...

111
MCMC and likelihood-free methods MCMC and likelihood-free methods Christian P. Robert Universit´ e Paris-Dauphine, IUF, & CREST Universit´ e de Besan¸ con, November 22, 2012

Transcript of MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free...

Page 1: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

MCMC and likelihood-free methods

Christian P. Robert

Universite Paris-Dauphine, IUF, & CREST

Universite de Besancon, November 22, 2012

Page 2: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Computational issues in Bayesian cosmology

Computational issues in Bayesiancosmology

The Metropolis-Hastings Algorithm

The Gibbs Sampler

Approximate Bayesian computation

Page 3: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Statistical problems in cosmology

I Potentially high dimensional parameter space [Not consideredhere]

I Immensely slow computation of likelihoods, e.g WMAP, CMB,because of numerically costly spectral transforms [Data is aFortran program]

I Nonlinear dependence and degeneracies between parametersintroduced by physical constraints or theoretical assumptions

Page 4: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Cosmological data

Posterior distribution of cosmological parameters for recentobservational data of CMB anisotropies (differences in temperaturefrom directions) [WMAP], SNIa, and cosmic shear.Combination of three likelihoods, some of which are available aspublic (Fortran) code, and of a uniform prior on a hypercube.

Page 5: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Cosmology parameters

Parameters for the cosmology likelihood(C=CMB, S=SNIa, L=lensing)

Symbol Description Minimum Maximum ExperimentΩb Baryon density 0.01 0.1 C LΩm Total matter density 0.01 1.2 C S Lw Dark-energy eq. of state -3.0 0.5 C S Lns Primordial spectral index 0.7 1.4 C L

∆2R Normalization (large scales) Cσ8 Normalization (small scales) C Lh Hubble constant C Lτ Optical depth CM Absolute SNIa magnitude Sα Colour response Sβ Stretch response Sa Lb galaxy z-distribution fit Lc L

For WMAP5, σ8 is a deduced quantity that depends on the other parameters

Page 6: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Adaptation of importance function

[Benabed et al., MNRAS, 2010]

Page 7: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Estimates

Parameter PMC MCMC

Ωb 0.0432+0.0027−0.0024 0.0432+0.0026

−0.0023

Ωm 0.254+0.018−0.017 0.253+0.018

−0.016

τ 0.088+0.018−0.016 0.088+0.019

−0.015

w −1.011± 0.060 −1.010+0.059−0.060

ns 0.963+0.015−0.014 0.963+0.015

−0.014

109∆2R 2.413+0.098

−0.093 2.414+0.098−0.092

h 0.720+0.022−0.021 0.720+0.023

−0.021

a 0.648+0.040−0.041 0.649+0.043

−0.042

b 9.3+1.4−0.9 9.3+1.7

−0.9

c 0.639+0.084−0.070 0.639+0.082

−0.070

−M 19.331± 0.030 19.332+0.029−0.031

α 1.61+0.15−0.14 1.62+0.16

−0.14

−β −1.82+0.17−0.16 −1.82± 0.16

σ8 0.795+0.028−0.030 0.795+0.030

−0.027

Means and 68% credible intervals using lensing, SNIa and CMB

Page 8: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

∫π(x)dx =

∫π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.

Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑n=1

wn

where the wn =π(xn)q(xn)

are the (unnormalised) importance weights.

Page 9: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Evidence/Marginal likelihood/Integrated Likelihood ...

Central quantity of interest in (Bayesian) model choice

E =

∫π(x)dx =

∫π(x)

q(x)q(x)dx.

expressed as an expectation under any density q with large enoughsupport.Importance sampling provides a sample x1, . . . xN ∼ q andapproximation of the above integral,

E ≈N∑n=1

wn

where the wn =π(xn)q(xn)

are the (unnormalised) importance weights.

Page 10: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb,h,ns, τ,σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models

Page 11: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Back to cosmology questions

Standard cosmology successful in explaining recent observations,such as CMB, SNIa, galaxy clustering, cosmic shear, galaxy clustercounts, and Lyα forest clustering.

Flat ΛCDM model with only six free parameters(Ωm,Ωb,h,ns, τ,σ8)

Extensions to ΛCDM may be based on independent evidence(massive neutrinos from oscillation experiments), predicted bycompelling hypotheses (primordial gravitational waves frominflation) or reflect ignorance about fundamental physics(dynamical dark energy).

Testing for dark energy, curvature, and inflationary models

Page 12: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Extended models

Focus on the dark energy equation-of-state parameter, modeled as

w = −1 ΛCDM

w = w0 wCDM

w = w0 +w1(1− a) w(z)CDM

In addition, curvature parameter ΩK for each of the above is eitherΩK = 0 (‘flat’) or ΩK 6= 0 (‘curved’).Choice of models represents simplest models beyond a“cosmological constant” model able to explain the observed,recent accelerated expansion of the Universe.

Page 13: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Cosmology priors

Prior ranges for dark energy and curvature models. In case ofw(a) models, the prior on w1 depends on w0

Parameter Description Min. Max.

Ωm Total matter density 0.15 0.45Ωb Baryon density 0.01 0.08h Hubble parameter 0.5 0.9

ΩK Curvature −1 1w0 Constant dark-energy par. −1 −1/3

w1 Linear dark-energy par. −1−w0−1/3−w01−aacc

Page 14: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Results

In most cases evidence in favour of the standard model. especiallywhen more datasets/experiments are combined.

Largest evidence is lnB12 = 1.8, for the w(z)CDM model andCMB alone. Case where a large part of the prior range is stillallowed by the data, and a region of comparable size is excluded.Hence weak evidence that both w0 and w1 are required, butexcluded when adding SNIa and BAO datasets.

Results on the curvature are compatible with current findings:non-flat Universe(s) strongly disfavoured for the three dark-energycases.

Page 15: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Evidence

Page 16: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Computational issues in Bayesian cosmology

Posterior outcome

Posterior on dark-energy parameters w0 and w1 as 68%- and 95% credible regions forWMAP (solid blue lines), WMAP+SNIa (dashed green) and WMAP+SNIa+BAO(dotted red curves). Allowed prior range as red straight lines.

Page 17: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis-Hastings Algorithm

Computational issues in Bayesiancosmology

The Metropolis-Hastings Algorithm

The Gibbs Sampler

Approximate Bayesian computation

Page 18: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo basics

General purpose

A major computational issue in Bayesian statistics:

Given a density π known up to a normalizing constant, and anintegrable function h, compute

Π(h) =

∫h(x)π(x)µ(dx) =

∫h(x)π(x)µ(dx)∫π(x)µ(dx)

when∫h(x)π(x)µ(dx) is intractable.

Page 19: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo basics

Monte Carlo 101

Generate an iid sample x1, . . . , xN from π and estimate Π(h) by

ΠMCN (h) = N−1N∑i=1

h(xi).

LLN: ΠMCN (h)as−→ Π(h)

If Π(h2) =∫h2(x)π(x)µ(dx) <∞,

CLT:√N(ΠMCN (h) − Π(h)

) L N

(0,Π[h− Π(h)]2

).

Caveat conducting to MCMC

Often impossible or inefficient to simulate directly from Π

Page 20: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo basics

Monte Carlo 101

Generate an iid sample x1, . . . , xN from π and estimate Π(h) by

ΠMCN (h) = N−1N∑i=1

h(xi).

LLN: ΠMCN (h)as−→ Π(h)

If Π(h2) =∫h2(x)π(x)µ(dx) <∞,

CLT:√N(ΠMCN (h) − Π(h)

) L N

(0,Π[h− Π(h)]2

).

Caveat conducting to MCMC

Often impossible or inefficient to simulate directly from Π

Page 21: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (MCMC)

It is not necessary to use a sample from the distribution f toapproximate the integral

I =

∫h(x)f(x)dx ,

Page 22: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (MCMC)

It is not necessary to use a sample from the distribution f toapproximate the integral

I =

∫h(x)f(x)dx ,

[notation warnin: π turned to f!]

Page 23: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (MCMC)

It is not necessary to use a sample from the distribution f toapproximate the integral

I =

∫h(x)f(x)dx ,

We can obtain X1, . . . ,Xn ∼ f (approx)without directly simulating from f,using an ergodic Markov chain withstationary distribution f

Page 24: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (MCMC)

It is not necessary to use a sample from the distribution f toapproximate the integral

I =

∫h(x)f(x)dx ,

We can obtain X1, . . . ,Xn ∼ f (approx)without directly simulating from f,using an ergodic Markov chain withstationary distribution f

Andreı Markov

Page 25: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (2)

Idea

For an arbitrary starting value x(0), an ergodic chain (X(t)) isgenerated using a transition kernel with stationary distribution f

Page 26: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (2)

Idea

For an arbitrary starting value x(0), an ergodic chain (X(t)) isgenerated using a transition kernel with stationary distribution f

I irreducible Markov chain with stationary distribution f isergodic with limiting distribution f under weak conditions

I hence convergence in distribution of (X(t)) to a randomvariable from f.

I for T0 “large enough” T0, X(T0) distributed from f

I Markov sequence is dependent sample X(T0),X(T0+1), . . .generated from f

I Birkoff’s ergodic theorem extends LLN, sufficient for mostapproximation purposes

Page 27: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Monte Carlo Methods based on Markov Chains

Running Monte Carlo via Markov Chains (2)

Idea

For an arbitrary starting value x(0), an ergodic chain (X(t)) isgenerated using a transition kernel with stationary distribution f

Problem: How can one build a Markov chain with a givenstationary distribution?

Page 28: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

The Metropolis–Hastings algorithm

Arguments: The algorithm uses theobjective (target) density

f

and a conditional density

q(y|x)

called the instrumental (or proposal)distribution

Nicholas Metropolis

Page 29: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

The MH algorithm

Algorithm (Metropolis–Hastings)

Given x(t),

1. Generate Yt ∼ q(y|x(t)).

2. Take

X(t+1) =

Yt with prob. ρ(x(t), Yt),

x(t) with prob. 1− ρ(x(t), Yt),

where

ρ(x,y) = min

f(y)

f(x)

q(x|y)

q(y|x), 1

.

Page 30: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

Features

I Independent of normalizing constants for both f and q(·|x)(ie, those constants independent of x)

I Never move to values with f(y) = 0

I The chain (x(t))t may take the same value several times in arow, even though f is a density wrt Lebesgue measure

I The sequence (yt)t is usually not a Markov chain

Page 31: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

Convergence properties

1. The M-H Markov chain is reversible, withinvariant/stationary density f since it satisfies the detailedbalance condition

f(y)K(y, x) = f(x)K(x,y)

2. As f is a probability measure, the chain is positive recurrent

3. If

Pr

[f(Yt) q(X

(t)|Yt)

f(X(t)) q(Yt|X(t))> 1

]< 1. (1)

that is, the event X(t+1) = X(t) is possible, then the chain isaperiodic

Page 32: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

Convergence properties

1. The M-H Markov chain is reversible, withinvariant/stationary density f since it satisfies the detailedbalance condition

f(y)K(y, x) = f(x)K(x,y)

2. As f is a probability measure, the chain is positive recurrent

3. If

Pr

[f(Yt) q(X

(t)|Yt)

f(X(t)) q(Yt|X(t))> 1

]< 1. (1)

that is, the event X(t+1) = X(t) is possible, then the chain isaperiodic

Page 33: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

The Metropolis–Hastings algorithm

Convergence properties

1. The M-H Markov chain is reversible, withinvariant/stationary density f since it satisfies the detailedbalance condition

f(y)K(y, x) = f(x)K(x,y)

2. As f is a probability measure, the chain is positive recurrent

3. If

Pr

[f(Yt) q(X

(t)|Yt)

f(X(t)) q(Yt|X(t))> 1

]< 1. (1)

that is, the event X(t+1) = X(t) is possible, then the chain isaperiodic

Page 34: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Random-walk Metropolis-Hastings algorithms

Random walk Metropolis–Hastings

Use of a local perturbation as proposal

Yt = X(t) + εt,

where εt ∼ g, independent of X(t).The instrumental density is of the form g(y− x) and the Markovchain is a random walk if we take g to be symmetric g(x) = g(−x)

Page 35: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Random-walk Metropolis-Hastings algorithms

Random walk Metropolis–Hastings [code]

Algorithm (Random walk Metropolis)

Given x(t)

1. Generate Yt ∼ g(y− x(t))

2. Take

X(t+1) =

Yt with prob. min

1,f(Yt)

f(x(t))

,

x(t) otherwise.

Page 36: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Langevin Algorithms

Proposal based on the Langevin diffusion Lt is defined by thestochastic differential equation

dLt = dBt +1

2∇ log f(Lt)dt,

where Bt is the standard Brownian motion

Theorem

The Langevin diffusion is the only non-explosive diffusion which isreversible with respect to f.

Page 37: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Discretization

Instead, consider the sequence

x(t+1) = x(t) +σ2

2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)

where σ2 corresponds to the discretization step

Unfortunately, the discretized chain may be transient, for instancewhen

limx→±∞

∣∣σ2∇ log f(x)|x|−1∣∣ > 1

Page 38: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Discretization

Instead, consider the sequence

x(t+1) = x(t) +σ2

2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)

where σ2 corresponds to the discretization stepUnfortunately, the discretized chain may be transient, for instancewhen

limx→±∞

∣∣σ2∇ log f(x)|x|−1∣∣ > 1

Page 39: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

MH correction

Accept the new value Yt with probability

f(Yt)

f(x(t))·exp

−∥∥∥Yt − x(t) − σ2

2 ∇ log f(x(t))∥∥∥2/2σ2

exp

−∥∥∥x(t) − Yt − σ2

2 ∇ log f(Yt)∥∥∥2/2σ2 ∧ 1 .

Choice of the scaling factor σShould lead to an acceptance rate of 0.574 to achieve optimalconvergence rates (when the components of x are uncorrelated)

[Roberts & Rosenthal, 1998; Girolami & Calderhead, 2011]

Page 40: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Optimizing the Acceptance Rate

Problem of choosing the transition q kernel from a practical pointof viewMost common solutions:

(a) a fully automated algorithm like ARMS;[Gilks & Wild, 1992]

(b) an instrumental density g which approximates f, such thatf/g is bounded for uniform ergodicity to apply;

(c) a random walk

In both cases (b) and (c), the choice of g is critical,

Page 41: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Case of the random walk

Different approach to acceptance ratesA high acceptance rate does not indicate that the algorithm ismoving correctly since it indicates that the random walk is movingtoo slowly on the surface of f.

If x(t) and yt are close, i.e. f(x(t)) ' f(yt) y is accepted withprobability

min

(f(yt)

f(x(t)), 1

)' 1 .

For multimodal densities with well separated modes, the negativeeffect of limited moves on the surface of f clearly shows.

Page 42: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Case of the random walk

Different approach to acceptance ratesA high acceptance rate does not indicate that the algorithm ismoving correctly since it indicates that the random walk is movingtoo slowly on the surface of f.If x(t) and yt are close, i.e. f(x(t)) ' f(yt) y is accepted withprobability

min

(f(yt)

f(x(t)), 1

)' 1 .

For multimodal densities with well separated modes, the negativeeffect of limited moves on the surface of f clearly shows.

Page 43: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Case of the random walk (2)

If the average acceptance rate is low, the successive values of f(yt)tend to be small compared with f(x(t)), which means that therandom walk moves quickly on the surface of f since it oftenreaches the “borders” of the support of f

Page 44: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Rule of thumb

In small dimensions, aim at an average acceptance rate of50%. In large dimensions, at an average acceptance rate of25%.

[Gelman,Gilks and Roberts, 1995]

warnin: rule to be taken with a pinch of salt!

Page 45: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Rule of thumb

In small dimensions, aim at an average acceptance rate of50%. In large dimensions, at an average acceptance rate of25%.

[Gelman,Gilks and Roberts, 1995]

warnin: rule to be taken with a pinch of salt!

Page 46: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Role of scale

Example (Noisy AR(1))

Hidden Markov chain from a regular AR(1) model,

xt+1 = ϕxt + εt+1 εt ∼ N(0, τ2)

and observablesyt|xt ∼ N(x2t ,σ

2)

The distribution of xt given xt−1, xt+1 and yt is

exp−1

2τ2

(xt −ϕxt−1)

2 + (xt+1 −ϕxt)2 +

τ2

σ2(yt − x

2t)2

.

Page 47: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Role of scale

Example (Noisy AR(1))

Hidden Markov chain from a regular AR(1) model,

xt+1 = ϕxt + εt+1 εt ∼ N(0, τ2)

and observablesyt|xt ∼ N(x2t ,σ

2)

The distribution of xt given xt−1, xt+1 and yt is

exp−1

2τ2

(xt −ϕxt−1)

2 + (xt+1 −ϕxt)2 +

τ2

σ2(yt − x

2t)2

.

Page 48: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Role of scale

Example (Noisy AR(1) continued)

For a Gaussian random walk with scale ω small enough, therandom walk never jumps to the other mode. But if the scale ω issufficiently large, the Markov chain explores both modes and give asatisfactory approximation of the target distribution.

Page 49: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Role of scale

Markov chain based on a random walk with scale ω = .1.

Page 50: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Metropolis-Hastings Algorithm

Extensions

Role of scale

Markov chain based on a random walk with scale ω = .5.

Page 51: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

The Gibbs Sampler

Computational issues in Bayesiancosmology

The Metropolis-Hastings Algorithm

The Gibbs Sampler

Approximate Bayesian computation

Page 52: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

General Principles

A very specific simulation algorithm based on the targetdistribution f:

1. Uses the conditional densities f1, . . . , fp from f

2. Start with the random variable X = (X1, . . . ,Xp)

3. Simulate from the conditional densities,

Xi|x1, x2, . . . , xi−1, xi+1, . . . , xp

∼ fi(xi|x1, x2, . . . , xi−1, xi+1, . . . , xp)

for i = 1, 2, . . . ,p.

Page 53: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

General Principles

A very specific simulation algorithm based on the targetdistribution f:

1. Uses the conditional densities f1, . . . , fp from f

2. Start with the random variable X = (X1, . . . ,Xp)

3. Simulate from the conditional densities,

Xi|x1, x2, . . . , xi−1, xi+1, . . . , xp

∼ fi(xi|x1, x2, . . . , xi−1, xi+1, . . . , xp)

for i = 1, 2, . . . ,p.

Page 54: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

General Principles

A very specific simulation algorithm based on the targetdistribution f:

1. Uses the conditional densities f1, . . . , fp from f

2. Start with the random variable X = (X1, . . . ,Xp)

3. Simulate from the conditional densities,

Xi|x1, x2, . . . , xi−1, xi+1, . . . , xp

∼ fi(xi|x1, x2, . . . , xi−1, xi+1, . . . , xp)

for i = 1, 2, . . . ,p.

Page 55: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Gibbs code

Algorithm (Gibbs sampler)

Given x(t) = (x(t)1 , . . . , x

(t)p ), generate

1. X(t+1)1 ∼ f1(x1|x

(t)2 , . . . , x

(t)p );

2. X(t+1)2 ∼ f2(x2|x

(t+1)1 , x

(t)3 , . . . , x

(t)p ),

. . .

p. X(t+1)p ∼ fp(xp|x

(t+1)1 , . . . , x

(t+1)p−1 )

X(t+1) → X ∼ f

Page 56: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Properties

The full conditionals densities f1, . . . , fp are the only densities usedfor simulation. Thus, even in a high dimensional problem, all ofthe simulations may be univariate

Page 57: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

toy example: iid N(µ,σ2) variates

When Y1, . . . ,Yniid∼ N(y|µ,σ2) with both µ and σ unknown, the

posterior in (µ,σ2) is conjugate outside a standard familly

But...

µ|Y0:n,σ2 ∼ N

(µ∣∣∣ 1n∑n

i=1 Yi,σ2

n )

σ2|Y1:n,µ ∼ IG(σ2

∣∣n2 − 1,

12

∑ni=1(Yi − µ)

2)

assuming constant (improper) priors on both µ and σ2

I Hence we may use the Gibbs sampler for simulating from theposterior of (µ,σ2)

Page 58: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

toy example: iid N(µ,σ2) variates

When Y1, . . . ,Yniid∼ N(y|µ,σ2) with both µ and σ unknown, the

posterior in (µ,σ2) is conjugate outside a standard familly

But...

µ|Y0:n,σ2 ∼ N

(µ∣∣∣ 1n∑n

i=1 Yi,σ2

n )

σ2|Y1:n,µ ∼ IG(σ2

∣∣n2 − 1,

12

∑ni=1(Yi − µ)

2)

assuming constant (improper) priors on both µ and σ2

I Hence we may use the Gibbs sampler for simulating from theposterior of (µ,σ2)

Page 59: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

toy example: R code

Gibbs Sampler for Gaussian posterior

n = length(Y);

S = sum(Y);

mu = S/n;

for (i in 1:500)

S2 = sum((Y-mu)^2);

sigma2 = 1/rgamma(1,n/2-1,S2/2);

mu = S/n + sqrt(sigma2/n)*rnorm(1);

Page 60: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1

, 2, 3, 4, 5, 10, 25, 50, 100, 500

Page 61: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2

, 3, 4, 5, 10, 25, 50, 100, 500

Page 62: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3

, 4, 5, 10, 25, 50, 100, 500

Page 63: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4

, 5, 10, 25, 50, 100, 500

Page 64: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5

, 10, 25, 50, 100, 500

Page 65: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5, 10

, 25, 50, 100, 500

Page 66: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5, 10, 25

, 50, 100, 500

Page 67: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5, 10, 25, 50

, 100, 500

Page 68: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5, 10, 25, 50, 100

, 500

Page 69: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with n = 10 observations from theN(0, 1) distribution

Number of Iterations 1, 2, 3, 4, 5, 10, 25, 50, 100, 500

Page 70: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Limitations of the Gibbs sampler

Formally, a special case of a sequence of 1-D M-H kernels, all withacceptance rate uniformly equal to 1.The Gibbs sampler

1. limits the choice of instrumental distributions

2. requires some knowledge of f

3. is, by construction, multidimensional

4. does not apply to problems where the number of parametersvaries as the resulting chain is not irreducible.

Page 71: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Limitations of the Gibbs sampler

Formally, a special case of a sequence of 1-D M-H kernels, all withacceptance rate uniformly equal to 1.The Gibbs sampler

1. limits the choice of instrumental distributions

2. requires some knowledge of f

3. is, by construction, multidimensional

4. does not apply to problems where the number of parametersvaries as the resulting chain is not irreducible.

Page 72: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Limitations of the Gibbs sampler

Formally, a special case of a sequence of 1-D M-H kernels, all withacceptance rate uniformly equal to 1.The Gibbs sampler

1. limits the choice of instrumental distributions

2. requires some knowledge of f

3. is, by construction, multidimensional

4. does not apply to problems where the number of parametersvaries as the resulting chain is not irreducible.

Page 73: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Limitations of the Gibbs sampler

Formally, a special case of a sequence of 1-D M-H kernels, all withacceptance rate uniformly equal to 1.The Gibbs sampler

1. limits the choice of instrumental distributions

2. requires some knowledge of f

3. is, by construction, multidimensional

4. does not apply to problems where the number of parametersvaries as the resulting chain is not irreducible.

Page 74: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

A wee problem

−1 0 1 2 3 4

−1

01

23

4

µ1

µ2

Gibbs started at random

Gibbs stuck at the wrong mode

−1 0 1 2 3

−1

01

23

µ1

µ2

Page 75: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

A wee problem

−1 0 1 2 3 4

−1

01

23

4

µ1

µ2

Gibbs started at random

Gibbs stuck at the wrong mode

−1 0 1 2 3

−1

01

23

µ1

µ2

Page 76: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Slice sampler as generic Gibbs

If f(θ) can be written as a product

k∏i=1

fi(θ),

it can be completed as

k∏i=1

I06ωi6fi(θ),

leading to the following Gibbs algorithm:

Page 77: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Slice sampler as generic Gibbs

If f(θ) can be written as a product

k∏i=1

fi(θ),

it can be completed as

k∏i=1

I06ωi6fi(θ),

leading to the following Gibbs algorithm:

Page 78: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Slice sampler (code)

Algorithm (Slice sampler)

Simulate

1. ω(t+1)1 ∼ U[0,f1(θ(t))]

;

. . .

k. ω(t+1)k ∼ U[0,fk(θ(t))]

;

k+1. θ(t+1) ∼ UA(t+1) , with

A(t+1) = y; fi(y) > ω(t+1)i , i = 1, . . . ,k.

Page 79: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2

, 3, 4, 5, 10, 50, 100

Page 80: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3

, 4, 5, 10, 50, 100

Page 81: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3, 4

, 5, 10, 50, 100

Page 82: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3, 4, 5

, 10, 50, 100

Page 83: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3, 4, 5, 10

, 50, 100

Page 84: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3, 4, 5, 10, 50

, 100

Page 85: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

The Gibbs Sampler

General Principles

Example of results with a truncated N(−3, 1) distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Number of Iterations 2, 3, 4, 5, 10, 50, 100

Page 86: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

Approximate Bayesian computation

Computational issues in Bayesiancosmology

The Metropolis-Hastings Algorithm

The Gibbs Sampler

Approximate Bayesian computation

Page 87: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Regular Bayesian computation issues

Recap’: When faced with a non-standard posterior distribution

π(θ|y) ∝ π(θ)L(θ|y)

the standard solution is to use simulation (Monte Carlo) toproduce a sample

θ1, . . . ,θT

from π(θ|y) (or approximately by Markov chain Monte Carlomethods)

[Robert & Casella, 2004]

Page 88: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Untractable likelihoods

Cases when the likelihood function f(y|θ) is unavailable (inanalytic and numerical senses) and when the completion step

f(y|θ) =

∫Z

f(y, z|θ) dz

is impossible or too costly because of the dimension of zc© MCMC cannot be implemented!

Page 89: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Illustration

Phylogenetic tree: in populationgenetics, reconstitution of a commonancestor from a sample of genes viaa phylogenetic tree that is close toimpossible to integrate out[100 processor days with 4parameters]

[Cornuet et al., 2009, Bioinformatics]

Page 90: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Illustration

demo-genetic inference

Genetic model of evolution from acommon ancestor (MRCA)characterized by a set of parametersthat cover historical, demographic, andgenetic factorsDataset of polymorphism (DNA sample)observed at the present time

97

!""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03 !1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+

Différents scénarios possibles, choix de scenario par ABC

Le scenario 1a est largement soutenu par rapport aux

autres ! plaide pour une origine commune des

populations pygmées d’Afrique de l’Ouest Verdu et al. 2009

Page 91: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Illustration

Pygmies population demo-genetics

Pygmies populations: do theyhave a common origin? whenand how did they split fromnon-pygmies populations? werethere more recent interactionsbetween pygmies andnon-pygmies populations?

94

!""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03 !1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+

Page 92: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

The ABC method

Bayesian setting: target is π(θ)f(x|θ)

When likelihood f(x|θ) not in closed form, likelihood-free rejectiontechnique:

ABC algorithm

For an observation y ∼ f(y|θ), under the prior π(θ), keep jointlysimulating

θ′ ∼ π(θ) , z ∼ f(z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y.

[Tavare et al., 1997]

Page 93: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

The ABC method

Bayesian setting: target is π(θ)f(x|θ)When likelihood f(x|θ) not in closed form, likelihood-free rejectiontechnique:

ABC algorithm

For an observation y ∼ f(y|θ), under the prior π(θ), keep jointlysimulating

θ′ ∼ π(θ) , z ∼ f(z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y.

[Tavare et al., 1997]

Page 94: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

The ABC method

Bayesian setting: target is π(θ)f(x|θ)When likelihood f(x|θ) not in closed form, likelihood-free rejectiontechnique:

ABC algorithm

For an observation y ∼ f(y|θ), under the prior π(θ), keep jointlysimulating

θ′ ∼ π(θ) , z ∼ f(z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y.

[Tavare et al., 1997]

Page 95: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Why does it work?!

The proof is trivial:

f(θi) ∝∑z∈D

π(θi)f(z|θi)Iy(z)

∝ π(θi)f(y|θi)= π(θi|y) .

[Accept–Reject 101]

Page 96: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

A as approximative

When y is a continuous random variable, equality z = y isreplaced with a tolerance condition,

ρ(y, z) 6 ε

where ρ is a distance

Output distributed from

π(θ)Pθρ(y, z) < ε ∝ π(θ|ρ(y, z) < ε)

Page 97: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

A as approximative

When y is a continuous random variable, equality z = y isreplaced with a tolerance condition,

ρ(y, z) 6 ε

where ρ is a distanceOutput distributed from

π(θ)Pθρ(y, z) < ε ∝ π(θ|ρ(y, z) < ε)

Page 98: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC algorithm

Algorithm 1 Likelihood-free rejection sampler 2

for i = 1 to N dorepeat

generate θ ′ from the prior distribution π(·)generate z from the likelihood f(·|θ ′)

until ρη(z),η(y) 6 εset θi = θ

end for

where η(y) defines a (not necessarily sufficient) statistic

Page 99: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Output

The likelihood-free algorithm samples from the marginal in z of:

πε(θ, z|y) =π(θ)f(z|θ)IAε,y(z)∫

Aε,y×Θ π(θ)f(z|θ)dzdθ,

where Aε,y = z ∈ D|ρ(η(z),η(y)) < ε.

The idea behind ABC is that the summary statistics coupled with asmall tolerance should provide a good approximation of theposterior distribution:

πε(θ|y) =

∫πε(θ, z|y)dz ≈ π(θ|η(y)) .

Page 100: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Output

The likelihood-free algorithm samples from the marginal in z of:

πε(θ, z|y) =π(θ)f(z|θ)IAε,y(z)∫

Aε,y×Θ π(θ)f(z|θ)dzdθ,

where Aε,y = z ∈ D|ρ(η(z),η(y)) < ε.

The idea behind ABC is that the summary statistics coupled with asmall tolerance should provide a good approximation of theposterior distribution:

πε(θ|y) =

∫πε(θ, z|y)dz ≈ π(θ|η(y)) .

Page 101: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Pima Indian benchmark

−0.005 0.010 0.020 0.030

020

4060

8010

0

Dens

ity

−0.05 −0.03 −0.01

020

4060

80

Dens

ity

−1.0 0.0 1.0 2.0

0.00.2

0.40.6

0.81.0

Dens

ityFigure: Comparison between density estimates of the marginals on β1(left), β2 (center) and β3 (right) from ABC rejection samples (red) andMCMC samples (black)

.

Page 102: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC advances

Simulating from the prior is often poor in efficiency

Either modify the proposal distribution on θ to increase the densityof x’s within the vicinity of y...

[Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

...or by viewing the problem as a conditional density estimationand by developing techniques to allow for larger ε

[Beaumont et al., 2002]

.....or even by including ε in the inferential framework [ABCµ][Ratmann et al., 2009]

Page 103: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC advances

Simulating from the prior is often poor in efficiencyEither modify the proposal distribution on θ to increase the densityof x’s within the vicinity of y...

[Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

...or by viewing the problem as a conditional density estimationand by developing techniques to allow for larger ε

[Beaumont et al., 2002]

.....or even by including ε in the inferential framework [ABCµ][Ratmann et al., 2009]

Page 104: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC advances

Simulating from the prior is often poor in efficiencyEither modify the proposal distribution on θ to increase the densityof x’s within the vicinity of y...

[Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

...or by viewing the problem as a conditional density estimationand by developing techniques to allow for larger ε

[Beaumont et al., 2002]

.....or even by including ε in the inferential framework [ABCµ][Ratmann et al., 2009]

Page 105: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC advances

Simulating from the prior is often poor in efficiencyEither modify the proposal distribution on θ to increase the densityof x’s within the vicinity of y...

[Marjoram et al, 2003; Bortot et al., 2007, Sisson et al., 2007]

...or by viewing the problem as a conditional density estimationand by developing techniques to allow for larger ε

[Beaumont et al., 2002]

.....or even by including ε in the inferential framework [ABCµ][Ratmann et al., 2009]

Page 106: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC-MCMC

Markov chain (θ(t)) created via the transition function

θ(t+1) =

θ′ ∼ Kω(θ

′|θ(t)) if x ∼ f(x|θ′) is such that x = y

and u ∼ U(0, 1) 6 π(θ′)Kω(θ(t)|θ′)π(θ(t))Kω(θ′|θ(t))

,

θ(t) otherwise,

has the posterior π(θ|y) as stationary distribution[Marjoram et al, 2003]

Page 107: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC-MCMC

Markov chain (θ(t)) created via the transition function

θ(t+1) =

θ′ ∼ Kω(θ

′|θ(t)) if x ∼ f(x|θ′) is such that x = y

and u ∼ U(0, 1) 6 π(θ′)Kω(θ(t)|θ′)π(θ(t))Kω(θ′|θ(t))

,

θ(t) otherwise,

has the posterior π(θ|y) as stationary distribution[Marjoram et al, 2003]

Page 108: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC-MCMC (2)

Algorithm 2 Likelihood-free MCMC sampler

Use Algorithm 1 to get (θ(0), z(0))for t = 1 to N do

Generate θ ′ from Kω(·|θ(t−1)

),

Generate z ′ from the likelihood f(·|θ ′),Generate u from U[0,1],

if u 6 π(θ ′)Kω(θ(t−1)|θ ′)π(θ(t−1)Kω(θ ′|θ(t−1))

IAε,y(z ′) then

set (θ(t), z(t)) = (θ ′, z ′)else(θ(t), z(t))) = (θ(t−1), z(t−1)),

end ifend for

Page 109: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Sequential Monte Carlo

SMC is a simulation technique to approximate a sequence ofrelated probability distributions πn with π0 “easy” and πT astarget.Iterated IS as PMC : particles moved from time n to time n viakernel Kn and use of a sequence of extended targets πn

πn(z0:n) = πn(zn)

n∏j=0

Lj(zj+1, zj)

where the Lj’s are backward Markov kernels [check that πn(zn) isa marginal]

[Del Moral, Doucet & Jasra, Series B, 2006]

Page 110: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

Sequential Monte Carlo (2)

Algorithm 3 SMC sampler [Del Moral, Doucet & Jasra, Series B,2006]

sample z(0)i ∼ γ0(x) (i = 1, . . . ,N)

compute weights w(0)i = π0(z

(0)i ))/γ0(z

(0)i )

for t = 1 to N doif ESS(w(t−1)) < NT then

resample N particles z(t−1) and set weights to 1end ifgenerate z

(t−1)i ∼ Kt(z

(t−1)i , ·) and set weights to

w(t)i =W

(t−1)i−1

πt(z(t)i ))Lt−1(z

(t)i ), z

(t−1)i ))

πt−1(z(t−1)i ))Kt(z

(t−1)i ), z

(t)i ))

end for

Page 111: MCMC and likelihood-free methods - Institut UTINAM · 2012. 11. 23. · MCMC and likelihood-free methods Computational issues in Bayesian cosmology Statistical problems in cosmology

MCMC and likelihood-free methods

Approximate Bayesian computation

ABC basics

ABC-SMC

[Del Moral, Doucet & Jasra, 2009]

True derivation of an SMC-ABC algorithmUse of a kernel Kn associated with target πεn and derivation of thebackward kernel

Ln−1(z, z′) =

πεn(z′)Kn(z

′, z)

πn(z)

Update of the weights

win ∝ wi(n−1)

∑Mm=1 IAεn (x

min)∑M

m=1 IAεn−1 (xmi(n−1))

when xmin ∼ K(xi(n−1), ·)