Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in...
Transcript of Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in...
![Page 1: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/1.jpg)
Overview
1. Probabilistic Reasoning/Graphical models
2. Importance Sampling
3. Markov Chain Monte Carlo: Gibbs Sampling
4. Sampling in presence of Determinism
5. Rao-Blackwellisation
6. AND/OR importance sampling
![Page 2: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/2.jpg)
Overview
1. Probabilistic Reasoning/Graphical models
2. Importance Sampling
3. Markov Chain Monte Carlo: Gibbs Sampling
4. Sampling in presence of Determinism
5. Cutset-based Variance Reduction
6. AND/OR importance sampling
![Page 3: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/3.jpg)
Probabilistic Reasoning;Graphical models
• Graphical models:
– Bayesian network, constraint networks, mixed network
• Queries
• Exact algorithm
– using inference,
– search and hybrids
• Graph parameters:
– tree-width, cycle-cutset, w-cutset
![Page 4: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/4.jpg)
Queries
• Probability of evidence (or partition function)
• Posterior marginal (beliefs):
• Most Probable Explanation
)var( 1
|)|()(eX
n
i
eii paxPeP X i
ii CZ )(
)var( 1
)var( 1
|)|(
|)|(
)(
),()|(
eX
n
j
ejj
XeX
n
j
ejj
ii
paxP
paxP
eP
exPexP i
e),xP(maxarg*xx
![Page 5: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/5.jpg)
5
Approximation
• Since inference, search and hybrids are too expensive when graph is dense; (high treewidth) then:
• Bounding inference:• mini-bucket and mini-clustering• Belief propagation
• Bounding search:• Sampling
• Goal: an anytime scheme
![Page 6: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/6.jpg)
Overview
1. Probabilistic Reasoning/Graphical models
2. Importance Sampling
3. Markov Chain Monte Carlo: Gibbs Sampling
4. Sampling in presence of Determinism
5. Rao-Blackwellisation
6. AND/OR importance sampling
![Page 7: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/7.jpg)
Outline
• Definitions and Background on Statistics
• Theory of importance sampling
• Likelihood weighting
• State-of-the-art importance sampling techniques
7
![Page 8: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/8.jpg)
A sample
• Given a set of variables X={X1,...,Xn}, a sample, denoted by St is an instantiation of all variables:
8
),...,,(t
n
ttt xxxS 21
![Page 9: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/9.jpg)
How to draw a sample ?Univariate distribution
• Example: Given random variable X having domain {0, 1} and a distribution P(X) = (0.3, 0.7).
• Task: Generate samples of X from P.
• How?
– draw random number r [0, 1]
– If (r < 0.3) then set X=0
– Else set X=1
9
![Page 10: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/10.jpg)
How to draw a sample?Multi-variate distribution
• Let X={X1,..,Xn} be a set of variables
• Express the distribution in product form
• Sample variables one by one from left to right, along the ordering dictated by the product form.
• Bayesian network literature: Logic sampling
10
),...,|(...)|()()( 11121 nn XXXPXXPXPXP
![Page 11: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/11.jpg)
Sampling for Prob. InferenceOutline
• Logic Sampling
• Importance Sampling
– Likelihood Sampling
– Choosing a Proposal Distribution
• Markov Chain Monte Carlo (MCMC)
– Metropolis-Hastings
– Gibbs sampling
• Variance Reduction
![Page 12: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/12.jpg)
Logic Sampling:
No Evidence (Henrion 1988)
Input: Bayesian network
X= {X1,…,XN}, N- #nodes, T - # samples
Output: T samples
Process nodes in topological order – first process the
ancestors of a node, then the node itself:
1. For t = 0 to T
2. For i = 0 to N
3. Xi sample xit from P(xi | pai)
12
![Page 13: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/13.jpg)
Logic sampling (example)
13
X1
X4
X2X3
)( 1XP
)|( 12 XXP )|( 13 XXP
)|( from Sample .4
)|( from Sample .3
)|( from Sample .2
)( from Sample .1
sample generate//
Evidence No
33,2244
1133
1122
11
xXxXxPx
xXxPx
xXxPx
xPx
k
),|()|()|()(),,,( 324131214321 XXXPXXPXXPXPXXXXP
),|( 324 XXXP
![Page 14: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/14.jpg)
Logic Sampling w/ Evidence
Input: Bayesian network
X= {X1,…,XN}, N- #nodes
E – evidence, T - # samples
Output: T samples consistent with E
1. For t=1 to T
2. For i=1 to N
3. Xi sample xit from P(xi | pai)
4. If Xi in E and Xi xi, reject sample:
5. Goto Step 1.
14
![Page 15: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/15.jpg)
Logic Sampling (example)
15
X1
X4
X2X3
)( 1xP
)|( 12 xxP
),|( 324 xxxP
)|( 13 xxP
)|( from Sample 5.
otherwise 1, fromstart and
samplereject 0, If .4
)|( from Sample .3
)|( from Sample .2
)( from Sample .1
sample generate//
0 :Evidence
3,244
3
133
122
11
3
xxxPx
x
xxPx
xxPx
xPx
k
X
![Page 16: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/16.jpg)
Expected value: Given a probability distribution P(X)
and a function g(X) defined over a set of variables X =
{X1, X2, … Xn}, the expected value of g w.r.t. P is
Variance: The variance of g w.r.t. P is:
Expected value and Variance
16
)()()]([ xPxgxgEx
P
)()]([)()]([2
xPxgExgxgVarx
PP
![Page 17: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/17.jpg)
Monte Carlo Estimate
• Estimator:
– An estimator is a function of the samples.
– It produces an estimate of the unknown parameter of the sampling distribution.
17
T
t
t
P
SgT
g
P
1
T21
1
:bygiven is [g(x)]E of estimate carlo Monte the
, fromdrawn S ,S ,S samples i.i.d.Given
)(ˆ
![Page 18: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/18.jpg)
Example: Monte Carlo estimate• Given:
– A distribution P(X) = (0.3, 0.7).– g(X) = 40 if X equals 0
= 50 if X equals 1.
• Estimate EP[g(x)]=(40x0.3+50x0.7)=47.• Generate k samples from P: 0,1,1,1,0,1,1,0,1,0
18
4610
650440
150040
samples
XsamplesXsamplesg
#
)(#)(#ˆ
![Page 19: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/19.jpg)
Outline
• Definitions and Background on Statistics
• Theory of importance sampling
• Likelihood weighting
• State-of-the-art importance sampling techniques
19
![Page 20: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/20.jpg)
Importance sampling: Main idea
• Express query as the expected value of a random variable w.r.t. to a distribution Q.
• Generate random samples from Q.
• Estimate the expected value from the generated samples using a monte carloestimator (average).
20
![Page 21: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/21.jpg)
Importance sampling for P(e)
)(,)()(ˆ
:
)]([)(
),(
)(
)(),(),()(
)(),(
,\
ZQzwT
eP
zwEzQ
ezPE
zQ
zQezPezPeP
zQezP
EXZLet
tT
t
t
zz
z where1
estimate Carlo Monte
:as P(e) rewritecan weThen,
00
satisfying on,distributi (proposal) a be Q(Z)Let
1
![Page 22: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/22.jpg)
Properties of IS estimate of P(e)
• Convergence: by law of large numbers
• Unbiased.
• Variance:
Tfor )()(
1)(ˆ ..
1ePzw
TeP saT
i
i
T
zwVarzw
TVarePVar
QN
i
i
)]([)(
1)(ˆ
1
)()](ˆ[ ePePEQ
![Page 23: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/23.jpg)
Properties of IS estimate of P(e)
• Mean Squared Error of the estimator
T
xwVar
ePVar
ePVarePEeP
ePePEePMSE
Q
Q
)]([
)(ˆ
)(ˆ)](ˆ[)(
)()(ˆ)(ˆ
2
2
This quantity enclosed in the brackets is zero because the expected value of the
estimator equals the expected value of g(x)
![Page 24: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/24.jpg)
Estimating P(Xi|e)
24
)|()|(E:biased is Estimate
,
,
)(ˆ
),(ˆ)|(:estimate Ratio
IS.by r denominato andnumerator Estimate :Idea
)(
),(
)(
),()(
),(
),()(
)(
),()|(
otherwise. 0 and xcontains z if 1 is which function, delta-dirac a be (z)Let
T
1k
T
1k
ix i
exPexP
e)w(z
e))w(z(z
eP
exPexP
zQ
ezPE
zQ
ezPzE
ezP
ezPz
eP
exPexP
ii
k
kk
x
ii
Q
x
Q
z
z
x
ii
i
i
i
![Page 25: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/25.jpg)
Properties of the IS estimator for P(Xi|e)
• Convergence: By Weak law of large numbers
• Asymptotically unbiased
• Variance
– Harder to analyze
– Liu suggests a measure called “Effective sample size”
25
T as )|()|( exPexP ii
)|()]|([lim exPexPE iiPT
![Page 26: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/26.jpg)
Generating samples from Q
• No restrictions on “how to”
• Typically, express Q in product form:
– Q(Z)=Q(Z1)xQ(Z2|Z1)x….xQ(Zn|Z1,..Zn-1)
• Sample along the order Z1,..,Zn
• Example:
– Z1Q(Z1)=(0.2,0.8)
– Z2 Q(Z2|Z1)=(0.1,0.9,0.2,0.8)
– Z3 Q(Z3|Z1,Z2)=Q(Z3)=(0.5,0.5)
![Page 27: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/27.jpg)
Outline
• Definitions and Background on Statistics
• Theory of importance sampling
• Likelihood weighting
• State-of-the-art importance sampling techniques
27
![Page 28: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/28.jpg)
Likelihood Weighting(Fung and Chang, 1990; Shachter and Peot, 1990)
28
Works well for likely evidence!
“Clamping” evidence+logic sampling+weighing samples by evidence likelihood
Is an instance of importance sampling!
![Page 29: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/29.jpg)
Likelihood Weighting: Sampling
29
e e e e e
e e e e
Sample in topological order over X !
Clamp evidence, Sample xi P(Xi|pai), P(Xi|pai) is a look-up in CPT!
![Page 30: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/30.jpg)
Likelihood Weighting: Proposal Distribution
30
EE
jj
EXX
ii
EXX EE
jjii
n
EXX
ii
j
i
i j
i
paeP
epaxP
paePepaxP
xQ
exPw
xxx
Weights
xXXXPXP
epaXPEXQ
)|(
),|(
)|(),|(
)(
),(
),..,(:
:
),|()(
),|()\(
\
\
\
sample aGiven
)X,Q(X
.xX Evidence
and )X,X|P(X)X|P(X)P(X)X,X,P(X :networkBayesian aGiven
:Example
1
2213131
22
213121321
Notice: Q is another Bayesian network
![Page 31: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/31.jpg)
Likelihood Weighting: Estimates
31
T
t
twT
eP1
)(1)(ˆEstimate P(e):
otherwise zero equals and xif 1)(
)(
)(ˆ
),(ˆ)|(ˆ
i
)(
1
)(
)(
1
)(
t
i
t
x
T
t
t
t
x
T
t
t
ii
xxg
w
xgw
eP
exPexP
i
i
Estimate Posterior Marginals:
![Page 32: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/32.jpg)
Likelihood Weighting
• Converges to exact posterior marginals
• Generates Samples Fast
• Sampling distribution is close to prior (especially if E Leaf Nodes)
• Increasing sampling variance
Convergence may be slow
Many samples with P(x(t))=0 rejected
32
![Page 33: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/33.jpg)
Outline
• Definitions and Background on Statistics
• Theory of importance sampling
• Likelihood weighting
• Error estimation
• State-of-the-art importance sampling techniques
33
![Page 34: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/34.jpg)
![Page 35: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/35.jpg)
![Page 36: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/36.jpg)
![Page 37: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/37.jpg)
absolute
![Page 38: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/38.jpg)
Outline
• Definitions and Background on Statistics
• Theory of importance sampling
• Likelihood weighting
• State-of-the-art importance sampling techniques
38
![Page 39: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/39.jpg)
Proposal selection
• One should try to select a proposal that is as close as possible to the posterior distribution.
)|()(
)()(
),(
estimator variance-zero a have to,0)()(
),(
)()()(
),(1)]([)(ˆ
2
ezPzQ
zQeP
ezP
ePzQ
ezP
zQePzQ
ezP
NT
zwVarePVar
Zz
Q
Q
![Page 40: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/40.jpg)
Perfect sampling using Bucket Elimination
• Algorithm:
– Run Bucket elimination on the problem along an ordering o=(XN,..,X1).
– Sample along the reverse ordering: (X1,..,XN)
– At each variable Xi, recover the probability P(Xi|x1,...,xi-1) by referring to the bucket.
![Page 41: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/41.jpg)
41
Bucket Elimination )0,()0|( eaPeaP
debc
cbePbadPacPabPaPeaP,0,,
),|(),|()|()|()()0,(
dc b e
badPcbePabPacPaP ),|(),|()|()|()(0
Elimination Order: d,e,b,cQuery:
D:
E:
B:
C:
A:
d
D badPbaf ),|(),(),|( badP
),|( cbeP ),|0(),( cbePcbfE
b
EDB cbfbafabPcaf ),(),()|(),(
)()()0,( afApeaP C)(aP
)|( acP c
BC cafacPaf ),()|()(
)|( abP
D,A,B E,B,C
B,A,C
C,A
A
),( bafD ),( cbfE
),( cafB
)(afC
AA
DD EE
CCBB
Bucket TreeD E
B
C
A
Original Functions Messages
Time and space exp(w*)
![Page 42: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/42.jpg)
SP2 42
Bucket elimination (BE) Algorithm elim-bel (Dechter 1996)
b
Elimination operator
P(e)
bucket B:
P(a)
P(C|A)
P(B|A) P(D|B,A) P(e|B,C)
bucket C:
bucket D:
bucket E:
bucket A:
B
C
D
E
A
e)(A,hD
(a)hE
e)C,D,(A,hB
e)D,(A,hC
![Page 43: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/43.jpg)
SP2 43
Sampling from the output of BE(Dechter 2002)
bucket B:
P(A)
P(C|A)
P(B|A) P(D|B,A) P(e|B,C)
bucket C:
bucket D:
bucket E:
bucket A:
e)(A,hD
(A)hE
e)C,D,(A,hB
e)D,(A,hC
Q(A)aA:
(A)hP(A)Q(A)E
Sample
ignore :bucket Evidence
e)D,(a,he)a,|Q(Dd D :Sample
bucket in the aASet
C
e)C,d,(a,h)|(d)e,a,|Q(C cC :Sample
bucket thein dDa,ASet
B
ACP
),|(),|()|(d)e,a,|Q(Bb B:Sample
bucket thein cCd,Da,ASet
cbePaBdPaBP
![Page 44: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/44.jpg)
Mini-buckets: “local inference”
• Computation in a bucket is time and space exponential in the number of variables involved
• Therefore, partition functions in a bucket into “mini-buckets” on smaller number of variables
• Can control the size of each “mini-bucket”, yielding polynomial complexity.
SP2 44
![Page 45: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/45.jpg)
SP2 45 45
Mini-Bucket Elimination
bucket A:
bucket E:
bucket D:
bucket C:
bucket B:
ΣB
P(B|A) P(D|B,A)
hE(A)
hB(A,D)
P(e|B,C)
Mini-buckets
ΣB
P(C|A) hB(C,e)
hD(A)
hC(A,e)
Approximation of P(e)
Space and Time constraints:Maximum scope size of the new function generated should be bounded by 2
BE generates a function having scope size 3. So it cannot be used.
P(A)
![Page 46: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/46.jpg)
SP2 46 46
Sampling from the output of MBE
bucket A:
bucket E:
bucket D:
bucket C:
bucket B: P(B|A) P(D|B,A)
hE(A)
hB(A,D)
P(e|B,C)
P(C|A) hB(C,e)
hD(A)
hC(A,e)Sampling is same as in BE-sampling except that now we construct Q from a randomly selected “mini-bucket”
![Page 47: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/47.jpg)
IJGP-Sampling (Gogate and Dechter, 2005)
• Iterative Join Graph Propagation (IJGP)
– A Generalized Belief Propagation scheme (Yedidiaet al., 2002)
• IJGP yields better approximations of P(X|E) than MBE
– (Dechter, Kask and Mateescu, 2002)
• Output of IJGP is same as mini-bucket “clusters”
• Currently the best performing IS scheme!
![Page 48: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/48.jpg)
Current Research question
• Given a Bayesian network with evidence or a Markov network representing function P, generate another Bayesian network representing a function Q (from a family of distributions, restricted by structure) such that Q is closest to P.
• Current approaches– Mini-buckets– Ijgp– Both
• Experimented, but need to be justified theoretically.
![Page 49: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/49.jpg)
Algorithm: Approximate Sampling
1) Run IJGP or MBE
2) At each branch point compute the edge probabilities by consulting output of IJGP or MBE
• Rejection Problem:
– Some assignments generated are non solutions
![Page 50: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/50.jpg)
k
)(ˆ Re
')(Q Update
)(N
1)(ˆe)(EP̂
Q z,...,z samples Generate
dok to1iFor
0)(ˆ
))(|(...))(|()()(Q Proposal Initial
1k
1
N1
221
1
eEPturn
End
QQkQ
zweEP
from
eEP
ZpaZQZpaZQZQZ
kk
iN
j
k
k
nn
Adaptive Importance Sampling
![Page 51: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/51.jpg)
Adaptive Importance Sampling
• General case
• Given k proposal distributions
• Take N samples out of each distribution
• Approximate P(e)
1
)(ˆ
1
k
j
proposaljthweightAvgk
eP
![Page 52: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/52.jpg)
Estimating Q'(z)
sampling importanceby estimated is
)Z,..,Z|(ZQ'each where
))(|('...))(|(')(')(Q
1-i1i
221
'
nn ZpaZQZpaZQZQZ
![Page 53: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/53.jpg)
Overview
1. Probabilistic Reasoning/Graphical models
2. Importance Sampling
3. Markov Chain Monte Carlo: Gibbs Sampling
4. Sampling in presence of Determinism
5. Rao-Blackwellisation
6. AND/OR importance sampling
![Page 54: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/54.jpg)
Markov Chain
• A Markov chain is a discrete random process with the property that the next state depends only on the
current state (Markov Property):
54
x1 x2 x3 x4
)|(),...,,|( 1121 tttt xxPxxxxP
• If P(Xt|xt-1) does not depend on t (time homogeneous) and state space is finite, then it is often expressed as a transition function (aka transition matrix) 1)(
x
xXP
![Page 55: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/55.jpg)
Example: Drunkard’s Walk• a random walk on the number line where, at
each step, the position may change by +1 or −1 with equal probability
55
1 2 3
,...}2,1,0{)( XD5.05.0
)1()1(
n
nPnP
transition matrix P(X)
1 2
![Page 56: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/56.jpg)
Example: Weather Model
56
5.05.0
1.09.0
)()(
sunny
rainy
sunnyPrainyP
transition matrix P(X)
},{)( sunnyrainyXD
rain rain rainrain sun
![Page 57: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/57.jpg)
Multi-Variable System
• state is an assignment of values to all the variables
57
x1t
x2t
x3t
x1t+1
x2t+1
x3t+1
},...,,{ 21
t
n
ttt xxxx
finitediscreteXDXXXX i ,)(},,,{ 321
![Page 58: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/58.jpg)
Bayesian Network System
• Bayesian Network is a representation of the joint probability distribution over 2 or more variables
X1t
X2t
X3t
},,{ 321
tttt xxxx
X1
X2 X3
58
},,{ 321 XXXX
x1t+1
x2t+1
x3t+1
![Page 59: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/59.jpg)
59
Stationary DistributionExistence
• If the Markov chain is time-homogeneous, then the vector (X) is a stationary distribution (aka invariant or equilibrium distribution, aka “fixed point”), if its entries sum up to 1 and satisfy:
• Finite state space Markov chain has a unique stationary distribution if and only if:– The chain is irreducible
– All of its states are positive recurrent
)(
)|()()(XDx
jiji
i
xxPxx
![Page 60: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/60.jpg)
60
Irreducible
• A state x is irreducible if under the transition rule one has nonzero probability of moving from x to any other state and then coming back in a finite number of steps
• If one state is irreducible, then all the states must be irreducible
(Liu, Ch. 12, pp. 249, Def. 12.1.1)
![Page 61: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/61.jpg)
61
Recurrent
• A state x is recurrent if the chain returns to x
with probability 1
• Let M(x ) be the expected number of steps to return to state x
• State x is positive recurrent if M(x ) is finiteThe recurrent states in a finite state chain are positive recurrent .
![Page 62: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/62.jpg)
Stationary Distribution Convergence
• Consider infinite Markov chain:nnn PPxxPP 00)( )|(
• Initial state is not important in the limit
“The most useful feature of a “good” Markov chain is its fast forgetfulness of its past…”
(Liu, Ch. 12.1)
)(lim n
nP
62
• If the chain is both irreducible and aperiodic, then:
![Page 63: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/63.jpg)
Aperiodic
• Define d(i) = g.c.d.{n > 0 | it is possible to go from i to i in n steps}. Here, g.c.d. means the greatest common divisor of the integers in the set. If d(i)=1 for i, then chain is aperiodic
• Positive recurrent, aperiodic states are ergodic
63
![Page 64: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/64.jpg)
Markov Chain Monte Carlo
• How do we estimate P(X), e.g., P(X|e) ?
64
• Generate samples that form Markov Chain with stationary distribution =P(X|e)
• Estimate from samples (observed states):
visited states x0,…,xn can be viewed as “samples” from distribution
),(1
)(1
tT
t
xxT
x
)(lim xT
![Page 65: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/65.jpg)
MCMC Summary
• Convergence is guaranteed in the limit
• Samples are dependent, not i.i.d.
• Convergence (mixing rate) may be slow
• The stronger correlation between states, the slower convergence!
• Initial state is not important, but… typically, we throw away first K samples - “burn-in”
65
![Page 66: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/66.jpg)
Gibbs Sampling (Geman&Geman,1984)
• Gibbs sampler is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables
• Sample new variable value one variable at a time from the variable’s conditional distribution:
• Samples form a Markov chain with stationary distribution P(X|e)
66
)\|(},...,,,..,|()( 111 i
t
i
t
n
t
i
t
i
t
ii xxXPxxxxXPXP
![Page 67: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/67.jpg)
Gibbs Sampling: Illustration
The process of Gibbs sampling can be understood as a random walk in the space of all instantiations of X=x (remember drunkard’s walk):
In one step we can reach instantiations that differ from current one by value assignment to at most one variable (assume randomized choice of variables Xi).
![Page 68: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/68.jpg)
Ordered Gibbs Sampler
Generate sample xt+1 from xt :
In short, for i=1 to N:
68
),\|( from sampled
),,...,,|(
...
),,...,,|(
),,...,,|(
1
1
1
1
2
1
1
1
3
1
12
1
22
321
1
11
exxXPxX
exxxXPxX
exxxXPxX
exxxXPxX
i
t
i
t
ii
t
N
tt
N
t
NN
t
N
ttt
t
N
ttt
ProcessAllVariablesIn SomeOrder
![Page 69: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/69.jpg)
Transition Probabilities in BN
Markov blanket:
:)|( )\|( t
iii
t
i markovXPxxXP
ij chX
jjiii
t
i paxPpaxPxxxP )|()|()\|(
)()( jj chX
jiii pachpaXmarkov
Xi
Given Markov blanket (parents, children, and their parents),Xi is independent of all other nodes
69
Computation is linear in the size of Markov blanket!
![Page 70: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/70.jpg)
Ordered Gibbs Sampling Algorithm(Pearl,1988)
Input: X, E=e
Output: T samples {xt }
Fix evidence E=e, initialize x0 at random1. For t = 1 to T (compute samples)
2. For i = 1 to N (loop through variables)
3. xit+1 P(Xi | markovi
t)
4. End For
5. End For
![Page 71: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/71.jpg)
Gibbs Sampling Example - BN
71
X1
X4
X8 X5 X2
X3
X9 X7
X6
}{},,...,,{ 9921 XEXXXX
X1 = x10
X6 = x60
X2 = x20
X7 = x70
X3 = x30
X8 = x80
X4 = x40
X5 = x50
![Page 72: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/72.jpg)
Gibbs Sampling Example - BN
72
X1
X4
X8 X5 X2
X3
X9 X7
X6
),,...,|( 9
0
8
0
21
1
1 xxxXPx
}{},,...,,{ 9921 XEXXXX
),,...,|( 9
0
8
1
12
1
2 xxxXPx
![Page 73: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/73.jpg)
Answering Queries P(xi |e) = ?• Method 1: count # of samples where Xi = xi (histogram estimator):
T
t
t
iiiii markovxXPT
xXP1
)|(1
)(
T
t
t
iii xxT
xXP1
),(1
)(
Dirac delta f-n
• Method 2: average probability (mixture estimator):
• Mixture estimator converges faster (consider estimates for the unobserved values of Xi; prove via Rao-Blackwell theorem)
![Page 74: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/74.jpg)
Rao-Blackwell Theorem
Rao-Blackwell Theorem: Let random variable set X be composed of two groups of variables, R and L. Then, for the joint distribution (R,L) and function g, the following result applies
74
)]([}|)({[ RgVarLRgEVar
for a function of interest g, e.g., the mean or covariance (Casella&Robert,1996, Liu et. al. 1995).
• theorem makes a weak promise, but works well in practice!• improvement depends the choice of R and L
![Page 75: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/75.jpg)
Importance vs. Gibbs
T
tt
tt
t
T
t
t
T
t
xQ
xPxg
Tg
eXQX
xgT
Xg
eXPeXP
eXPx
1
1
)(
)()(1
)|(
)(1
)(ˆ
)|()|(ˆ
)|(ˆ
wt
Gibbs:
Importance:
![Page 76: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/76.jpg)
Gibbs Sampling: Convergence
76
• Sample from `P(X|e)P(X|e)
• Converges iff chain is irreducible and ergodic
• Intuition - must be able to explore all states:– if Xi and Xj are strongly correlated, Xi=0 Xj=0,
then, we cannot explore states with Xi=1 and Xj=1
• All conditions are satisfied when all probabilities are positive
• Convergence rate can be characterized by the second eigen-value of transition matrix
![Page 77: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/77.jpg)
Gibbs: Speeding Convergence
Reduce dependence between samples (autocorrelation)
• Skip samples
• Randomize Variable Sampling Order
• Employ blocking (grouping)
• Multiple chains
Reduce variance (cover in the next section)
77
![Page 78: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/78.jpg)
Blocking Gibbs Sampler
• Sample several variables together, as a block
• Example: Given three variables X,Y,Z, with domains of size 2, group Y and Z together to form a variable W={Y,Z} with domain size 4. Then, given sample (xt,yt,zt), compute next sample:
+ Can improve convergence greatly when two variables are strongly correlated!
- Domain of the block variable grows exponentially with the #variables in a block!
78
)|,(),(
)(),|(
1111
1
tttt
tttt
xZYPwzy
wPzyXPx
![Page 79: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/79.jpg)
Gibbs: Multiple Chains
• Generate M chains of size K
• Each chain produces independent estimate Pm:
79
M
i
mPM
P1
1ˆ
K
t
i
t
iim xxxPK
exP1
)\|(1
)|(
Treat Pm as independent random variables.
• Estimate P(xi|e) as average of Pm (xi|e) :
![Page 80: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/80.jpg)
Gibbs Sampling Summary
• Markov Chain Monte Carlo method(Gelfand and Smith, 1990, Smith and Roberts, 1993, Tierney, 1994)
• Samples are dependent, form Markov Chain
• Sample from which converges to
• Guaranteed to converge when all P > 0
• Methods to improve convergence:– Blocking
– Rao-Blackwellised
80
)|( eXP )|( eXP
![Page 81: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/81.jpg)
Overview
1. Probabilistic Reasoning/Graphical models
2. Importance Sampling
3. Markov Chain Monte Carlo: Gibbs Sampling
4. Sampling in presence of Determinism
5. Rao-Blackwellisation
6. AND/OR importance sampling
![Page 82: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/82.jpg)
Sampling: Performance
• Gibbs sampling
– Reduce dependence between samples
• Importance sampling
– Reduce variance
• Achieve both by sampling a subset of variables and integrating out the rest (reduce dimensionality), aka Rao-Blackwellisation
• Exploit graph structure to manage the extra cost
82
![Page 83: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/83.jpg)
Smaller Subset State-Space
• Smaller state-space is easier to cover
83
},,,{ 4321 XXXXX },{ 21 XXX
64)( XD 16)( XD
![Page 84: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/84.jpg)
Smoother Distribution
84
00
01
10
11
0
0.1
0.2
0001
1011
P(X1,X2,X3,X4)
0-0.1 0.1-0.2 0.2-0.26
0
1
0
0.1
0.2
0
1
P(X1,X2)
0-0.1 0.1-0.2 0.2-0.26
![Page 85: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/85.jpg)
Speeding Up Convergence
• Mean Squared Error of the estimator:
PVarBIASPMSE QQ 2
2
2
][ˆ]ˆ[]ˆ[ PEPEPVarPMSE QQQQ
• Reduce variance speed up convergence !
• In case of unbiased estimator, BIAS=0
85
![Page 86: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/86.jpg)
Rao-Blackwellisation
86
)}(~{]}|)([{)}({
)}(ˆ{
]}|)([{)}({
]}|)({var[]}|)([{)}({
]}|)([]|)([{1
)(~
)}()({1
)(ˆ
1
1
xgVarT
lxhEVar
T
xhVarxgVar
lxgEVarxgVar
lxgElxgEVarxgVar
lxhElxhET
xg
xhxhT
xg
LRX
T
T
Liu, Ch.2.3
![Page 87: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/87.jpg)
Rao-Blackwellisation
• X=RL
• Importance Sampling:
• Gibbs Sampling: – autocovariances are lower (less correlation
between samples)
– if Xi and Xj are strongly correlated, Xi=0 Xj=0, only include one fo them into a sampling set
87
})(
)({}
),(
),({
RQ
RPVar
LRQ
LRPVar QQ
Liu, Ch.2.5.5
“Carry out analytical computation as much as possible” - Liu
![Page 88: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/88.jpg)
Blocking Gibbs Sampler vs. Collapsed
• Standard Gibbs:
(1)
• Blocking:
(2)
• Collapsed:
(3)
88
X Y
Z
),|(),,|(),,|( yxzPzxyPzyxP
)|,(),,|( xzyPzyxP
)|(),|( xyPyxP
FasterConvergence
![Page 89: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/89.jpg)
Collapsed Gibbs Sampling
Generating Samples
Generate sample ct+1 from ct :
89
),\|(
),,...,,|(
...
),,...,,|(
),,...,,|(
1
1
1
1
2
1
1
1
3
1
12
1
22
321
1
11
ecccPcC
eccccPcC
eccccPcC
eccccPcC
i
t
i
t
ii
t
K
tt
K
t
KK
t
K
ttt
t
K
ttt
from sampled
In short, for i=1 to K:
![Page 90: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/90.jpg)
Collapsed Gibbs Sampler
Input: C X, E=e
Output: T samples {ct }
Fix evidence E=e, initialize c0 at random1. For t = 1 to T (compute samples)
2. For i = 1 to N (loop through variables)
3. cit+1 P(Ci | ct\ci)
4. End For
5. End For
![Page 91: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/91.jpg)
Calculation Time
• Computing P(ci| ct\ci,e) is more expensive (requires inference)
• Trading #samples for smaller variance:
– generate more samples with higher covariance
– generate fewer samples with lower covariance
• Must control the time spent computing sampling probabilities in order to be time-effective!
91
![Page 92: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/92.jpg)
Exploiting Graph Properties
Recall… computation time is exponential in the adjusted induced width of a graph
• w-cutset is a subset of variable s.t. when they are observed, induced width of the graph is w
• when sampled variables form a w-cutset , inference is exp(w) (e.g., using Bucket Tree
Elimination)
• cycle-cutset is a special case of w-cutset
92
Sampling w-cutset w-cutset sampling!
![Page 93: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/93.jpg)
What If C=Cycle-Cutset ?
93
}{},{ 9
0
5
0
2
0 XE,xxc
X1
X7
X5 X4
X2
X9 X8
X3
X6
X1
X7
X4
X9 X8
X3
X6
P(x2,x5,x9) – can compute using Bucket Elimination
P(x2,x5,x9) – computation complexity is O(N)
![Page 94: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/94.jpg)
Computing Transition Probabilities
94
),,1(:
),,0(:
932
932
xxxPBE
xxxPBE
X1
X7
X5 X4
X2
X9 X8
X3
X6
),,1()|1(
),,0()|0(
),,1(),,0(
93232
93232
932932
xxxPxxP
xxxPxxP
xxxPxxxP
Compute joint probabilities:
Normalize:
![Page 95: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/95.jpg)
Cutset Sampling-Answering Queries
• Query: ci C, P(ci |e)=? same as Gibbs:
95
computed while generating sample tusing bucket tree elimination
compute after generating sample tusing bucket tree elimination
T
t
i
t
ii ecccPT
|ecP1
),\|(1
)(ˆ
T
t
t
ii ,ecxPT
|e)(xP1
)|(1
• Query: xi X\C, P(xi |e)=?
![Page 96: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/96.jpg)
Cutset Sampling vs. Cutset Conditioning
96
)|()|(
)()|(
)|(1
)(
)(
1
ecPc,exP
T
ccountc,exP
,ecxPT
|e)(xP
CDc
i
CDc
i
T
t
t
ii
• Cutset Conditioning
• Cutset Sampling
)|()|()(
ecPc,exP|e)P(xCDc
ii
![Page 97: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/97.jpg)
Cutset Sampling Example
97
)(
)(
)(
3
1)|(
)(
)(
)(
9
2
52
9
1
52
9
0
52
92
9
2
52
3
2
9
1
52
2
2
9
0
52
1
2
,x| xxP
,x| xxP
,x| xxP
xxP
,x| xxP x
,x| xxP x
,x| xxP x
X1
X7
X6 X5 X4
X2
X9 X8
X3
Estimating P(x2|e) for sampling node X2 :
Sample 1
Sample 2
Sample 3
![Page 98: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/98.jpg)
Cutset Sampling Example
98
),,|(},{
),,|(},{
),,|(},{
9
3
5
3
23
3
5
3
2
3
9
2
5
2
23
2
5
2
2
2
9
1
5
1
23
1
5
1
2
1
xxxxPxxc
xxxxPxxc
xxxxPxxc
Estimating P(x3 |e) for non-sampled node X3 :
X1
X7
X6 X5 X4
X2
X9 X8
X3
),,|(
),,|(
),,|(
3
1)|(
9
3
5
3
23
9
2
5
2
23
9
1
5
1
23
93
xxxxP
xxxxP
xxxxP
xxP
![Page 99: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/99.jpg)
CPCS54 Test Results
99
MSE vs. #samples (left) and time (right)
Ergodic, |X|=54, D(Xi)=2, |C|=15, |E|=3
Exact Time = 30 sec using Cutset Conditioning
CPCS54, n=54, |C|=15, |E|=3
0
0.001
0.002
0.003
0.004
0 1000 2000 3000 4000 5000
# samples
Cutset Gibbs
CPCS54, n=54, |C|=15, |E|=3
0
0.0002
0.0004
0.0006
0.0008
0 5 10 15 20 25
Time(sec)
Cutset Gibbs
![Page 100: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/100.jpg)
CPCS179 Test Results
100
MSE vs. #samples (left) and time (right) Non-Ergodic (1 deterministic CPT entry)|X| = 179, |C| = 8, 2<= D(Xi)<=4, |E| = 35
Exact Time = 122 sec using Cutset Conditioning
CPCS179, n=179, |C|=8, |E|=35
0
0.002
0.004
0.006
0.008
0.01
0.012
100 500 1000 2000 3000 4000
# samples
Cutset Gibbs
CPCS179, n=179, |C|=8, |E|=35
0
0.002
0.004
0.006
0.008
0.01
0.012
0 20 40 60 80
Time(sec)
Cutset Gibbs
![Page 101: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/101.jpg)
CPCS360b Test Results
101
MSE vs. #samples (left) and time (right)
Ergodic, |X| = 360, D(Xi)=2, |C| = 21, |E| = 36
Exact Time > 60 min using Cutset Conditioning
Exact Values obtained via Bucket Elimination
CPCS360b, n=360, |C|=21, |E|=36
0
0.00004
0.00008
0.00012
0.00016
0 200 400 600 800 1000
# samples
Cutset Gibbs
CPCS360b, n=360, |C|=21, |E|=36
0
0.00004
0.00008
0.00012
0.00016
1 2 3 5 10 20 30 40 50 60
Time(sec)
Cutset Gibbs
![Page 102: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/102.jpg)
Random Networks
102
MSE vs. #samples (left) and time (right)
|X| = 100, D(Xi) =2,|C| = 13, |E| = 15-20
Exact Time = 30 sec using Cutset Conditioning
RANDOM, n=100, |C|=13, |E|=15-20
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
0 200 400 600 800 1000 1200
# samples
Cutset Gibbs
RANDOM, n=100, |C|=13, |E|=15-20
0
0.0002
0.0004
0.0006
0.0008
0.001
0 1 2 3 4 5 6 7 8 9 10 11
Time(sec)
Cutset Gibbs
![Page 103: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/103.jpg)
Coding NetworksCutset Transforms Non-Ergodic Chain to Ergodic
103
MSE vs. time (right)
Non-Ergodic, |X| = 100, D(Xi)=2, |C| = 13-16, |E| = 50
Sample Ergodic Subspace U={U1, U2,…Uk}
Exact Time = 50 sec using Cutset Conditioning
x1 x2 x3 x4
u1 u2 u3 u4
p1 p2 p3 p4
y4y3y2y1
Coding Networks, n=100, |C|=12-14
0.001
0.01
0.1
0 10 20 30 40 50 60
Time(sec)
IBP Gibbs Cutset
![Page 104: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/104.jpg)
Non-Ergodic Hailfinder
104
MSE vs. #samples (left) and time (right)
Non-Ergodic, |X| = 56, |C| = 5, 2 <=D(Xi) <=11, |E| = 0
Exact Time = 2 sec using Loop-Cutset Conditioning
HailFinder, n=56, |C|=5, |E|=1
0.0001
0.001
0.01
0.1
1
1 2 3 4 5 6 7 8 9 10
Time(sec)
Cutset Gibbs
HailFinder, n=56, |C|=5, |E|=1
0.0001
0.001
0.01
0.1
0 500 1000 1500
# samples
Cutset Gibbs
![Page 105: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/105.jpg)
CPCS360b - MSE
105
cpcs360b, N=360, |E|=[20-34], w*=20, MSE
0
0.000005
0.00001
0.000015
0.00002
0.000025
0 200 400 600 800 1000 1200 1400 1600
Time (sec)
Gibbs
IBP
|C|=26,fw=3
|C|=48,fw=2
MSE vs. Time
Ergodic, |X| = 360, |C| = 26, D(Xi)=2
Exact Time = 50 min using BTE
![Page 106: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/106.jpg)
Cutset Importance Sampling
• Apply Importance Sampling over cutset C
T
t
tT
tt
t
wTcQ
ecP
TeP
11
1
)(
),(1)(ˆ
T
t
tt
ii wccT
ecP1
),(1
)|(
T
t
tt
ii wecxPT
exP1
),|(1
)|(
where P(ct,e) is computed using Bucket Elimination
(Gogate & Dechter, 2005) and (Bidyuk & Dechter, 2006)
![Page 107: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/107.jpg)
Likelihood Cutset Weighting (LCS)
• Z=Topological Order{C,E}
• Generating sample t+1:
107
For End
If End
),...,|(
Else
If
:do For
1
1
11
1
1
t
i
t
i
t
i
ii
t
i
i
i
zzZPz
e ,zzz
E Z
ZZ • computed while generating sample tusing bucket tree elimination
• can be memoized for some number of instances K (based on memory available
KL[P(C|e), Q(C)+ ≤ KL*P(X|e), Q(X)]
![Page 108: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/108.jpg)
Pathfinder 1
108
![Page 109: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/109.jpg)
Pathfinder 2
109
![Page 110: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/110.jpg)
Link
110
![Page 111: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/111.jpg)
Summary
• i.i.d. samples
• Unbiased estimator
• Generates samples fast
• Samples from Q
• Reject samples with zero-weight
• Improves on cutset
• Dependent samples
• Biased estimator
• Generates samples slower
• Samples from `P(X|e)
• Does not converge in presence of constraints
• Improves on cutset
111
Importance Sampling Gibbs Sampling
![Page 112: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/112.jpg)
CPCS360b
112
cpcs360b, N=360, |LC|=26, w*=21, |E|=15
1.E-05
1.E-04
1.E-03
1.E-02
0 2 4 6 8 10 12 14
Time (sec)
MS
ELW
AIS-BN
Gibbs
LCS
IBP
LW – likelihood weightingLCS – likelihood weighting on a cutset
![Page 113: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/113.jpg)
CPCS422b
113
1.0E-05
1.0E-04
1.0E-03
1.0E-02
0 10 20 30 40 50 60
MS
E
Time (sec)
cpcs422b, N=422, |LC|=47, w*=22, |E|=28 LW
AIS-BN
Gibbs
LCS
IBP
LW – likelihood weightingLCS – likelihood weighting on a cutset
![Page 114: Sampling Techniques for Probabilistic and …Markov Chain Monte Carlo: Gibbs Sampling 4. Sampling in presence of Determinism 5. Rao-Blackwellisation 6. AND/OR importance sampling Overview](https://reader035.fdocuments.net/reader035/viewer/2022062602/5ec5c5626d942b5f2d16a59e/html5/thumbnails/114.jpg)
Coding Networks
114
coding, N=200, P=3, |LC|=26, w*=21
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
0 2 4 6 8 10
Time (sec)
MS
ELWAIS-BNGibbsLCSIBP
LW – likelihood weightingLCS – likelihood weighting on a cutset