Recursive Cavity Modeling for Estimation of Gaussian...
Transcript of Recursive Cavity Modeling for Estimation of Gaussian...
![Page 1: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/1.jpg)
Recursive Cavity Modeling for
Estimation of Gaussian MRFs∗
Stochastic Systems Group
Jason K. Johnson
October 9, 2002
∗/mit/jasonj/Public/SSG-OCT9-02
![Page 2: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/2.jpg)
Overview
• Background
– Graphical Models (MRFs)
– Exponential Families
– Gaussian MRFs
– Information Geometry and Projections
• Model-Thinning Projections
– Model Selection by greedy edge-removal proce-dure.
– Parameters optimized by Iterative Scaling.
• Recursive Cavity Modeling
– Nested Dissection
– Cavity Modeling
– Blanket Modeling
– Examples
1
![Page 3: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/3.jpg)
Graphical Models∗
Undirected graph G = (V, E) based upon ver-
tices V with E (unordered pairs of vertices).
Random variables x = (xi, i ∈ V) are said to
be Markov w.r.t G when
p(xA, xB|xS) = p(xA|xS)p(xB|xS)
for all A,B, S ⊂ V where S seperates A from
B.
Hammersley-Clifford, 71.† x is Markov w.r.t.
G if and only if p(x) factors according to G as
p(x) =1
Z(ψ)
∏
c∈Cψc(xc)
with positive potential functions ψ and Z(ψ)is normalization constant.
Markov structure of random process x allows
for compact specification of p(x) as graphical
models.∗Lauritzen, 96; Jordan, 99.†Grimmett, 73.
2
![Page 4: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/4.jpg)
Example MRF
x1 x2
x4 x3
Graph Factorization
p(x) ∝ ψ1(x1)ψ2(x2)ψ3(x3)ψ4(x4)
ψ1,2(x1, x2)ψ2,3(x2, x3)
ψ3,4(x3, x4)ψ4,1(x4, x1)
Conditional Independence
p(x1,3|x2,4) = p(x1|x2,4)p(x3|x2,4)
p(x2,4|x1,3) = p(x2|x1,3)p(x4|x1,3)
3
![Page 5: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/5.jpg)
Exponential Families∗
Specified by a base measure q(x) > 0 and a set
of sufficient statistics t(x) both defined over
some specified state-space X. We take X =
Rn so that model is specified by pdf of the
form
f(x; θ) = q(x) exp{θ · t(x)− ϕ(θ)}
where the cumulant function ϕ(θ) is the nor-
malization constant
ϕ(θ) = log∫
q(x) exp{θ · t(x)}dx
Only consider admissable parameters Θ s.t.
pdf is normalizable ϕ(θ) < ∞. The family
is regular if Θ has non-empty interior. The
statistics are minimal if the t(x) are linearly-
independent. Then, dual parameterization pro-
vided by moment coordinates η = Eθ{t(x)}
over the set of achievable moments η(Θ).
∗Chentsov, 66; Barndorff-Nielsen, 78.
4
![Page 6: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/6.jpg)
Gaussian Markov Random Fields
Consider Gaussian process x ∼ N (µ,Σ) with
mean vector µ = E{x} and covariance matrix
Σ = E{xx′} − µµ′.
Information Filter Form. Say that x ∼ N−1(h, J)
if
h = Σ−1µ
J = Σ−1
s.t. density function is parameterized as
p(x) = exp{−1
2x′Jx+ h′x− ϕ(h, J)}
where
ϕ(h, J) =1
2{h′J−1h− log |J |+ n log 2π}.
This is an exponential family model with
θ = (h,−J/2)
t(x) = (x, xx′)
η = (µ,Σ + µµ′)
ϕ(θ) = ϕ(h, J)
5
![Page 7: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/7.jpg)
Example GMRF
ψ1 ψ2 ψ3
ψ1,2 ψ2,3
p(x) ∝ ψ1(x1)ψ2(x2)ψ3(x3)ψ1,2(x1, x2)ψ2,3(x2, x3)
ψ1(x1) = exp{−1
2x′1J1,1x1 + h′1x1}
ψ2(x2) = exp{−1
2x′2J2,2x2 + h′2x2}
ψ3(x3) = exp{−1
2x′3J3,3x3 + h′3x3}
ψ1,2(x1, x2) = exp{−x′1J1,2x2}
ψ2,3(x2, x3) = exp{−x′2J2,3x3}
h =
h1h2h3
, J =
J1,1 J1,2 0J ′1,2 J2,2 J2,3
0 J ′2,3 J3,3
6
![Page 8: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/8.jpg)
Information Geometry∗
Based upon the Kullback-Leibler divergence†,
a measure of contrast between probability dis-
tributions.
D(p‖q) = Ep
logp(x)
q(x)
Bregman distance in θ based upon ϕ(θ),
D(θ∗‖θ) = ϕ(θ)−∇ϕ(θ∗) · (θ − θ∗)
Legendre transform ϕ∗(η) of ϕ(θ):
ϕ∗(η) = θ(η) · η − ϕ(θ)
“Slope transform”
η(θ) =∂ϕ(θ)
∂θ
θ(η) =∂ϕ∗(η)
∂η
Convex bifunction in (η(p), θ(q)),
D(η‖θ) = ϕ∗(η) + ϕ(θ)− η · θ
∗Chentsov, 72; Csiszar, 75; Efron, 78; Amari, 01.†Kullback and Leibler, 51.
7
![Page 9: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/9.jpg)
Bregman distance∗
ϕ(θ)
D(θ0||θ)
θ0 θ
ϕ(θ; θ0)
∗Bregman, 67.
8
![Page 10: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/10.jpg)
Triangle Relation
D(θ0||θ1)
ϕ(θ)
θ0 θ1 θ2
D(θ0||θ2)
D(θ1||θ2)
∆ · (θ2 − θ1)
D(θ0‖θ2) = D(θ0‖θ1)+D(θ1‖θ2)+(η1−η0)·(θ2−θ1)
9
![Page 11: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/11.jpg)
Information Projections
Let F be a regular exponential family with min-
imal statistics t(x), exponential coordinates Θ,
and moment coordinates η(Θ).
M-projection. Let p ∈ F , H ⊂ F e-flat sub-
manifold. Exists unique q∗ ∈ H satisfying the
following equivalent conditions:
(i) D(p‖q∗) = infq∈HD(p‖q)
(ii) ∀q ∈ H : (η(p)−η(q∗)) ·(θ(q)−θ(q∗)) = 0
(iii) ∀q ∈ H : D(p‖q) = D(p‖q∗) +D(q∗‖q)
We call q∗ = arg minq∈HD(p‖q) them-projection
of p to H.
10
![Page 12: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/12.jpg)
M-projection
Θ η(Θ)
p
q q∗
p
q∗
q
D(p‖q∗)
D(q∗‖q)
D(p‖q)
∂
∂θ(q)D(p‖q) = η(q)− η(p)
11
![Page 13: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/13.jpg)
Dual E-projection
E-projection. Let q ∈ F , H′ ⊂ F m-flat sub-
manifold. Exists unique p∗ ∈ H′ satisfying the
following equivalent conditions:
(i) D(p∗‖q) = infp∈H′D(p‖q)
(ii) ∀p ∈ H′ : (η(p)−η(p∗))·(θ(q)−θ(p∗)) = 0
(iii) ∀p ∈ H′ : D(p‖q) = D(p‖p∗) +D(p∗‖q)
We call p∗ = arg minp∈H′D(p||q) the e-projectionof q to H′.
Duality. Let H and H′ be I-orthogonal sub-manifolds such that exists r in intersection and
∀p ∈ H′, q ∈ H : (η(p)−η(r))·(θ(q)−θ(r)) = 0
Then, r is both the m-projection of p ∈ H′ toH and the e-projection of q ∈ H to H′.
12
![Page 14: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/14.jpg)
E-projection
Θ η(Θ)
p
q
p
q
D(p‖q)D(p‖p∗)
p∗
D(p∗‖q)
p∗
∂
∂η(p)D(p‖q) = θ(p)− θ(q)
13
![Page 15: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/15.jpg)
Model Thinning
Let t(x) = (tH(x), t′H(x)), θ = (θH, θ′H) and
η = (ηH, η′H).
Objective. M-project p ∈ F to lower-order
exponential family,
H = {q ∈ F | θ′H(q) = 0}
Dual Problem. E-projection q ∈ H to the m-
flat submanifold:
H′(p) = {r ∈ F | ηH(r) = ηH(p)} (1)
The latter e-projection problem may be solved
by iterative scaling techniques which adjust pa-
rameters θH(q) until ηH(q) = ηH(p) (moment
matching).
For GMRF x ∼ N−1(h, J), impose sparsity
on J . Moment-matching gives classical covari-
ance selection problem (Dempster, 72).
14
![Page 16: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/16.jpg)
Iterative Scaling
Alternating e-projections to set of m-flat sub-
manifolds converges to e-projection to inter-
section (Csiszar, 75). Special case of method
of alternating Bregman projections (Bregman,
67).
Iterative Proportional Fitting.∗ m-flat subman-
ifolds impose marginal moment constraints specif-
ing marginal distribution p∗(xC).
ψ(xC)← ψ(xC)×p∗(xC)
p(xC)
Covariance Selection.† Updates exponential
parameters (hC, JC) to impose moment con-
straints (µ∗C,Σ∗C).
JC ← JC + (J∗C − JC)
hC ← hC + (h∗C − hC)
where (h∗C, J∗C) = ((Σ∗C)−1µ∗C, (Σ
∗C)−1) and
(h∗C, J∗C) = (Σ−1
C µC,Σ−1C ) (marginal informa-
tion models).∗Ireland and Kullback, 68.†Speed and Kivveri, 86.
15
![Page 17: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/17.jpg)
Greedy Edge-Removal
Prunes edges from graphical model by forc-
ing selected off-diagonal entries of J to zero
(m-projections implemented by iterative scal-
ing techniques).
Selects weak interactions to prune according
to conditional mutual information
I(xi; xj|x\ij) = −1
2log
1−det Ji,j
√
det Ji,i det Jj,j
which gives tractable lower-bound estimate of
KL under m-projection.
Selects batch K ⊂ V of weakest edges to prune
satisfying
∑
KIi;j <
δ
|K|
Continues thinning until no more weak inter-
actions relative to δ. Related to Akaike infor-
mation criterion (Akaike, 74).
16
![Page 18: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/18.jpg)
Nested Dissection
(1) vertical cut.
(2) horizontal cut.
(3) vertical cut
(4) horizontal cut.
17
![Page 19: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/19.jpg)
Variable Elimination
Integrate over subset Λ ⊂ V of random vari-
ables:
p(x\Λ) =∫
p(x)dxΛ
Local parameter update in (h, J) representa-
tion:
h∂Λ ← h∂Λ − J∂Λ,ΛJ−1Λ,ΛhΛ
J∂Λ ← J∂Λ − J∂Λ,ΛJ−1Λ,ΛJΛ,∂Λ
Eliminates vertices in graphical model but adds
“fill” edges between neighbors. Only updates
local parameters and structure of “boundary”
∂Λ of subfield.
18
![Page 20: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/20.jpg)
Cavity Models (Initialization)
(1) Partial model of subfield (zero boundary).
(2) Elimination gives model of surface.
(3) Model thinning gives “cavity model”.
19
![Page 21: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/21.jpg)
“Upwards” Cavity Modeling
(1) Initialization.
(2) Merge. (3) Eliminate.
(4) Thin.
20
![Page 22: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/22.jpg)
“Downwards” Blanket Modeling
(1) Initialization.
(2) Merge. (3) Eliminate.
(4) Thin.
21
![Page 23: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/23.jpg)
Conclusion
RCM appears to provide a powerful and flex-
ible framework for tractable yet near-optimal
computation in MRFs.
Much work remains to better characterize per-
formance and explore promising extensions:
• Develop information geometry of RCM.
• Consider more general families of graphical
models.
• Employ alternative modeling techniques.
• Applications
– Model Identification
– Image Processing
– Data Compression and Coding
– Monte-Carlo Simulation
22
![Page 24: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/24.jpg)
References
Akaike, 74. A new look at the statistical model identi-fication. IEEE Trans. Auto. Control, AC-19:716:723.
Amari, 01. Information geometry of hierarchy of proba-bility distributions. IEEE Trans. Inf. Theory, 47(5):1701-1711.
Chentsov, 66. A systematic theory of exponential fam-ilies. Theory of Prob. and Appl., 11.
Chentsov, 72. Statistical decision rules and optimal in-ference. AMS Trans. Math. Mono., v.53 (reprint 82).
Barndorff-Nielsen, 78. Information and Exponential Fam-
ilies. John Wiley.
Bregman, 67. The relaxation method of finding thecommon point of convex sets. USSR Comp. Math.
and Physics, 7:200-217.
Csiszar, 75. I-divergence geometry of probability distri-butions and minimization problems. Annals of Prob.,3(1):146-158.
Dempster, 72. Covariance Selection. Biometrics, 28(1):157-175.
Efron, 78. The geometry of exponential families. An-
nals of Stat., 6(2):362-376.
Grimmett, 73. A thoerem about random fields. Bull.
of London Math. Soc., 5:81-84.
23
![Page 25: Recursive Cavity Modeling for Estimation of Gaussian MRFsssg.mit.edu/~jasonj/johnson-rcm-ssg02.pdf · Recursive Cavity Modeling for Estimation of Gaussian MRFs ... Rn so that model](https://reader033.fdocuments.net/reader033/viewer/2022060316/5f0c1a5c7e708231d433c145/html5/thumbnails/25.jpg)
Ireland and Kullback, 68. Contingency tables with givenmarginals. Biometrika, 55:179-188.
Jordan (editor), 99. Learning in Graphical Models. MITPress.
Kullback and Leibler, 51. On information and suffi-ciency. Annals of Math. Stat., 22(1):79-86.
Lauritzen, 96. Graphical Models. Oxford UniversityPress.
Speed and Kiiveri, 86. Gaussian Markov distributionsover finite graphs. Annals of Stat., 14(1):138-150.