Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
Transcript of Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
-
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
1/19
Lecture 9: Undirected Graphical Models
Machine Learning
Andrew Rosenberg
March 5, 2010
1 / 1
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
2/19
Today
Graphical Models
Probabilities in Undirected Graphs
2 / 1
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
3/19
Undirected Graphs
What if we allow undirected graphs?What do they correspond to?
Its not cause/effect, or trigger/response, rather, generaldependence.
Example: Image pixels, where each pixel is a bernouli.Can have a probability over all pixels p(x11, x1M, xM1, xMM)
Bright pixels have bright neighbors.
No parents, just probabilities.
Grid models are called Markov Random Fields.3 / 1
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
4/19
Undirected Graphs
x y|{w, z}
w z|{x, y}
x y|{w}
cannot representx y|{w, z}
w
x y
z
w
x y
z
Undirected separation is easy.
To check xa xb|xc, check Graph reachability of xa and xbwithout going through nodes in xc.
4 / 1
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
5/19
Undirected Graphs
x y|z OR x zx y
x y|z
z
x y
z
x yx y
z
Undirected separation is easy.
To check xa xb|xc, check Graph reachability of xa and xbwithout going through nodes in xc.
5 / 1
G
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
6/19
Probabilities in Undirected Graphs
Graph cliques define clusters of dependent variables
a b c
d
e
f
Clique: a set of nodes such there is an edge between every
pair of members of the set.We define probability as a product of functions defined overcliques
6 / 1
P b bili i i U di d G h
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
7/19
Probabilities in Undirected Graphs
Graph cliques define clusters of dependent variables
a b c
d
e
f
Clique: a set of nodes such there is an edge between every
pair of members of the set.We define probability as a product of functions defined overcliques
7 / 1
P b bili i i U di d G h
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
8/19
Probabilities in Undirected Graphs
Graph cliques define clusters of dependent variables
a b c
d
e
f
Clique: a set of nodes such there is an edge between every
pair of members of the set.We define probability as a product of functions defined overcliques
8 / 1
P b biliti i U di t d G h
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
9/19
Probabilities in Undirected Graphs
Graph cliques define clusters of dependent variables
a b c
d
e
f
Clique: a set of nodes such there is an edge between every
pair of members of the set.We define probability as a product of functions defined overcliques
9 / 1
P b biliti i U di t d G h
http://goforward/http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
10/19
Probabilities in Undirected Graphs
Graph cliques define clusters of dependent variables
a b c
d
e
f
Clique: a set of nodes such there is an edge between every
pair of members of the set.We define probability as a product of functions defined overcliques
10/1
R s ti P b biliti s
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
11/19
Representing Probabilities
Potential Functions over cliques
p(x) = p(x0, . . . , xn1) =1
Z
cC
c(xc)
Normalizing Term guarantees that p(x) sums to 1
Z =
x
cC
c(xc)
Potential Functions are positive functions over groups ofconnected variables.
Use only maximal cliques.e.g. (x1, x2, x3)(x2, x3) (x1, x2, x3)
11/1
Logical Inference
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
12/19
Logical Inference
a
b
c
d
e
NOT
XOR
AND
In Logic Networks, nodes are binary, and edges represent gates
Gates: AND, OR, XOR, NAND, NOR, NOT, etc.
Inference: given observed variables, predict others.
Problems: Uncertainty, conflicts and inconsistencyRather than saying a variable is True or False, lets say it is .8True and .2 False.
Probabilistic Inference
12/1
Inference
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
13/19
Inference
Probabilistic Inference
a
b
c
d
e
NOT
XOR
AND
Replace the logic network with a Bayesian Network
Probabilistic Inference: given observed variables, predictmarginals over others.
Not
b=t b=f
a=t 0 1
a=f 1 0
13/1
Inference
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
14/19
Inference
Probabilistic Inference
a
b
c
d
e
NOT
XOR
AND
Replace the logic network with a Bayesian Network
Probabilistic Inference: given observed variables, predictmarginals over others.
Soft Notb=t b=f
a=t .1 .9
a=f .9 .1
14/1
Inference
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
15/19
Inference
General Problem
Given a graph and probabilities, for any subset of variables, find
p(xe|xo) =p(xe, xo)
p(xo)
Compute both marginals and divide.
But this can be exponential...(Based on the number of parent eachnode has, or the size of the cliques)
p(xj, xk) =
x0
x1
. . .
xM
1
M1
i=0
p(xi|i)
p(xj, xk) =
x0
x1
. . .
xM1
cC
(xc)
Have efficient learning and storage in Graphical Models, now
inference.15/1
Inefficient Marginals
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
16/19
Inefficient Marginals
Brute Force.
Given CPTs and a graph structure we can compute arbitrarymarginals by brute force, but its inefficient.
For Example
p(x) = p(x0)p(x1|x0)p(x2|x0)p(x3|x1)p(x4|x2)p(x5|x2, x5)
p(x0, x2) = p(x0)p(x2|x0)
p(x0, x5) =X
x1,x2,x3,x4
p(x0)p(x1|x0)p(x2|x0)p(x3|x1)p(x4|x2)p(x5|x2, x5)
p(x0|x5) =
Px1,x2,x3,x4
p(x)
Px0
,
x1,
x2,
x3,
x4
p(x)
p(x0|x5 = TRUE) =
Px1,x2,x3,x4
p(xU\5|x5 = TRUE)Px0,x1,x2,x3,x4
p(xU\5|x5 = TRUE)
16/1
Efficient Computation of Marginals
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
17/19
Efficient Computation of Marginals
a
b
c
d
e
Pass messages (small tables) around the graph.
The messages will be small functions that propagatepotentials around an undirected graphical model.
The inference technique is the Junction Tree Algorithm
17/1
Junction Tree Algorithm
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
18/19
Junction Tree Algorithm
Efficient Message Passing on Undirected Graphs.For Directed Graphs, first convert to an Undirected Graph(Moralization).
Junction Tree Algorithm
Moralization
Introduce Evidence
Triangulate
Construct Junction Tree
Propagate Probabilities
18/1
Bye
http://find/http://goback/ -
8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning
19/19
Bye
Next
Junction Tree Algorithm
19/1
http://find/http://goback/