Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning

1/19

Lecture 9: Undirected Graphical Models

Machine Learning

Andrew Rosenberg

March 5, 2010

1 / 1
http://find/http://goback/


2/19

Today

Graphical Models

Probabilities in Undirected Graphs

2 / 1
http://goforward/http://find/http://goback/


3/19

Undirected Graphs

What if we allow undirected graphs?What do they correspond to?

Its not cause/effect, or trigger/response, rather, generaldependence.

Example: Image pixels, where each pixel is a bernouli.Can have a probability over all pixels p(x11, x1M, xM1, xMM)

Bright pixels have bright neighbors.

No parents, just probabilities.

Grid models are called Markov Random Fields.3 / 1


4/19

Undirected Graphs

x y|{w, z}

w z|{x, y}

x y|{w}

cannot representx y|{w, z}

w

x y

z

w

x y

z

Undirected separation is easy.

To check xa xb|xc, check Graph reachability of xa and xbwithout going through nodes in xc.

4 / 1


5/19

Undirected Graphs

x y|z OR x zx y

x y|z

z

x y

z

x yx y

z

Undirected separation is easy.

To check xa xb|xc, check Graph reachability of xa and xbwithout going through nodes in xc.

5 / 1

G


6/19


Graph cliques define clusters of dependent variables

a b c

d

e

f

Clique: a set of nodes such there is an edge between every

pair of members of the set.We define probability as a product of functions defined overcliques

6 / 1

P b bili i i U di d G h


7/19



a b c

d

e

f



7 / 1

P b bili i i U di d G h


8/19



a b c

d

e

f



8 / 1

P b biliti i U di t d G h


9/19



a b c

d

e

f



9 / 1

P b biliti i U di t d G h


10/19



a b c

d

e

f



10/1

R s ti P b biliti s


11/19

Representing Probabilities

Potential Functions over cliques

p(x) = p(x0, . . . , xn1) =1

Z

cC

c(xc)

Normalizing Term guarantees that p(x) sums to 1

Z =

x

cC

c(xc)

Potential Functions are positive functions over groups ofconnected variables.

Use only maximal cliques.e.g. (x1, x2, x3)(x2, x3) (x1, x2, x3)

11/1

Logical Inference


12/19

Logical Inference

a

b

c

d

e

NOT

XOR

AND

In Logic Networks, nodes are binary, and edges represent gates

Gates: AND, OR, XOR, NAND, NOR, NOT, etc.

Inference: given observed variables, predict others.

Problems: Uncertainty, conflicts and inconsistencyRather than saying a variable is True or False, lets say it is .8True and .2 False.

Probabilistic Inference

12/1

Inference


13/19

Inference


a

b

c

d

e

NOT

XOR

AND

Replace the logic network with a Bayesian Network

Probabilistic Inference: given observed variables, predictmarginals over others.

Not

b=t b=f

a=t 0 1

a=f 1 0

13/1

Inference


14/19

Inference


a

b

c

d

e

NOT

XOR

AND

Replace the logic network with a Bayesian Network

Probabilistic Inference: given observed variables, predictmarginals over others.

Soft Notb=t b=f

a=t .1 .9

a=f .9 .1

14/1

Inference


15/19

Inference

General Problem

Given a graph and probabilities, for any subset of variables, find

p(xe|xo) =p(xe, xo)

p(xo)

Compute both marginals and divide.

But this can be exponential...(Based on the number of parent eachnode has, or the size of the cliques)

p(xj, xk) =

x0

x1

. . .

xM

1

M1

i=0

p(xi|i)

p(xj, xk) =

x0

x1

. . .

xM1

cC

(xc)

Have efficient learning and storage in Graphical Models, now

inference.15/1

Inefficient Marginals


16/19

Inefficient Marginals

Brute Force.

Given CPTs and a graph structure we can compute arbitrarymarginals by brute force, but its inefficient.

For Example

p(x) = p(x0)p(x1|x0)p(x2|x0)p(x3|x1)p(x4|x2)p(x5|x2, x5)

p(x0, x2) = p(x0)p(x2|x0)

p(x0, x5) =X

x1,x2,x3,x4

p(x0)p(x1|x0)p(x2|x0)p(x3|x1)p(x4|x2)p(x5|x2, x5)

p(x0|x5) =

Px1,x2,x3,x4

p(x)

Px0

,

x1,

x2,

x3,

x4

p(x)

p(x0|x5 = TRUE) =

Px1,x2,x3,x4

p(xU\5|x5 = TRUE)Px0,x1,x2,x3,x4

p(xU\5|x5 = TRUE)

16/1

Efficient Computation of Marginals


17/19

Efficient Computation of Marginals

a

b

c

d

e

Pass messages (small tables) around the graph.

The messages will be small functions that propagatepotentials around an undirected graphical model.

The inference technique is the Junction Tree Algorithm

17/1

Junction Tree Algorithm


18/19


Efficient Message Passing on Undirected Graphs.For Directed Graphs, first convert to an Undirected Graph(Moralization).


Moralization

Introduce Evidence

Triangulate

Construct Junction Tree

Propagate Probabilities

18/1

Bye


19/19

Bye

Next


19/1

Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning

Documents

Transcript of Andrew Rosenberg- Lecture 9: Undirected Graphical Models Machine Learning