Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) =...

31
chain rule: and ~v3 and v4 and ~v5) P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3, P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4) P(v1,v2,~ and ~v3 and v4 and ~v5) P(v2|v4)P(~v3|v4,v2)P(v1|v4,v2,~v3)P(~v5|v4,v2,~v3, Assume Vi a binary valued variable (T or F) A factorization ordering An alternative ordering

Transcript of Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) =...

Page 1: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Recall the chain rule:

P(v1 and v2 and ~v3 and v4 and ~v5)

= P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4)

P(v1,v2) P(v1,v2,~v3)

P(v1,v2,~v3,v4)

P(v1,v2,~v3,v4,~v5)

P(v1 and v2 and ~v3 and v4 and ~v5)

= P(v4)P(v2|v4)P(~v3|v4,v2)P(v1|v4,v2,~v3)P(~v5|v4,v2,~v3,v1)

Assume Vi a binary valued variable (T or F)

A factorization ordering

An alternative ordering

Page 2: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider an ordering:

P(v1 and v2 and ~v3 and v4 and ~v5)

= P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4)

Assume the following conditional independencies:

P(v1) P(v2|v1) = P(v2) and P(v2|~v1) = P(v2), P(~v2|v1) = P(~v2), P(~v2|~v1) = P(~v2)

P(~v3|v1,v2) = P(~v3|v1) and P(~v3|v1,~v2) = P(~v3|v1), P(~v3|~v1,v2) = P(~v3|~v1), P(~v3|~v1,~v2) = P(~v3|~v1), P(v3|v1,v2) = P(v3|v1), P(v3|v1,~v2) = P(v3|v1), P(v3|~v1,v2) = P(v3|~v1), P(v3|~v1,~v2) = P(v3|~v1)

P(v4|v1,v2,~v3) = P(v4|v2, ~v3) and ……

P(~v5|v1,v2,~v3,v4) = P(~v5|~v3) and …..

A factorization ordering

v2 independent of v1

v3 independent of v2 conditioned on v1

Page 3: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

ThenP(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) = P(v1)P(v2)P(~v3|v1)P(v4|v2,~v3)P(~v5|~v3)

How many probabilities need be stored? P(v1), P(~v1) 2 probabilities (actually only one, since P(~v1) = 1 – P(v1))

2 probabilities (or 1) instead of 4 (or 2)

P(v2|v1) = P(v2) and P(v2|~v1) = P(v2), P(~v2|v1) = P(~v2), P(~v2|~v1) = P(~v2) P(~v3|v1) = 1 – P(v3|v1)

P(~v3|v1,v2) = P(~v3|v1) 4 probabilities (or 2) instead of 8 (or 4)

and P(~v3|v1,~v2) = P(~v3|v1), P(~v3|~v1,v2) = P(~v3|~v1), P(~v3|~v1,~v2) = P(~v3|~v1), P(v3|v1,v2) = P(v3|v1), P(v3|v1,~v2) = P(v3|v1), P(v3|~v1,v2) = P(v3|~v1), P(v3|~v1,~v2) = P(v3|~v1) 8 probabilities (or 4) instead of 16 (or 8)

P(v4|v1,v2,~v3) = P(v4|v2, ~v3) and …… 4 probabilities (or 2) instead of 32 (or 16)

P(~v5|v1,v2,~v3,v4) = P(~v5|~v3) and …..

Page 4: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

A Bayesian Network is a graphical representation of a joint probability distribution with (conditional) independence relationships made explicit

For a particular factorization ordering, construct a Bayesian network as follows(p. 440 of text):

P(v1), P(~v1) v1 P(v1) = 0.75P(~v1) = 0.25 = 1 – P(v1)

Since P(v2|v1) = P(v2) ….

v1 a “root”

v2 is second variable in ordering. If v2 independent of a subset of its predecessors(possibly the empty set) in ordering conditioned on a disjoint subset of predecessors(including possibly all its predecessors), then the latter subset is its parents, elsev2 is a “root”

v1P(v1) v2 P(v2)

Page 5: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

v3 is third variable in ordering. Since P(v3|v1,v2) = P(v3|v1), …:

v1P(v1) v2 P(v2)

v3P(v3|v1)P(v3|~v1)P(~v3|v1) = 1 – P(v3|v1)P(~v3|~v1) = 1 – P(v3|~v1)

Since P(v4| v1, v2, v3) = P(v4 | v2, v3), …

v1P(v1) v2 P(v2)

v3P(v3|v1)P(v3|~v1)

v4

P(v4|v2, v3), P(~v4|v2,v3) = 1-P(v4|v2,v3)P(v4|v2,~v3), …P(v4|~v2, v3), …P(v4|~v2, ~v3), …

Page 6: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Since P(v5|v1,v2, v3, v4) = P(v5|v3),…:

v1P(v1) v2 P(v2)

v3P(v3|v1)P(v3|~v1)

v4P(v4|v2, v3),P(v4|v2,~v3),P(v4|~v2, v3),P(v4|~v2, ~v3)

v5P(v5|v3)P(v5|~v3)

Components of a Bayesian Network: a topology (graph) that qualitatively indicates displays the conditional independencies, and probability tables at each node

Semantics of graphical component: for each variable, v, v is independent of all of its non-descendents conditioned on its parents

Page 7: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Where does knowledge of conditional independence come from?

a) From data. Consider congressional voting records. Suppose that we have data on House votes (and political party). Suppose variables are ordered Party, Immigration, StarWars, ….

Party P(Republican) = 0.52 (226/435 Republicans 209/435 Democrats)

To determine relationship between Party and Immigration, we count

Actual Counts Predicted Counts (if Immigration and Immigration Party independent) Yes No Yes No Republican 17 209 Republican 92 134 Democrat 160 49 Democrat 85 124

P(Rep)*P(Yes) * 435 = 0.52 * (17+160)/435 * 435

Very different distributions – conclude dependent

Page 8: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Party P(Republican) = 0.52 (226/435 Republicans 209/435 Democrats)

Actual Counts Immigration Yes No Republican 17 209 Democrat 160 49

ImmigrationP(Yes| Rep) = 0.075P(Yes|Dem) = 0.765

17/226

Consider StarWars Is StarWars independent of Party and Immigration? (i.e., is P(StarWars|Party, Immigration) approx equal P(StarWars) for all combinations of variable values?) if yes, then stop and make StarWars a “root”, else continue Is StarWars independent of Immigration conditioned on Party? if yes, then stop and make Immigration a child of Party, else continue Is StarWars independent of Party conditioned on Immigration? if yes, then stop and make Immigration a child of Immigration, else continue Make StarWars a child of both Party and Immigration

Page 9: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Party P(Republican) = 0.52 (226/435 Republicans 209/435 Democrats)

Actual Counts Actual Counts Immigration StarWars Yes No Yes No Republican 17 209 219 7 Democrat 160 49 24 185

ImmigrationP(Yes| Rep) = 0.075P(Yes|Dem) = 0.765

17/226

Consider StarWars Is StarWars independent of Party and Immigration?

Actual Counts Immigration Yes NoRepublicanDemocrat Yes No Yes No StarWars

14 3 205 4 8 152 16 33

Predicted Counts Immigration Yes NoRepublicanDemocrat Yes No Yes No StarWars

9.5 7.5 117 92 89 71 27 22

different – not independent

P(Rep & Imm=Y)P(SW=Y)435

Page 10: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Party

Immigration StarWars

Further tests might indicate

i.e., Immigration and StarWars are independent conditioned on Party

Page 11: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Where does knowledge of conditional independence come from?

b) “First principles”

For example, suppose that the grounds keeper sets sprinkler timersto a fixed schedule that depends on the season (Summer, Winter,Spring, Fall), and suppose that the probability that it rains or not is dependent on season. We might write:

This model might differ from one in which a homeowner manuallyturns on a sprinkler

Season

Rains Sprinkler

Season

Rains Sprinkler

Page 12: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

H: Hardware problems (h) or not (~h)B: Bugs in code (b) or not (~b)E: Editor running (e) or not (~e)L: Lisp interpreter running (l) or not (~l)F: Cursor flashing (f) or not (~f)D: prompt displayed (d) or not (~d)

Page 13: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

P(h,~b,e,~l,f,d) ?

Page 14: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

P(h,~b,e,~l,f,d)

= P(h)P(~b|h)P(e|h,~b)P(~l|h,~b,e)P(f|h,~b,e,~l)P(d|h,~b,e,~l,f)

= P(h)P(~b)P(e|h)P(~l|h,~b)P(f|e)P(d|~l)

always true

true given BN above

Note that H,B,E,L,F,D order consistent with BN, but so are others such as B,H,L,D,E,F (a node comes after all its ancestors)

1-P(b) 1-P(l|h,~b)

Page 15: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l,~b) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 16: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l,~b) = P(~l, ~b, h) + P(~l, ~b, ~h) = P(~l | ~b, h)P(~b)P(h) + P(~l | ~b, ~h)P(~b)P(~h) = (1-P(l|~b,h)(1-P(b))P(h) + (1-P(l|~b,~h)(1-P(b))(1-P(h))

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 17: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(f, d)?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 18: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(f, d) = P(f, d, e, h, l, b) + … + P(f, d, ~e, ~h, ~l, ~b) = …

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 19: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l|h,~b) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 20: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l|h,~b)? 1 – P(l|h,~b) (essentially table lookup)

Causal inference

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 21: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l|h) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 22: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l|h)? = P(~l,h)/P(h) = [P(~l,h,b) + P(~l,h,~b)] / P(h) = [P(~l|h,b)P(h,b) + P(~l|h,~b)P(h,~b)] / P(h) = [P(~l|h,b)P(h)P(b) + P(~l|h,~b)P(h)P(~b)] / P(h) = P(~l|h,b)P(b) + P(~l|h,~b)P(~b) = (1-P(l|h,b))P(b) + (1-P(l|h,~b))(1-P(b))

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 23: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 24: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(~l)? = P(~l,h,b) + P(~l,h,~b) + P(~l,~h,b) + P(~l,~h,~b) = P(~l|h,b)P(h)P(b) + P(~l|h,~b)P(h)P(~b) + P(~l|~h,b)P(~h)P(b) + P(~l|~h,~b)P(~h)P(~b) = (1-P(l|h,b))P(h)P(b) + (1-P(l|h,~b)P(h)(1-P(b)) + ….

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 25: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(d|h) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 26: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(d|h)? = P(d,h)/P(h) = [P(d,l,b,h) + P(d,l,~b,h) + P(d,~l,b,h) + P(d,~l,~b,h)] / P(h) = P(d|l)P(l|h,b)P(b) + P(d|l)P(l|h,~b)P(~b) + P(d|~l)P(~l|h,b)P(b) + P(d|~l)P(~l|h,~b)P(~b) = ….

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 27: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(h|l) ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 28: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(h|l)? = P(h,l)/P(l) = [P(l,b,h) + P(l,~b,h)] P(l,b,h) + P(l,~b,h) + P(l,b,~h) + P(l,~b,~h)

= …. simplify

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 29: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following:

P(f | l) = ?

P(f | l, h) = ?

P(f | b) = ?

P(f | b, l) = ?

H B

E L

F D

P(b)P(h)

P(e|h)P(e|~h)

P(l|h,b) P(l|~h,b)P(l|h,~b) P(l|~h,~b)

P(d|l)P(d|~l)

P(f|e)P(f|~e)

Page 30: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

Consider the following belief network: A B

C D E

GEach variable is a binary-valued variable (e.g., A has values a and ~a, B has values

and ~b). Probability tables are stored at each node. For each of the following, involve the minimal number of variables necessary.

Express P(a, d, ~b, e, c) only in terms of the probabilities found probability tables of the network

F

Page 31: Recall the chain rule: P(v1 and v2 and ~v3 and v4 and ~v5) = P(v1)P(v2|v1)P(~v3|v1,v2)P(v4|v1,v2,~v3)P(~v5|v1,v2,~v3,v4) P(v1,v2) P(v1,v2,~v3) P(v1,v2,~v3,v4)

A B

C D E

GExpress P(f | b) only in terms of probabilities found in the probability tables of the network.

Express P(a | ~e) only in terms of probabilities found in the probability tables of the network

F