Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne...
Transcript of Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne...
![Page 1: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/1.jpg)
Bayesian Networks andDecision-Theoretic Reasoning
for Artificial Intelligence
Jack Breese
Microsoft ResearchDaphne Koller
Stanford University
![Page 2: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/2.jpg)
2© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Overview
■ Decision-theoretic techniques◆ Explicit management of uncertainty and tradeoffs◆ Probability theory◆ Maximization of expected utility
■ Applications to AI problems◆ Diagnosis◆ Expert systems◆ Planning◆ Learning
![Page 3: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/3.jpg)
3© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Science- AAAI-97
■ Model Minimization in Markov Decision Processes
■ Effective Bayesian Inference for Stochastic Programs
■ Learning Bayesian Networks from Incomplete Data
■ Summarizing CSP Hardness With ContinuousProbability Distributions
■ Speeding Safely: Multi-criteria Optimization inProbabilistic Planning
■ Structured Solution Methods for Non-MarkovianDecision Processes
![Page 4: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/4.jpg)
4© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Applications
Microsoft’s cost-cutting helps users
04/21/97
A Microsoft Corp. strategy to cut its support costs by letting users solve theirown problems using electronic means is paying off for users.In March, thecompany began rolling out a series of Troubleshooting Wizards on its WorldWide Web site.
Troubleshooting Wizards save time and money for users who don’thave Windows NT specialists on hand at all times, said Paul Soares,vice president and general manager of Alden Buick Pontiac, a GeneralMotors Corp. car dealership in Fairhaven, Mass
![Page 5: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/5.jpg)
5© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Teenage Bayes
Microsoft ResearchersExchange Brainpower withEighth-grader
Teenager Designs Award-Winning Science Project
.. For her science project, which shecalled "Dr. Sigmund Microchip,"Tovar wanted to create a computerprogram to diagnose the probability ofcertain personality types. With onlyanswers from a few questions, theprogram was able to accuratelydiagnose the correct personality type90 percent of the time.
![Page 6: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/6.jpg)
6© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
» Concepts in Probability◆ Probability◆ Random variables◆ Basic properties (Bayes rule)
■ Bayesian Networks■ Inference■ Decision making■ Learning networks from data■ Reasoning over time■ Applications
![Page 7: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/7.jpg)
7© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Probabilities
■ Probability distribution P(X|ξ)◆ X is a random variable
■ Discrete
■ Continuous
◆ ξ is background state of information
![Page 8: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/8.jpg)
8© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Discrete Random Variables
■ Finite set of possible outcomes
0)( ≥ixP
1)(1
=∑=
n
iixP
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
X1 X2 X3 X41)()( =+ xPxPX binary:
{ }nxxxxX ,...,,, 321∈
![Page 9: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/9.jpg)
9© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Continuous Random Variable
■ Probability distribution (density function)over continuous values
∫ =10
0
1)( dxxP
[ ]10,0∈X 0)( ≥xP
∫=≤≤7
5
)()75( dxxPxP
)(xP
x5 7
![Page 10: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/10.jpg)
10© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
More Probabilities
■ Conditional
◆ Probability that X=x given we know that Y=y
■ Joint
◆ Probability that both X=x and Y=y
)(),( yYxXPyxP =∧=≡
)|()|( yYxXPyxP ==≡
![Page 11: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/11.jpg)
11© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Rules of Probability
■ Product Rule
■ Marginalization
)()|()()|(),( XPXYPYPYXPYXP ==
),(),()( xYPxYPYP +=
),( )(1
∑=
=n
iixYPYP
X binary:
![Page 12: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/12.jpg)
12© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayes Rule)()|()()|(),( HPHEPEPEHPEHP ==
)(
)()|()|(
EP
HPHEPEHP =
)()|()()|(
)()|(
),(),(
)()|()|(
hPhePhPheP
hPheP
hePheP
hPhePehP
+=
+=
![Page 13: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/13.jpg)
13© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents■ Concepts in Probability» Bayesian Networks
◆ Basics◆ Additional structure◆ Knowledge acquisition
■ Inference■ Decision making■ Learning networks from data■ Reasoning over time■ Applications
![Page 14: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/14.jpg)
14© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian networks
■ Basics◆ Structured representation
◆ Conditional independence
◆ Naïve Bayes model
◆ Independence facts
![Page 15: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/15.jpg)
15© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian Networks
CancerSmoking{ }heavylightnoS ,,∈
{ }malignantbenignnoneC ,,∈P(S=no) 0.80P(S=light) 0.15P(S=heavy) 0.05
Smoking= no light heavyP(C=none) 0.96 0.88 0.60P(C=benign) 0.03 0.08 0.25P(C=malig) 0.01 0.04 0.15
![Page 16: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/16.jpg)
16© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Product Rule
■ P(C,S) = P(C|S) P(S)
S⇓ C⇒ none benign malignantno 0.768 0.024 0.008
light 0.132 0.012 0.006
heavy 0.035 0.010 0.005
![Page 17: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/17.jpg)
17© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Marginalization
S⇓ C⇒ none benign malig totalno 0.768 0.024 0.008 .80
light 0.132 0.012 0.006 .15
heavy 0.035 0.010 0.005 .05
total 0.935 0.046 0.019
P(Cancer)
P(Smoke)
![Page 18: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/18.jpg)
18© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayes Rule Revisited
)(
),(
)(
)()|()|(
CP
SCP
CP
SPSCPCSP ==
S⇓ C⇒ none benign maligno 0.768/.935 0.024/.046 0.008/.019
light 0.132/.935 0.012/.046 0.006/.019
heavy 0.030/.935 0.015/.046 0.005/.019
Cancer= none benign malignantP(S=no) 0.821 0.522 0.421P(S=light) 0.141 0.261 0.316P(S=heavy) 0.037 0.217 0.263
![Page 19: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/19.jpg)
19© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
A Bayesian Network
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
![Page 20: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/20.jpg)
20© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Independence
Age and Gender are independent.
P(A|G) = P(A) A ⊥ G P(G|A) = P(G) G ⊥ A
GenderAge
P(A,G) = P(G|A) P(A) = P(G)P(A)P(A,G) = P(A|G) P(G) = P(A)P(G)
P(A,G) = P(G)P(A)
![Page 21: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/21.jpg)
21© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Conditional Independence
Smoking
GenderAge
Cancer
Cancer is independentof Age and Gendergiven Smoking.
P(C|A,G,S) = P(C|S) C ⊥ A,G | S
![Page 22: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/22.jpg)
22© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
More Conditional Independence:Naïve Bayes
Cancer
LungTumor
SerumCalcium
Serum Calcium isindependent of Lung Tumor,given Cancer
P(L|SC,C) = P(L|C)
Serum Calcium and LungTumor are dependent
![Page 23: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/23.jpg)
23© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Naïve Bayes in general
H
E1 E2 E3 En…...
2n + 1 parameters:nihePheP
hP
ii ,,1),|(),|(
)(
�=
![Page 24: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/24.jpg)
24© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
More Conditional Independence:Explaining Away
Exposure to Toxics isdependent on Smoking,given Cancer
Exposure to Toxics andSmoking are independentSmoking
Cancer
Exposureto Toxics
E ⊥ S
P(E = heavy | C = malignant) >
P(E = heavy | C = malignant, S=heavy)
![Page 25: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/25.jpg)
25© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Put it all together=),,,,,,( SCLCSEGAP
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
)|()|( CLPCSCP ⋅
⋅⋅ )()( GPAP
⋅⋅ ),|()|( GASPAEP
⋅),|( SECP
![Page 26: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/26.jpg)
26© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
General Product (Chain) Rulefor Bayesian Networks
)|(),,,(1
21 iPa∏=
=n
iin XPXXXP �
Pai=parents(Xi)
![Page 27: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/27.jpg)
27© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Conditional Independence
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics Cancer is independent
of Age and Gendergiven Exposure toToxics and Smoking.
Descendants
Parents
Non-Descendants
A variable (node) is conditionally independentof its non-descendants given its parents.
![Page 28: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/28.jpg)
28© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Another non-descendant
Descendants
Parents
Non-DescendantGeneticDamage
Cancer is independentof Genetic Damagegiven Exposure toToxics and Smoking.
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
![Page 29: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/29.jpg)
29© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Independence and GraphSeparation
■ Given a set of observations, is one set ofvariables dependent on another set?
■ Observing effects can induce dependencies.
■ d-separation (Pearl 1988) allows us to checkconditional independence graphically.
![Page 30: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/30.jpg)
30© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian networks
■ Additional structure◆ Nodes as functions
◆ Causal independence
◆ Context specific dependencies
◆ Continuous variables
◆ Hierarchy and model construction
![Page 31: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/31.jpg)
31© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
X
Nodes as functions
A
B
0.1
0.3
0.6
a b a b a b
0.4
0.2
0.4
a b
0.5
0.3
0.2
lo
med
hi
0.7
0.1
0.2
X
0.7
0.1
0.2
■ A BN node is conditional distribution function◆ its parent values are the inputs
◆ its output is a distribution over its valueslo : 0.7
med : 0.1
hi : 0.2
b
a
![Page 32: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/32.jpg)
32© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
X
A
B
X
Any type of functionfrom Val(A,B)to distributions
over Val(X)
lo : 0.7
med : 0.1
hi : 0.2
b
a
![Page 33: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/33.jpg)
33© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Causal Independence
■ Burglary causes Alarm iff motion sensor clear
■ Earthquake causes Alarm iff wire loose
■ Enabling factors are independent of each other
EarthquakeBurglary
Alarm
![Page 34: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/34.jpg)
34© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
deterministic or Alarm
Motion sensed Wire Move
Fine-grained model
EarthquakeBurglary
mm
1-rB
rB
01
b bww
1-rE
rE
01
e e
![Page 35: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/35.jpg)
35© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Noisy-Or model
Alarm false only if all mechanisms independently inhibited
EarthquakeBurglary
P(a) = 1 - Π rXparent Xactive
# of parameters is linear in the # of parents
![Page 36: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/36.jpg)
36© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
CPCS Network
![Page 37: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/37.jpg)
37© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Context-specific Dependencies
■ Alarm can go off only if it is Set
■ A burglar and the cat can both set off the alarm
■ If a burglar comes in, the cat hides and does not setoff the alarm
CatAlarm-Set
Alarm
Burglary
![Page 38: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/38.jpg)
38© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Asymmetric dependencies
A
CatAlarm-Set Burglary
■ Alarm independent of◆ Burglary, Cat given s◆ Cat given s and b
Node functionrepresented
as a tree
S
BC
(a: 0, a : 1)
(a: 0.9, a : 0.1)
(a: 0.01, a : 0.99) (a: 0.6, a : 0.4)
s s
c
b
c
b
![Page 39: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/39.jpg)
39© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Asymmetric Assessment
PrinterOutput
Location
Local Transport
Net Transport
PrintData
Local OKNet OK
![Page 40: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/40.jpg)
40© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
OutdoorTemperature
A/C Setting
97o hi
Continuous variables
IndoorTemperature
Function from Val(A,B)to density functions
over Val(X)
P(x)
x
IndoorTemperature
![Page 41: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/41.jpg)
41© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Gaussian (normal) distributions
−−=σ
µσπ 2
)(exp
2
1)(
2xxP
N(µ, σ)
different mean different variance
![Page 42: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/42.jpg)
42© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Gaussian networks
X Y
),(~ 2XNX σµ
),(~ 2YbaxNY σ+
X YX Y
Each variable is a linearfunction of its parents,with Gaussian noise
Joint probability density functions:
![Page 43: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/43.jpg)
43© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Composing functions
■ Recall: a BN node is a function
■ We can compose functions to get morecomplex functions.
■ The result: A hierarchically structured BN.
■ Since functions can be called more thanonce, we can reuse a BN model fragment inmultiple contexts.
![Page 44: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/44.jpg)
44© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Tires
Owner
Car:
Mileage
Maintenance Age Original-value
Fuel-efficiency Braking-power
OwnerAge Income
BrakesBrakes: Power
Tires:RF-TireLF-Tire
TractionPressure
EngineEngineEngine:Power
![Page 45: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/45.jpg)
45© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian Networks
■ Knowledge acquisition◆ Variables
◆ Structure
◆ Numbers
![Page 46: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/46.jpg)
46© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Risk of Smoking Smoking
■ Values versus Probabilities
What is a variable?
■ Collectively exhaustive, mutually exclusivevalues
4321 xxxx ∨∨∨
jixx ji ≠∧¬ )(
Error Occured
No Error
![Page 47: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/47.jpg)
47© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Clarity Test:Knowable in Principle
■ Weather {Sunny, Cloudy, Rain, Snow}
■ Gasoline: Cents per gallon
■ Temperature { ≥ 100F , < 100F}
■ User needs help on Excel Charting {Yes, No}
■ User’s personality {dominant, submissive}
![Page 48: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/48.jpg)
48© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Structuring
LungTumor
SmokingExposureto Toxic
GenderAge
Extending the conversation.
Network structure correspondingto “causality” is usually good.
CancerGeneticDamage
![Page 49: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/49.jpg)
49© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Do the numbers really matter?
■ Zeros and Ones
■ Order of Magnitude : 10-9 vs 10-6
■ Sensitivity Analysis
■ Second decimal usually does not matter
■ Relative Probabilities
![Page 50: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/50.jpg)
50© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian Networks and Structure
■ Causal independence: from2n to n+1 parameters
■ Asymmetric assessment:similar savings in practice.
■ Typical savings (#params):◆ 145 to 55 for a small
hardware network;
◆ 133,931,430 to 8254 forCPCS !!
![Page 51: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/51.jpg)
51© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
■ Concepts in Probability
■ Bayesian Networks
» Inference
■ Decision making
■ Learning networks from data
■ Reasoning over time
■ Applications
![Page 52: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/52.jpg)
52© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Inference
■ Patterns of reasoning
■ Basic inference
■ Exact inference
■ Exploiting structure
■ Approximate inference
![Page 53: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/53.jpg)
53© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Predictive Inference
How likely are elderly malesto get malignant cancers?
P(C=malignant | Age>60, Gender= male)
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
![Page 54: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/54.jpg)
54© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
CombinedHow likely is an elderlymale patient with highSerum Calcium to havemalignant cancer?
P(C=malignant | Age>60, Gender= male, Serum Calcium = high)
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
![Page 55: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/55.jpg)
55© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Explaining away
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
■ If we see a lung tumor,the probability of heavysmoking and of exposureto toxics both go up.
■ If we then observe heavysmoking, the probabilityof exposure to toxics goesback down.
Smoking
![Page 56: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/56.jpg)
56© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Inference in Belief Networks
■ Find P(Q=q|E= e)
◆ Q the query variable
◆ E set of evidence variables
P(q | e) =P(q, e)
P(e)
X1,…, Xn are network variables except Q, E
P(q, e) = Σ P(q, e, x1,…, xn) x1,…, xn
![Page 57: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/57.jpg)
57© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Basic Inference
A B
P(b) = ?
![Page 58: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/58.jpg)
58© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Product Rule
■ P(C,S) = P(C|S) P(S)
S⇓ C⇒ none benign malignantno 0.768 0.024 0.008
light 0.132 0.012 0.006
heavy 0.035 0.010 0.005
S C
![Page 59: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/59.jpg)
59© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Marginalization
S⇓ C⇒ none benign malig totalno 0.768 0.024 0.008 .80
light 0.132 0.012 0.006 .15
heavy 0.035 0.010 0.005 .05
total 0.935 0.046 0.019
P(Cancer)
P(Smoke)
![Page 60: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/60.jpg)
60© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Basic Inference
A B
= Σ P(c | b) Σ P(b | a) P(a) b a
P(b)
P(c) = Σ P(a, b, c)b,a
P(b) = Σ P(a, b) = Σ P(b | a) P(a) a a
C
bP(c) = Σ P(c | b) P(b)
b,a= Σ P(c | b) P(b | a) P(a)
![Page 61: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/61.jpg)
61© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Inference in trees
X
Y1 Y2
P(x) = Σ P(x | y1, y2) P(y1, y2)y1, y2
because of independence of Y1, Y2:
y1, y2
= Σ P(x | y1, y2) P(y1) P(y2)
X
![Page 62: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/62.jpg)
62© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Polytrees■ A network is singly connected (a polytree)
if it contains no undirected loops.
Theorem: Inference in a singly connectednetwork can be done in linear time*.
Main idea: in variable elimination, need only maintaindistributions over single nodes.
* in network size including table sizes.
��
![Page 63: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/63.jpg)
63© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
The problem with loops
Rain
Cloudy
Grass-wet
Sprinkler
P(c) 0.5
P(r)c c
0.99 0.01 P(s)c c
0.01 0.99
deterministic or
The grass is dry only if no rain and no sprinklers.
P(g) = P(r, s) ~ 0
![Page 64: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/64.jpg)
64© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
The problem with loops contd.
= P(r, s)
P(g | r, s) P(r, s) + P(g | r, s) P(r, s)
+ P(g | r, s) P(r, s) + P(g | r, s) P(r, s)
0
10
0
= P(r) P(s) ~ 0.5 ·0.5 = 0.25
problem
~ 0
P(g) =
![Page 65: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/65.jpg)
65© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Variable elimination
A
B
C
P(c) = Σ P(c | b) Σ P(b | a) P(a) b a
P(b)
x
P(A) P(B | A)
P(B, A) ΣA P(B)
x
P(C | B)
P(C, B) ΣB P(C)
![Page 66: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/66.jpg)
66© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Inference as variable elimination
■ A factor over X is a function from val(X) tonumbers in [0,1]:◆ A CPT is a factor◆ A joint distribution is also a factor
■ BN inference:◆ factors are multiplied to give new ones◆ variables in factors summed out
■ A variable can be summed out as soon as allfactors mentioning it have been multiplied.
![Page 67: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/67.jpg)
67© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Variable Elimination with loops
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
x
P(A,G,S)
P(A) P(S | A,G)P(G)
P(A,S)ΣG
ΣE,S
P(C)
P(L | C) x P(C,L) ΣC
P(L)
Complexity is exponential in the size of the factors
P(E,S)ΣA
P(A,E,S)
P(E | A)
x
P(C | E,S)
P(E,S,C)
x
![Page 68: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/68.jpg)
68© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Join trees*
P(A)
P(S | A,G)
P(G)
P(A,S)x
xx A, G, S
E, S, C
C, LC, S-C
A join tree is a partially precompiled factorization
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
* aka junction trees, Lauritzen-Spiegelhalter, Hugin alg., …
A, E, S
![Page 69: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/69.jpg)
69© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
deterministic or Alarm
Motion sensed Wire Move
Exploiting Structure
EarthquakeBurglary
Idea: explicitly decompose nodes
Noisy or:
![Page 70: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/70.jpg)
70© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Earthquake
Noisy-or decompositionAlarm Burglary Truck Wind
Smaller families Smaller factors Faster inference
A’ B’ T’ W’
A B T W
orE:A’ B’ T’ W’
A B T W
or or
orE:
![Page 71: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/71.jpg)
71© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Inference with continuous variables■ Gaussian networks: polynomial time inference
regardless of network structure■ Conditional Gaussians:
◆ discrete variables cannot depend on continuous
SmokeConcentration
SmokeAlarm
�Smoke
Concentration
Fire
WindSpeed
� ),(~ 2FFF bwaNS σ+
■ These techniques do not work for general hybridnetworks.
![Page 72: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/72.jpg)
72© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Computational complexity■ Theorem: Inference in a multi-connected
Bayesian network is NP-hard.
Boolean 3CNF formula φ = (u∨ v ∨ w)∧ (u ∨ w ∨ y)
Probability ( ) = 1/2n · # satisfying assignments of φ
or
and
U V W Y
prior probability1/2or
![Page 73: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/73.jpg)
73© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Samples:
B E A C N
Stochastic simulation
Call
Alarm
Burglary Earthquake
Newscast
P(b) 0.03 P(e) 0.001
P(a)b e b e b e b e0.98 0.40.7 0.01
P(c)a a
0.8 0.05P(n)
e e0.3 0.001
e a c
= c
b n
b e a c n
0.03 0.001
0.3
0.4
0.8
P(b|c) ~# of live samples with B=b
total # of live samples
...
![Page 74: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/74.jpg)
74© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Likelihood weighting
Samples:B E A C N
...
P(b|c) =weight of samples with B=b
total weight of samples
e a cb n
P(c)a a
0.8 0.05
Call
Alarm
Burglary Earthquake
Newscast= c
weight
0.8
0.95b e a c n
![Page 75: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/75.jpg)
75© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Other approaches
■ Search based techniques◆ search for high-probability instantiations◆ use instantiations to approximate probabilities
■ Structural approximation◆ simplify network
■ eliminate edges, nodes■ abstract node values■ simplify CPTs
◆ do inference in simplified network
![Page 76: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/76.jpg)
76© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
CPCS Network
![Page 77: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/77.jpg)
77© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
■ Concepts in Probability
■ Bayesian Networks
■ Inference
» Decision making
■ Learning networks from data
■ Reasoning over time
■ Applications
![Page 78: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/78.jpg)
78© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Decision making
■ Decisions, Preferences, and Utility functions
■ Influence diagrams
■ Value of information
![Page 79: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/79.jpg)
79© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Decision making
■ Decision - an irrevocable allocation of domainresources
■ Decision should be made so as to maximizeexpected utility.
■ View decision making in terms of◆ Beliefs/Uncertainties
◆ Alternatives/Decisions
◆ Objectives/Utilities
![Page 80: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/80.jpg)
80© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
A Decision Problem
Should I have my party inside or outside?
in
out
Regret
Relieved
Perfect!
Disaster
dry
wet
dry
wet
![Page 81: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/81.jpg)
81© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Value Function
■ A numerical score over all possible states ofthe world.
Location? Weather? Valuein dry $50in wet $60out dry $100out wet $0
![Page 82: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/82.jpg)
82© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Preference for Lotteries
0.8
0.2 $40,000
$0 0.75
0.25 $30,000
$0≈�
�
![Page 83: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/83.jpg)
83© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Desired Properties forPreferences over Lotteries
1-p
p $100
$0 1-q
q $100
$0
If you prefer $100 to $0 and p < q then
(always)
�
![Page 84: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/84.jpg)
84© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Expected Utility
pn
p1x1
p2
x2
xnqn
q1y1
q2
y2
yn
iff
Σi qi U(yi)<Σi pi U(xi)
Properties of preference ⇒ existence of function U, that satisfies:
�
![Page 85: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/85.jpg)
85© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Some properties of U
⇒ U ≠ monetary payoff
0.2
0.8 $40,000
$0 0
1 $30,000
$0≈�
�
![Page 86: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/86.jpg)
86© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Attitudes towards risk
$ reward
U
10000 500
U(l)
400
Certain equivalent
insurance/risk premium
U convexU concaveU linear
risk averserisk seekingrisk neutral
.5
.5 $1,000
$0l:
U($500)
![Page 87: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/87.jpg)
87© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Are people rational?
0.8
0.2 $40,000
$0 0.75
0.25 $30,000
$0
0.2 • U($40k) > 0.25 • U($30k)0.8 • U($40k) > U($30k)
0.2
0.8 $40,000
$0 0
1 $30,000
$0
0.8 • U($40k) < U($30k)
�
�
![Page 88: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/88.jpg)
88© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Maximizing Expected Utility
choose the action that maximizes expected utilityEU(in) = 0.7 ⋅ .632 + 0.3 ⋅ .699 = .652
EU(out) = 0.7 ⋅ .865 + 0.3 ⋅ 0 = .605 Choose in
in
out
U($50)=.632
U($60)=.699
U($100)=.865
U($0 ) =0
dry
wet
0.7
0.3
dry
wet
0.7
0.3
.652
.605
![Page 89: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/89.jpg)
89© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Multi-attribute utilities (or: Money isn’t everything)
■ Many aspects of an outcome combine todetermine our preferences.◆ vacation planning: cost, flying time, beach quality,
food quality, …
◆ medical decision making: risk of death (micromort),quality of life (QALY), cost of treatment, …
■ For rational decision making, must combine allrelevant factors into single utility function.
![Page 90: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/90.jpg)
90© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Influence DiagramsBurglary
Alarm
Call
Earthquake
Newcast
GoHome?
Miss Meeting
GoodsRecovered
UtilityBigSale
![Page 91: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/91.jpg)
91© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Decision Making with InfluenceDiagrams
Call? Go Home?
Neighbor Phoned Yes
No Phone Call No
Expected Utility of this policy is 100
Burglary
Alarm
Call
Earthquake
Newcast
GoHome?
UtilityBigSale
Miss Meeting
GoodsRecovered
![Page 92: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/92.jpg)
92© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Value-of-Information
■ What is it worth to get another piece ofinformation?
■ What is the increase in (maximized)expected utility if I make a decision with anadditional piece of information?
■ Additional information (if free) cannot makeyou worse off.
■ There is no value-of-information if you willnot change your decision.
![Page 93: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/93.jpg)
93© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Value-of-Information in anInfluence Diagram
How much bettercan we do whenthis arc is here?
Burglary
Alarm
Call
Earthquake
Newcast
GoHome?
UtilityBigSale
Miss Meeting
GoodsRecovered
![Page 94: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/94.jpg)
94© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Value-of-Information is theincrease in Expected Utility
Phonecall? Newscast? Go Home?
Yes Quake NoYes No Quake YesNo Quake NoNo No Quake No
Expected Utility of this policy is 112.5
Burglary
Alarm
Call
Earthquake
Newcast
GoHome?
UtilityBigSale
Miss Meeting
GoodsRecovered
![Page 95: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/95.jpg)
95© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
■ Concepts in Probability
■ Bayesian Networks
■ Inference
■ Decision making
» Learning networks from data
■ Reasoning over time
■ Applications
![Page 96: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/96.jpg)
96© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Learning networks from data
■ The learning task
■ Parameter learning◆ Fully observable
◆ Partially observable
■ Structure learning
■ Hidden variables
![Page 97: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/97.jpg)
97© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
The learning task
B E A C N
...
Input: training data
Call
Alarm
Burglary Earthquake
Newscast
Output: BN modeling data
■ Input: fully or partially observable data cases?
■ Output: parameters or also structure?
e a cb n
b e a c n
![Page 98: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/98.jpg)
98© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Parameter learning: one variable
■ Different coin tosses independent given θ⇒ P(X1, …, Xn | θ ) =
h heads, t tails
θ
■ Unfamiliar coin:
◆ Let θ = bias of coin (long-run fraction of heads)
■ If θ known (given), then
◆ P(X = heads | θ ) =
θ h (1-θ)t
![Page 99: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/99.jpg)
99© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Maximum likelihood
θ∗ = hh+t
■ Input: a set of previous coin tosses◆ X1, …, Xn = {H, T, H, H, H, T, T, H, . . ., H}
h heads, t tails■ Goal: estimate θ■ The likelihood P(X1, …, Xn | θ ) = θ h (1-θ )t
■ The maximum likelihood solution is:
![Page 100: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/100.jpg)
100© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian approach
∫∞
∞−
= θθθ dP )(
Uncertainty about θ ⇒ distribution over its values
∫∞
∞−
=== θθθ dPheadsXPheadsXP )()|()(
P(θ )
θ
![Page 101: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/101.jpg)
101© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Conditioning on data
P(θ )D
h heads, t tails
P(θ | D)
1 head1 tail
∝ P(θ ) P(D | θ ) = P(θ ) θ h (1-θ )t
![Page 102: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/102.jpg)
102© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
),( ∝thBeta θαα
Good parameter distribution:
11 )1( −− − th αα θθ
* Dirichlet distribution generalizes Beta to non-binary variables.
![Page 103: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/103.jpg)
103© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
General parameter learning
■ A multi-variable BN is composed of severalindependent parameters (“coins”).
A B θA, θB|a, θB|a
■ Can use same techniques as one-variablecase to learn each one separately
Three parameters:
Max likelihood estimate of θB|a would be:
#data cases with b, a#data cases with aθ∗
B|a =
![Page 104: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/104.jpg)
104© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Partially observable data
B E A C N
... Call
Alarm
Burglary Earthquake
Newscast
■ Fill in missing data with “expected” value◆ expected = distribution over possible values
◆ use “best guess” BN to estimate distribution
? a cb ?
b ? a ? n
![Page 105: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/105.jpg)
105© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Intuition■ In fully observable case:
Problem: θ* unknown.
θ∗n|e =
#data cases with n, e#data cases with e
Σj I(n,e | dj)
Σ j I(e | dj)
I(e | dj) =1 if E=e in data case dj
0 otherwise
=
■ In partially observable case I is unknown.
Best estimate for I is: )|,()|,(ˆ * jj denPdenI θ=
![Page 106: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/106.jpg)
106© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Expectation Maximization (EM)
■ Expectation (E) step◆ Use current parameters θ to estimate filled in data.
■ Maximization (M) step◆ Use filled in data to do max likelihood estimation
)|,()|,(ˆ jj denPdenI θ=
∑∑
=j j
j j
endeI
denI
)|(ˆ
)|,(ˆ~|θ
Repeat :
until convergence.
■ Set: θθ ~:=
![Page 107: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/107.jpg)
107© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Structure learning
Goal: find “good” BN structure (relative to data)
Solution: do heuristic search over space of network structures.
![Page 108: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/108.jpg)
108© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Search spaceSpace = network structuresOperators = add/reverse/delete edges
![Page 109: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/109.jpg)
109© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
score
Heuristic searchUse scoring function to do heuristic search (any algorithm).Greedy hill-climbing with randomness works pretty well.
![Page 110: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/110.jpg)
110© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Scoring■ Fill in parameters using previous techniques
& score completed networks.■ One possibility for score:
�likelihood function: Score(B) = P(data | B)
Example: X, Y independent coin tosses typical data = (27 h-h, 22 h-t, 25 t-h, 26 t-t)
Maximum likelihood network structure:
X Y
Max. likelihood network typically fully connected
This is not surprising: maximum likelihood always overfits…
![Page 111: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/111.jpg)
111© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Better scoring functions■ MDL formulation: balance fit to data and
model complexity (# of parameters)
Score(B) = P(data | B) - model complexity
* with Dirichlet parameter prior, MDL is an approximation to full Bayesian score.
■ Full Bayesian formulation◆ prior on network structures & parameters◆ more parameters ⇒ higher dimensional space◆ get balance effect as a byproduct*
![Page 112: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/112.jpg)
112© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Hidden variables
■ There may be interesting variables that wenever get to observe:◆ topic of a document in information retrieval;
◆ user’s current task in online help system.
■ Our learning algorithm should◆ hypothesize the existence of such variables;
◆ learn an appropriate state space for them.
![Page 113: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/113.jpg)
113© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
E3
E1
E2
Randomlyscattered data
![Page 114: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/114.jpg)
114© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
E3
E1
E2
Actual data
![Page 115: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/115.jpg)
115© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian clustering (Autoclass)
■ (hypothetical) class variable never observed■ if we know that there are k classes, just run EM■ learned classes = clusters■ Bayesian analysis allows us to choose k, trade off
fit to data with model complexity
naïve Bayes model: Class
E1 E2 En…...
![Page 116: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/116.jpg)
116© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
E3
E1
E2
Clustereddistributions
![Page 117: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/117.jpg)
117© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Detecting hidden variables
■ Unexpected correlations hidden variables.
Cholesterolemia
Test1 Test2 Test3
Hypothesized model
Cholesterolemia
Test1 Test2 Test3
Data model
Cholesterolemia
Test1 Test2 Test3
“Correct” modelHypothyroid
![Page 118: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/118.jpg)
118© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
■ Concepts in Probability
■ Bayesian Networks
■ Inference
■ Decision making
■ Learning networks from data
» Reasoning over time
■ Applications
![Page 119: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/119.jpg)
119© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Reasoning over time
■ Dynamic Bayesian networks
■ Hidden Markov models
■ Decision-theoretic planning◆ Markov decision problems
◆ Structured representation of actions
◆ The qualification problem & the frame problem
◆ Causality (and the frame problem revisited)
![Page 120: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/120.jpg)
120© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Dynamic environments
State(t) State(t+1) State(t+2)
■ Markov property:◆ past independent of future given current state;
◆ a conditional independence assumption;
◆ implied by fact that there are no arcs t→ t+2.
![Page 121: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/121.jpg)
121© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Dynamic Bayesian networks
...
■ Each variable depends only on few others.
■ State described via random variables.
Velocity(t+1)
Position(t+1)
Weather(t+1)
Drunk(t+1)
Velocity(t)
Position(t)
Weather(t)
Drunk(t)
Velocity(t+2)
Position(t+2)
Weather(t+2)
Drunk(t+2)
![Page 122: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/122.jpg)
122© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Hidden Markov model
State transitionmodel
Observationmodel
State(t) State(t+1)
Obs(t) Obs(t+1)
■ An HMM is a simple model for a partiallyobservable stochastic domain.
![Page 123: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/123.jpg)
123© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Hidden Markov models (HMMs)
■ Speech recognition:◆ states = phonemes◆ observations = acoustic signal
■ Biological sequencing:◆ states = protein structure◆ observations = amino acids
0.8
0.150.05
Partially observable stochastic environment:
■ Mobile robots:◆ states = location
◆ observations = sensor input
![Page 124: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/124.jpg)
124© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
HMMs and DBNs
■ HMMs are just very simple DBNs.
■ Standard inference & learning algorithms forHMMs are instances of DBN algorithms◆ Forward-backward = polytree
◆ Baum-Welch = EM
◆ Viterbi = most probable explanation.
![Page 125: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/125.jpg)
125© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Acting under uncertainty
agentobserves
state
■ Overall utility = sum of momentary rewards.■ Allows rich preference model, e.g.:
rewards correspondingto “get to goal asap” = +100 goal states
-1 other states
action model
Action(t)
Markov Decision Problem (MDP)
State(t+2)
Action(t+1)
Reward(t+1)Reward(t)
State(t) State(t+1)
![Page 126: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/126.jpg)
126© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Partially observable MDPs
State(t+2)State(t) State(t+1)
Action(t) Action(t+1)
Reward(t+1)Reward(t)
■ The optimal action at time t depends on theentire history of previous observations.
■ Instead, a distribution over State(t) suffices.
agent observesObs, not state
Obs(t) Obs(t+1)Obs depends
on state
![Page 127: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/127.jpg)
127© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Structured representationPosition(t)
Holding(t)
Direction(t)Preconditions Effects
Probabilistic action model• allows for exceptions & qualifications;• persistence arcs: a solution to the frame problem.
Position(t+1)
Holding(t+1)
Direction(t+1)Move:
Position(t)
Holding(t)
Direction(t)
Position(t+1)
Holding(t+1)
Direction(t+1)Turn:
![Page 128: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/128.jpg)
128© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Causality
■ Modeling the effects of interventions
■ Observing vs. “setting” a variable
■ A form of persistence modeling
![Page 129: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/129.jpg)
129© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Causal Theory
Distributor Cap
Car Starts
Cold temperatures can causethe distributor cap tobecome cracked.
If the distributor cap iscracked, then the car is lesslikely to start.
Temperature
![Page 130: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/130.jpg)
130© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Setting vs. Observing
Distributor Cap
Car Starts
Temperature
The car does not start.Will it start if wereplace the distributor?
![Page 131: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/131.jpg)
131© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Predicting the effects ofinterventions
Distributor Cap
Car Starts
TemperatureThe car does not start.Will it start if wereplace the distributor?
What is the probabilitythat the car will start if Ireplace the distributorcap?
![Page 132: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/132.jpg)
132© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Mechanism Nodes
Distributor
Start
Mstart
Mstart Distributor Starts?Always Starts Cracked YesAlways Starts Normal YesNever Starts Cracked NoNever Starts Normal NoNormal Cracked NoNormal Normal YesInverse Cracked YesInverse Normal No
![Page 133: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/133.jpg)
133© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
PersistencePre-action Post-action
Set toNormal
Persistencearc
ObservedAbnormal
Assumption:The mechanism relating Dist to Start isunchanged by replacing the Distributor.
DistMstartDistMstart
Temperature Temperature
Start Start
![Page 134: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/134.jpg)
134© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Course Contents
■ Concepts in Probability
■ Bayesian Networks
■ Inference
■ Decision making
■ Learning networks from data
■ Reasoning over time
» Applications
![Page 135: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/135.jpg)
135© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Applications■ Medical expert systems
◆ Pathfinder◆ Parenting MSN
■ Fault diagnosis◆ Ricoh FIXIT◆ Decision-theoretic troubleshooting
■ Vista■ Collaborative filtering
![Page 136: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/136.jpg)
136© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Why use Bayesian Networks?
■ Explicit management of uncertainty/tradeoffs
■ Modularity implies maintainability
■ Better, flexible, and robust recommendationstrategies
![Page 137: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/137.jpg)
137© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Pathfinder
■ Pathfinder is one of the first BN systems.
■ It performs diagnosis of lymph-node diseases.
■ It deals with over 60 diseases and 100 findings.
■ Commercialized by Intellipath and ChapmanHall publishing and applied to about 20 tissuetypes.
![Page 138: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/138.jpg)
138© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Studies of Pathfinder DiagnosticPerformance
■ Naïve Bayes performed considerably betterthan certainty factors and Dempster-ShaferBelief Functions.
■ Incorrect zero probabilities caused 10% ofcases to be misdiagnosed.
■ Full Bayesian network model with featuredependencies did best.
![Page 139: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/139.jpg)
139© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Commercial system: Integration
■ Expert System with advanced diagnostic capabilities◆ uses key features to form the differential diagnosis
◆ recommends additional features to narrow the differentialdiagnosis
◆ recommends features needed to confirm the diagnosis
◆ explains correct and incorrect decisions
■ Video atlases and text organized by organ system
■ “Carousel Mode” to build customized lectures
■ Anatomic Pathology Information System
![Page 140: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/140.jpg)
140© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
On Parenting: Selecting problem
■ Diagnostic indexing for HomeHealth site on Microsoft Network
■ Enter symptoms for pediatriccomplaints
■ Recommends multimedia content
![Page 141: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/141.jpg)
141© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
On Parenting : MSNOriginal Multiple Fault Model
![Page 142: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/142.jpg)
142© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Single Fault approximation
![Page 143: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/143.jpg)
143© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
On Parenting: Selecting problem
![Page 144: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/144.jpg)
144© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Performing diagnosis/indexing
![Page 145: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/145.jpg)
145© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
RICOH Fixit■ Diagnostics and information retrieval
![Page 146: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/146.jpg)
146© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
FIXIT: Ricoh copy machine
![Page 147: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/147.jpg)
147© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Online Troubleshooters
![Page 148: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/148.jpg)
148© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Define Problem
![Page 149: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/149.jpg)
149© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Gather Information
![Page 150: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/150.jpg)
150© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Get Recommendations
![Page 151: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/151.jpg)
151
Vista Project: NASA MissionControl
Decision-theoretic methods for display for high-stakes aerospace
decisions
![Page 152: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/152.jpg)
152
Costs & Benefits of ViewingInformation
Dec
isio
n qu
ality
Quantity of relevant information
![Page 153: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/153.jpg)
153© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Status Quo at Mission Control
![Page 154: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/154.jpg)
154
E2, t’ En, t’E1, t’
Time-Critical Decision Making
Utility
E2, to
Action A,t
En, to
• Consideration of time delay in temporal process
State ofSystem H, to
Duration ofProcess
E1, to
State ofSystem H, t’
![Page 155: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/155.jpg)
155
Simplification: HighlightingDecisions
■ Variable threshold to control amount ofhighlighted information
OxygenFuel Pres
Chamb PresHe PresDelta v
OxygenFuel Pres
Chamb PresHe PresDelta v
15.610.55.417.733.3
14.211.84.814.763.3
10.612.50.015.763.3
10.212.80.015.832.3
![Page 156: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/156.jpg)
156
Simplification: HighlightingDecisions
■ Variable threshold to control amount ofhighlighted information
OxygenFuel Pres
Chamb PresHe PresDelta v
OxygenFuel Pres
Chamb PresHe PresDelta v
15.610.55.417.733.3
14.211.84.814.763.3
10.612.50.015.763.3
10.212.80.015.832.3
![Page 157: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/157.jpg)
157
Simplification: HighlightingDecisions
■ Variable threshold to control amount ofhighlighted information
OxygenFuel Pres
Chamb PresHe PresDelta v
OxygenFuel Pres
Chamb PresHe PresDelta v
15.610.55.417.733.3
14.211.84.814.763.3
10.612.50.015.763.3
10.212.80.015.832.3
![Page 158: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/158.jpg)
158© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
What is Collaborative Filtering?
■ A way to find cool websites, news stories,music artists etc
■ Uses data on the preferences of many users,not descriptions of the content.
■ Firefly, Net Perceptions (GroupLens), andothers offer this technology.
![Page 159: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/159.jpg)
159© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Bayesian Clustering forCollaborative Filtering
P(Like title i | Like title j, Like title k)
■ Probabilistic summary of the data
■ Reduces the number of parameters torepresent a set of preferences
■ Provides insight into usage patterns.
■ Inference:
![Page 160: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/160.jpg)
160© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Applying Bayesian clustering
class1 class2 ...title1 p(like)=0.2 p(like)=0.8title2 p(like)=0.7 p(like)=0.1title3 p(like)=0.99 p(like)=0.01
...
user classes
title 1 title 2 title n...
![Page 161: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/161.jpg)
161© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Readers of commerce andtechnology stories (36%):
MSNBC Story clusters
■ E-mail delivery isn’t exactlyguaranteed
■ Should you buy a DVD player?■ Price low, demand high for
Nintendo
Sports Readers (19%):■ Umps refusing to work is the
right thing■ Cowboys are reborn in win over
eagles■ Did Orioles spend money wisely?
Readers of top promotedstories (29%):■ 757 Crashes At Sea■ Israel, Palestinians Agree To
Direct Talks■ Fuhrman Pleads Innocent To
Perjury
Readers of “Softer” News (12%):■ The truth about what things cost■ Fuhrman Pleads Innocent To
Perjury■ Real Astrology
![Page 162: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/162.jpg)
162© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Top 5 shows by user classClass 1• Power rangers• Animaniacs• X-men• Tazmania• Spider man
Class 4• 60 minutes• NBC nightly news• CBS eve news• Murder she wrote• Matlock
Class 2• Young and restless• Bold and the beautiful• As the world turns• Price is right• CBS eve news
Class 3• Tonight show• Conan O’Brien• NBC nightly news• Later with Kinnear• Seinfeld
Class 5• Seinfeld• Friends• Mad about you• ER• Frasier
![Page 163: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/163.jpg)
163© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Richer model
Age GenderLikessoaps
User class
WatchesSeinfeld
WatchesNYPD Blue
WatchesPower Rangers
![Page 164: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/164.jpg)
164© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
What’s old?
■ principled models of belief and preference;■ techniques for:
◆ integrating evidence (conditioning);◆ optimal decision making (max. expected utility);◆ targeted information gathering (value of info.);◆ parameter estimation from data.
Decision theory & probability theory provide:
![Page 165: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/165.jpg)
165© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
What’s new?
KnowledgeAcquisition
Inference
Learning
Bayesian networks exploit domain structure to allowcompact representations of complex models.
StructuredRepresentation
![Page 166: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/166.jpg)
166© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
Some Important AI Contributions
■ Key technology for diagnosis.■ Better more coherent expert systems.■ New approach to planning & action modeling:
◆ planning using Markov decision problems;◆ new framework for reinforcement learning;◆ probabilistic solution to frame & qualification
problems.
■ New techniques for learning models from data.
![Page 167: Bayesian Networks and Decision-Theoretic …© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved. 2 Overview Decision-theoretic techniques](https://reader035.fdocuments.net/reader035/viewer/2022063005/5f9c9e40e0ff976931752818/html5/thumbnails/167.jpg)
167© 1997 Jack Breese, Microsoft Corporation and Daphne Koller, Stanford University. All rights reserved.
What’s in our future?
■ Better models for:◆ preferences & utilities;◆ not-so-precise numerical probabilities.
■ Inferring causality from data.■ More expressive representation languages:
◆ structured domains with multiple objects;◆ levels of abstraction;◆ reasoning about time;◆ hybrid (continuous/discrete) models.
StructuredRepresentation