The Wumpus World!
description
Transcript of The Wumpus World!
The Wumpus World!
2012级 ACM班金汶功
Hunt the wumpus!
Description
• Performance measure• Environment• Actuators• Sensors: Stench & Breeze & Glitter & Bump &
Scream
An Example
An Example
Reasoning via logic
Semantics
• Semantics: Relationship between logic and the real world
• Model: • Entailment:
Models• KB: valid sentences• : “There is no pit in [1,2]”• : “There is no pit in [2,2]”
Knowledge base
Axioms
Current States
Sensors
Actuators
Agent
Tell
Ask
TellModel
checking
Answer
Efficient Model Checking
• DPLL• Early termination• Pure symbol heuristic• Unit clause heuristic• Component analysis• …
Drawbacks
• Model checking is NP-complete
• Knowledge base may tell nothing.
Probabilistic Reasoning
Full joint probability distribution
• P(X, Y) = P(X|Y)P(Y)• X: {1,2,3,4} -> {0.1,0.2,0.3,0.4}• Y: {a,b} -> {0.4, 0.6}
• P(X = 2, Y = a) = P(X = 2|Y = a)P(Y = a)• The probability of all combination of values
Normalization
• is a constant•
The Wumpus World• Aim: calculate the probability that each of the
three squares contains a pit.
Full joint distribution
• P(, , , ) P(, , |) P(• P(
• Every room contains a pit of probability 0.2
How likely is it that [1,3] has a pit?• Given observation:
• terms
Using independence
Simplification
• Now there are only 4 terms, cheers!
Finally
• [2,2] contains a pit with 86% probability!
• Data structures---independence
Bayesian Network
Simple Example
Burglary Earthquake
Alarm(Bark)
John Calls Mary Calls
P(B)
.001
P(E)
.002
B E P(A)
True true .95
true false .94
false true .29
false false .001
Bark P(J)
true .90
false .05
Bark P(M)
true .70
false .01
Specification
• Each node corresponds to a random variable
• Acyclic – DAG
• Each node has a conditional probability distribution
Conditional Independence
Exact Inference
P1,3 known
b
P3,1P2,2
P1,3 P2,2 P3,1 b
True True True 1
True True False 1
True False True 1
True False False 0
False True True 1
False True False 1
False False True 0
False False False 0
P(1,3)
0.2
P(known)
P(P3,1)
0.2P(P2,2)
0.2
Approximate Inference
• Markov Chain Monte Carlo• Gibbs Sampling• Idea: The long-run fraction of time spent in
each state is exactly proportional to its posterior probability.
𝑃 (𝑥 𝑖′∨𝑀𝑎𝑟𝑘𝑜𝑣𝐵𝑙𝑎𝑛𝑘𝑒𝑡 (𝑋 𝑖 ))=α P (𝑥𝑖′∨𝑃𝑎𝑟𝑒𝑛𝑡𝑠 (𝑋 𝑖 ))× ∏
𝑌 𝑗∈ h𝐶 𝑖𝑙𝑑𝑟𝑒𝑛 ( 𝑋 𝑖)𝑃(𝑦 𝑗∨𝑝𝑎𝑟𝑒𝑛𝑡𝑠 (𝑌 𝑗))
Reference• http://zh.wikipedia.org/wiki/Hunt_the_Wumpus• http://zh.wikipedia.org/wiki/%
E8%B4%9D%E5%8F%B6%E6%96%AF%E7%BD%91%E7%BB%9C
• Stuart Russell, Peter Norvig Artificial Intelligence—A Modern Approach 3rd edition, 2010