The Wumpus World!

The Wumpus World!

2012级 ACM班金汶功

Hunt the wumpus!

Description

• Performance measure• Environment• Actuators• Sensors: Stench & Breeze & Glitter & Bump &

Scream

An Example

Reasoning via logic

Semantics

• Semantics: Relationship between logic and the real world

• Model: • Entailment:

Models• KB: valid sentences• : “There is no pit in [1,2]”• : “There is no pit in [2,2]”

Knowledge base

Axioms

Current States

Sensors

Actuators

Agent

Tell

Ask

TellModel

checking

Answer

Efficient Model Checking

• DPLL• Early termination• Pure symbol heuristic• Unit clause heuristic• Component analysis• …

Drawbacks

• Model checking is NP-complete

• Knowledge base may tell nothing.

Probabilistic Reasoning

Full joint probability distribution

• P(X, Y) = P(X|Y)P(Y)• X: {1,2,3,4} -> {0.1,0.2,0.3,0.4}• Y: {a,b} -> {0.4, 0.6}

• P(X = 2, Y = a) = P(X = 2|Y = a)P(Y = a)• The probability of all combination of values

Normalization

• is a constant•

The Wumpus World• Aim: calculate the probability that each of the

three squares contains a pit.

Full joint distribution

• P(, , , ) P(, , |) P(• P(

• Every room contains a pit of probability 0.2

How likely is it that [1,3] has a pit?• Given observation:

• terms

Using independence

Simplification

• Now there are only 4 terms, cheers!

Finally

• [2,2] contains a pit with 86% probability!

• Data structures---independence

Bayesian Network

Simple Example

Burglary Earthquake

Alarm(Bark)

John Calls Mary Calls

P(B)

.001

P(E)

.002

B E P(A)

True true .95

true false .94

false true .29

false false .001

Bark P(J)

true .90

false .05

Bark P(M)

true .70

false .01

Specification

• Each node corresponds to a random variable

• Acyclic – DAG

• Each node has a conditional probability distribution

Conditional Independence

Exact Inference

P1,3 known

b

P3,1P2,2

P1,3 P2,2 P3,1 b

True True True 1

True True False 1

True False True 1

True False False 0

False True True 1

False True False 1

False False True 0

False False False 0

P(1,3)

0.2

P(known)

P(P3,1)

0.2P(P2,2)

0.2

Approximate Inference

• Markov Chain Monte Carlo• Gibbs Sampling• Idea: The long-run fraction of time spent in

each state is exactly proportional to its posterior probability.

𝑃 (𝑥 𝑖′∨𝑀𝑎𝑟𝑘𝑜𝑣𝐵𝑙𝑎𝑛𝑘𝑒𝑡 (𝑋 𝑖 ))=α P (𝑥𝑖′∨𝑃𝑎𝑟𝑒𝑛𝑡𝑠 (𝑋 𝑖 ))× ∏

𝑌 𝑗∈ h𝐶 𝑖𝑙𝑑𝑟𝑒𝑛 ( 𝑋 𝑖)𝑃(𝑦 𝑗∨𝑝𝑎𝑟𝑒𝑛𝑡𝑠 (𝑌 𝑗))

Reference• http://zh.wikipedia.org/wiki/Hunt_the_Wumpus• http://zh.wikipedia.org/wiki/%

E8%B4%9D%E5%8F%B6%E6%96%AF%E7%BD%91%E7%BB%9C

• Stuart Russell, Peter Norvig Artificial Intelligence—A Modern Approach 3rd edition, 2010

http://zh.wikipedia.org/wiki/Hunt_the_Wumpus

http://zh.wikipedia.org/wiki/Hunt_the_Wumpus

http://zh.wikipedia.org/wiki/%E8%B4%9D%E5%8F%B6%E6%96%AF%E7%BD%91%E7%BB%9C



The Wumpus World!

Documents

Transcript of The Wumpus World!