Emotionally Adaptive Intelligent Tutoring Systems using POMDPs
Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact...
-
Upload
adela-floyd -
Category
Documents
-
view
218 -
download
0
Transcript of Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06) Exact...
Fifth International Conference on Autonomous Agents and Multi-agent Systems (AAMAS-06)
Exact Solutions of Interactive POMDPs Using Behavioral
Equivalence
SpeakerPrashant Doshi
University of Georgia
AuthorsB. Rathnasabapathy, Prashant Doshi, and Piotr Gmytrasiewicz
2Overview
I-POMDP – Framework for sequential decision making for an agent in a multi-agent setting– Takes the perspective of an individual in an interaction
Problem– Cardinality of the interactive state space → infinite
Other agent's models (incl. beliefs) are part of an agent's state space (interactive epistemology)
An algorithm for solving I-POMDPs exactly– Aggregate behaviorally equivalent models of other agents
3 Background – Properties of POMDPs and I-POMDPs
• Finitely nested – Beliefs are nested up to a finite strategic level l– Level 0 models are POMDPs
• Value function of POMDP and finitely nested I-POMDP is piecewise linear and convex (PWLC)
• Agents’ behaviors in POMDP and finitely nested I-POMDP can be represented using policy trees
liPOMDPI ,
4Interactive POMDPs
• Definition
• Interactive state space
– S: set of physical states : set of intentional models
: set of subintentional models– Intentional models contain the other agent’s beliefs
1, lj
jSM
5Example: Single-Agent Tiger Problem
?+10 -100
-1
6Behaviorally Equivalent Models
P1 P2 P3
Equivalence Classes of Beliefs
7 Equivalence Classes of Interactive States
• Definition– Combination of a physical state and an equivalence
class of models
8Lossless Aggregation
• In a finitely nested I-POMDP, a probability distribution over , provides a sufficient statistic for the past history of i’s observations
• Transformation of the interactive state space into behavioral equivalence classes is value-preserving
• Optimal policy of the transformed finitely nested I-POMDP remains unchanged
9Solving I-POMDPs Exactly
Procedure Solve-IPOMDP ( AGENTi, Belief Nesting L ) : Returns Policy
If L = 0 Then
Return Policy : = Solve-POMDP ( AGENTi )
Else
For all AGENTj < > AGENTi
Policyj : = Solve-IPOMDP( AGENTj , L-1)
End
Mj := Behavioral-Equivalence-Models(Policyj )
ECISi : = S x xj Mj
Policy : = Modified-GIP(ECISi , Ai , Ti , Ωi , Oi , Ri )
Return Policy
End
10Multi-Agent Persistent-Tiger Problem
+10 -100
Growl Left, Growl RightX
Creak Right, Creak Left, Silence
11Beliefs on ECIS
Agent j’s policy
Agent i’s policy in the presence of another agent j
Policy becomes diverse as i’s ability of observing j’s actions improves
13
14
A method that enables exact solution of finitely nested interactive POMDPs
Aggregate agent models into behavioral equivalence classes– Discretization is lossless
Interesting behaviors emerge in the multi-agent Tiger problem
Conclusions
Thank You and Please Stop by my Poster
Questions