Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.
-
Upload
rosaline-edwards -
Category
Documents
-
view
220 -
download
1
Transcript of Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.
2
A Core Computation ProblemA Core Computation Problem
Focus: Classical state-space planning, extended with parallel actions
Core problem - similar computational issues arise in many other models
• Reactive plans
• Planning with uncertainty and utilities
• Continuous processes
• Metric time
3
State-space PlanningState-space Planning
Find a sequence of operators that transform an initial state to a goal state
State = complete truth assignment to a set of variables (fluents)
Goal = partial truth assignment (set of states)
Operator = a partial function State State
• specified by three sets of variables:precondition, add list, delete
list
4
ParallelismParallelism
Operators may be applied in parallel when all orderings are well defined and equivalent
(Op1 || Op2)(s) = Op2(Op1(s)) = Op1(Op2(s))
A special form of non-linear plans
• Only allows parallel actions, not parallel action sequences
• Easy to serialize
5
Abdundance of Negative Complexity Results
Abdundance of Negative Complexity Results
I. Domain-independent planning: PSPACE-complete
(Chapman 1987; Bylander 1991; Backstrom 1993)
II. Domain-dependent planning: NP-complete(Chenoweth 1991; Gupta and Nau 1992)
III. Approximate planning: NP-complete(Selman 1994)
6
PracticePractice
Until recently, domain-independent planning systems could only generate very short plans
– 5 steps
Practical systems minimize or eliminate search• Search control rules
(Sacerdoti 1975, Slaney 1996, Bacchus 1996)
• Pre-compiling entire state-space (Agre & Chapman 1997, Williams & Nayak 1997)
Scaling remains problematic when state space is large or not well understood!
How far can can one push domain-independent planning?– AIPS Planning Competition - 50 step plans
7
Planning as InferencePlanning as Inference
• Planning as first-order theorem proving (Green 1969)
computationally infeasible
• STRIPS (Fikes & Nilsson 1971)
very hard
• Partial-order planning (modal truth criteria) (Tate 1977, Chapman 1985, McAllester 1991, Smith & Peot 1993)
can be more efficient, but still hard (Minton, Bresina, & Drummond 1994)
• SATPLAN: planning as propositional reasoning
8
ApproachApproach
SAT encodings are designed so that plans correspond to satisfying assignments
Use recent efficient satisfiability procedures (systematic and stochastic) to solve
Evaluation performance on benchmark instances
9
OutlineOutline
1. Modeling and Solving Planning Problems as SAT
2. Improved Encodings using Graph Analysis
3. Improved Encodings using Compiled Control Knowledge
4. Beyond SAT: Planning with Resources and Optimization
10
Part 1: Modeling and Solving Planning Problems as SAT
Part 1: Modeling and Solving Planning Problems as SAT
11
SAT EncodingsSAT Encodings
Propositional CNF: no variables or quantifiers
Sets of clauses specified by axiom schemas
• fully instantiated before problem-solving
Discrete time, modeled by integers
• state predicates: indexed by time at which they hold
• action predicates: indexed by time at which action begins– each action takes 1 time step
– many actions may occur at the same step
12
Encoding ConventionsEncoding Conventions
• Actions imply preconditions and effects
fly(x,y,i) at(x,i) & route(x,y) & at(y,i+1)
• Conflicting actions cannot occur at same time (A deletes a precondition of B)
fly(x,y,i) & yz fly(x,z,i)
• If something changes, an action must have caused it (Explanatory Frame Axioms)
at(x,i) & at(x,i+1) y . route(x,y) & fly(x,y,i)
• Initial and final states hold
at(NY,0) & ... & at(LA,9) & ...
13
Modeling TricksModeling Tricks
Can often dramatically reduce size of problem by modeling techniques
move(x,y,z,i) requires n4 vars
pickup(x,y,i), putdown(x,z,i) requires 2n3 vars
State-based encodings: eliminate all action variables (“compile away”)
at(x,i) at(x,i+1) y . route(x,y) & at(y,i+1)
at(x,i) & xy at(y,i)
14
Solution to a Planning ProblemSolution to a Planning Problem
A solution is specified by any model (satisfying truth assignment) of the conjunction of the axioms describing the initial state, goal state, and operators
Easy to convert back to a STRIPS-style plan
15
SATPLANSATPLAN
axiomschemas instantiated
propositionalclauses
satisfyingmodelplan
mapping
length
problemdescription
SATengine(s)
instantiate
interpret
16
SAT AlgorithmsSAT Algorithms
Systematic Search• DP (Davis Putnam Logemann Loveland)
backtrack search + unit propagation
• satz (Chu Min Li) - variable selection by forward checking: max unit props
• relsat (Bayardo) - dependency directed backtracking: add new clauses at dead-ends
Local Search• Inspired by Mins-Conflict algorithm
(Adorf, Johnson, Minton, Philips, & Laird)
• GSAT (Selman), Walksat (Selman, Kautz & Cohen)greedy local search + noise to escape minima
17
Planning Benchmark Test SetPlanning Benchmark Test Set
Extension of Graphplan test set
blocks world - up to 18 blocks, 1019 states
logistics - complex, highly-parallel transportation domain.
Logistics.d:
• 2,165 possible actions per time slot
• 1016 legal configurations (22000 states)
• optimal solution contains 74 distinct actions over 14 time slots
Problems of this size never previously handled by state-space planning systems
18
Scaling Up Logistics PlanningScaling Up Logistics Planning
0.01
0.1
1
10
100
1000
10000
rocket.a
rocket.b
log.b
log.a
log.c
log.d
log
so
luti
on
tim
e
Graphplan
DP
DP/Satz
Walksat
19
Unpredictability of Systematic Search
Unpredictability of Systematic Search
0.01
0.1
1
10
100
1000
10000
rocket.a
rocket.b
log.b
log.a
log.c
log.d
log
so
luti
on
tim
e
DP/Satz
Walksat
20
Randomized RestartsRandomized Restarts
Solution: randomize the systematic solver
• Add noise to the heuristic branching (variable choice) function
• Cutoff and restart search after a fixed number of backtracks
In practice: rapid restarts with low cutoff can dramatically improve performance
(Gomes 1996, Gomes, Kautz, and Selman 1997, 1998)
21
Increased PredictabilityIncreased Predictability
0.01
0.1
1
10
100
1000
10000
rocket.a
rocket.b
log.b
log.a
log.c
log.d
log
so
luti
on
tim
e
Satz
Satz/Rand
22
What SATPLAN ShowsWhat SATPLAN Shows
General propositional theorem provers can compete with state of the art specialized planning systems
• New, highly tuned variations of DP surprising powerful
– result of sharing ideas and code in large SAT/CSP research community
– specialized engines can catch up, but by then new general techniques
• Radically new stochastic approaches to SAT can provide very low exponential scaling
– 2+ orders magnitude speedup on hard benchmark problems
23
Why SATPLAN WorksWhy SATPLAN Works
More flexible than forward or backward chaining
• Systematic: most unit propagation at most highly constrained states
• Stochastic: iterative repair
Space for time tradeoff
• Less overhead since does not have to instantiate variable during search
Randomized algorithms less likely to get trapped along bad paths
24
Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner
Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner
25
GraphplanGraphplan
Planning as graph search (Blum & Furst 1995)
Set new paradigm for planning
Like SATPLAN...
• Two phases: instantiation of propositional structure, followed by search
Unlike SATPLAN...
• Interleaves instantiation and pruning of plan graph
• Employs specialized search engine
Graphplan - better instantiation
SATPLAN - better search
26
Graph PruningGraph Pruning
Graphplan instantiates in a forward direction, pruning unreachable nodes • conflicting actions are mutex
• if all actions that add two facts are mutex, the facts are mutex
• if the preconditions for an action are mutex, the action is unreachable!
In logical terms: limited application of negative binary propagation
• given: P V Q, P V R V S V ...
• infer: Q V R V S V ...
27
The Plan GraphThe Plan Graph
Facts FactsActions
... ...
Facts FactsActions
... ...
preconditions add effects
mutually exclusive
delete effects
28
Bridging ParadigmsBridging Paradigms
Both SATPLAN and Graphplan are disjunctive planners (Kambhampati 1996)
Can the best features of each be combined?
IJCAI Challenge in Bridging Plan Synthesis Paradigms (Kambhampati 1997)
Our response: blackbox
29
Translation of Plan GraphTranslation of Plan Graph
Fact Act1 Act2
Act1 Pre1 Pre2
¬Act1 ¬Act2
Act1
Act2
Fact
Pre1
Pre2
30
Improved EncodingsImproved Encodings
Translations of Logistics.a:
STRIPS Axiom Schemas SAT(Medic system, Weld et. al 1997)
• 3,510 variables, 16,168 clauses
• 24 hours to solve
STRIPS Plan Graph SAT
• 2,709 variables, 27,522 clauses
• 5 seconds to solve!
31
Limited InferenceLimited Inference
SATPLAN used only a single general theorem-prover
What role can limited (polytime) reasoning algorithms play?
Two kinds of limited deduction• Planning specific (mutex computation)
• General polytime simplification – apply to all CNF formulas, may or may not be
designed with planning in mind
32
General Limited InferenceGeneral Limited Inference
Generated wff can be further simplified by consistency propagation techniques
Compact (Crawford & Auton 1996)
• unit propagation: is Wff inconsistant by resolution against unit clauses?
O(n)
• failed literal rule: is Wff + { P } inconsistant by unit propagation?
O(n2)
• binary failed literal rule: is Wff + { P V Q } inconsistant by unit propagation?
O(n3)
Complements domain specific limited inference
Discovers hidden local structure!
33
General Limited InferenceGeneral Limited Inference
Percent vars set byProblem Varsunitprop
failedlit
binaryfailed
bw.a 2452 10% 100% 100%bw.b 6358 5% 43% 99%bw.c 19158 2% 33% 99%log.a 2709 2% 36% 45%log.b 3287 2% 24% 30%log.c 4197 2% 23% 27%log.d 6151 1% 25% 33%
34
BlackboxBlackbox
STRIPSPlan Graph
Mutex computation
CNF
GeneralStochastic / Systematic SAT engines
Solution
SimplifierTranslator
CNF
35
Staged InferenceStaged Inference
Domain specific model
Polytime domain specific inference
General language encoding
Full general inference(NP complete)
Solution
Polytime general inference
Abstract problem specification
Encoding scheme
Combinatorial CORE
36
IntuitionIntuition
Many real-world problems not tractable, but are nearly so
• polytime inference takes advance of special kinds of structure
• structure may be visible at the level of a domain specific representation, or only after the problem is encoded
• small number of practical methods for combinatorial core– can be highly optimized
37
Blackbox ResultsBlackbox Results
0.01
0.1
1
10
100
1000
10000
rocket.a rocket.b log.a log.b log.c log.d
Graphplan
BB-walksat
BB-rand-sys
Handcoded-walksat
38
Part 3: Improved Encodings: Compiling Control KnowledgePart 3: Improved Encodings: Compiling Control Knowledge
39
Kinds of Control KnowledgeKinds of Control Knowledge
About domain itself• a truck is only in one location
About good plans• do not remove a package from its destination location
About how to search• plan air routes before land routes
40
Expressing KnowledgeExpressing Knowledge
Such information is traditionally incorporated in the planning algorithm itself
– or in a special programming language
Instead: use additional declarative axioms– (Bacchus 1995; Kautz 1998; Chen, Kautz, & Selman 1999)
• Problem instance: operator axioms + initial and goal axioms + control axioms
• Control knowledge constraints on search and solution spaces
• Independent of any search engine strategy
41
Axiomatic Control KnowledgeAxiomatic Control Knowledge
State Invariant: A truck is at only one location
at(truck,loc1,i) & loc1 loc2 at(truck,loc2,i)
Optimality: Do not return a package to a location
at(pkg,loc,i) & at(pkg,loc,i+1) & i<j at(pkg,loc,j)
Simplifying Assumption: Once a truck is loaded, it should immediately move
in(pkg,truck,i) & in(pkg,truck,i+1) &at(truck,loc,i+1)
at(truck,loc,i+2)
42
Adding Control Kx to SATPLANAdding Control Kx to SATPLAN
ProblemSpecification
Axioms
Control Knowledge
Axioms
Instantiated Clauses
SAT Simplifier
SAT Engine
SAT “Core”
As control knowledge increases, Core shrinks!
43
Tradeoffs of Control KnowledgeTradeoffs of Control Knowledge
If the planning domain is inherently intractable, how can any amount of control knowledge make planning tractable?
• by reducing solution quality
• optimal planning - NP-Hard
• non-optimal - (maybe) Polynomial
Issue: speed / quality tradeoff
Case study: Control Knowledge in TLPLAN and BlackBox
• TLPLAN (Bacchus 1996): simple forward-chaining search with strong control rules
44
IntuitionIntuition
Greater raw search power of SAT approach should give better quality solutions than planners entirely dependent on control knowlege
46
I. Rules involves only static information
II. Rules depends on the current state
III. Rules depends on the current state and
requires dynamic user-defined predicates
Temporal Logic for ControlTemporal Logic for Control
( at(obj1, loc1) =>
at(obj1, loc1) )
47
a
Category I Control RulesCategory I Control Rules
a
Do NOT unload an object from an airplane unless the object is at its goal destination
GoalInitial
a
SFO ORLNYC
48
Pruning the Planning GraphCategory I Rules
Pruning the Planning GraphCategory I Rules
Facts FactsActions
... ...
Facts FactsActions
... ...
49
Effect of Graph PruningEffect of Graph Pruning
0
2000
4000
6000
8000
10000
log-a log-b log-c log-d
nu
mb
er o
f n
od
es
Original Pruned
50
Category II Control RulesCategory II Control Rules
a
ORL NYC
Do NOT move an airplane if there is an object in the airplane that needs to be unloaded at that location.
SFO
51
Control by Adding ConstraintsControl by Adding Constraints
Control Rules
))ORLORL )next(at(p,)at(p,(in(pkg,p)
Planning Formula Constraints Clauses
)( 1 iii yyx
52
Blackbox with Control Knowledge(Logistics domain)
Blackbox with Control Knowledge(Logistics domain)
1
10
100
1000
10000
log-a log-b log-c log-d log-e
tim
e (s
ec)
blackbox blackbox(I) blackbox(I&II)
53
Blackbox with Control Knowledge(Tire-World domain)
0
20
40
60
80
100
120
tire-a tire-b
tim
e (s
ec)
blackbox blackbox(I) blackbox(I&II)
54
Comparison between Blackbox and TLPlan(Parallel Plan Length)
0
5
10
15
20
25
30
35
log-c log-d log-e log-1 log-2
Par
alle
l P
lan
Len
gth
TLPlan Blackbox
55
Comparison between Blackbox and TLPlan(Running Time)
Comparison between Blackbox and TLPlan(Running Time)
0
20
40
60
80
log-a log-b log-c log-d log-e
Tim
e (
se
c)
TLPlan Blackbox(I&II)
56
ComparisonComparison
TLPlan (without Control): Intractable.TLPlan (with Control): fastest, but limited parallelism
Blackbox (without Control): slower, high parallelismBlackbox (with Control): faster, high parallelism
57
SummarySummary
Easy to encode domain-specific knowledge in the planning as satisfiablity frame
• Key to order-of-magnitude scaling
• Propositional logic, temporal logic, ...
• Can be applied before/after SAT encoding
Can control time / quality tradeoff• Power of underlying SAT engines gives option of
finding higher quality solutions
Heuristics are independent from the SAT engine• Can use same axioms for radically different
problem solvers
58
How to Generate Control KxHow to Generate Control Kx
Introspection• Try to capture “obvious” inferences that are hard to
deduce
EBL (Minton, Kambhampati)
• Generalize trace of previous problem solving
Static analysis (Smith, Etzioni, Knoblock, Peot)
• Analyze operators
Inductive Logic Programming (Huang, Selman, Kautz)
• Find rules that hold for a set of previous high-quality solution plans
59
Part 4: Beyond SAT: Planning with Resources and
Optimization
Part 4: Beyond SAT: Planning with Resources and
Optimization
60
ConclusionsConclusions
• Propositional approaches to Open-Loop planning using general SAT engines are highly competitive with specialized planning algorithms
• Synergy with Plan Graph approaches
• Can effectively employ purely declarative control knowledge
• Promising direction: generalization to domains with numeric information
• Biggest limitation: domains where number of objects is too large to instantiate
– “lifted SAT” - limited quantification?