Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.

60
Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs

Transcript of Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.

Open-Loop Planning as Satisfiability

Open-Loop Planning as Satisfiability

Henry KautzAT&T Labs

2

A Core Computation ProblemA Core Computation Problem

Focus: Classical state-space planning, extended with parallel actions

Core problem - similar computational issues arise in many other models

• Reactive plans

• Planning with uncertainty and utilities

• Continuous processes

• Metric time

3

State-space PlanningState-space Planning

Find a sequence of operators that transform an initial state to a goal state

State = complete truth assignment to a set of variables (fluents)

Goal = partial truth assignment (set of states)

Operator = a partial function State State

• specified by three sets of variables:precondition, add list, delete

list

4

ParallelismParallelism

Operators may be applied in parallel when all orderings are well defined and equivalent

(Op1 || Op2)(s) = Op2(Op1(s)) = Op1(Op2(s))

A special form of non-linear plans

• Only allows parallel actions, not parallel action sequences

• Easy to serialize

5

Abdundance of Negative Complexity Results

Abdundance of Negative Complexity Results

I. Domain-independent planning: PSPACE-complete

(Chapman 1987; Bylander 1991; Backstrom 1993)

II. Domain-dependent planning: NP-complete(Chenoweth 1991; Gupta and Nau 1992)

III. Approximate planning: NP-complete(Selman 1994)

6

PracticePractice

Until recently, domain-independent planning systems could only generate very short plans

– 5 steps

Practical systems minimize or eliminate search• Search control rules

(Sacerdoti 1975, Slaney 1996, Bacchus 1996)

• Pre-compiling entire state-space (Agre & Chapman 1997, Williams & Nayak 1997)

Scaling remains problematic when state space is large or not well understood!

How far can can one push domain-independent planning?– AIPS Planning Competition - 50 step plans

7

Planning as InferencePlanning as Inference

• Planning as first-order theorem proving (Green 1969)

computationally infeasible

• STRIPS (Fikes & Nilsson 1971)

very hard

• Partial-order planning (modal truth criteria) (Tate 1977, Chapman 1985, McAllester 1991, Smith & Peot 1993)

can be more efficient, but still hard (Minton, Bresina, & Drummond 1994)

• SATPLAN: planning as propositional reasoning

8

ApproachApproach

SAT encodings are designed so that plans correspond to satisfying assignments

Use recent efficient satisfiability procedures (systematic and stochastic) to solve

Evaluation performance on benchmark instances

9

OutlineOutline

1. Modeling and Solving Planning Problems as SAT

2. Improved Encodings using Graph Analysis

3. Improved Encodings using Compiled Control Knowledge

4. Beyond SAT: Planning with Resources and Optimization

10

Part 1: Modeling and Solving Planning Problems as SAT

Part 1: Modeling and Solving Planning Problems as SAT

11

SAT EncodingsSAT Encodings

Propositional CNF: no variables or quantifiers

Sets of clauses specified by axiom schemas

• fully instantiated before problem-solving

Discrete time, modeled by integers

• state predicates: indexed by time at which they hold

• action predicates: indexed by time at which action begins– each action takes 1 time step

– many actions may occur at the same step

12

Encoding ConventionsEncoding Conventions

• Actions imply preconditions and effects

fly(x,y,i) at(x,i) & route(x,y) & at(y,i+1)

• Conflicting actions cannot occur at same time (A deletes a precondition of B)

fly(x,y,i) & yz fly(x,z,i)

• If something changes, an action must have caused it (Explanatory Frame Axioms)

at(x,i) & at(x,i+1) y . route(x,y) & fly(x,y,i)

• Initial and final states hold

at(NY,0) & ... & at(LA,9) & ...

13

Modeling TricksModeling Tricks

Can often dramatically reduce size of problem by modeling techniques

move(x,y,z,i) requires n4 vars

pickup(x,y,i), putdown(x,z,i) requires 2n3 vars

State-based encodings: eliminate all action variables (“compile away”)

at(x,i) at(x,i+1) y . route(x,y) & at(y,i+1)

at(x,i) & xy at(y,i)

14

Solution to a Planning ProblemSolution to a Planning Problem

A solution is specified by any model (satisfying truth assignment) of the conjunction of the axioms describing the initial state, goal state, and operators

Easy to convert back to a STRIPS-style plan

15

SATPLANSATPLAN

axiomschemas instantiated

propositionalclauses

satisfyingmodelplan

mapping

length

problemdescription

SATengine(s)

instantiate

interpret

16

SAT AlgorithmsSAT Algorithms

Systematic Search• DP (Davis Putnam Logemann Loveland)

backtrack search + unit propagation

• satz (Chu Min Li) - variable selection by forward checking: max unit props

• relsat (Bayardo) - dependency directed backtracking: add new clauses at dead-ends

Local Search• Inspired by Mins-Conflict algorithm

(Adorf, Johnson, Minton, Philips, & Laird)

• GSAT (Selman), Walksat (Selman, Kautz & Cohen)greedy local search + noise to escape minima

17

Planning Benchmark Test SetPlanning Benchmark Test Set

Extension of Graphplan test set

blocks world - up to 18 blocks, 1019 states

logistics - complex, highly-parallel transportation domain.

Logistics.d:

• 2,165 possible actions per time slot

• 1016 legal configurations (22000 states)

• optimal solution contains 74 distinct actions over 14 time slots

Problems of this size never previously handled by state-space planning systems

18

Scaling Up Logistics PlanningScaling Up Logistics Planning

0.01

0.1

1

10

100

1000

10000

rocket.a

rocket.b

log.b

log.a

log.c

log.d

log

so

luti

on

tim

e

Graphplan

DP

DP/Satz

Walksat

19

Unpredictability of Systematic Search

Unpredictability of Systematic Search

0.01

0.1

1

10

100

1000

10000

rocket.a

rocket.b

log.b

log.a

log.c

log.d

log

so

luti

on

tim

e

DP/Satz

Walksat

20

Randomized RestartsRandomized Restarts

Solution: randomize the systematic solver

• Add noise to the heuristic branching (variable choice) function

• Cutoff and restart search after a fixed number of backtracks

In practice: rapid restarts with low cutoff can dramatically improve performance

(Gomes 1996, Gomes, Kautz, and Selman 1997, 1998)

21

Increased PredictabilityIncreased Predictability

0.01

0.1

1

10

100

1000

10000

rocket.a

rocket.b

log.b

log.a

log.c

log.d

log

so

luti

on

tim

e

Satz

Satz/Rand

22

What SATPLAN ShowsWhat SATPLAN Shows

General propositional theorem provers can compete with state of the art specialized planning systems

• New, highly tuned variations of DP surprising powerful

– result of sharing ideas and code in large SAT/CSP research community

– specialized engines can catch up, but by then new general techniques

• Radically new stochastic approaches to SAT can provide very low exponential scaling

– 2+ orders magnitude speedup on hard benchmark problems

23

Why SATPLAN WorksWhy SATPLAN Works

More flexible than forward or backward chaining

• Systematic: most unit propagation at most highly constrained states

• Stochastic: iterative repair

Space for time tradeoff

• Less overhead since does not have to instantiate variable during search

Randomized algorithms less likely to get trapped along bad paths

24

Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner

Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner

25

GraphplanGraphplan

Planning as graph search (Blum & Furst 1995)

Set new paradigm for planning

Like SATPLAN...

• Two phases: instantiation of propositional structure, followed by search

Unlike SATPLAN...

• Interleaves instantiation and pruning of plan graph

• Employs specialized search engine

Graphplan - better instantiation

SATPLAN - better search

26

Graph PruningGraph Pruning

Graphplan instantiates in a forward direction, pruning unreachable nodes • conflicting actions are mutex

• if all actions that add two facts are mutex, the facts are mutex

• if the preconditions for an action are mutex, the action is unreachable!

In logical terms: limited application of negative binary propagation

• given: P V Q, P V R V S V ...

• infer: Q V R V S V ...

27

The Plan GraphThe Plan Graph

Facts FactsActions

... ...

Facts FactsActions

... ...

preconditions add effects

mutually exclusive

delete effects

28

Bridging ParadigmsBridging Paradigms

Both SATPLAN and Graphplan are disjunctive planners (Kambhampati 1996)

Can the best features of each be combined?

IJCAI Challenge in Bridging Plan Synthesis Paradigms (Kambhampati 1997)

Our response: blackbox

29

Translation of Plan GraphTranslation of Plan Graph

Fact Act1 Act2

Act1 Pre1 Pre2

¬Act1 ¬Act2

Act1

Act2

Fact

Pre1

Pre2

30

Improved EncodingsImproved Encodings

Translations of Logistics.a:

STRIPS Axiom Schemas SAT(Medic system, Weld et. al 1997)

• 3,510 variables, 16,168 clauses

• 24 hours to solve

STRIPS Plan Graph SAT

• 2,709 variables, 27,522 clauses

• 5 seconds to solve!

31

Limited InferenceLimited Inference

SATPLAN used only a single general theorem-prover

What role can limited (polytime) reasoning algorithms play?

Two kinds of limited deduction• Planning specific (mutex computation)

• General polytime simplification – apply to all CNF formulas, may or may not be

designed with planning in mind

32

General Limited InferenceGeneral Limited Inference

Generated wff can be further simplified by consistency propagation techniques

Compact (Crawford & Auton 1996)

• unit propagation: is Wff inconsistant by resolution against unit clauses?

O(n)

• failed literal rule: is Wff + { P } inconsistant by unit propagation?

O(n2)

• binary failed literal rule: is Wff + { P V Q } inconsistant by unit propagation?

O(n3)

Complements domain specific limited inference

Discovers hidden local structure!

33

General Limited InferenceGeneral Limited Inference

Percent vars set byProblem Varsunitprop

failedlit

binaryfailed

bw.a 2452 10% 100% 100%bw.b 6358 5% 43% 99%bw.c 19158 2% 33% 99%log.a 2709 2% 36% 45%log.b 3287 2% 24% 30%log.c 4197 2% 23% 27%log.d 6151 1% 25% 33%

34

BlackboxBlackbox

STRIPSPlan Graph

Mutex computation

CNF

GeneralStochastic / Systematic SAT engines

Solution

SimplifierTranslator

CNF

35

Staged InferenceStaged Inference

Domain specific model

Polytime domain specific inference

General language encoding

Full general inference(NP complete)

Solution

Polytime general inference

Abstract problem specification

Encoding scheme

Combinatorial CORE

36

IntuitionIntuition

Many real-world problems not tractable, but are nearly so

• polytime inference takes advance of special kinds of structure

• structure may be visible at the level of a domain specific representation, or only after the problem is encoded

• small number of practical methods for combinatorial core– can be highly optimized

37

Blackbox ResultsBlackbox Results

0.01

0.1

1

10

100

1000

10000

rocket.a rocket.b log.a log.b log.c log.d

Graphplan

BB-walksat

BB-rand-sys

Handcoded-walksat

38

Part 3: Improved Encodings: Compiling Control KnowledgePart 3: Improved Encodings: Compiling Control Knowledge

39

Kinds of Control KnowledgeKinds of Control Knowledge

About domain itself• a truck is only in one location

About good plans• do not remove a package from its destination location

About how to search• plan air routes before land routes

40

Expressing KnowledgeExpressing Knowledge

Such information is traditionally incorporated in the planning algorithm itself

– or in a special programming language

Instead: use additional declarative axioms– (Bacchus 1995; Kautz 1998; Chen, Kautz, & Selman 1999)

• Problem instance: operator axioms + initial and goal axioms + control axioms

• Control knowledge constraints on search and solution spaces

• Independent of any search engine strategy

41

Axiomatic Control KnowledgeAxiomatic Control Knowledge

State Invariant: A truck is at only one location

at(truck,loc1,i) & loc1 loc2 at(truck,loc2,i)

Optimality: Do not return a package to a location

at(pkg,loc,i) & at(pkg,loc,i+1) & i<j at(pkg,loc,j)

Simplifying Assumption: Once a truck is loaded, it should immediately move

in(pkg,truck,i) & in(pkg,truck,i+1) &at(truck,loc,i+1)

at(truck,loc,i+2)

42

Adding Control Kx to SATPLANAdding Control Kx to SATPLAN

ProblemSpecification

Axioms

Control Knowledge

Axioms

Instantiated Clauses

SAT Simplifier

SAT Engine

SAT “Core”

As control knowledge increases, Core shrinks!

43

Tradeoffs of Control KnowledgeTradeoffs of Control Knowledge

If the planning domain is inherently intractable, how can any amount of control knowledge make planning tractable?

• by reducing solution quality

• optimal planning - NP-Hard

• non-optimal - (maybe) Polynomial

Issue: speed / quality tradeoff

Case study: Control Knowledge in TLPLAN and BlackBox

• TLPLAN (Bacchus 1996): simple forward-chaining search with strong control rules

44

IntuitionIntuition

Greater raw search power of SAT approach should give better quality solutions than planners entirely dependent on control knowlege

45

TLPlanTLPlan

Temporal Logic Control Formula

46

I. Rules involves only static information

II. Rules depends on the current state

III. Rules depends on the current state and

requires dynamic user-defined predicates

Temporal Logic for ControlTemporal Logic for Control

( at(obj1, loc1) =>

at(obj1, loc1) )

47

a

Category I Control RulesCategory I Control Rules

a

Do NOT unload an object from an airplane unless the object is at its goal destination

GoalInitial

a

SFO ORLNYC

48

Pruning the Planning GraphCategory I Rules

Pruning the Planning GraphCategory I Rules

Facts FactsActions

... ...

Facts FactsActions

... ...

49

Effect of Graph PruningEffect of Graph Pruning

0

2000

4000

6000

8000

10000

log-a log-b log-c log-d

nu

mb

er o

f n

od

es

Original Pruned

50

Category II Control RulesCategory II Control Rules

a

ORL NYC

Do NOT move an airplane if there is an object in the airplane that needs to be unloaded at that location.

SFO

51

Control by Adding ConstraintsControl by Adding Constraints

Control Rules

))ORLORL )next(at(p,)at(p,(in(pkg,p)

Planning Formula Constraints Clauses

)( 1 iii yyx

52

Blackbox with Control Knowledge(Logistics domain)

Blackbox with Control Knowledge(Logistics domain)

1

10

100

1000

10000

log-a log-b log-c log-d log-e

tim

e (s

ec)

blackbox blackbox(I) blackbox(I&II)

53

Blackbox with Control Knowledge(Tire-World domain)

0

20

40

60

80

100

120

tire-a tire-b

tim

e (s

ec)

blackbox blackbox(I) blackbox(I&II)

54

Comparison between Blackbox and TLPlan(Parallel Plan Length)

0

5

10

15

20

25

30

35

log-c log-d log-e log-1 log-2

Par

alle

l P

lan

Len

gth

TLPlan Blackbox

55

Comparison between Blackbox and TLPlan(Running Time)

Comparison between Blackbox and TLPlan(Running Time)

0

20

40

60

80

log-a log-b log-c log-d log-e

Tim

e (

se

c)

TLPlan Blackbox(I&II)

56

ComparisonComparison

TLPlan (without Control): Intractable.TLPlan (with Control): fastest, but limited parallelism

Blackbox (without Control): slower, high parallelismBlackbox (with Control): faster, high parallelism

57

SummarySummary

Easy to encode domain-specific knowledge in the planning as satisfiablity frame

• Key to order-of-magnitude scaling

• Propositional logic, temporal logic, ...

• Can be applied before/after SAT encoding

Can control time / quality tradeoff• Power of underlying SAT engines gives option of

finding higher quality solutions

Heuristics are independent from the SAT engine• Can use same axioms for radically different

problem solvers

58

How to Generate Control KxHow to Generate Control Kx

Introspection• Try to capture “obvious” inferences that are hard to

deduce

EBL (Minton, Kambhampati)

• Generalize trace of previous problem solving

Static analysis (Smith, Etzioni, Knoblock, Peot)

• Analyze operators

Inductive Logic Programming (Huang, Selman, Kautz)

• Find rules that hold for a set of previous high-quality solution plans

59

Part 4: Beyond SAT: Planning with Resources and

Optimization

Part 4: Beyond SAT: Planning with Resources and

Optimization

60

ConclusionsConclusions

• Propositional approaches to Open-Loop planning using general SAT engines are highly competitive with specialized planning algorithms

• Synergy with Plan Graph approaches

• Can effectively employ purely declarative control knowledge

• Promising direction: generalization to domains with numeric information

• Biggest limitation: domains where number of objects is too large to instantiate

– “lifted SAT” - limited quantification?