Lecture 4 - Planning

Lecture 4 Planning

Lecture Plan

1. Planning:

I. PDDL

II. Partial Order Planning

III. Planning Graphs and the Graph Plan Algorithm

PDDL: Planning Domain Definition Language

States

Actions

Goal


States

States are specified by a set of 'atomic fluents.'

• Fluents are propositions that can change their truth value

• Atomic means these propositions are simple: They do not include conjunctions, disjunctions, conditionals or negation.

• Database semantics is assumed, allowing negation by omission.

• Unique names assumption

• Closed-world assumption

• Domain closure

Actions

Goal


States

Actions

Actions are specified by a set of preconditions and effects, both of which are conjunctions of literals.

• Literals are simple atomic propositions or negations of atomic propositions.

• All preconditions are current, all effects immediate.

Actions can be performed at a state is all positive literals are, and all negative literals are not, in the set of fluents at a state.

The effect of an action is to add all positive literals, and remove all negative literals, in the action's effects from the current state.

• Where an action has non-trivial effects, this causes us to transition to a new state.

Goal

How actions function as transitions:

State 1

Action State

2

State 1: Set, F, of Atomic Fluents

State 2: Set, G = 𝐹 ∪ 𝐸+ ∖ 𝐴(𝐸−), of Atomic Fluents

Preconditions: Set, P = 𝑃+ ∪ 𝑃−, of literals, where:

• 𝑃+⊆F • 𝐴(𝑃−) ∩ F=∅.

Effects: Set, E = 𝐸+ ∪ 𝐸−, of literals.


States

Actions

Goal

The planning goal is specified by a conjunction of literals OR a schema which can be satisfied by such a conjunction of literals.

We have achieved a goal state when all positive literals are, and all negative literals are not, included in the current state specification.

A PDDL Problem

1. An initial state

2. A set of actions, or action schema:

Action schema use variables:

Action: Eat Cake (x,y)

Precondition: Cake(x), -Eaten(x), Person(y), Hungry(y)

Effect: Eaten(x), -Hungry(y)

3. A goal specification.

PDDL Problems as Search

1. We must discover the transitions we can make:

At each state, we must search through our actions to discover which actions may be performed there.

2. It is easy to search backwards and discover states that can transition to our current state given a possible action.

To find predecessor state to current state given a particular action: Remove all effects (add negations) and add all preconditions (remove negations) from current state.

PDDL Example: Initial State

Initial state: Person (mike)

PDDL Example: Actions

Action: Go Shopping (x)

Preconditions: Person(x)

Effects: HasMix(x)


Action: Make Cake(x)

Preconditions: Person(x)

HasMix (x)

Effects: HasCake(x)


Action: Wait(x)

Preconditions: Person(x),

-Hungry(x)

Effects: Hungry(x)


Action: Eat Cake A (x)


Hungry(x),

HasCake(x)

Effects: -HasCake(x),

-Hungry(x),

EatenCake(x)


Action: Eat Cake B (x)


-Hungry(x),

HasCake(x)


-Person(x)

PDDL Example: Goal

Preconditions: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

We could have a goal specification that included variables. Eg Let anyone who is a person, not just mike, eat a cake while having a cake and not being hungry:

Person(x),

-Hungry(x),

HasCake(x),

EatenCake(x)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)

Effects: -HasCake(x)

-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),

-Hungry(x) Action: Wait (x)

Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Person(mike)

Initial State

Person(mike)

Satisfied


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Person(mike)

Satisfied by omission


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Person(mike)

Unsatisfied


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Person(mike)

Satisfied

Initial State

Person(mike)

Satisfied (the second by omission).


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Person(mike)

Go Shopping (mike)

Wait (mike)

S1

Person(mike), HasMix(mike)

S2

Person(mike), Hungry(mike)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Pre.: Person(x)

-Hungry(x),

HasCake(x)


-Person(x)


Pre.: Person(x)

Hungry(x),

HasCake(x)


EatenCake(x),


Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)


Pre.: Person(x)

Effects: HasMix(x)

Goal

Pre.: Person(mike),

-Hungry(mike),

HasCake(mike),

EatenCake(mike)

Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)

Person(mike), HasMix(mike) At the next step, from S1 we could go shopping

again. But it wouldn’t change anything.



Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)



The preconditions of the action are met...


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)




But the effects are already true at this state.


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)

Person(mike), HasCake(mike)

Go Shopping (mike)

Make Cake (mike)

Or from S1 we could make a cake which would change things.



Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Go Shopping (mike)

Make Cake (mike)





Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Go Shopping (mike)

Make Cake (mike)




And the effects cause one fluent to be made true...


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

-HasMix(x)


Go Shopping (mike)

Make Cake (mike)




And the effects cause one fluent to be made true... And one to be made false.

Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Go Shopping (mike)

Make Cake (mike)

S4

Action: Wait (x)

Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x) Person(mike), HasMix(mike)

Wait (mike)

Or we could wait.



Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3


Go Shopping (mike)

Make Cake (mike)

S4 Go Shopping (mike)

Wait (mike)


Pre.: Person(x)

Effects: HasMix(x)


From S2 we can only go shopping. And that will take us to S4 by another route.


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)



Pre.: Person(x)

Effects: HasMix(x)

Person(mike), HasMix(mike), HasCake(mike)

From S3 we can go shopping again.


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)


S6


Pre.: Person(x)

-Hungry(x)

HasCake(x)

Effects: -Person(x),

-HasCake(x)

Eat Cake B (mike)


Or we can eat the cake with action ’Eat Cake B’ (by omission Mike is not hungry). This leads to a state with no atomic fluents. Everything is negated. Including ’Person(mike)’: Mike is dead in this state!


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)


S6

Action: Wait (x)

Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)

Eat Cake B (mike)

S7 Person(mike), HasCake(mike), Hungry(mike) Wait

(mike)


Or we can wait (and Mike lives...)


Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)


S6

Eat Cake B (mike)

S7 Person(mike), HasCake(mike), Hungry(mike)

Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)


From S4 we can go shopping again, but that doesn’t change anything.

Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)


S6

Eat Cake B (mike)


Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

Make Cake (mike)


Or we can make a cake, which takes us to S7

Initial State

Go Shopping (mike)

Wait (mike)

S1

S2

S3

Go Shopping (mike)

Make Cake (mike)

S4

Go Shopping (mike)

Wait (mike)


S6

Eat Cake B (mike)


Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

Make Cake (mike)


Lets compress the diagram...

Initial State

S1

S2

S3

S4

S5

S6

S7

Person(mike), HasCake(mike), Hungry(mike)


Lets compress the diagram...

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)


From S5, we can go shopping, changing nothing...

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)


Pre.: Person(x)

HasMix(x)

Effects: HasCake(x)

Make Cake (mike)


From S5, we can make a cake... Leading back to S3, since we cannot express that we have multiple cakes.

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 HasMix(mike) Make Cake (mike)

Eat Cake B (mike)


Pre.: Person(x)

-Hungry(x)

HasCake(x)

Effects: -Person(x),

-HasCake(x)


Or we can eat the cake, leading to another way to kill Mike.

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 Make Cake (mike)

Eat Cake B (mike)

S9 Person(mike), HasMix(mike), HasCake(mike), Hungry(mike)

Wait (mike)

Action: Wait (x)

Pre.: Person(x)

-Hungry(x)

Effects: Hungry(x)

HasMix(mike)


Or we can wait.

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 Make Cake (mike)

Eat Cake B (mike)

S9 Wait (mike)

HasMix(mike)


Person(mike), HasMix(mike), HasCake(mike), Hungry(mike)

We can’t do anything from S6, since all actions require Person(mike).

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 Make Cake (mike)

Eat Cake B (mike)

S9 Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

Effects: HasMix(x)

HasMix(mike)


Person(mike), HasMix(mike), HasCake(mike), Hungry(mike) From S7, we can go shopping, leading to S9.

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 Make Cake (mike)

Eat Cake B (mike)

S9 Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

Hungry(x)

HasCake(x)

Effects: Hungry(x)

-HasCake(x)

S10 Person(mike), EatenCake(mike)

Eat Cake A (mike)

HasMix(mike)



Or we can eat the cake. This time we use ’Eat Cake A’, since Mike is hungry. So this time he lives! (Yay)

Initial State

S1

S2

S3

S4

S5

S6

S7


Go Shopping (mike)

S8 Make Cake (mike)

Eat Cake B (mike)

S9 Wait (mike)

Go Shopping (mike)


Pre.: Person(x)

Hungry(x)

HasCake(x)

Effects: Hungry(x)

-HasCake(x)

S10 Person(mike), EatenCake(mike)

Eat Cake A (mike)

HasMix(mike)



Lets clear out the diagram (and stop showing actions)...

S5

S6

S7


S8

S9

S10

Person(mike), EatenCake(mike)

HasMix(mike) Person(mike), HasMix(mike), HasCake(mike)


Lets clear out the diagram (and stop showing actions)...

S5

S6

S7


S8

S9

S10




S8 is another terminal node, since we killed off Mike.

S5

S6

S7


S8

S9

S10


Go Shopping (mike)



At S9, we can go shopping but it does nothing.

S5

S6

S7


S8

S9

S10


Go Shopping (mike)

Make Cake (mike)



Or we can make another cake. But that takes us back to S7.

S5

S6

S7


S8

S9

S10


Go Shopping (mike)

S11 Eat Cake A (mike)

Person(mike), HasMix(mike), EatenCake(mike)

Make Cake (mike)



Or we can eat the cake using ’Eat Cake A’.

S5

S6

S7


S8

S9

S10


Go Shopping (mike)

Make Cake (mike)


Go Shopping (mike)




At S10, we can go shopping, which also leads to S11.

S5

S6

S7


S8

S9


S10


Go Shopping (mike)

Make Cake (mike)


Go Shopping (mike)

S12

Wait (mike) Person(mike),

HasMix(mike), EatenCake(mike), Hungry(mike)



Or we can wait

S5

S6

S7


S8

S9


S10


Go Shopping (mike)

Make Cake (mike)


Go Shopping (mike)

S12





GOAL!

Goal conditions:

Person(mike)

HasCake(mike),

EatenCake(mike)

-Hungry(mike)

Finally, from S11 we can make a cake and reach a state that satisfies the goal conditions!

Make Cake (mike)

S5

S6

S7


S8

S9


S10


Go Shopping (mike)

Make Cake (mike)


Go Shopping (mike)

S12





GOAL!

Goal conditions:

Person(mike)

HasCake(mike),

EatenCake(mike)

-Hungry(mike)

Finally, from S11 we can make a cake and reach a state that satisfies the goal conditions!

We can do a few more things from S12 that will take us back to other states, but lets ignore them.

Make Cake (mike)

S11 G

Make

S9 Eat A

S10 Shop

S7

S5

Shop

Wait

Shop

Eat A

S3 Shop

S4

Shop

Make

Wait

Shop

S1

Make

Wait

S2 Shop

Shop

I

Shop

Wait

Backtrack to find mutliple paths/plans.

And we can backtrack though the graph (not the search tree, which we didn’t show) to find possible paths.

Points to note

Naive searches are not going to scale.

Limitations regarding expression

Really for deterministic domains.

Can have difficulties with disjunctions (numerics a case in point).

Specifications of actions VERY important.

Can overcome expressive limitations

Can avoid pointless searching.

Loops in the search space

Orphan nodes in the search space

Potentially multiple goal nodes.

Potentially multiple paths from the initial state to a goal node.

Often due to the fact that we can perform the same steps in multiple orders.

Solving PDDL problems

Planning as Partial Ordering

A partial ordering is just a directed, acyclic graph.

Idea:

- Make use of the ability to search backwards.

- Deal with sets of actions that can be performed in multiple orders.

Planning as Partial Ordering: Basic Idea THIS IS NOT A ROBUST ALGORITHM: IT REQUIRES CONSTRAINTS ON THE PROBLEM SPECIFICATION TO AVOID ERRORS. IT IS HERE TO GIVE YOU AN IDEA OF PARTIAL ORDERED PLANNING.

1. Enter initial state in a graph, and place the literals true at the initial state in a ’Base’ list.

2. Enter the goal state in the graph, specifying the required literals. This is a list of ’open’ literals. Literals are ’closed’ by placing an arrow to them from an action node or the initial state node.

3. While there are open literals remaining that can be closed:

I. If there are open literals that are in the base list or any effect lists, close them by drawing an arrow from the initial state/action to the literals.

II. If there are open literals which are the effect of actions:

1. Select an action whose effects are consistent with the other literals in the specific list.

2. Placing it as a node on the graph.

3. Place the effects of the action in a effect list.

4. Close the open literal by drawing an arrow from the relevant item in the effect list to the literal.

5. Then place the preconditions for the action is a new list of open literals.

4. If the above terminates with open literals remaining, backtrack and try to use different actions. (You would want to choose routes cleverly in real problems.)

Goal: Person(mike), -Hungry(mike), EatenCake(mike), HasCake(mike)

Base list: Person(mike) -Hungry(mike) -HasCake(mike) -HasMix(mike) -EatenCake(mike)

We set up: • The base list from the initial state • The initial open literal list from

the goal state .



We can immediately close two literals in the goal list, since they are present in the base list.



Now we want to close ’EatenCake(mike)’. Only ’Eat Cake A’ can do this, so we add this action...



Person(mike) HasCake(mike) Hungry(mike)

EatenCake(mike) -Hungry(mike)

Now we want to close ’EatenCake(mike)’. Only ’Eat Cake A’ can do this, so we add this action...





From the effects list of the ’Eat Cake A’ action, we can close ’EatenCake(mike)’.





And we close the ’Person(mike)’ literal on the new open literal list, since it is present in the base list.




Person(mike) HasMix(mike)

HasCake(mike)

Now we want to close ’HasCake(mike)’. Only ’Make Cake’ can do this, so we add this action and use it to remove the literal.







HasCake(mike)

And we close what we can from the new open literal list.






HasCake(mike)


HasCake(mike)

We want to close ’HasCake(mike)’ again, but in a different list. So we add ’Make Cake’ again...






HasCake(mike)


HasCake(mike)

And we close the open literal






HasCake(mike)


HasCake(mike)

And we close what we can from the new open literal list.






HasCake(mike)


HasCake(mike)

Lets close both open ’HasMix’ literals with two seperate ’GoShopping’ actions.






HasCake(mike)


HasCake(mike)

Person(mike)

HasMix(mike)


Lets close both open ’HasMix’ literals with two seperate ’GoShopping’ actions.






HasCake(mike)


HasCake(mike)

Person(mike)

HasMix(mike)


We can close the open literals on these new lists directly from the base list. So these paths are completed.






HasCake(mike)


HasCake(mike)

Person(mike)

HasMix(mike)


This leaves us wanting to close ’Hungry(mike)’, so we add the ’Wait’ action.






HasCake(mike)


HasCake(mike)

Person(mike)

HasMix(mike)


Person(mike) -Hungry(mike)

Hungry(mike)

This leaves us wanting to close ’Hungry(mike)’, so we add the ’Wait’ action.






HasCake(mike)


HasCake(mike)

Person(mike)

HasMix(mike)


Person(mike) -Hungry(mike)

Hungry(mike)

Now we can close the remaining open literals from the base list and we are done!

Goal

Initial state

Using action names, here is our partial order plan...

Go Shopping (mike)

Go Shopping (mike)

Make Cake (mike)

Wait (mike)

Make Cake (mike)

Eat Cake A (mike)

Goal

Initial state

We can perform actions in any order that does not violate the partial order given here.

Go Shopping (mike)

Go Shopping (mike)

Make Cake (mike)

Wait (mike)

Make Cake (mike)

Eat Cake A (mike)

Note: Avoid extra work

Although we didn’t see it in our example, we can close open literals from previously placed actions.

Goal: Attached(painting) Attached(poster)

Base List HasHammer(mike) Unattached(poster) Unattached(painting)



HasTape(mike) Unattached(poster)

Attached(poster) -HasTape(mike)



Add ’TapePoster(mike,poster)’ action.




HasNail(mike) HasHammer(mike) Unattached(painting)

Attached(painting) -HasNail(mike)




Add ’NailPainting(mike,painting)’ action.




Add ’GetSupplies(mike)’ action.


Attached(painting) -HasNail(mike) HasNail(mike)

HasTape(mike)







Use it to close open literals in both ’TapePoster(mike,poster) and ’NailPainting(mike,painting)’ actions.


Attached(painting) -HasNail(mike) HasNail(mike)

HasTape(mike)




Improvements to the algorithm

THESE IMPROVEMENTS SHOW HOW THE BASIC ALGORITHM CAN BE ADDED TO IN ORDER TO MAKE IT MORE ROBUST IN CERTAIN CIRCUMSTANCES. THEY ARE NOT A COMPREHENSIVE LIST OF DEVELOPING A POP PLANNER THAT CAN DEAL WITH ANY PDDL PROBLEM. IF YOU WANT TO IMPLEMENT A POP PLANNER YOU STILL NEED TO THINK CAREFULLY ABOUT PROBLEM SPECIFICATION. YOU SHOULD RESEARCH THE LITERATURE TO SEE HOW OTHERS HAVE DEALT WITH THESE ISSUES.

Dealing with unchanging fluents

Dealing with expended fluents

Dealing with unchanging fluents

To get the partial order we produced, we ignored that arrows were going from the initial state to all states. This was because Person(mike) was being satisfied at all actions and the goal by the base list.

It would not have mattered to have extra links from the initial state to all states. But it would be cluttered.

You can work out when links can be dropped by examining the actions in the partial order for preconditions that are not effected by the action


Problem:

The graph on the right will terminate using our basic algorithm.

But while we can do either action from the origin, we cannot do both.

HasNail(mike)


HasNail(mike)

Attached(poster) -HasNail(mike)


Base List HasNail(mike)

Dealing with expended fluents Problem:

The graph on the right will terminate using our basic algorithm.

But while we can do either action from the origin, we cannot do both.

HasNail(mike)


HasNail(mike)





Solution:

Track the inconsistencies between the preconditions and effects of an action to close open fluents.

Close base literals/effects if they are effected by an action that they are used to supply the preconditions for.

HasNail(mike)


HasNail(mike)




Planning Graphs

Planning Graphs

Directed graphs, organized into alternating state and action levels:

s0,a0,s1,a1...sn.

Each action at ai is connected to its precondition at si and effects at si+1.

Trivial ’no-op’ actions are given for all literals. No-op action for literal P:

Preconditions: P

Effects: P

Tracks (some) mutual exclusion relations.

Mutual Exclusion

Level ai contains all actions that can occur at level si and records mutual exclusion links between incompatible actions. Two actions are recorded as incompatible if:

Inconsistent effects – one action’s effect negates an effect of the other.

Interference – one action’s effect is the negation of a precondition of the other.

Competing needs – a precondition for one action is marked as mutually exclusive to a precondition for the other.

Level si+1 contains all literals could result from any subset of compatible actions at level ai and records mutual exclusion links between incompatible literals. Two literals are recorded as incompatible if:

Contradictory – One is the negation of the other.

Inconsistent support – If each possible pair of actions that could produce the literals are mutually exclusive.

Note: Not all mutual exclusions are tracked. Literals can appear before they are actually possible!

Planning Graph

Person(mike) -Hungry(mike) -HasCake(mike) -HasMix(mike) -EatenCake(mike)

S0

The first state layer has all the literals true in the initial state.

Planning Graph


S0

Each action layer has all the actions that could be perform given the first state

layer.

Actions point to their preconditions.

Trivial actions are represented by ’□’.

□

Go Shopping Wait □

□

□

□

S0

Planning Graph


S0 Each state layer now has all literals that are the result of any action in the previous action later. Each state points to the actions it is an

effect of.

Trivial actions ensure that the state layers are monotonically

increasing (we never loose states).

□


□

□

□

A0

Person(mike)

Hungry(mike) -Hungry(mike)

-HasCake(mike) HasMix(mike) -HasMix(mike)

-EatenCake(mike)

S1

Planning Graph


S0

□


□

□

□

A0

Person(mike)



-EatenCake(mike)

S1

And we enter mutex (mutual exclusion) relations.

Two actions are recorded as incompatible if: • one action’s effect

negates an effect of the other.

• one action’s effect is the negation of a precondition of the other.

• a precondition for one action is marked as exclusive to a precondition for the other.

Planning Graph


S0

□


□

□

□

A0

Person(mike)



-EatenCake(mike)

S1 And we enter mutex (mutual exclusion) relations.

Two states are recorded as incompatible if: • One is the negation

of the other. • If each possible pair

of actions that could produce the literals are mutually exclusive.

Planning Graph

Person(mike)



-EatenCake(mike)

S1

□

Go Shopping □ □

Wait □ □ □

Make Cake □

A1

And we proceed. Note we can create the graph iteratively... And at each stage the number of actions/literals increases...

Person(mike)

Hungry(mike) -Hungry(mike) HasCake(mike) -HasCake(mike) HasMix(mike) -HasMix(mike)

-EatenCake(mike)

S2

Planning Graph

Person(mike)



-EatenCake(mike)

S1

□

Go Shopping □ □

Wait □ □ □

Make Cake □

A1

Person(mike)


-EatenCake(mike)

S2 Two actions are recorded as incompatible if: • one action’s effect

negates an effect of the other.

• one action’s effect is the negation of a precondition of the other.

• a precondition for one action is marked as exclusive to a precondition for the other.

Planning Graph

Person(mike)



-EatenCake(mike)

S1

□

Go Shopping □ □

Wait □ □ □

Make Cake □

A1

Person(mike)


-EatenCake(mike)

S2


of the other. • If each possible pair

of actions that could produce the literals are mutually exclusive.

Planning Graph

Person(mike)



-EatenCake(mike)

S1

□

Go Shopping □ □

Wait □ □ □

Make Cake □

A1

Person(mike)


-EatenCake(mike)

S2


of the other. • If each possible

pair of actions that could produce the literals are mutually exclusive.

Planning Graph

Person(mike)



-EatenCake(mike)

S1

□

Go Shopping □ □

Wait □ □ □

Make Cake □

A1

Person(mike)


-EatenCake(mike)

S2

We continue in this iterative fashion until two consecutive levels are identical. At this point we say the graph has ‘leveled off’ and terminate.

What use are Planning Graphs?

1. If any goal literal fails to appear in the final state level, then the problem is unsolvable.

2. The number of levels between a state and the presence of all goal literals can be used to give an admissable heuristic (for, e.g. A*).

1. The max-level heuristic given the number of levels until all goal literals are in the graph.

2. The set-level heuristic given the number of levels until all goal literals are in the graph AND there are no mutexs between any pair of goal literals.

This is unsurprising: We have effectively relaxed constraint on the problem, which is a general way of obtaining heuristics.

3. We can obtain a plan directly of the Planning Graph using the Graph Planning Algorithm.

Graph Planning Algorithm

Set initial state as current state and then repeat until termination:

1. While at least one goal literals is absent at current state, Sc, OR there exists at least one mutex between goal literals:

Create the next action and state levels, Ac and Sc+1 and then set current state to Sc+1.

2. Otherwise, run the GP-Termination algorithm to see if the algorithm terminated. If it does not, create the next action and state levels, Ac and Sc+1 and set the current state to Sc+1.

GP-Termination Algorithm

1. Set the last level of the planning graph as the initial state for a new search, and initialize a list of literals to be satisfied from the goal state of the planning problem.

2. The actions available in a state at level Si are any non-mutually exclusive sub-set of the actions whose preconditions are themselves non-mutually exclusive.

3. When transitioning from one state to another, the new state has as its goal list the set of preconditions for actions involved in the transition.

4. If a state is found with an empty goal list, a solution path has been found and we termination with success.

5. If no such state exists, no solution is currently possible and we record which goals were unsatisfiable from the initial level. (This is used in terminating the algorithm with a failure - see below - but we can also use this to terminate fruitless searches at later iterations.)

6. If Sc = Sc-1 AND the goals that were unsatisfiable at Sc are the same as those that were unsatisfiable at Sc-1 then we terminate with failure.

Planning Graph

Person(mike)


-EatenCake(mike)

S2

□ Go Shopping

Wait Make Cake Eat Cake A Eat Cake B

□ □ □ □ □ □ □

A2

Person(mike) -Person(mike)


EatenCake(mike) -EatenCake(mike)

S3

Planning Graph

Person(mike)


-EatenCake(mike)

S2

□

Go Shopping Wait

Make Cake Eat Cake A Eat Cake B

□ □ □ □ □ □

□

A2




S3

We’ll do one more level, but stop showing arrows for trivial actions. We use line alignment instead.

Planning Graph

Person(mike)


-EatenCake(mike)

S2

□

Go Shopping Wait


□ □ □ □ □ □

□

A2




S3

‘Eat Cake B’ is mutex with all other non-trivial actions, since it kills Mike.

Planning Graph

Person(mike)


-EatenCake(mike)

S2

□

Go Shopping Wait


□ □ □ □ □ □

□

A2




S3 There are probably more mutexs at S3. But I’ve had enough! All goal literals (indeed all literals period) are in S3. So the heuristic value for the origin, using the max-level heuristic, is 3. The literals of the goal state have mutexes between them. So we would not yet be ready to backtrack in the Graph Planning algorithm. Nor would we be able to calculate the set-level heuristic.

Lecture 4 - Planning

Documents

Transcript of Lecture 4 - Planning