1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State...

1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit

Subbarao Kambhampati

Arizona State University

http://rakaposhi.eas.asu.edu

(With tons of help from Daniel Bryce, Minh Binh Do, Xuan Long NguyenRomeo Sanchez Nigenda, Biplav Srivastava, Terry Zimmerman)

Funding from NSF & NASA

WMD-in-the-toilet“After the flush, you may find

that there were no bombs to begin with”

Planning Graph and Projection

• Envelope of Progression Tree (Relaxed Progression)– Proposition lists: Union

of states at kth level– Mutex: Subsets of

literals that cannot be part of any legal state

• Lowerbound reachability information

p pqrs

A2A3A4

[Blum&Furst, 1995] [ECP, 1997]

A2A1A3

Planning Graphs can be used as the basis forheuristics!

And PG Heuristics for all..

– Classical (regression) planning– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)

• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial expansion

– Graphplan style search– GP-HSP (AIPS 2000)

• Variable/Value ordering heuristics based on distances

– Partial order planning– RePOP (IJCAI 2001)

• Mutexes used to detect Indirect Conflicts

– Metric Temporal Planning– Sapa (ECP 2001; AIPS 2002; JAIR 2003)

• Propagation of cost functions; Phased relaxation

– Conformant Planning– CAltAlt (ICAPS Uncertanity Wkshp, 2003)

Multiple graphs; Labelled graphs

And PG Heuristics for all..

– Graphplan style search– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]

• Multiple graphs; Labelled graphs

“All T

he tim

I. PG Heuristics for State-space (Regression) planners

[AAAI 2000; AIPS 2000; AIJ 2002; JAIR 2003]

Problem: Given a set of subgoals (regressed state) estimate how far they are from the initial state

Graphplan Graph

Extension Phase

(based on STAN)

Planning

Actions in the

Last Level

Action Templates Extraction of

Heuristics

Heuristic

Regression Planner

(based on HSP-R)Problem Specification

(Initial and Goal State)

Planning Graphs: Optimistic Projection of Achievability

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

Pick_key(0,1) Have_key

~Key(0,1)xx

Mutexes

Initial state

Goal state

Grid Problem

• Serial PG: PG where any pair of non-noop actions are marked mutex• lev(S): index of the first level where all props in S appear non-mutexed.

– If there is no such level, then• If the graph is grown to level off, then • Else k+1 (k is the current length of the graph)

Cost of a Set of Literals

• lev(p) : index of the first level at which p comes into the planning graph• lev(S): index of the first level where all props in S appear non-mutexed.

– If there is no such level, thenIf the graph is grown to level off, then Else k+1 (k is the current length of the graph)

Sum Set-Level

Partition-k Adjusted Sum ComboSet-Level with memos

h(S) = pS lev({p}) h(S) = lev(S)

Admissible

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

xAt(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

PROBLEM Level Sum AdjSum2M

Gripper-25 - 69/0.98 67/1.57

Gripper-30 - 81/1.63 77/2.83

Tower-7 127/1.28 127/0.95 127/1.37

Tower-9 511/47.91 511/16.04 511/48.45

8-Puzzle1 31/6.25 39/0.35 31/0.69

8-Puzzle2 30/0.74 34/0.47 30/0.74

Mystery-6 - - 16/62.5

Mistery-9 8/0.53 8/0.66 8/0.49

Mprime-3 4/1.87 4/1.88 4/1.67

Mprime-4 8/1.83 8/2.34 10/1.49

Aips-grid1 14/1.07 14/1.12 14/0.88

Aips-grid2 - - 34/95.98

Adjusting the Sum Heuristic

• Start with Sum heuristic and adjust it to take subgoal interactions into account – Negative interactions in terms

of “degree of interaction”– Positive interactions in terms

of co-achievement links • Ignore negative interactions

when accounting for positive interactions (and vice versa)

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

xAt(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

[AAAI 2000]

HAdjSum2M(S) = length(RelaxedPlan(S)) + max p,qS (p,q)

Where (p,q) = lev({p,q}) - max{lev(p), lev(q)} /*Degree of –ve Interaction */

Optimizations in Heuristic Computation

• Taming Space/Time costs

• Bi-level Planning Graph representation

• Partial expansion of the PG (stop before level-off)

– It is FINE to cut corners when using PG for heuristics (instead of search)!!

• Branching factor can still be quite high

– Use actions appearing in the PG• Select actions in lev(S) vs Levels-off

Heuristic extracted from partial graph vs. leveled graph

100000

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161

Problems

Levels-off

Lev(S)

•A •A•A •A

•B•B

•A •A•A •A

•B•B

Goals C,D are presentExample: Levels off

Trade-off

Discarded

AltAlt Performance

Logistics Domain(AIPS-00).

100000

Problems.

STAN3.0

HSP2.0

AltAlt1.0

Schedule Domain (AIPS-00)

100000

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161

Problems

STAN3.0

HSP2.0

AltAlt1.0Logistics

Scheduling

Problem sets from IPC 2000

ZenoTravel AIPS-02

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Problems

AltAlt

AltAlt-PostProc

AltAlt-p

Even Parallel Plans aren’t safe..Action Templates

Problem Spec(Init, Goal state)

Solution Plan

GraphplanPlan Extension Phase

(based on STAN)

ParallelPlanningGraph

Extraction ofHeuristics

HeuristicsActions in the

Last Level

NodeExpansion(Fattening)

Node Orderingand Selection

PlanCompression

Algorithm(PushUp)

AltAltp

[JAIR 2003]

Logistics AIPS-00

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61

Problems

Altalt-p

Blackbox

LPG 2nd

Serial graph over-estimates • Use “parallel” rather than serial PG

as the basis for heuristicsProjection over sets of actions too costly

•Select the branch with the best action and fatten it

• Use “push-up” to make the partial plans more parallel

II. PG heuristics for Graphplan..

PG Heuristics for Graphplan(!)• Goal/Action Ordering

Heuristics for Backward Search

• Propositions are ordered for consideration in decreasing value of their levels.

• Actions supporting a proposition are ordered for consideration in increasing values of their costs

– Cost of an action = 1 + Cost of its set of preconditions

• Use of level heuristics improves the performance significantly.– The heuristics are surprisingly

insensitive to the length of the planning graph

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,0)

Key(0,1)

Prop listLevel 0

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

xAt(0,1)

At(1,0)

Action listLevel 0

Move(0,0,0,1)

Move(0,0,1,0)

At(0,0)

key(0,1)

Prop listLevel 1

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

At(0,0)

Key(0,1)

Action listLevel 1

Prop listLevel 2

Move(0,1,1,1)At(1,1)

At(1,0)

At(0,1)

Move(1,0,1,1)

…...

~Key(0,1)xx

Mutexes

[AIPS 2000]0.019/320.019/320.019/32rocket-b

.0098/29.0078/29.0068/29rocket-a

>30->30->30-bw-prob04

0.0218/180.0218/180.0118/18huge-fct

7.428/284.1828/284.1328/28bw-large-c

0.2518/180.2118/180.2118/18bw-large-b

.0112/12.00812/12.00712/12bw-large-a

TLTLTL

+10 levels+5levels+3levels

MOPProblem

0.019/320.019/320.019/32rocket-b

.0098/29.0078/29.0068/29rocket-a

>30->30->30-bw-prob04

0.0218/180.0218/180.0118/18huge-fct

7.428/284.1828/284.1328/28bw-large-c

0.2518/180.2118/180.2118/18bw-large-b

.0112/12.00812/12.00712/12bw-large-a

TLTLTL

+10 levels+5levels+3levels

MOPProblem

…And then state-space heuristics for Graphplan

(PEGG)

Planning Graph (proposition levels)

1: Capture a state space view of Graphplan’s search in a search trace

action assignments

Regressed ‘states’

No solution?

extend graph…

0 1 2 3 4 5

…And then state-space heuristics for Graphplan

PEGG now competitive with a heuristic state space planner

ProblemGraphplan PEGG-so

cpu sec (steps/acts)

PEGGcpu sec (steps/acts)

Alt Alt (Lisp version)heuristics:

adjusum2 comboGP-e

bw-large-b 13.4 (18/18) 12.2 3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )

bw-large-c s 1104 66.9 (28/28) 738 (/ 28) 114.9 (/38)

bw-large-d s pe 340 (38/38) 2350 (/ 36) * rocket-ext-a 3.5 (7/36) 2.8 (7/34) 1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)

att-log-a 31.8 (11/79) 2.6 (11/72) 2.2 (11/62) 36.7 ( /56) 2.27( / 64)

Gripper-15 s 47.5 16.7 (36/45) 14.1 (/ 45) 16.98 (/45)

Gripper-20 s s 110.8 (40/59) 38.2 (/ 59) 20.92 (/59)

Tower-9 s (511/511) 118 23.6 (511/511) 121(/511) *8puzzle-1 95.2 (31/31) 31.1 9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)

8puzzle-2 87.5 (30/30) 31.3 7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)

AIPS 1998 Alt Alt (Lisp version)

grid-y-1 16.7 (14/14) 16.8 16.8 (14/14) 739.4 (/14) 640.5 (/14)

mprime-1 4.8 (4/6) 3.6 (4/6) 2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)

driverlog-2-3-6b 27.5 (7/20) 1.9 1.9 (7/20) 232

ProblemGraphplan PEGG-so

adjusum2 comboGP-e

bw-large-b 13.4 (18/18) 12.2 3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )

bw-large-c s 1104 66.9 (28/28) 738 (/ 28) 114.9 (/38)

bw-large-d s pe 340 (38/38) 2350 (/ 36) * rocket-ext-a 3.5 (7/36) 2.8 (7/34) 1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)

att-log-a 31.8 (11/79) 2.6 (11/72) 2.2 (11/62) 36.7 ( /56) 2.27( / 64)

Gripper-15 s 47.5 16.7 (36/45) 14.1 (/ 45) 16.98 (/45)

Gripper-20 s s 110.8 (40/59) 38.2 (/ 59) 20.92 (/59)

Tower-9 s (511/511) 118 23.6 (511/511) 121(/511) *8puzzle-1 95.2 (31/31) 31.1 9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)

8puzzle-2 87.5 (30/30) 31.3 7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)

grid-y-1 16.7 (14/14) 16.8 16.8 (14/14) 739.4 (/14) 640.5 (/14)

mprime-1 4.8 (4/6) 3.6 (4/6) 2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)

driverlog-2-3-6b 27.5 (7/20) 1.9 1.9 (7/20) 232

ProblemProblemGraphplanGraphplanGraphplan PEGG-so

PEGG-socpu sec

(steps/acts)

adjusum2 combo

adjusum2 comboGP-eGP-e

bw-large-bbw-large-b 13.4 (18/18)13.4 (18/18) 12.2 12.2 3.1 (18/18)3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )87.1 (/ 18 ) 20.5 (/28 )

bw-large-cbw-large-c ss 1104 1104 66.9 (28/28)66.9 (28/28) 738 (/ 28) 114.9 (/38)738 (/ 28) 114.9 (/38)

bw-large-dbw-large-d ss pepe 340 (38/38)340 (38/38) 2350 (/ 36) * 2350 (/ 36) * rocket-ext-a rocket-ext-a 3.5 (7/36)3.5 (7/36) 2.8 (7/34)2.8 (7/34) 1.1 (7/34)1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)43.6 (/ 40) 1.26 (/ 34)

att-log-a att-log-a 31.8 (11/79)31.8 (11/79) 2.6 (11/72)2.6 (11/72) 2.2 (11/62)2.2 (11/62) 36.7 ( /56) 2.27( / 64)36.7 ( /56) 2.27( / 64)

Gripper-15 Gripper-15 ss 47.5 47.5 16.7 (36/45)16.7 (36/45) 14.1 (/ 45) 16.98 (/45)14.1 (/ 45) 16.98 (/45)

Gripper-20Gripper-20 ss ss 110.8 (40/59)110.8 (40/59) 38.2 (/ 59) 20.92 (/59)38.2 (/ 59) 20.92 (/59)

Tower-9Tower-9 s (511/511)s (511/511) 118118 23.6 (511/511)23.6 (511/511) 121(/511) *121(/511) *8puzzle-1 8puzzle-1 95.2 (31/31)95.2 (31/31) 31.1 31.1 9.2 (31/31)9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)143.7 ( / 31) 119.5 ( /39)

8puzzle-2 8puzzle-2 87.5 (30/30)87.5 (30/30) 31.3 31.3 7.0 (30/30)7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)348.3 (/ 30) 50.5 (/ 48)

AIPS 1998AIPS 1998AIPS 1998 Alt Alt (Lisp version)Alt Alt (Lisp version)Alt Alt (Lisp version)

grid-y-1 grid-y-1 16.7 (14/14)16.7 (14/14) 16.816.8 16.8 (14/14)16.8 (14/14) 739.4 (/14) 640.5 (/14)739.4 (/14) 640.5 (/14)

mprime-1 mprime-1 4.8 (4/6)4.8 (4/6) 3.6 (4/6)3.6 (4/6) 2.1 (4/6)2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)722.6 (/ 4) 79.6 (/ 4)

AIPS 2002AIPS 2002AIPS 2002 Alt Alt (Lisp version)Alt Alt (Lisp version)Alt Alt (Lisp version)

driverlog-2-3-6b driverlog-2-3-6b 27.5 (7/20)27.5 (7/20) 1.9 1.9 1.9 (7/20)1.9 (7/20) 232232

[IJCAI 2003]

In the beginning it was all POP.

Then it was cruellyUnPOPped

The good timesreturn with Re(vived)POP

III. PG Heuristics for PO Planners

POP Algorithm

1. Plan Selection: Select a plan P from the search queue2. Flaw Selection: Choose a flaw f

(open cond or unsafe link)3. Flaw resolution:

If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist

4. If there is no flaw left, return P

g2g2oc1

Choice points• Flaw selection (open condition? unsafe link? Non-backtrack choice)• Flaw resolution/Plan Selection (how to select (rank) partial plan?)

1. Initial plan:

2. Plan refinement (flaw selection and resolution):

• Distance heuristics to estimate cost of partially ordered plans (and to select flaws)– If we ignore negative interactions,

then the set of open conditions can be seen as a regression state

• Mutexes used to detect indirect conflicts in partial plans– A step threatens a link if there is

a mutex between the link condition and the steps’ effect or precondition

– Post disjunctive precedences and use propagation to simplify

PG Heuristics for Partial Order Planning

kjik SSSS

rpmutexorqpmutexif

),(),(

RePOP’s Performance

4.1214.67(5.23) -45.78Bw-large-a

14.14122.56(18.86) --Bw-large-b

116.34-(137.84) --Bw-large-c

20.62-91.53-Logistics-d

4.52-22.54-Logistics-c

1.18262.642.31-Logistics-b

1.59306.123.16-Logistics-a

1.2977.488.17-Rocket-b

1.0275.128.36-Rocket-a

15.42-81.86-Gripper-20

1.1547min2.72-Gripper-10

.4366.821.01-Gripper-8

AltAltGraphplanRePOPUCPOPProblem

4.1214.67(5.23) -45.78Bw-large-a

14.14122.56(18.86) --Bw-large-b

116.34-(137.84) --Bw-large-c

20.62-91.53-Logistics-d

4.52-22.54-Logistics-c

1.18262.642.31-Logistics-b

1.59306.123.16-Logistics-a

1.2977.488.17-Rocket-b

1.0275.128.36-Rocket-a

15.42-81.86-Gripper-20

1.1547min2.72-Gripper-10

.4366.821.01-Gripper-8

AltAltGraphplanRePOPUCPOPProblem• RePOP implemented on top of UCPOP

– Dramatically better than any other partial order planner

– Competitive with Graphplan and AltAlt

– VHPOP carried the torch at ICP 2002

[IJCAI, 2001]

You see, pop, it is possible to Re-use all the old POP work!

Written in Lisp, runs on Linux, 500MHz, 250MB

IV. PG Heuristics for Metric Temporal Planning

Build RTPG Propagate Cost

functionsExtract relaxed plan

Adjust for Mutexes; Resources

Planning Problem

Generate start state

Partialize thep.c. plan

Returno.c and p.c plans

Expand state by applying

actions

Heuristicestimation

Select state with lowest f-value

SatisfiesGoals?

Queue of Time-Stamped states

f can have bothCost & Makespan

components

[ECP 2001; AIPS 2002; ICAPS 2003; JAIR 2003]

Multi-Objective Nature of MTP

• Plan quality in Metric Temporal domains is inherently Multi-dimensional – Temporal quality (e.g. makespan,

slack)– Plan cost (e.g. cumulative action cost,

resource consumption)• Necessitates multi-objective search

– Modeling objective functions– Tracking different quality metrics and

heuristic estimation Challenge: Inter-dependencies

between different quality metrics Typically cost will go down with

higher makespan…

Phoenix

SAPA’s approach

• Use a temporal version of the Planning Graph (Smith & Weld) structure to track the time-sensitive cost function:– Estimation of the earliest time

(makespan) to achieve all goals.– Estimation of the lowest cost to

achieve goals– Estimation of the cost to achieve

goals given the specific makespan value.

• Use this information to calculate the heuristic value for the objective function involving both time and cost

Challenge: How to propagate cost over planning graphs?

Phoenix

Los Angeles

Phoenix

Los Angeles

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Search through time-stamped states

S=(P,M,,Q,t)

Set <pi,ti> of predicates pi and thetime of their last achievement ti < t.

Set of functions represent resource values.Set of functions represent resource values.

Set of protectedpersistent conditions(could be binary or resource conds).

Event queue (contains resource as wellAs binary fluent events).Event queue (contains resource as wellAs binary fluent events).

Time stamp of S.Time stamp of S.

Flying

(in-city ?airplane ?city1)

(fuel ?airplane) > 0

(in-city ?airplane ?city1) (in-city ?airplane ?city2)

consume (fuel ?airplane)

Flying

(in-city ?airplane ?city1)

(fuel ?airplane) > 0

(in-city ?airplane ?city1) (in-city ?airplane ?city2)

consume (fuel ?airplane)

• Goal Satisfaction: S=(P,M,,Q,t) G if <pi,ti> G either:

<pi,tj> P, tj < ti and no event in Q deletes pi.

e Q that adds pi at time te < ti.

• Action Application: Action A is applicable in S if:

– All instantaneous preconditions of A are satisfied by P and M.

– A’s effects do not interfere with and Q.– No event in Q interferes with persistent

preconditions of A.– A does not lead to concurrent resource

change• When A is applied to S:

– P is updated according to A’s instantaneous effects.

– Persistent preconditions of A are put in – Delayed effects of A are put in Q.

Search: Pick a state S from the queue. If S satisfies the goals, endElse non-deterministically do one of

--Advance the clock (by executing the earliest event in Qs

--Apply one of the applicable actions to S

Propagating Cost Functions

Phoenix

time0 1.5 2 10

t = 1.5 t = 10

Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour

Drive-car(Tempe,LA)

Hel(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0.5

Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))

Airplane(P,LA)

t = 2.0

Issues in Cost Propagation

Costing a set of literals• Cost(f,t) = min {Cost(A,t) : f Effect(A)}• Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))

• Aggregate can be Sum or Max

• Set-level idea would entail tracking costs of subsets of literals

Termination Criteria

• Deadline Termination: Terminate at time point t if: goal G: Deadline(G) t goal G: (Deadline(G) < t)

(Cost(G,t) = • Fix-point Termination: Terminate

at time point t where we can not improve the cost of any proposition.

• K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times.

Heuristics based on cost functions

• If we want to minimize makespan:– h = t0

• If we want to minimize cost– h = CostAggregate(G, t)

• If we want to minimize a function f(time,cost) of cost and makespan – h = min f(t,Cost(G,t)) s.t. t0

t t• E.g. f(time,cost) =

100.makespan + Cost then h = 100x2 + 220 at t0 t = 2 t

0 t0=1.5 2 t = 10

Cost(At(LA))

Time of Earliest achievement

Time of lowest cost

Direct • Extract a relaxed plan using h as the bias

– If the objective function is f(time,cost), then action A ( to be added to RP) is selected such that:

f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew))

is minimal

Gnew = (G Precond(A)) \ Effects)

Using Relaxed Plan

Phased Relaxation

Adjusting for Resource Interactions: Estimate the number of additional resource-producing actions needed to make-up for any resource short-fall in the relaxed plan C = C + R (Con(R) – (Init(R)+Pro(R)))/R * C(AR)

Adjusting for Mutexes: Adjust the make-span estimate of the relaxed plan by marking actions that are mutex (and thus cannot be executed concurrently

The relaxed plan can be adjusted to take into account constraints that were originally ignored

Handling Cost/Makespan Tradeoffs

Results over 20 randomly generated temporal logistics problems involve moving 4 packages between different locations in 3 cities:

O = f(time,cost) = .Makespan + (1- ).TotalCost

Cost variation

0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1

Makespan variation

Cost variation

0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1

Makespan variation

Planning Problem

actions

Heuristicestimation

SatisfiesGoals?

components

SAPA at IPC-2002

Rover (time setting) Rover (time setting)

Satellite (complex setting) Satellite (complex setting)

Planning Problem

actions

Heuristicestimation

SatisfiesGoals?

components

[JAIR 2003]

IV. PG Heuristics for Conformant Planning

A* Search Engine(HSP-r)

Heuristics

PlanningGraph(s)

Clausal States

Labels (CUDD)

ModelChecker

(NuSMV)

Off – The - Shelf Custom

IPC PDDL Parser

Input forInput for

Validates

Extracted

Conformant Planning as RegressionActions:A1: M P => KA2: M Q => KA3: M R => LA4: K => GA5: L => G

Initially: (P V Q V R) &

(~P V ~Q) & (~P V ~R) & (~Q V ~R) &

Goal State:G

(G V K)

(G V K V L)

(G V K V L V P) & M

G or K must be true before A4For G to be true after A4

(G V K V L V P V Q) & M

(G V K V L V P V Q V R) &M

Each Clause is Satisfied by a Clause in the Initial Clausal State -- Done! (5 actions)

Initially: (P V Q V R) &

(~P V ~Q) & (~P V ~R) & (~Q V ~R) &

M(G V K V L V P V Q V R) &

Using a Single, Unioned GraphPM

Heuristic Estimate = 2

•Not effective•Lose world specific support information•Incorrect mutexesUnion literals from

all initial states into a conjunctive initial graph level

•Easy to implement

Using Multiple GraphsP

•Accurate Mutexes•Moderate Implementation Difficulty

•Memory Intensive•Heuristic Computation Can be costly

Unioning these graphs a priori would give much savings …

Using a Single, Labeled Graph

Action Labels:Conjunction of Labels of Supporting Literals

Literal Labels:Disjunction of LabelsOf Supporting Actions

Heuristic Value = 5

•Memory Efficient•Cheap Heuristics•Scalable•Extensible

•Tricky to Implement

Benefits from BDD’s and a model checker

~Q & ~R

~P & ~R

~P & ~Q

(~P & ~R) V (~Q & ~R)

(~P & ~R) V (~Q & ~R) V(~P & ~Q)

Label Key

Label of a literal signifies the set of worlds in which it is supported --Full support means all init worlds

CAltAlt Performance• Label-graph based

heuristics make CAltAlt competitive with the current best approaches

Rovers Domain

100000

1000000

1 2 3 4Problem

Single Sum Multi Level Multi RP Union

Label Level Label RP CGP

HSCP GPT KACMBP

Logistics

1 2 3 4Problem

Label RP CGP

HSCP GPT

KACMBP

A* Search Engine(HSP-r)

Heuristics

PlanningGraph(s)

Clausal States

Labels (CUDD)

ModelChecker

(NuSMV)

Off – The - Shelf Custom

IPC PDDL Parser

Input forInput for

Validates

Extracted

The Damage until now..

– Graphplan style search– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]

• Multiple graphs; Labelled graphs

Still to come: PG Heuristics for—

• Probabilistic Conformant Planning• Conditional Planning• Lifted Planning

• Trans-Atlantic camaraderie• Post-war reconstruction• Middle-east peace…

Meanwhile outside Tempe…

• Hoffman’s FF uses relaxed plans from PG• Geffner & Haslum derive DP-versions of PG-

heuristics• Gerevini & Serina’s LPG uses PG heuristics to

cost the various repairs• Smith back-propagates (convolves) probability

distributions over PG to decide the contingencies worth focusing on

• Trinquart proposes a PG-clone that directly computes reachability in plan-space…

• …

Why do we love PG Heuristics?• They work!• They are “forgiving”

– You don't like doing mutex? okay– You don't like growing the graph all the way? okay.

• Allow propagation of many types of information– Level, subgoal interaction, time, cost, world support,

• Support phased relaxation– E.g. Ignore mutexes and resources and bring them back later…

• Graph structure supports other synergistic uses– e.g. action selection

• Versatility…

• PG Variations– Serial– Parallel– Temporal– Labelled

• Propagation Methods– Level– Mutex– Cost– Label

• Planning Problems– Classical– Resource/Temporal– Conformant

• Planners– Regression– Progression– Partial Order– Graphplan-style

Versatility of PG Heuristics

1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State...

Documents

Transcript of 1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State...

An LP-Based Heuristic for Optimal Planning · 2016-07-26 · An LP-Based Heuristic for Optimal Planning Menkes van den Briel1,J.Benton 2, Subbarao Kambhampati , and Thomas Vossen3

Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.

Web Information Extraction 3 rd Oct 2007. Information Extraction (Slides based on those by Ray Mooney, Craig Knoblock, Dan Weld, Perry, Subbarao Kambhampati,

When is Temporal Planning Really Temporal? William Cushing Subbarao Kambhampati Special thanks to: J. Benton, Menkes van den Briel Mausam Daniel Weld.

Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements Raju Balakrishnan (PhD Proposal Defense) Committee: Subbarao Kambhampati.

RePOP: Reviving Partial Order Planning XuanLong Nguyen & Subbarao Kambhampati {xuanlong,rao}@asu.edu Yochan Group Arizona State University .

MA2: Recent Advances in AI Planning: A Unified View Subbarao Kambhampati 3/11.

Subbarao Kambhampati Arizona State University Incomplete Domain Models, Uncertain Users, Unending Planning & Open Worlds Model-Lite Planning for Autonomy.

Subbarao Kambhampati Arizona State University

Query Processing over Incomplete Autonomous Databases Presented By Garrett Wolf, Hemal Khatri, Bhaumik Chokshi, Jianchun Fan, Yi Chen, Subbarao Kambhampati.

Planning Graph Based Reachability Heuristics Daniel Bryce & Subbarao Kambhampati ICAPS’06 Tutorial 6 June 7, 2006

Subbarao Kambhampati Arizona State Universitysistemas-humano-computacionais.wdfiles.com/local--files... · 2012-09-17 · Subbarao Kambhampati. Arizona State University. Incomplete

Learning-Assisted Automated Planning - Subbarao Kambhampati

CSE 571: Artificial Intelligence Instructor: Subbarao Kambhampati rao@asu.edu Homepage: //rakaposhi.eas.asu.edu/cse571.

Evaluating Temporal Planning Domains William Cushing Subbarao Kambhampati Kartik Talamadupula Daniel Weld Mausam.

Subbarao Kambhampati* Department of Computer Science and ... · Department of Computer Science and Engineering Arizona State University, Tempe, AZ 85287, rao@asu.edu Abstract The

Learning for Planning Sungwook Yoon Subbarao Kambhampati Arizona State University Tutorial presented at ICAPS 2007.

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati.

How to write a good paper.. Subbarao Kambhampati Arizona State University How to write a research paper (and not to die trying it)

Optimizing Recursive Information Gathering Plans Eric Lambrecht, Subbarao Kambhampati Senthil Gnanaprakasam Arizona State University Tempe, USA rakaposhi.eas.asu.edu/yochan.html.