Post on 15-Jan-2016
1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit
Subbarao Kambhampati
Arizona State University
http://rakaposhi.eas.asu.edu
(With tons of help from Daniel Bryce, Minh Binh Do, Xuan Long NguyenRomeo Sanchez Nigenda, Biplav Srivastava, Terry Zimmerman)
Funding from NSF & NASA
987
WMD-in-the-toilet“After the flush, you may find
that there were no bombs to begin with”
Planning Graph and Projection
• Envelope of Progression Tree (Relaxed Progression)– Proposition lists: Union
of states at kth level– Mutex: Subsets of
literals that cannot be part of any legal state
• Lowerbound reachability information
p pqrs
pqrst
A1A2
A3
A1
A2A3A4
[Blum&Furst, 1995] [ECP, 1997]
p
pq
pr
ps
pqr
pq
pqs
psq
ps
pst
A1A2
A3
A2A1A3
A1A3
A4
Planning Graphs can be used as the basis forheuristics!
And PG Heuristics for all..
– Classical (regression) planning– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial expansion
– Graphplan style search– GP-HSP (AIPS 2000)
• Variable/Value ordering heuristics based on distances
– Partial order planning– RePOP (IJCAI 2001)
• Mutexes used to detect Indirect Conflicts
– Metric Temporal Planning– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Propagation of cost functions; Phased relaxation
– Conformant Planning– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
Multiple graphs; Labelled graphs
And PG Heuristics for all..
– Classical (regression) planning– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial expansion
– Graphplan style search– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]
• Variable/Value ordering heuristics based on distances
– Partial order planning– RePOP (IJCAI 2001)
• Mutexes used to detect Indirect Conflicts
– Metric Temporal Planning– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Propagation of cost functions; Phased relaxation
– Conformant Planning– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
• Multiple graphs; Labelled graphs
Cavea
t:
“All T
empe
,
All t
he tim
e”
I. PG Heuristics for State-space (Regression) planners
[AAAI 2000; AIPS 2000; AIJ 2002; JAIR 2003]
Problem: Given a set of subgoals (regressed state) estimate how far they are from the initial state
Graphplan Graph
Extension Phase
(based on STAN)
Planning
Graph
Actions in the
Last Level
Action Templates Extraction of
Heuristics
Heuristic
Regression Planner
(based on HSP-R)Problem Specification
(Initial and Goal State)
Planning Graphs: Optimistic Projection of Achievability
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
Initial state
0 1 2
0 1 2
Goal state
Grid Problem
• Serial PG: PG where any pair of non-noop actions are marked mutex• lev(S): index of the first level where all props in S appear non-mutexed.
– If there is no such level, then• If the graph is grown to level off, then • Else k+1 (k is the current length of the graph)
Cost of a Set of Literals
• lev(p) : index of the first level at which p comes into the planning graph• lev(S): index of the first level where all props in S appear non-mutexed.
– If there is no such level, thenIf the graph is grown to level off, then Else k+1 (k is the current length of the graph)
Sum Set-Level
Partition-k Adjusted Sum ComboSet-Level with memos
h(S) = pS lev({p}) h(S) = lev(S)
Admissible
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
xAt(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
PROBLEM Level Sum AdjSum2M
Gripper-25 - 69/0.98 67/1.57
Gripper-30 - 81/1.63 77/2.83
Tower-7 127/1.28 127/0.95 127/1.37
Tower-9 511/47.91 511/16.04 511/48.45
8-Puzzle1 31/6.25 39/0.35 31/0.69
8-Puzzle2 30/0.74 34/0.47 30/0.74
Mystery-6 - - 16/62.5
Mistery-9 8/0.53 8/0.66 8/0.49
Mprime-3 4/1.87 4/1.88 4/1.67
Mprime-4 8/1.83 8/2.34 10/1.49
Aips-grid1 14/1.07 14/1.12 14/0.88
Aips-grid2 - - 34/95.98
Adjusting the Sum Heuristic
• Start with Sum heuristic and adjust it to take subgoal interactions into account – Negative interactions in terms
of “degree of interaction”– Positive interactions in terms
of co-achievement links • Ignore negative interactions
when accounting for positive interactions (and vice versa)
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
xAt(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
[AAAI 2000]
HAdjSum2M(S) = length(RelaxedPlan(S)) + max p,qS (p,q)
Where (p,q) = lev({p,q}) - max{lev(p), lev(q)} /*Degree of –ve Interaction */
Optimizations in Heuristic Computation
• Taming Space/Time costs
• Bi-level Planning Graph representation
• Partial expansion of the PG (stop before level-off)
– It is FINE to cut corners when using PG for heuristics (instead of search)!!
• Branching factor can still be quite high
– Use actions appearing in the PG• Select actions in lev(S) vs Levels-off
Heuristic extracted from partial graph vs. leveled graph
0.1
1
10
100
1000
10000
100000
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161
Problems
Tim
e(S
eco
nd
s)
Levels-off
Lev(S)
•A •A•A •A
•B•B
•C
•B
•C
•D
•B
•C
•D
•E
•A
•B
•C
•D
•E
x x
x
x
x
x
xxx
•A •A•A •A
•B•B
•C
•B
•C
•D
•B
•C
•D
•E
•A
•B
•C
•D
•E
x x
x
x
x
x
xxx
Goals C,D are presentExample: Levels off
Trade-off
Discarded
AltAlt Performance
Logistics Domain(AIPS-00).
0.01
0.1
1
10
100
1000
10000
100000
Problems.
Tim
e(S
eco
nd
s)
STAN3.0
HSP2.0
HSP-r
AltAlt1.0
Schedule Domain (AIPS-00)
0.01
0.1
1
10
100
1000
10000
100000
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161
Problems
Tim
e(S
ec
on
ds
)
STAN3.0
HSP2.0
AltAlt1.0Logistics
Scheduling
Problem sets from IPC 2000
ZenoTravel AIPS-02
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Problems
Ste
ps
AltAlt
AltAlt-PostProc
AltAlt-p
Even Parallel Plans aren’t safe..Action Templates
Problem Spec(Init, Goal state)
Solution Plan
GraphplanPlan Extension Phase
(based on STAN)
ParallelPlanningGraph
Extraction ofHeuristics
HeuristicsActions in the
Last Level
NodeExpansion(Fattening)
Node Orderingand Selection
PlanCompression
Algorithm(PushUp)
AltAltp
[JAIR 2003]
Logistics AIPS-00
0
10
20
30
40
50
60
70
80
90
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61
Problems
Ste
ps
Altalt-p
STAN
TP4
Blackbox
LPG 2nd
Serial graph over-estimates • Use “parallel” rather than serial PG
as the basis for heuristicsProjection over sets of actions too costly
•Select the branch with the best action and fatten it
• Use “push-up” to make the partial plans more parallel
II. PG heuristics for Graphplan..
PG Heuristics for Graphplan(!)• Goal/Action Ordering
Heuristics for Backward Search
• Propositions are ordered for consideration in decreasing value of their levels.
• Actions supporting a proposition are ordered for consideration in increasing values of their costs
– Cost of an action = 1 + Cost of its set of preconditions
• Use of level heuristics improves the performance significantly.– The heuristics are surprisingly
insensitive to the length of the planning graph
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,0)
Key(0,1)
Prop listLevel 0
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
xAt(0,1)
At(1,0)
noop
noop
Action listLevel 0
Move(0,0,0,1)
Move(0,0,1,0)
x
At(0,0)
key(0,1)
Prop listLevel 1
x
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
At(0,0)
Key(0,1)
noop
noop
x
Action listLevel 1
x
Prop listLevel 2
Move(0,1,1,1)At(1,1)
At(1,0)
At(0,1)
Move(1,0,1,1)
noop
noop
x
x
xx
xx
…...
x
…...
Pick_key(0,1) Have_key
~Key(0,1)xx
x
xx
Mutexes
[AIPS 2000]0.019/320.019/320.019/32rocket-b
.0098/29.0078/29.0068/29rocket-a
>30->30->30-bw-prob04
0.0218/180.0218/180.0118/18huge-fct
7.428/284.1828/284.1328/28bw-large-c
0.2518/180.2118/180.2118/18bw-large-b
.0112/12.00812/12.00712/12bw-large-a
TLTLTL
+10 levels+5levels+3levels
MOPProblem
0.019/320.019/320.019/32rocket-b
.0098/29.0078/29.0068/29rocket-a
>30->30->30-bw-prob04
0.0218/180.0218/180.0118/18huge-fct
7.428/284.1828/284.1328/28bw-large-c
0.2518/180.2118/180.2118/18bw-large-b
.0112/12.00812/12.00712/12bw-large-a
TLTLTL
+10 levels+5levels+3levels
MOPProblem
2 43
…And then state-space heuristics for Graphplan
(PEGG)
E
Y
Q
E
Y
R
T
E
F
R
-
Init
State
A
C
E
F
K
0 1
Goal
X
Y
Z
5
X
W
Q
-
-
-
W
T
S
-
-
-
W
T
R
Planning Graph (proposition levels)
6
1: Capture a state space view of Graphplan’s search in a search trace
XY a2
a3a4Z
action assignments
Regressed ‘states’
No solution?
extend graph…
6
Init
State
A
C
E
F
K
0 1 2 3 4 5
W
E
R
E
C
E
T
F
D
K
F
W
7
Y
F
Goal
X
Y
Z
W
R
W
R
E
X
W
Q
-
-
-
W
T
S
-
-
-
W
T
R
E
Y
Q
E
Y
R
T
-
-
-
E
F
R
F
R
E
F
J
F
R
A
E
Y
R
T
W
R
E
F
R
…And then state-space heuristics for Graphplan
PEGG now competitive with a heuristic state space planner
ProblemGraphplan PEGG-so
cpu sec (steps/acts)
PEGGcpu sec (steps/acts)
Alt Alt (Lisp version)heuristics:
adjusum2 comboGP-e
bw-large-b 13.4 (18/18) 12.2 3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )
bw-large-c s 1104 66.9 (28/28) 738 (/ 28) 114.9 (/38)
bw-large-d s pe 340 (38/38) 2350 (/ 36) * rocket-ext-a 3.5 (7/36) 2.8 (7/34) 1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)
att-log-a 31.8 (11/79) 2.6 (11/72) 2.2 (11/62) 36.7 ( /56) 2.27( / 64)
Gripper-15 s 47.5 16.7 (36/45) 14.1 (/ 45) 16.98 (/45)
Gripper-20 s s 110.8 (40/59) 38.2 (/ 59) 20.92 (/59)
Tower-9 s (511/511) 118 23.6 (511/511) 121(/511) *8puzzle-1 95.2 (31/31) 31.1 9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)
8puzzle-2 87.5 (30/30) 31.3 7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)
AIPS 1998 Alt Alt (Lisp version)
grid-y-1 16.7 (14/14) 16.8 16.8 (14/14) 739.4 (/14) 640.5 (/14)
mprime-1 4.8 (4/6) 3.6 (4/6) 2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)
AIPS 2002 Alt Alt (Lisp version)
driverlog-2-3-6b 27.5 (7/20) 1.9 1.9 (7/20) 232
ProblemGraphplan PEGG-so
cpu sec (steps/acts)
PEGGcpu sec (steps/acts)
Alt Alt (Lisp version)heuristics:
adjusum2 comboGP-e
bw-large-b 13.4 (18/18) 12.2 3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )
bw-large-c s 1104 66.9 (28/28) 738 (/ 28) 114.9 (/38)
bw-large-d s pe 340 (38/38) 2350 (/ 36) * rocket-ext-a 3.5 (7/36) 2.8 (7/34) 1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)
att-log-a 31.8 (11/79) 2.6 (11/72) 2.2 (11/62) 36.7 ( /56) 2.27( / 64)
Gripper-15 s 47.5 16.7 (36/45) 14.1 (/ 45) 16.98 (/45)
Gripper-20 s s 110.8 (40/59) 38.2 (/ 59) 20.92 (/59)
Tower-9 s (511/511) 118 23.6 (511/511) 121(/511) *8puzzle-1 95.2 (31/31) 31.1 9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)
8puzzle-2 87.5 (30/30) 31.3 7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)
AIPS 1998 Alt Alt (Lisp version)
grid-y-1 16.7 (14/14) 16.8 16.8 (14/14) 739.4 (/14) 640.5 (/14)
mprime-1 4.8 (4/6) 3.6 (4/6) 2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)
AIPS 2002 Alt Alt (Lisp version)
driverlog-2-3-6b 27.5 (7/20) 1.9 1.9 (7/20) 232
ProblemProblemGraphplanGraphplanGraphplan PEGG-so
cpu sec (steps/acts)
PEGG-socpu sec
(steps/acts)
PEGGcpu sec (steps/acts)
PEGGcpu sec (steps/acts)
Alt Alt (Lisp version)heuristics:
adjusum2 combo
Alt Alt (Lisp version)heuristics:
adjusum2 comboGP-eGP-e
bw-large-bbw-large-b 13.4 (18/18)13.4 (18/18) 12.2 12.2 3.1 (18/18)3.1 (18/18) 87.1 (/ 18 ) 20.5 (/28 )87.1 (/ 18 ) 20.5 (/28 )
bw-large-cbw-large-c ss 1104 1104 66.9 (28/28)66.9 (28/28) 738 (/ 28) 114.9 (/38)738 (/ 28) 114.9 (/38)
bw-large-dbw-large-d ss pepe 340 (38/38)340 (38/38) 2350 (/ 36) * 2350 (/ 36) * rocket-ext-a rocket-ext-a 3.5 (7/36)3.5 (7/36) 2.8 (7/34)2.8 (7/34) 1.1 (7/34)1.1 (7/34) 43.6 (/ 40) 1.26 (/ 34)43.6 (/ 40) 1.26 (/ 34)
att-log-a att-log-a 31.8 (11/79)31.8 (11/79) 2.6 (11/72)2.6 (11/72) 2.2 (11/62)2.2 (11/62) 36.7 ( /56) 2.27( / 64)36.7 ( /56) 2.27( / 64)
Gripper-15 Gripper-15 ss 47.5 47.5 16.7 (36/45)16.7 (36/45) 14.1 (/ 45) 16.98 (/45)14.1 (/ 45) 16.98 (/45)
Gripper-20Gripper-20 ss ss 110.8 (40/59)110.8 (40/59) 38.2 (/ 59) 20.92 (/59)38.2 (/ 59) 20.92 (/59)
Tower-9Tower-9 s (511/511)s (511/511) 118118 23.6 (511/511)23.6 (511/511) 121(/511) *121(/511) *8puzzle-1 8puzzle-1 95.2 (31/31)95.2 (31/31) 31.1 31.1 9.2 (31/31)9.2 (31/31) 143.7 ( / 31) 119.5 ( /39)143.7 ( / 31) 119.5 ( /39)
8puzzle-2 8puzzle-2 87.5 (30/30)87.5 (30/30) 31.3 31.3 7.0 (30/30)7.0 (30/30) 348.3 (/ 30) 50.5 (/ 48)348.3 (/ 30) 50.5 (/ 48)
AIPS 1998AIPS 1998AIPS 1998 Alt Alt (Lisp version)Alt Alt (Lisp version)Alt Alt (Lisp version)
grid-y-1 grid-y-1 16.7 (14/14)16.7 (14/14) 16.816.8 16.8 (14/14)16.8 (14/14) 739.4 (/14) 640.5 (/14)739.4 (/14) 640.5 (/14)
mprime-1 mprime-1 4.8 (4/6)4.8 (4/6) 3.6 (4/6)3.6 (4/6) 2.1 (4/6)2.1 (4/6) 722.6 (/ 4) 79.6 (/ 4)722.6 (/ 4) 79.6 (/ 4)
AIPS 2002AIPS 2002AIPS 2002 Alt Alt (Lisp version)Alt Alt (Lisp version)Alt Alt (Lisp version)
driverlog-2-3-6b driverlog-2-3-6b 27.5 (7/20)27.5 (7/20) 1.9 1.9 1.9 (7/20)1.9 (7/20) 232232
[IJCAI 2003]
In the beginning it was all POP.
Then it was cruellyUnPOPped
The good timesreturn with Re(vived)POP
III. PG Heuristics for PO Planners
POP Algorithm
1. Plan Selection: Select a plan P from the search queue2. Flaw Selection: Choose a flaw f
(open cond or unsafe link)3. Flaw resolution:
If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist
4. If there is no flaw left, return P
S0
S1
S2
S3
Sinf
p
~p
g1
g2g2oc1
oc2
q1
Choice points• Flaw selection (open condition? unsafe link? Non-backtrack choice)• Flaw resolution/Plan Selection (how to select (rank) partial plan?)
S0
Sinf
g1
g2
1. Initial plan:
2. Plan refinement (flaw selection and resolution):
• Distance heuristics to estimate cost of partially ordered plans (and to select flaws)– If we ignore negative interactions,
then the set of open conditions can be seen as a regression state
• Mutexes used to detect indirect conflicts in partial plans– A step threatens a link if there is
a mutex between the link condition and the steps’ effect or precondition
– Post disjunctive precedences and use propagation to simplify
PG Heuristics for Partial Order Planning
Si
Sk
Sj
p
q
r
S0
S1
S2
S3p
~p
g1
g2g2q
r
q1
Sinf
S4
S5
kjik SSSS
rpmutexorqpmutexif
),(),(
RePOP’s Performance
4.1214.67(5.23) -45.78Bw-large-a
14.14122.56(18.86) --Bw-large-b
116.34-(137.84) --Bw-large-c
20.62-91.53-Logistics-d
4.52-22.54-Logistics-c
1.18262.642.31-Logistics-b
1.59306.123.16-Logistics-a
1.2977.488.17-Rocket-b
1.0275.128.36-Rocket-a
15.42-81.86-Gripper-20
1.1547min2.72-Gripper-10
.4366.821.01-Gripper-8
AltAltGraphplanRePOPUCPOPProblem
4.1214.67(5.23) -45.78Bw-large-a
14.14122.56(18.86) --Bw-large-b
116.34-(137.84) --Bw-large-c
20.62-91.53-Logistics-d
4.52-22.54-Logistics-c
1.18262.642.31-Logistics-b
1.59306.123.16-Logistics-a
1.2977.488.17-Rocket-b
1.0275.128.36-Rocket-a
15.42-81.86-Gripper-20
1.1547min2.72-Gripper-10
.4366.821.01-Gripper-8
AltAltGraphplanRePOPUCPOPProblem• RePOP implemented on top of UCPOP
– Dramatically better than any other partial order planner
– Competitive with Graphplan and AltAlt
– VHPOP carried the torch at ICP 2002
[IJCAI, 2001]
You see, pop, it is possible to Re-use all the old POP work!
Written in Lisp, runs on Linux, 500MHz, 250MB
IV. PG Heuristics for Metric Temporal Planning
Build RTPG Propagate Cost
functionsExtract relaxed plan
Adjust for Mutexes; Resources
Planning Problem
Generate start state
No
Partialize thep.c. plan
Returno.c and p.c plans
Expand state by applying
actions
Heuristicestimation
Select state with lowest f-value
SatisfiesGoals?
Queue of Time-Stamped states
Yes
f can have bothCost & Makespan
components
[ECP 2001; AIPS 2002; ICAPS 2003; JAIR 2003]
Multi-Objective Nature of MTP
• Plan quality in Metric Temporal domains is inherently Multi-dimensional – Temporal quality (e.g. makespan,
slack)– Plan cost (e.g. cumulative action cost,
resource consumption)• Necessitates multi-objective search
– Modeling objective functions– Tracking different quality metrics and
heuristic estimation Challenge: Inter-dependencies
between different quality metrics Typically cost will go down with
higher makespan…
Tempe
Phoenix
L.A
SAPA’s approach
• Use a temporal version of the Planning Graph (Smith & Weld) structure to track the time-sensitive cost function:– Estimation of the earliest time
(makespan) to achieve all goals.– Estimation of the lowest cost to
achieve goals– Estimation of the cost to achieve
goals given the specific makespan value.
• Use this information to calculate the heuristic value for the objective function involving both time and cost
Challenge: How to propagate cost over planning graphs?
Tempe
Phoenix
Los Angeles
Tempe
Phoenix
Los Angeles
Drive-car(Tempe,LA)
Heli(T,P)
Shuttle(T,P)
Airplane(P,LA)
t = 0 t = 0.5 t = 1 t = 1.5 t = 10
Drive-car(Tempe,LA)
Heli(T,P)
Shuttle(T,P)
Drive-car(Tempe,LA)
Heli(T,P)
Shuttle(T,P)
Airplane(P,LA)
t = 0 t = 0.5 t = 1 t = 1.5 t = 10
Search through time-stamped states
S=(P,M,,Q,t)
Set <pi,ti> of predicates pi and thetime of their last achievement ti < t.
Set <pi,ti> of predicates pi and thetime of their last achievement ti < t.
Set of functions represent resource values.Set of functions represent resource values.
Set of protectedpersistent conditions(could be binary or resource conds).
Set of protectedpersistent conditions(could be binary or resource conds).
Event queue (contains resource as wellAs binary fluent events).Event queue (contains resource as wellAs binary fluent events).
Time stamp of S.Time stamp of S.
Flying
(in-city ?airplane ?city1)
(fuel ?airplane) > 0
(in-city ?airplane ?city1) (in-city ?airplane ?city2)
consume (fuel ?airplane)
Flying
(in-city ?airplane ?city1)
(fuel ?airplane) > 0
(in-city ?airplane ?city1) (in-city ?airplane ?city2)
consume (fuel ?airplane)
• Goal Satisfaction: S=(P,M,,Q,t) G if <pi,ti> G either:
<pi,tj> P, tj < ti and no event in Q deletes pi.
e Q that adds pi at time te < ti.
• Action Application: Action A is applicable in S if:
– All instantaneous preconditions of A are satisfied by P and M.
– A’s effects do not interfere with and Q.– No event in Q interferes with persistent
preconditions of A.– A does not lead to concurrent resource
change• When A is applied to S:
– P is updated according to A’s instantaneous effects.
– Persistent preconditions of A are put in – Delayed effects of A are put in Q.
Search: Pick a state S from the queue. If S satisfies the goals, endElse non-deterministically do one of
--Advance the clock (by executing the earliest event in Qs
--Apply one of the applicable actions to S
Propagating Cost Functions
Tempe
Phoenix
L.A
time0 1.5 2 10
$300
$220
$100
t = 1.5 t = 10
Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour
1
Drive-car(Tempe,LA)
Hel(T,P)
Shuttle(T,P)
t = 0
Airplane(P,LA)
t = 0.5
0.5
t = 1
Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))
Airplane(P,LA)
t = 2.0
$20
Issues in Cost Propagation
Costing a set of literals• Cost(f,t) = min {Cost(A,t) : f Effect(A)}• Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))
• Aggregate can be Sum or Max
• Set-level idea would entail tracking costs of subsets of literals
Termination Criteria
• Deadline Termination: Terminate at time point t if: goal G: Deadline(G) t goal G: (Deadline(G) < t)
(Cost(G,t) = • Fix-point Termination: Terminate
at time point t where we can not improve the cost of any proposition.
• K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times.
Heuristics based on cost functions
• If we want to minimize makespan:– h = t0
• If we want to minimize cost– h = CostAggregate(G, t)
• If we want to minimize a function f(time,cost) of cost and makespan – h = min f(t,Cost(G,t)) s.t. t0
t t• E.g. f(time,cost) =
100.makespan + Cost then h = 100x2 + 220 at t0 t = 2 t
time
cost
0 t0=1.5 2 t = 10
$300
$220
$100
Cost(At(LA))
Time of Earliest achievement
Time of lowest cost
Direct • Extract a relaxed plan using h as the bias
– If the objective function is f(time,cost), then action A ( to be added to RP) is selected such that:
f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew))
is minimal
Gnew = (G Precond(A)) \ Effects)
Using Relaxed Plan
Phased Relaxation
Adjusting for Resource Interactions: Estimate the number of additional resource-producing actions needed to make-up for any resource short-fall in the relaxed plan C = C + R (Con(R) – (Init(R)+Pro(R)))/R * C(AR)
Adjusting for Mutexes: Adjust the make-span estimate of the relaxed plan by marking actions that are mutex (and thus cannot be executed concurrently
The relaxed plan can be adjusted to take into account constraints that were originally ignored
Handling Cost/Makespan Tradeoffs
Results over 20 randomly generated temporal logistics problems involve moving 4 packages between different locations in 3 cities:
O = f(time,cost) = .Makespan + (1- ).TotalCost
Cost variation
0
10
20
30
40
50
60
0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1
Alpha
To
tal
Co
st
Makespan variation
Cost variation
0
10
20
30
40
50
60
0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1
Alpha
To
tal
Co
st
Makespan variation
Build RTPG Propagate Cost
functionsExtract relaxed plan
Adjust for Mutexes; Resources
Planning Problem
Generate start state
No
Partialize thep.c. plan
Returno.c and p.c plans
Expand state by applying
actions
Heuristicestimation
Select state with lowest f-value
SatisfiesGoals?
Queue of Time-Stamped states
Yes
f can have bothCost & Makespan
components
SAPA at IPC-2002
Rover (time setting) Rover (time setting)
Satellite (complex setting) Satellite (complex setting)
Build RTPG Propagate Cost
functionsExtract relaxed plan
Adjust for Mutexes; Resources
Planning Problem
Generate start state
No
Partialize thep.c. plan
Returno.c and p.c plans
Expand state by applying
actions
Heuristicestimation
Select state with lowest f-value
SatisfiesGoals?
Queue of Time-Stamped states
Yes
f can have bothCost & Makespan
components
[JAIR 2003]
IV. PG Heuristics for Conformant Planning
A* Search Engine(HSP-r)
Heuristics
PlanningGraph(s)
(IPP)
Clausal States
Labels (CUDD)
ModelChecker
(NuSMV)
Off – The - Shelf Custom
IPC PDDL Parser
Sear
ches
Gui
ded
By
Input forInput for
Con
dens
e
Validates
Extracted
From
Conformant Planning as RegressionActions:A1: M P => KA2: M Q => KA3: M R => LA4: K => GA5: L => G
Initially: (P V Q V R) &
(~P V ~Q) & (~P V ~R) & (~Q V ~R) &
M
Goal State:G
G
(G V K)
(G V K V L)
A4
A1
(G V K V L V P) & M
A2
A5
A3
G or K must be true before A4For G to be true after A4
(G V K V L V P V Q) & M
(G V K V L V P V Q V R) &M
Each Clause is Satisfied by a Clause in the Initial Clausal State -- Done! (5 actions)
Initially: (P V Q V R) &
(~P V ~Q) & (~P V ~R) & (~Q V ~R) &
M(G V K V L V P V Q V R) &
M
Using a Single, Unioned GraphPM
QM
RM
P
Q
R
M
A1
A2
A3
Q
R
M
K
LA4
GA5
PA1
A2
A3
Q
R
M
K
L
P
G
A4K
A1P
M
Heuristic Estimate = 2
•Not effective•Lose world specific support information•Incorrect mutexesUnion literals from
all initial states into a conjunctive initial graph level
•Easy to implement
Using Multiple GraphsP
M
A1 P
M
K
A1 P
M
KA4
G
R
MA3
R
M
L
A3R
M
L
GA5
PM
QM
RM
Q
M
A2Q
M
K
A2Q
KA4
G
M
G
A4K
A1
M
P
G
A4K
A2Q
M
GA5
L
A3R
M
•Accurate Mutexes•Moderate Implementation Difficulty
•Memory Intensive•Heuristic Computation Can be costly
Unioning these graphs a priori would give much savings …
Using a Single, Labeled Graph
P
Q
R
A1
A2
A3
P
Q
R
M
L
A1
A2
A3
P
Q
R
L
A5
Action Labels:Conjunction of Labels of Supporting Literals
Literal Labels:Disjunction of LabelsOf Supporting Actions
PM
QM
RM
KA4
G
K
A1
A2
A3
P
Q
R
M
GA5
A4L
K
A1
A2
A3
P
Q
R
M
Heuristic Value = 5
•Memory Efficient•Cheap Heuristics•Scalable•Extensible
•Tricky to Implement
Benefits from BDD’s and a model checker
ATMS
~Q & ~R
~P & ~R
~P & ~Q
(~P & ~R) V (~Q & ~R)
(~P & ~R) V (~Q & ~R) V(~P & ~Q)
M
True
Label Key
Label of a literal signifies the set of worlds in which it is supported --Full support means all init worlds
CAltAlt Performance• Label-graph based
heuristics make CAltAlt competitive with the current best approaches
Rovers Domain
1
10
100
1000
10000
100000
1000000
1 2 3 4Problem
Tim
e(m
s)
Single Sum Multi Level Multi RP Union
Label Level Label RP CGP
HSCP GPT KACMBP
Logistics
0
5
10
15
20
25
30
1 2 3 4Problem
Pla
n L
eng
th
Label RP CGP
HSCP GPT
KACMBP
A* Search Engine(HSP-r)
Heuristics
PlanningGraph(s)
(IPP)
Clausal States
Labels (CUDD)
ModelChecker
(NuSMV)
Off – The - Shelf Custom
IPC PDDL Parser
Sear
ches
Gui
ded
By
Input forInput for
Con
dens
e
Validates
Extracted
From
The Damage until now..
– Classical (regression) planning– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial expansion
– Graphplan style search– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]
• Variable/Value ordering heuristics based on distances
– Partial order planning– RePOP (IJCAI 2001)
• Mutexes used to detect Indirect Conflicts
– Metric Temporal Planning– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Propagation of cost functions; Phased relaxation
– Conformant Planning– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
• Multiple graphs; Labelled graphs
Still to come: PG Heuristics for—
• Probabilistic Conformant Planning• Conditional Planning• Lifted Planning
• Trans-Atlantic camaraderie• Post-war reconstruction• Middle-east peace…
Meanwhile outside Tempe…
• Hoffman’s FF uses relaxed plans from PG• Geffner & Haslum derive DP-versions of PG-
heuristics• Gerevini & Serina’s LPG uses PG heuristics to
cost the various repairs• Smith back-propagates (convolves) probability
distributions over PG to decide the contingencies worth focusing on
• Trinquart proposes a PG-clone that directly computes reachability in plan-space…
• …
Why do we love PG Heuristics?• They work!• They are “forgiving”
– You don't like doing mutex? okay– You don't like growing the graph all the way? okay.
• Allow propagation of many types of information– Level, subgoal interaction, time, cost, world support,
• Support phased relaxation– E.g. Ignore mutexes and resources and bring them back later…
• Graph structure supports other synergistic uses– e.g. action selection
• Versatility…
• PG Variations– Serial– Parallel– Temporal– Labelled
• Propagation Methods– Level– Mutex– Cost– Label
• Planning Problems– Classical– Resource/Temporal– Conformant
• Planners– Regression– Progression– Partial Order– Graphplan-style
Versatility of PG Heuristics