Metric/Temporal Planning. Metric Temporal Planning MTP Adds time and resources to planning Special...

Metric/Temporal Planning

Metric Temporal Planning MTP Adds time and resources to planning

Special cases: TP: Temporal planning RP: Resource Planning

Issues with Time Changes brought by the introduction of time into Planning can be grouped into two

categories Changes brought by having a metric (clock) time

I.e., there is a clock with respect to which we can specify events Changes brought by durations to actions

I.e., actions are not instantaneous Without metric time, a plan has just a beginning and ending point. Metric time allows

us to talk about all time points (and intervals) during the execution of the plan. Changes brought by metric time include Exogenous events

Special case: Timed initial literals (in the initial state we can state that some fluent becomes true at a specific time in the future during execution)

Deadline goals We can state that different goals need to be made true by different deadline

times (instead of all goals being true at the end) Durative goals

We can state that a certain fluent must have a specific value over an entire interval

Issues with Time (contd.) Durations of actions may be static or “dynamic”

duration depends on the context—eg. Time to fill your gas tank depends on how empty the tank is to begin with

Advanced issues: Uncertain durations… With instantaneous actions, an action has just “before” and “after” –preconditions must hold

“before” and effects will hold “after. Durative actions have before, after as well as “during”. With metric time (i.e., external clock), we can refer to all these points. We can now ask: When are preconditions needed?

Are they needed at a single point or over a duration? When are effects given? Are they point effects or “durative” effects (which are guaranteed

over a certain duration)? Note that because actions have durations, they can have multiple effects on a single

fluent at different times E.g. the action can make fluent P true at start, false after 10 sec, true again after

another 10 sec etc. A default assumption is to say that all preconditions are needed at the beginning and must

hold during the entire action’s duration. And that all effects will be available at the end of the action

E.g Consider “Grading homeworks” action—when are the homeworks needed? When are the grades available? What does your teacher tell you?

Issues with Time (contd.) Durative actions bring more pointed meaning to “concurrency”.

Concurrency is not just a luxury (to reduce make-span), but is often a necessity (e.g. burn a match, and cross the dark corridor while the match is burning..)

Suppose I tell you that a plan P contains actions A1… A10, each with duration d1…d10, then what is the makespan (execution duration) of P? Makespan(P) >= max(d1…d10) If Makespan(P) = Sum(d1…d10), then it is a strictly serial plan If Makespan(P) > Sum(d1..d10), then there is idle-time in the plan If Makespan(P) < Sum(d1..d10), then there is concurrency

Actions don’t need to start right after the preceding action Think of the bank teller gossiping with his colleague in between

servicing each customer Planned idle/slack time may not always be a bad thing—it can sometimes

improve the robustness of the plan Think of three travel plans involving connections in Minneapolis: Plan 1 schedules 5 min for connection time; plan 2 schedules 1

hour; plan 3 schedules 2 days. Which one is better (all else being equal).

Issues with Resources (continuous quantities)

Resources: Actions may consume/produce (continuous quantity) “resources” The main consequence is that we have numeric state variables, instead of just

boolean (or multi-valued) ones (multi-valued does not mean numeric—a variable can take red,blue,green as values).

Actions can update a numeric state variable (whereas they just assign a non-numeric one)

Resource qty after action := Some-function-of(Resource qty before action, action parameters)

Updates can be linear OR non-linear When combined with durative actions, updates can be discrete (i.e, happen all at once

at the end of the action) OR continuous (or happen at some given rate during the action)

Planning issues: How to efficiently reason with continuous quantities during planning

PDDL 2.1 Standard:Summary

Durations Static and dynamic durations allowed Also allows duration inequalities

Preconditions Can be “at start” or “over all” (throughout the duration)

Doesn’t model preconditions being needed for arbitrary durations in the middle Effects

Can be “at start” or “at end” This makes effects “discrete”

Numeric quantities Can be present in the preconditions or effects Presence in the effects can be “discrete” (“at start”/”at end”) or continuous

Continuous change specified by giving a “rate” at which the quantity changes Non-linear rate harder

(:durative-action burn_match:parameters ():duration (= ?duration 15):condition: (and (at start have_match) (at start have_strikepad)):effect (and (at start have_light) (at end (not have_light))

))

have_match, have strikepad

have_light ~have_lightBURN MATCH(dur: 15)

(:durative-action cross_cellar:parameters ():duration (= ?duration 10):condition (and (at start have_light)

(over all have_light) (at start at_steps))

:effect (and (at start (not at_steps)) (at start crossing)(at end at_fuse_box))

CROSS_CELLAR(dur: 10)have_light, at_steps

at_fuse_box~at_steps, crossing

PDDL 2.1 (Level 2)Pure Durative Actions

PDDL 2.1 Level 3:Durative actions and numeric quantities

(but discrete effects)

The entire energy to be consumed is “encumbered” at the very beginning (even though it gets consumed Slowly over the full duration.

PDDL 2.1 Level 4:Durative actions and numeric quantities

(with continuous effects: )

Issues in modeling continuous change by discrete vs. continuous effects

Consider the action of boiling a pan of water The quantity “temperature of water” changes

continuously over the duration of the action We can ignore continuous effects by

specifying that temperature is 1000 C at the end

Easy to handle; can only access the temperature at the end of the action; Reduces concurrency (what if we also put a blow torch to the pan to “hasten” the process?)

Or we can specify that the temperature of the water raises at a linear rate until it becomes 100

Harder to handle; but allows more concurrency (the total rate of increase is summation of all the individual rates of increase)

Compiling durative actions into instantaneous ones

A durative action A that has only at-start, at-end and over-all conditions can be modeled in terms of two coupled instantaneous actions As and Ae As gets all the at-start

conditions and effects Ae gets all the at-end

conditions and effects An “invariant” (think of it as an

Interval Preservation Constraint) from As to Ae for all the “overall” preconditions

+es

ps

+ee

pepo

As

A

Ae

+es

ps

+ee

pe

po

Plan representation

A1

A2

A3

Drive(cityA,cityB)

QAt(truck,B)

An executable plan must provide -- the actions that need to be executed -- the start times for each of the actions Or a set of simple temporal constraints on the set of actions (S.T.C. are generalization of partial orders) E.g. A1—[4,5]A2 (means 4 <= ST(A2) – ST(A1) <= 5 )

Plan views: Pert and Gantt charts GANTT Chart is what is shown on the right PERT shows the Causal links

Problem Representation Achievement Goals are specified as a list <pi,ti> where pi

needs to hold by time ti ti is the deadline by which G must hold. It can be metric time (e.g.

make clear(b) true by 2pm.) If ti is omitted we will assume that G is a non-deadline goal (must be true by

the time the plan is done. “Persist Goals” are specified as a condition and an interval

over which it must hold A persist goal may be supported by different actions for the different parts

of the duration ( “goal reduction” a la ZENO) E.g. striking multiple matches to have light over a duration

Plan Quality Measures Makespan: Clock time for the execution of the plan

(more concurrency lower makespan) Slack: The difference between the deadline for a goal

and the time by which the plan achieves it Tardiness is negative slack Optimize max/min/average slack/tardiness measures

Cost: Sum of costs of all the actions Can be split into multiple dimensions, one corresponding to

each resource

A1

A2

A3

Drive(cityA,cityB)

QAt(truck,B)

Concurrency

Two actions are concurrent if their execution durations overlap in time A plan is concurrent if it has concurrently executing actions

If make-span of a plan is less than the sum of the durations of the actions in the plan, then the plan has concurrency

A problem requires concurrency if every solution plan for the problem is concurrent Note that a problem has sequential solutions but for optimality reasons it may have

to go for concurrent solutions A domain requires concurrency if any of its problems requires concurrency One distinguishing feature of temporal planning domains is that they may have

problems that require concurrency. Interesting Factoid: Several of the planners that won the temporal planning

competition could not actually solve problems requiring concurrency Another interesting factoid: Most of the bench-mark domains actually didn’t have

problems that required concurrency

[Cushing et. al. IJCAI-07; ICAPS-07]

Looking at STRIPS Actions from PDDL2.1 Vantage Point

How best to view non-durative actions? Instantaneous

Makes it hard to provide physical semantics (no change is instantaneous)

epsilon duration with only Overall preconditions and At-end post-conditions We can show that domains with this type of actions can

never have problems that require concurrency

TGP-style durative actions

A PDDL-2.1 action is a TGP-style durative action if All preconditions are “Overall” preconditions All effects are “at-end” effects

It can be shown that domains in which all actions areTGP-style will not require concurrency Concurrency may still be needed for make-span

optimization

Temporal Gap

A PDDL-2.1 style action is said to have temporal gap if there is no single time-point in the action where all the preconditions and effects of the actions must hold Epsilon duration STRIPS actions have no temporal gap TGP-style actions have no temporal gap

All the preconditions and effects must hold together at the end point of the action

If none of the actions in a domain have temporal gap, then that domain cannot have problems with required concurrency “Duration” is like a cost measure

Add…

The issue of time—dense vs. integer Rintanen’s complexity issue—R.C. with the

same action.. Non-RC plans can be compiled 1-1 A huge modeling jump

Ended here..

Some Brand Names

Planners that can handle similar types of temporal and resource constraints: TLPlan, HSTS, IxTexT, Zeno, SAPA, LPG

TlPlan, SAPA are progression-based planners HSTS,IxTET,Zeno are partial-order-based planners TlPlan,HSTS are domain-customized planners; the rest are domain independent

Planners that can handle a subset of constraints: Only temporal: TGP, TPG, LPGP Only resources: LPSAT, GRT-R, Kautz-Walser, Metric-FF Subset of temporal and resource constraints: TP4, Resource-IPP

LPGP and LPSAT are “loosely-coupled” systems. LPSAT connects SAT and LP solvers; LPGP connects Graphplan and LPsolver

Issues of how “tight” is the loose-connection. TGP,TPG,LPGP are Graphplan-based LPSAT is based on SAT encodings being sent to LP solvers Kautz-Walser is based solely on LP encodings

State of the Art (as of IPC2002)(revised for IPC 2004)

At IPC 2002; PDDL 2.1 standard had three levels Level 1: STRIPS/ADL Level 2: +Durative Actions

FF, LPG, SAPA, SGPlan (extends LPG) Level 3: +Numeric quantities discrete change

Sapa, LPG, SGPlan (extends LPG) Level 4: +Continuous change

None at IPC Some planners can handle it “in theory” but none are scalable

Approaches for MTP

In theory, pretty much every one of the approaches we saw for classical planning can be (and have been) extended to MTP (with varying degrees of scalability)

There are some interesting tradeoffs PO planners are easiest to extend to support the concurrency

needed for durative actions Have harder time handling resources (because resource consumption

depends on exactly what actions occurred before this time point) Progression planners easiest to extend to support resource

consuming actions But harder time handling concurrency (need to consider “advancing

clock” as a separate option in addition to applying one of the actions)

Our Road Map

Will focus on conjunctive planning approaches—with special attention to Sapa action models

Using PDDL2.1 standard how to model the search

Progression; Regression; PO planning how to extract good heuristics

Done

Action Representation

Flying

(in-city ?airplane ?city1)

(fuel ?airplane) > 0

(in-city ?airplane ?city1) (in-city ?airplane ?city2)

consume (fuel ?airplane)

Durative with EA = SA + DA

Instantaneous effects e at time te = SA + d, 0 d DA

Preconditions need to be true at the starting point, and protected during a period of time d, 0 d DA

Action can consume or produce continuous amount of some resource

Action Conflicts:

Consuming the same resourceOne action’s effect conflictingwith other’s precondition or effect

Digression: Concurrent vs. Parallel plans

The main difference with temporal planning is that we need to produce concurrent plans In the context of classical planning, concurrent planners are akin to

parallel plans (aka Graphplan) This analogy is not complete of course. For every solvable problem in

classical planning, there is guaranteed to be a sequential plan. This guarantee does not hold for temporal planning (which means we have to search in the space of concurrent plans)

Progression planners that we have seen until now produce sequential plans (FF does not produce parallel plans!)

FF is still complete because in classical planning, there is always a sequential plan for every problem

So, we can start by asking what we need to do to make progression produce parallel plans.

Digression: How to produce parallel plans with progression?

The naïve idea is to project over subsets of non-interfering actions (rather than single actions).

Problem: Exponential branching factor A better idea: Consider “fattening” as well as “lengthening” the current partial plan as

two options. We start by representing the state of a partial plan prefix as [S, {A1…Ak}] where S is the

current state, and {A1..Ak} are the mutually non-interfering actions that we have already committed to applying at S.

Notice that this is just a generalization of the normal progression state, in which the action set {A1..Ak} will be a singleton

Given a state [S,{A1..Ak}] to expand, we have (backtrackable) choices: Fatten: Consider applying another action B in state S [One branch for each possible action B]

For this to be feasible, B should be applicable in Si and B should not be interfering with A1..Ak. The resulting state will be {S; {A1…Ak}}

Lengthen: Consider applying an action C in the state S’ which is obtained by applying actions {A1…Ak} in S [One branch for each applicable action]

For this to be feasible, C should be applicable in S’. The resulting state is {S’, {C}} Notice that

Fattening is only done at the current state (once lengthening is done, the current state changes. So any new fattening will be done at the new state.

Normal progression always selects “Lengthen”. The addition needed to support parallel plans is the “Fatten” branch.

Digression: Generating concurrent plans is similar to generating parallel plans…

To generate concurrent plans using progression, we start with the idea of generating parallel plans with progression

For parallel plans, the “state of the partial plan” is represented by [S, {A1..Ak}]

For temporal concurrent plans, we need to generalize this to consider the fact that

1. Each action may have different duration2. Actions may have effects that are realized at different time points in the

future 1. This means that some actions that we have committed to applying at previous

states may wind up posting their effects now. The solution is to start thinking in terms of “current time stamp”, and

information about the set of durative actions that we have committed to apply whose effects have not yet been realized. We can either add additional non-interfering actions at the current time-stamp OR advance the timestamp (to the nearest future time where new effects of

already committed actions can be realized).

State-Space Search:Search is through time-stamped states

Search states should have information about -- what conditions hold at the current time slice (P,M below) -- what actions have we already committed to put into the plan (,Q below)

S=(P,M,,Q,t)

Set <pi,ti> of predicates pi and thetime of their last achievement ti < t.

Set of functions represent resource values.

Set of protectedpersistent conditions(could be binary or resource conds).

Event queue (contains resource as wellAs binary fluent events).

Time stamp of S.

In the initial state, P,M, non-empty Q non-empty if we have exogenous events

(:durative-action cross_cellar:parameters ():duration (= ?duration 10):condition (and (at start have_light)

(over all have_light)(at start at_steps))

:effect (and (at start (not at_steps)) (at start crossing)(at end at_fuse_box)

)

Let current state S be P:{have_light@0; at_steps@0}; Q:{~have_light@15} t: 0(presumably after doing the light-candle action) Applying cross_cellar to this state gives

S’= P:{have_light@0; crossing@0}; :{have_light,<0,10>} Q:{at_fuse-box@10;~have_light@15} t: 0

(:durative-action burn_match:parameters ():duration (= ?duration 15):condition: (and (at start have_match)

(at start have_strikepad)):effect (and (at start have_light)

(at end (not have_light)))

)

Light-match

Light-match

Cross-cellar

1510

Time-stamp

“Advancing” the clock as a device for concurrency control

To support concurrency, we need to consider advancing the clock How far to advance the clock?

One shortcut is to advance the clock to the time of the next earliest event event in the event queue; since this is the least advance needed to make changes to P and M of S.

At this point, all the events happening at that time point are transferred from Q to P and M (to signify that they have happened)

This This strategy will find “a” plan for every problem—but will

have the effect of enforcing concurrency by putting the concurrent actions to “align on the left end”

In the candle/cellar example, we will find plans where the crossing cellar action starts right when the light-match action starts

If we need slack in the start times, we will have to post-process the plan

If we want plans with arbitrary slacks on start-times to appears in the search space, we will have to consider advancing the clock by arbitrary amounts (even if it changes nothing in the state other than the clock time itself).

Light-match

Cross-cellar

~have-light

1510

In the cellar plan above, the clock,If advanced, will be advanced to 15,Where an event (~have-light will occur)This means cross-cellar can either be doneAt 0 or 15 (and the latter makes no sense)

Cross-cellar

Search Algorithm (cont.) Goal Satisfaction: S=(P,M,,Q,t) G if <pi,ti> G either:

<pi,tj> P, tj < ti and no event in Q deletes pi.

e Q that adds pi at time te < ti. Action Application: Action A is applicable in S if:

All instantaneous preconditions of A are satisfied by P and M.

A’s effects do not interfere with and Q. No event in Q interferes with persistent

preconditions of A. A does not lead to concurrent resource change

When A is applied to S: P is updated according to A’s instantaneous

effects. Persistent preconditions of A are put in Delayed effects of A are put in Q.

Flying





Flying





S=(P,M,,Q,t)

Search: Pick a state S from the queue. If S satisfies the goals, endElse non-deterministically do one of

--Advance the clock (by executing the earliest event in Qs

--Apply one of the applicable actions to S

[TLplan; Sapa; 2001]

Metric/Temporal Planning. Metric Temporal Planning MTP Adds time and resources to planning Special...

Documents

Transcript of Metric/Temporal Planning. Metric Temporal Planning MTP Adds time and resources to planning Special...