Strong-Cyclic Planning When Fairness is Not a Valid...

29
Strong-Cyclic Planning When Fairness is Not a Valid Assumption Alberto Camacho Sheila A. McIlraith Department of Computer Science University of Toronto, Canada {acamacho,sheila}@cs.toronto.edu KnowProS July 10, 2016

Transcript of Strong-Cyclic Planning When Fairness is Not a Valid...

Page 1: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Strong-Cyclic Planning When Fairness is Not a Valid Assumption

Alberto Camacho Sheila A. McIlraith

Department of Computer ScienceUniversity of Toronto, Canada

{acamacho,sheila}@cs.toronto.edu

KnowProSJuly 10, 2016

Page 2: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Take Home Message

Motivation

Soundness of standard strong-cyclic solutions to Fully ObservableNon-Deterministic (FOND) planning problems is guaranteed only whenthe fairness assumption holds.

Approach

We introduce L-fairness; a more generic concept that generalizes theclassical fairness assumption.

Contribution

FOND+ class of planning problems. Soundness of solutions ispredicated on the L-fairness assumption.

Identify a class of FOND+ solutions that are also solutions to1-primary normative fault-tolerant planning problems.

We present different algorithms to solve FOND+ problems.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 2 / 21

Page 3: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Non-Deterministic Planning

Non-Deterministic Planning Domain D = 〈F , S ,A,T 〉

F finite set of propositions

S finite set of states S ⊆ 2F

A set of actions a = 〈Prea,Eff a〉

Preconditions PreaNon-deterministic effects Eff a = 〈Eff 1

a, . . .Effna〉

T : S ×A → 2S transition function

If s ′ ∈ T (s, a,Eff ia) then s ′ = Prog(s, a,Eff i

a) for some Eff ia ∈ Eff a

We write state transition (s, a, s ′)

In our paper, we address two classes of non-deterministic planningproblems:

Fully Observable Non-Deterministic (FOND) Planning

Fault-Tolerant Planning

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 3 / 21

Page 4: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

FOND Planning

FOND Planning Problem P = 〈D, s0, SG 〉

D = 〈F , S ,A,T 〉 is a non-deterministic planning domain

s0 ∈ S initial state

SG ⊆ S goal states

Solutions are policies, or mappings from states into actions.

weak solutions

strong solutions

strong-cyclic solutions

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 4 / 21

Page 5: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Solutions to a FOND Problem (cf. [Cimatti et al., 2003])

Weak Solutions

Weak solutions are plans that achieve the goal, but without guarantees.

Strong Solutions

Strong solutions guarantee goal achievement in all executions.

Strong-Cyclic Solutions

Strong-Cyclic solutions guarantee goal achievement, provided that allexecutions are fair.

An execution σ is unfair when a state-action tuple s, a appears infinitelyoften in σ, but the transition (s, a, s ′) occurs a finite number of times foran outcome s ′ ∈ T (s, a).Executions that are not unfair are said to be fair.

c.f. [Cimatti et al., 2003]

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 5 / 21

Page 6: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Fault-Tolerant Planning

Fault-Tolerant Planning Problem P = 〈D, s0, SG , F , κ〉

D = 〈F , S ,A,T 〉 is a non-deterministic planning domain

s0 ∈ S initial state

SG ⊆ S goal states

F is an exception model

κ is an integer parameter

F :⋃

a∈AEff a → N is an exception model:

F (e) > 0 when the effect is faulty

F (e) = 0 when the effect is normative

If |e | F (e) = 0, e ∈ Eff a| = 1 for all a ∈ A, then problem is1-primary

c.f. [Jensen et al., 2004, Domshlak, 2013]

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 6 / 21

Page 7: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Solutions to Fault-Tolerant Planning Problems

κ-admissible Executions

A state-effect execution (s0, e0, . . . , si , ei , . . .) is κ-admissible whenΣiF (ei ) ≤ κ.

Solutions are κ-Plans

A policy is a κ-plan when all κ-admissible executions are finite and reachthe goal.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 7 / 21

Page 8: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Motivation

Blocksworld domain:

Initial state: {on(A,B), ontable(B), handempty}

Actions:

pick-up-block(?b,?from):Pre = {handempty, on-block(?b,?from)}Eff 1 = {holding(?b) ∧ ¬handempty}Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}put-block-on-table(?b)

Pre = {holding(?b)}Eff = {on-table(?b) ∧ ¬holding(?b)}put-on-block(?b1,?b2)

Pre = {handempty ∧ clear(?b2)};Eff = {on-block(?b1,?b2) ∧ ¬handempty}

Goal condition: {on-table(A)}

B

A

B A B

A

Goal achievement ispredicated on fairness.

B

A

B A B

A

Goal achievement is notpredicated on fairness.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 8 / 21

Page 9: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Desired Solutions

Guarantees vs. no guarantees of occurrence:

solutions need not to rely on an effect for which there is noguarantees of occurrence

Normative vs. faulty behaviour:

solutions need to achieve the goal when the system manifests itsnormative behaviour

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21

Page 10: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Outline

1 Background in Non-Deterministic Planning

2 The Model: FOND+

3 Algorithms to solve FOND+

4 Experimental Results

5 Conclusions

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 9 / 21

Page 11: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

L-fair Executions

L-fair Executions

For a labeling function L : S ×A× S → {F, U}, we say that an executionin state s0 is L-unfair when there exists a state-action tuple (s, a) suchthat

(s, a) appears infinitely often, and

there exists a transition (s, a, s ′) such that L(s, a, s ′) = F and(s, a, s ′) occurs a finite number of times.

Executions that are not L-unfair are said to be L-fair.

Note that fairness, as defined by [Cimatti et al., 2003], is a particularcase of L-fairness that occurs when L assigns F to all transitions.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 10 / 21

Page 12: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Planning With Unfair Non-Determinism

FOND+ Planning Problem P = 〈D, s0, SG , L〉

D = 〈F , S ,A,T 〉 is a non-deterministic planning domain

s0 ∈ S is the initial state

SG ⊆ S is a set of goal states

L : S ×A× S → {F, U} is a labeling function

Solutions

Solutions to a FOND+ problem P = 〈D, s0, SG , L〉 are policies thatguarantee goal achievement, predicated on the assumption that allexecutions of D in s0 are L-fair.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 11 / 21

Page 13: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Classes of FOND+ Solutions

Strictly Fair

A solution π to a FOND+ problem is strictly fair when all transitions tproduced by L-fair plan executions have L(t) = F.

Strictly Unfair

A solution π to a FOND+ problem is strictly unfair when all transitionst produced by L-fair plan executions have L(t) = U.

Mixed

A solution π to a FOND+ problem is mixed when it is neither strictly fairnor strictly unfair.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 12 / 21

Page 14: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

FOND+ and Fault-Tolerant Planning

Normative Solutions

A FOND+ solution π is normative when, in each state s, reachable by π:

there exists a plan execution in s that reaches the goal and suchthat all transitions t have L(t) = F, and

exactly one outcome of s by π(s) produces a transition t withL(t) = F.

Normative Solutions are Fault-Tolerant

Normative solutions to a FOND+ problem P = 〈D, s0, SG , L〉 are also1-primary normative solutions to fault-tolerant planning problemsP ′ = 〈D, s0, SG ,F , κ〉 s.t. F (e) = 0 (resp. F (e) > 0) when e producestransition (s, a, s ′) such that L(s, a, s ′) = F (resp. L(s, a, s ′) = U).

Normative FOND+ solutions are robust to occurrence of anypossible number of faults during execution, as opposed to standardfault-tolerant solutions.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21

Page 15: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Outline

1 Background in Non-Deterministic Planning

2 The Model: FOND+

3 Algorithms to solve FOND+

4 Experimental Results

5 Conclusions

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 13 / 21

Page 16: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Strictly Fair Solutions

For a FOND+ problem P, the algorithm consists of two steps:

1 P is relaxed into a FOND problem P ′ = 〈D′, s0, SG 〉.D′ is like D, but the actions applicable in a given state s arerestricted to those a’s that only yield transitions (s, a, s ′) labeledwith L(s, a, s ′) = F.

2 A sound and complete strong-cyclic FOND planner – e.g. PRP[Muise et al., 2012] – is used to search for a strong-cyclic solution toP ′, which is returned as a strictly fair solution to P.

Theorem

Algorithm is sound and complete.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 14 / 21

Page 17: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Strictly Unfair Solutions

For a FOND+ problem P, the algorithm consists of two steps:

1 P is relaxed into a FOND problem P ′ = 〈D′, s0, SG 〉.D′ is like D, but the actions applicable in a given state s arerestricted to those a’s that only yield transitions (s, a, s ′) labeled wthL(s, a, s ′) = U.

2 A sound and complete strong FOND planer – e.g.[Jaramillo et al., 2014] – is used to search for a strong solution toP ′, which is returned as a strictly unfair solution to P.

Theorem

Algorithm is sound and complete.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 15 / 21

Page 18: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Normative Solutions

Three basic steps (also in PRP):

Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).

Init S1 S2 Goal

?

?

?

?

?

?

a1 a2 a3

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21

Page 19: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Normative Solutions

Three basic steps (also in PRP):

Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).

Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.

Init S1 S2 Goal

?

?

?

?

S3

?

a1 a2 a3

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21

Page 20: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Normative Solutions

Three basic steps (also in PRP):

Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).

Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.

Step 3: Repeat Step 2 until convergence.

Init S1 S2 Goal

?

?

?

?

S3

?

a1 a2 a3

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21

Page 21: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Algorithm to Find Normative Solutions

Three basic steps (also in PRP):

Step 1: Search plan in the all-outcomes determinization of the problem(i.e. ignore non-determinisim).

Step 2: Select a state result of non-determinisim, and search plan to theGoal or to a previously resolved state.

Step 3: Repeat Step 2 until convergence.

Difference with PRP is in the open list of states.

In PRP: First-In, Last-Out

In our algorithm: Exploration of states produced by normativeeffects have preference.

Theorem

Algorithm is sound and complete.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21

Page 22: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Outline

1 Background in Non-Deterministic Planning

2 The Model: FOND+

3 Algorithms to solve FOND+

4 Experimental Results

5 Conclusions

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 16 / 21

Page 23: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Objectives of the Experiments

Two Main Objectives:

Test the efficiency of one of our algorithms

Evaluate characteristics (planner run time and policy size) ofnormative solutions

Procedure:

Compute Normative solutions to FOND+ problems

Compute Strong-Cyclic solutions to FOND problems, using PRPplanner [Muise et al., 2012]

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 17 / 21

Page 24: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Blocksworld Problems

Blocksworld problems from [Muise et al., 2012], with actions:

pick-up-block(?b,?from):Pre = {handempty, on-block(?b,?from)}Eff 1 = {holding(?b) ∧ ¬handempty}Eff 2 = {on-table(?b) ∧ ¬on-block(?b,?from)}

put-block-on-table(?b)

Pre = {holding(?b)}Eff = {on-table(?b) ∧ ¬holding(?b)}

put-on-block(?b1,?b2)

Pre = {handempty ∧ clear(?b2)}Eff 1 = {on-block(?b1,?b2) ∧ ¬handempty}Eff 2 = {on-table(?b1) ∧ ¬handempty}

In FOND+ problems we consider:

Eff 1 is a normative effect

Eff 2 is a faulty effect

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 18 / 21

Page 25: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Results

Strong-Cyclic Normativeproblem run-time size run-time sizep2 0 3 0 3p3 0.002 5 0.016 5p4 0.020 11 0.048 11p5 0.070 27 0.178 27p6 0.110 39 0.296 39p7 0.114 32 0.270 32p8 0.150 26 0.356 26p9 0.278 46 0.664 46p10 0.336 49 0.782 49p11 0.522 120 1.936 97p12 0.626 97 1.840 119.5p13 0.682 57 1.810 57p14 3.794 1117 37.10 1123p15 1.500 278 7.814 278

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 19 / 21

Page 26: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Outline

1 Background in Non-Deterministic Planning

2 The Model: FOND+

3 Algorithms to solve FOND+

4 Experimental Results

5 Conclusions

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 19 / 21

Page 27: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Summary and Future Work

Strong-cyclic planning does not guarantee goal achievement inproblems the fairness assumption is not valid

We introduced L-fairness and FOND+ model

We identified connection between FOND+ and 1-primary normativefault-tolerant planning

Introduced algorithms to search FOND+ solutions

Future Work:

Further investigate and formalise connections between FOND+ andfault-tolerant planning

More extensive experiments

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 20 / 21

Page 28: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

Questions?

code, benchmarks, and slides available soon:

http://www.cs.toronto.edu/~acamacho

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21

Page 29: Strong-Cyclic Planning When Fairness is Not a Valid …acamacho/files/FONDplus_KnowProS16_Slides.pdfExecutions that are not L-unfair are said to be L-fair. Note that fairness, as defined

References I

Cimatti, A., Pistore, M., Roveri, M., and Traverso, P. (2003).

Weak, strong, and strong cyclic planning via symbolic model checking.Artificial Intelligence, 147:35–84.

Domshlak, C. (2013).

Fault tolerant planning: Complexity and compilation.

Hertle, A., Dornhege, C., Keller, T., Mattmller, R., Ortlieb, M., and Nebel, B. (2014).

An Experimental Comparison of Classical, FOND and Probabilistic Planning.In Proc. of 37th International Conference on Artificial Intelligence (KI 2014), Prague.

Jaramillo, A. C., Fu, J., Ng, V., Bastani, F. B., and Yen, I.-L. (2014).

Fast strong planning for fond problems with multi-root directed acyclic graphs.International Journal on Artificial Intelligence Tools, 23(06):1460028.

Jensen, R. M., Veloso, M. M., and Bryant, R. E. (2004).

Fault tolerant planning: Toward probabilistic uncertainty models in symbolicnon-deterministic planning.pages 335–344.

Little, I. and Thiebaux, S. (2007).

Probabilistic planning vs. replanning.ICAPS Workshop on IPC: Past, Present and Future.

Muise, C., McIlraith, S. A., and Beck, J. C. (2012).

Improved Non-deterministic Planning by Exploiting State Relevance.In ICAPS, pages 172–180.

Camacho and McIlraith: Strong-Cyclic Planning When Fairness is Not a Valid Assumption 21 / 21