Task and Motion Policy Synthesis as Liveness Games and Motion Policy Synthesis as Liveness Games Yue...
Transcript of Task and Motion Policy Synthesis as Liveness Games and Motion Policy Synthesis as Liveness Games Yue...
Task and Motion Policy Synthesis as Liveness Games
Yue Wang
Department of Computer ScienceRice University
May 9, 2016
Joint work with Neil T. Dantam, Swarat Chaudhuri, and Lydia E. Kavraki
1
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Motivation
2
Industrial Robots
Picture from robots.co
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Motivation
2
Highly structured environment
Pre-computed Plan
Industrial Robots
Picture from robots.co
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Motivation
2
Highly structured environment
Pre-computed Plan
Personal Robots
Picture from robohow.eu
Industrial Robots
Picture from robots.co
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Motivation
2
Highly structured environment
Pre-computed Plan
Unstructured environment
?
Personal Robots
Picture from robohow.eu
Industrial Robots
Picture from robots.co
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
• Pre-computed plan not working
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
• Pre-computed plan not working
• Need a policy
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Example — Kitchen Scenario
3
• Task
▪ avoid collisions
▪ eventually pick up an object
• Assumptions
▪ perfect sensing of current state
▪ deterministic actions
• Pre-computed plan not working
• Need a policy
Problem: Given (1) Task Specification, (2) Geometric description of Robot and Env, and (3) Discrete abstraction of Robot and Env actions, automatically synthesize a policy that accomplishes the task.
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Challenges
4
• Uncontrollable agents
▪What is the proper model?
• Policy over large state space
▪ How to efficiently synthesize the policy?
▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]
- Information from continuous geometry
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Challenges
4
Games between Robot and Env
• Uncontrollable agents
▪What is the proper model?
• Policy over large state space
▪ How to efficiently synthesize the policy?
▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]
- Information from continuous geometry
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Challenges
4
Games between Robot and Env
Policy SynthesisAlgorithm
• Uncontrollable agents
▪What is the proper model?
• Policy over large state space
▪ How to efficiently synthesize the policy?
▪ Integration of task and motion planning [e.g., Bhatia et al. ’11; Kaelbling and Lozano-Perez ’11; Srivastava et al. ’14; He et al. ’15]
- Information from continuous geometry
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Related Work
5
Static, Deterministic
domain
Uncertain domain
Stochastic Adversarial (Worst case)
Task and Motion
Planning (TMP) [e.g., Bhatia et
al. ’11; Kaelbling and
Lozano-Perez ’11; Srivastava et al. ’14; He et
al. ’15]
MDP [e.g., Lahijanian et al. ’10 ’12; Ding et al. ’11; Wolff
et al. ’12; Luna et al. ’14]
POMDP [e.g., Grady et al. ’13 ’15; Kurniawati et al. ’08; Somani et al. ’13]
Planning in belief space [e.g., Kaelbling and Lozano-Perez. ’13; Levin et al. ’13; Wong et al. ’13; Hadfield-
Menell et al. ’15]
Reactive Synthesis [e.g., Kress-Gazit et al. ’09, Decastro and
Kress-Gazit ’15; Wongpiromsarn et
al. ’10; Ulusoy et al. ’13; Alur, Moarref, and Topcu. ’15]
Our Problem
Differential Dynamics
Mobile Manipulation (High DOF)
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Ideas we extend
6
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Ideas we extend
6
• Program Synthesis
▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];
▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]
▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]
- efficiently handle quantitative constraints
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Ideas we extend
6
• Program Synthesis
▪ Syntax guided synthesis (SyGuS) [Alur et al. ’13];
▪ Counterexample guided inductive synthesis (CEGIS) [Solar-Lezama et al. ’06]
▪ Satisfiability Modulo Theories (SMT) [De Moura and Bjørner. ’08]
- efficiently handle quantitative constraints
•Games
▪ de Alfaro and Henzinger ’00; Alur, Henzinger, and Kupferman ’02
▪ Solving infinite games [Beyene et al. 2014 ]
▪ Liveness Games: eventually reach a certain state
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Liveness Game structure
7
• Game state space
▪ Robot states × Env states
•Game transitions
▪ valid moves for Robot and Env
•Winning condition
▪ Defined using a set dst of goal states
- Winning play should eventually visit a state s ∈ dst.
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Liveness Game structure
7
• Game state space
▪ Robot states × Env states
•Game transitions
▪ valid moves for Robot and Env
•Winning condition
▪ Defined using a set dst of goal states
- Winning play should eventually visit a state s ∈ dst.
Policy: select a proper action for the robot for every state
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Input Game Structure
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game Structure
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Constraints on system transitions
Game transitions
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Constraints on system transitions
Game transitions
Liveness Task Specification
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Constraints on system transitions
Game transitions
Liveness Task Specification
Liveness winning condition
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Constraints on system transitions
Game transitions
Liveness Task Specification
Liveness winning condition
Construct a policy
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis as Games
8
Geometric description of Robot and Env
Input Game StructurePlacement Graph[Nedunuri et al. 2014]
Game state space
Discrete abstraction of Robot and Env actions
Constraints on system transitions
Game transitions
Liveness Task Specification
Liveness winning condition
Construct a policy Find a winning strategy
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis Algorithm
9
• Iteratively generate a candidate and verifies its correctness
▪Counterexample guided [Solar-Lezama et al. ’06]
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Policy Synthesis Algorithm
9
• Iteratively generate a candidate and verifies its correctness
▪Counterexample guided [Solar-Lezama et al. ’06]
• Apply heuristic to generalize failures
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Geometric-Based Generalization
10
• Generalize the counterexample to a set of similar examples:
▪ Explore geometric structure
▪ reduce necessary iteration numbers - improve efficiency
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Geometric-Based Generalization
10
Counterexample
• Generalize the counterexample to a set of similar examples:
▪ Explore geometric structure
▪ reduce necessary iteration numbers - improve efficiency
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Geometric-Based Generalization
10
Counterexample
• Generalize the counterexample to a set of similar examples:
▪ Explore geometric structure
▪ reduce necessary iteration numbers - improve efficiency
Counterexample set
Generalization
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Experiments
11
Kitchen Scenario• Kitchen environment
▪ 2 chefs moving within the blue region
▪ increasing the size of the blue region (FoodPrep Region)
• Task requirements:
▪ avoid collisions
▪ eventually pick up an object
• Comparison with the GR(1) synthesizer [Piterman, Pnueli, and Saar 2006 ]
▪ back-end solver of LTLMoP [Finucane, Jing, and Kress-Gazit. ’10]
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Results
12
• In the tested benchmark, our method scales better for large problems
• Generalization gives order-of-magnitude speedup
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Performance with quantitative constraints — energy limits
13
• Still scales well
• About one-time slower
Yue Wang, Neil T. Dantam, Swarat Chaudhuri, Lydia E. Kavraki (Rice University)Task and Motion Policy Synthesis as Liveness Games
Conclusion
14
• Game model for policy synthesis in adversarial domains
• Algorithm for solving liveness games
▪ utilize geometric information (generalization)
▪ efficiently handle quantitative constraints, e.g., energy limits
• Future extensions:
▪ other uncertainty sources, such as sensor noises
▪ investigate additional generalization heuristics for broader domains