By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

21
GSMDPs for Multi-Robot Sequential Decision Making By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering

Transcript of By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Page 1: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

GSMDPs for Multi-Robot Sequential Decision Making

By: Messias, Spaan, Lima

Presented by: Mike PlaskerDMES – Ocean Engineering

Page 2: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

IntroductionRobotic Planning under uncertaintyMDP solutionsLimited real-world application

Page 3: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Assumptions for Multi-Robot teamsCommunication (Inexpensive, free, or costly)Synchronous and steady state transitionsDiscretization of environment

Page 4: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

A Different ApproachStates and actions discrete (like MDP)Continuous measure of timeState transitions regarded as random ‘events’

Page 5: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

AdvantagesNon-Markovian effects of discretization

minimizedFully reactive to changesCommunication only required for ‘events’

Page 6: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

GSMDPsGeneric temporal probability distributions

over eventsCan model concurrent (persistently enabled)

eventsSolvable by discrete-time MDP algorithms by

obtaining an equivalent (semi-)Markovian model

Avoids negative effects of synchronous alternatives

Page 7: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Why GSMDPs for RoboticsCooperative Robotics requires:

Operation in inherently continuous environments

Uncertainty in actions (and observations)Joint decision making for optimizationReactive

Page 8: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Definitionsmultiagent GSMDP: tuple <d, S, X, A, T, F, R, C, h>

d = number agentsS = state space (contains state factors)X = state factorsA = set of joint actionsT = transition functionF = time modelR = instantaneous reward functionC = cumulative reward rateh = planning over continuous time

Page 9: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

DefinitionsEvent in a GSMDP:An abstraction to state transitions that share the same properties

Persistently enabled events:Events that are enabled from step ‘t’ to step ‘t+1’, but not triggered at step ‘t’

Page 10: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Common ApproachSynchronous actionPre-defined time step

• Performance• Reaction time

Page 11: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

GSMDPsPersistently enabled events modeled by

allowing their temporal distributions to depend on the time they were enabled

Explicit modeling of non-Markovian effects from discretization

Communication efficiency

Page 12: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Modeling EventsGroup state transitions as events to minimize

temporal distributions and transitions(battery low)

Transition function found by estimating relative frequency of each transition in the event

Time model found by timing the transition data

Approximated as a phase-type distributionReplaces events with acyclic Markov chains

Page 13: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Events (cont.)Not always possibleDecompose events with minimum duration

into deterministically timed transitionsCan then better approximate using phase-

type distribution

Page 14: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Solving a GSMDPCan be viewed as an equivalent discrete-time

MDPAlmost all solution algorithms for MDPs work

Page 15: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

ExperimentRobotic soccerScore a goal (reward 150)Passing around obstacle (reward 60)

Page 16: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

ResultsMDP: T = 4s

GSMDP

Page 17: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

ResultsNo idle timeReduced

communicationImproved scoring

efficiencySystem failures

(zero goals) independent of model

Page 18: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Example Video

Page 19: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Future WorkExtend to partially observable domainsApply bilateral phase distributions to

increase the class of non-Markovian events that are able to be modeled

Page 20: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

Questions?

Page 21: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering.

MESSIAS, J.; SPAAN, M.; LIMA, P.. GSMDPs for Multi-Robot Sequential Decision-Making. AAAI Conference on Artificial Intelligence, North America, jun. 2013. Available at: <http://www.aaai.org/ocs/index.php/AAAI/AAAI13/paper/view/6432/6843>. Date accessed: 06 Apr. 2014