Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

21
Robot Swarms as Ensembles of Cooperating Components Matthias Hölzl With contributions from Martin Wirsing, Annabelle Klarl AWASS Lucca, June 24, 2013 www.ascens-ist.eu

description

Matthias Holzl's Case Study from AWASS 2013

Transcript of Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Page 1: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Robot Swarms as Ensembles ofCooperating Components

Matthias HölzlWith contributions from Martin Wirsing, Annabelle Klarl

AWASSLucca, June 24, 2013

www.ascens-ist.eu

Page 2: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

The Task

Robots cleaning an exhibition area

Matthias Hölzl 2

Page 3: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

marXbot

Miniature mobile robot developed by EPFLRough-terrain mobilityRobots can dock to other robotsMany sensors

Proximity sensorsGyroscope3D accelerometerRFID readerCameras. . .

ARM-based Linux systemGripper for picking up items

Matthias Hölzl 3

Page 4: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Swarm Robotics

Matthias Hölzl 4

Page 5: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Problems

Noise, sensor resolutionExtracting information from sensor dataUnforeseen situationsUncertainty about the environmentPerforming complex actions when intermediateresults are uncertain. . .

Matthias Hölzl 5

Page 6: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Action Logics

Logics that can represent change over timeProbabilistic behavior can be modeled (but is cumbersome)

Matthias Hölzl 6

Page 7: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Markov Decision Processes

pos = (x,y)

pos = (x,y+1) pos = (x+1,y+1)

pos = (x + 1,y)e / ...

s, n / ...

s / 0.9 / -0.1e,w / 0.025 / -0.1

w / ...s,n / ...

e / ...s, n / ...

w/ ...s,n / ...

n / 0.9 / -0.1e, w / 0.025 / -0.1

s,n,w,e / 0.05 / -0.1

Matthias Hölzl 7

Page 8: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Markov Decision Processes

watchTV

goToClub

DecideActivity

In Club

Watching TV

Dancing Alone

Drinking

Dancing With

Partner

dance p = 0.5

dance p = 0.5

drinkBeer

Oh, oh

Oh, no

flirt

p = 0.05

p = 0.95

Matthias Hölzl 8

Page 9: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

MDPs: Strategies

watchTV

goToClub

DecideActivity

In Club

Watching TV

Dancing Alone

Drinking

Dancing With

Partner

dance p = 0.5

dance p = 0.5

drinkBeer

Oh, oh

Oh, no

flirt

p = 0.05

p = 0.95

State TV CDB CDFDA watchTV goToClub goToClubIC drinkBeer drinkBeer danceDWP flirt flirt flirtUtility 0.1 −0.05(∗) −1.975(∗)

(∗) 0.05+ (−0.1)(∗∗) 0.05+ (0.5 × 0.2) + 0.5 × (0.25+ (0.05 × 5) + (0.95 × −5))

Matthias Hölzl 9

Page 10: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Reinforcement Learning

General idea:Figure out the expectedvalue of each actionin each statePick the action withthe highest expectedvalue (most of thetime)Update the expectationsaccording to the actualrewards

Matthias Hölzl 10

Page 11: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

How well does this work?

Rather well for small problemsBut: state explosion

Matthias Hölzl 11

Page 12: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Solutions

DecompositionHierarchyPartial programs

Matthias Hölzl 12

Page 13: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

POEM

Action languageFirst-order reasoningHierarchical reinforcement learning

Learns completions for partial programsConcurrency

Reflection / meta-object protocols· · ·

Matthias Hölzl 13

Page 14: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Iliad: A POEM Implementation

Common Lisp-based programming languageFull first-order reasoning

Operations on logical theories: U(C)NA, domain closure, . . .Resolution, hyperresolution, DPLL, etc.Conditional answer extractionProcedural attachment, constraint solving

Hierarchical reinforcement learningBased on Concurrent ALispPartial programsThreadwise and temporal state abstractionHierarchically Optimal Yet Local (HOLY) Q-learningHierarchically Optimal Recursively Decomposed (HORD) Q-learning

Matthias Hölzl 14

Page 15: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Planned Contents

Introduction to CALisp/PoemSimple TD-learning: banditsFlat reinforcement learning: navigationHierarchical reinforcement learning:collecting items individuallyThreadwise decomposition for hierarchicalreinforcement learning: learning collaboratively

Matthias Hölzl 15

Page 16: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

n-armed Bandits

S

search / 1.0 / N(0.1, 1.0)

coll-known / 1.0 / N(0.3, 3.0)

Choice between n actionsReward depends probabilisticly onthe action choiceNo long-term consequencesSimplest form of TD-learning

Matthias Hölzl 16

Page 17: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Flat Learning

XXXXXXXXXXXXXXXXXXXXXXXX Target: (0 0)XXTT XX XXXX XXXXXXXXXX XXXX Choices: (N E S W)XXXXXX XX XX XX Q-values:XX XX XXXXXXXX XX #((N (Q -1.8))XX XXXXXX XX XX (E (Q -1.8))XX XX XXXXXXXX (S (Q -2.25))XXXXXXXXXX XX XX (W (Q 2.76)))XX XX XXXX XX Recommended choice is WXX XX XX XX XX XXXX XX RR XXXXXXXXXXXXXXXXXXXXXXXXXX

Matthias Hölzl 17

Page 18: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Flat Learning

(defun simple-robot ()(call (nav (target-loc (robot-env)))))

(defun nav (loc)(until (equal (robot-loc) loc)

(with-choice navigate-choice (dir ’(N E S W))(action navigate-move dir))))

Matthias Hölzl 18

Page 19: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Hierarchical Learning

(defun waste-removal ()(loop

(choose choose-waste-removal-action(action sleep ’SLEEP)(call (pickup-waste))(call (drop-waste)))))

(defun pickup-waste ()(call (nav (waste-source)))(action pickup-waste ’PICKUP))

Matthias Hölzl 19

Page 20: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Multi-Robot Learning

XXXXXXXXXXXXXXXXXXXXXXXXXXTT XX RR XXXX XX XXXXXXXX XX XXXX XX RW RR XXXX XXXX RR WW XXXX XXXX RR XXXX XXXX XXXXXXXXXXXXXXXXXXXXXXXXXX

Matthias Hölzl 20

Page 21: Robot Swarms as Ensembles of Cooperating Components - Matthias Holzl

Thank you!

Any Questions?

[email protected]

Matthias Hölzl 21