An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real...

79
An Introduction to i ib d b i Distributed Robotic Systems Alcherio Martinoli School of Architecture Civil and Environmental Engineering School of Architecture, Civil and Environmental Engineering Distributed Intelligent Systems and Algorithms Laboratory École Polytechnique Fédérale de Lausanne CH-1015 Lausanne, Switzerland CH 1015 Lausanne, Switzerland http://disal.epfl.ch/ alcherio.martinoli@epfl.ch EPFL, Topics in Autonomous Robotics, April 23, 2012

Transcript of An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real...

Page 1: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

An Introduction to i ib d b iDistributed Robotic Systems

Alcherio Martinoli

School of Architecture Civil and Environmental EngineeringSchool of Architecture, Civil and Environmental EngineeringDistributed Intelligent Systems and Algorithms Laboratory

École Polytechnique Fédérale de LausanneCH-1015 Lausanne, SwitzerlandCH 1015 Lausanne, Switzerland

http://disal.epfl.ch/

[email protected]@ p

EPFL, Topics in Autonomous Robotics, April 23, 2012

Page 2: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Outline – Part I (9:15 am - noon)

• Distributed Robotics: Background• Machine-Learning Methods forMachine Learning Methods for

Distributed Control– Particle Swarm Optimization (PSO)– Noise-resistant algorithms– Collective-specific issues for

distributed control design anddistributed control design and optimization

– Particle Swarm Optimization applied to multi-robot systems

• Conclusion Part I

Page 3: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Outline – Part II (2:15 - 5 p m )Outline Part II (2:15 - 5 p.m.)

• Model-Based Methods– Bottom-up and top down approaches– Multi-level approach

• Model Calibration• Model Calibration• Examples

– Linear exampleLinear example– Nonlinear example

• Conclusion Part II

Page 4: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Background:Background:Multi-Robot Systems Multi Robot Systems

Page 5: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Typical Tasks for Multirobot Teams

• Mapping and exploration• Target trackingg g• Carrying objects• Pla ing team games• Playing team games• Construction

Caloud et al; 1990

CMPack; 2002

• Inspection• Sensing and monitoringg g

From N. Kalra, PhD CMU, 2006

Page 6: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Why Multiple Robots?

• Faster execution• More robustMore robust• Simpler robots

M l i• More solutions• Task requires it

From N. Kalra, PhD CMU, 2006

Page 7: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Why Not Multiple Robots?

• Need comms• More complexityMore complexity• Harder to test

N h bl• N x the trouble• Expensive

From N. Kalra, PhD CMU, 2006

Page 8: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Goal of a Multirobot Team

Coordinate to distribute tasks and resourcesCoordinate to distribute tasks and resourcesin a way that leads to efficient and robust mission completionmission completion.

Page 9: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Challenges

• Dynamic events• Changing task demandsg g• Resource failures• Ad ersaries• Adversaries• Limited sensing, planning, and actuation• Limited communication• Diverse team

Page 10: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

M hi L i Machine-Learning: Rationale and TerminologyRationale and Terminology

Page 11: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Why Machine-Learning?y g• Complementary to a model-based/engineering

h h l l l d ilapproaches: when low-level details matter (optimization) and/or good models do not exist (design)!(design)!

• When the design/optimization space is too big (infinite)/too computationally expensive (e g NP-(infinite)/too computationally expensive (e.g. NPhard) to be systematically searched

• Automatic design and optimization techniquesAutomatic design and optimization techniques• Role of the engineer refocused to performance

specification, problem encoding, and p , p g,customization of algorithmic parameters (and perhaps operators)

Page 12: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Why Machine-Learning?y g

• There are design and optimization techniques• There are design and optimization techniques robust to noise, nonlinearities, discontinuitiesP ibl h l SW• Possible search spaces: parameters, rules, SW or HW structures/architectures

• Individual real-time adaptation to new environmental conditions; i.e. increased individual flexibility when environmental conditions are not known/cannot predicted a priori

Page 13: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

ML Techniques: Classification Axis 1ML Techniques: Classification Axis 1

– Supervised learning: off-line, a teacher is available

– Unsupervised learning: off-line, teacher not available

– Reinforcement-based (or evaluative) learning: on-line, no pre-established training and evaluation data sets

Page 14: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

ML T h i Cl ifi i A i 2ML Techniques: Classification Axis 2

– In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

l ti th d l d d t l h d hsolutions are then downloaded onto real hardware when certain criteria are met

– Hybrid: most of the time in simulation (e.g. 90%), last period (e.g. 10%) of the learning process on real hardware

– Hardware-in-the-loop: from the beginning on real hardware (no simulation). Depending on the algorithm more or less

idrapid

Page 15: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

ML T h i Cl ifi i A i 3ML Techniques: Classification Axis 3ML algorithms require often fairly important g q y pcomputational resources (in particular for population-based algorithms), therefore a further classification is:

– On-board: machine-learning algorithm run on the system to be learned (no external unit)be learned (no external unit)

– Off-board: the machine-learning algorithm runs off-board and the system to be learned just serves as embodied implementation of a candidate solution

Page 16: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

ML T h i Cl ifi ti A i 4ML Techniques: Classification Axis 4− Population-based (“multi-agent”): a population of

did l i i i i d b h l i h fcandidate solutions is maintained by the algorithm, for instance Genetic Algorithms (individuals), Particle Swarm Optimization (particles), Particle Filter (particle), Ant

l i i i ( l i ) i fColony Optimization (tour solution set, etc.); it can refer also to the fact that multiple agents are needed to build the solution set, for instance in virtual ants as core multi-agent principle in Ant Colony Optimization

− Hill-climbing (“single-agent”): the machine-learning algorithm work on a single candidate solution and try to improve on it; for instance, (stochastic) gradient-based methods, reinforcement learning algorithms

Page 17: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Particle Swarm OptimizationParticle Swarm Optimization

Page 18: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Flocking and Swarming InspirationFlocking and Swarming Inspiration

Principles:Principles:

• Imitate

• Evaluate

• Comparep

Page 19: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

PSO: TerminologyPSO: Terminology• Population: set of candidate solutions tested in one time step, consists of m

particles (e.g., m = 20)

• Particle: represents a candidate solution; it is characterized by a velocity vector v and a position vector x in the hyperspace of dimension Dvector v and a position vector x in the hyperspace of dimension D

• Evaluation span: evaluation period of each candidate solution during a single time step (algorithm iteration); as in GA the evaluation span might take more or less time depending on the experimental scenariotake more or less time depending on the experimental scenario

• Life span: number of iterations a candidate solution survives

• Fitness function: measurement of efficacy of a given candidate solutionFitness function: measurement of efficacy of a given candidate solution during the evaluation span

• Population manager: update velocities and position for each particle di t th i PSO laccording to the main PSO loop

Page 20: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Evolutionary Loop: Several GenerationsGenerations

Start

Initialize particles

Perform main PSO loopPerform main PSO loop

E d it i t?Ex. of end criteria:

End criterion met?N

Y• # of time steps

• best solution performance

End

p

•…

Page 21: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Initialization: Positions and Velocities

Page 22: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

The Main PSO Loop – ParametersThe Main PSO Loop Parameters and Variables

• Functions– rand ()= uniformly distributed random number in [0,1]

• Parameters– w: velocity inertia (positive scalar)– cp: personal best coefficient/weight (positive scalar)– cn: neighborhood best coefficient/weight (positive scalar)

• Variables– xij(t): position of particle i in the j-th dimension at time step t (j = [1,D])– vij(t): velocity particle i in the j-th dimension at time step t

iti f ti l i i th j th di i ith i l fit t)(*– : position of particle i in the j-th dimension with maximal fitness up to iteration t

– : position of particle i’ in the j-th dimension having achieved the maximal fitness up to iteration t in the neighborhood of particle i

)(* txij

)(* tx ji′p g p

Page 23: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

The Main PSO Loop (Eberhart, Kennedy, and Shi, 1995, 1998)

At each time step tfor each particle i

for each component j

At each time step t

update

for each component j

)()1( ijij twvtv +=+pthe

velocity )()()()(

)()(**

ijjinijijp

ijij

xxrandcxxrandc −+− ′

( ) ( )1)1( ++=+ tvtxtx ijijijthen move ( ) ( ))( ijijijthen move

Page 24: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

The main PSO Loop- Vector Visualization

*ix My position for optimal

Here I am!

fitness up to date

xi *xThe position with

optimal fitness of my neighbors up to date

iix ′

g p

vi

Page 25: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Neighborhood Types• Size:

– Neighborhood index considers also the particle itself in the countingthe counting

– Local: only k neighbors considered over m particles in the population (1 < k < m); k=1 means no information from other particles used in velocity updatep y p

– Global: m neighbors• Topology:

G hi l– Geographical– Social– Indexed– Random– …

Page 26: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Neighborhood Examples: Geographical vs. Social

geographical social

Page 27: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Neighborhood Example: Indexed and Circular (lbest)

11

8 2Particle 1’s 3-

neighbourhood

7 3

Virtual circle

7

Virtual circle

5

64

Page 28: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

A Simple PSO Animation

© M. Clerc

Page 29: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

GA vs. PSO in Absence of Noise

Page 30: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

GA vs. PSO – OverviewGA vs. PSO Overview• According to most recent research, PSO g

outperforms GA on most (but not all!) continuous optimization problemsp p

• GA still much more widely used in general research community and robust to continuousresearch community and robust to continuous and discrete optimization problems

• Because of random aspects very difficult to• Because of random aspects, very difficult to analyze either metaheuristic or make guarantees about performanceguarantees about performance

Page 31: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Benchmark Functions• Goal: minimization of a given f(x)

S d d b h k f i i h hi ( 30) d• Standard benchmark functions with thirty terms (n = 30) and a fixed number of iterations

• All xi constrained to [-5.12, 5.12]All xi constrained to [ 5.12, 5.12]

Page 32: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Performance ith Small Pop lationsPerformance with Small Populations

• GA: Roulette Wheel for selection, mutation applies numerical adjustment to gene

• PSO: lbest ring topology with neighborhood of size 3• PSO: lbest ring topology with neighborhood of size 3• Algorithm parameters used (but not thoroughly optimized):

Page 33: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Performance with Small PopulationsBold: best results; 30 runs; no noise on the performance function

Performance with Small Populations

Function (no noise)

GA (mean ± std dev)

PSO(mean ± std dev)( ) ( ) ( )

Sphere 0.02 ± 0.01 0.00 ± 0.00

Generalized Rosenbrock

34.6 ± 18.9 7.38 ± 3.27

Rastrigin 157 ±21.8 48.3 ± 14.4

Griewank 0 01 ± 0 01 0 01 ± 0 03Griewank 0.01 ± 0.01 0.01 ± 0.03

Page 34: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Expensive Optimization and Expensive Optimization and Noise ResistanceNoise Resistance

Page 35: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Expensive Optimization ProblemsExpensive Optimization ProblemsTwo fundamental reasons making robot control design and optimization expensive in terms of time:

1. Time for evaluation of candidate solutions (e.g., t f d ) >> ti f li ti f

p p

tens of seconds) >> time for application of metaheuristic operators (e.g., milliseconds)

2 Noisy performance evaluation disrupts the2. Noisy performance evaluation disrupts the adaptation process and require multiple evaluations for actual performanceo actua pe o a ce

– Multiple evaluations at the same point in the search space yield different results

– Noise causes decreased convergence speed and residual error

Page 36: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Reducing Evaluation TimeReducing Evaluation TimeGeneral recipe: exploit more abstracted, calibrated representations (models and sim lation tools)representations (models and simulation tools)

See also part II

Page 37: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Dealing with Noisy Evaluations• Better information about candidate solution can be obtained by

combining multiple noisy evaluationsW ld l i ll h did l i f• We could evaluate systematically each candidate solution for afixed number of times → not efficient from a computational perspectiveW t t d di t t ti l ti t l t• We want to dedicate more computational time to evaluate promising solutions and eliminate as quickly as possible the “lucky” ones → each candidate solution might have been evaluated a different number of times when comparedevaluated a different number of times when compared

• In GA good and robust candidate solutions survive over generations; in PSO they survive in the individual memory

• Use dedicated functions for aggregating multiple evaluations:• Use dedicated functions for aggregating multiple evaluations: e.g., minimum and average or generalized aggregation functions (e.g., quasi-linear weighted means), perhaps combined with a statistical test for comparing resulting co b ed w t a stat st ca test o co pa g esu t gaggregated performances

Page 38: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

GA PSO

Page 39: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Testing Noise-Resistant GA andTesting Noise Resistant GA and PSO on Benchmarks

• Benchmark 1 : generalized Rosenbrock function– 30 real parameters– Minimize objective function– Expensive only because of noise

• Benchmark 2: obstacle avoidance on a robot22 l– 22 real parameters

– Maximize objective functionExpensive because of noise and evaluation time– Expensive because of noise and evaluation time

Page 40: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Benchmark 1: Gaussian Additive Noise on Generalized Rosenbrockon Generalized Rosenbrock

Fair test: samenumber of evaluations candidate solutions for allsolutions for all algorithms (i.e. n generations/ i i f d diterations of standard versions compared with n/2 of the noise-resistant ones))

Page 41: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Benchmark 2:Obstacle Avoidance on a Mobile Robot

• Similar to [Floreano and Mondada 1996]– Discrete-time, single-layer, artificial

recurrent neural network controller– Shaping of neural weights and biases

(22 l t )

V = average wheel speed, Δv = difference between wheel speeds, i = value of

(22 real parameters)– fitness function: rewards speed,

straight movement, avoiding obstacles

p ,most active proximity sensor

obstacles• Different from [Floreano and

Mondada 1996]E i t b d d– Environment: bounded open-space of 2x2 m instead of a maze

A t l lt [P h J EPFL PhD Th i NActual results: [Pugh J., EPFL PhD Thesis No. 4256, 2008]

Similar results: [Pugh et al, IEEE SIS 2005]

Page 42: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Baseline Experiment: Extended-Time Adaptation

• Compare the basic algorithms ith their corresponding noisewith their corresponding noise-

resistant version• Population size 100, 100

iterations, evaluation span 300 d ( f iseconds (150 s for noise-

resistant algorithms)→ 34.7 days

• Fair test: same total evaluation time for all the algorithms

• Realistic simulation (Webots)• Best evolved solutions averaged

over 30 evolutionary runsover 30 evolutionary runs• Best candidate solution in the

final pool selected based on 5 runs of 30 s each; performance

d 40 f 30 htested over 40 runs of 30s each• Similar performance for all

algorithms[Pugh J., EPFL PhD Thesis No. 4256, 2008]

Page 43: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Where Can Noise-Resistant Algorithms Make the Difference?

• Limited adaptation time • Sudden change in noise distribution• Large amount of noise

Notes:Notes:• all examples from shaping obstacle avoidance behavior• best evolved solution averaged over multiple runs

f i l f l i i f ll h• fair tests: same total amount of evaluation time for all the different algorithms (standard and noise-resistant)

Page 44: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Limited-Time AdaptationLimited Time Adaptation• Total adaptation time

= 8 3 hours (1/100 ofEspecially good with

= 8.3 hours (1/100 of previous learning time)

small populations

time)• Trade-offs: population

size, number of iterations, evaluation span, number of re-evaluationsevaluations

• Realistic simulation (Webots)Varying population size vs (Webots)Varying population size vs.

number of iterations[Pugh J., EPFL PhD Thesis No. 4256, 2008]

Page 45: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Sudden Change of Noise DistributionsSudden Change of Noise Distributions• Move from realistic simulation (Webots) to real

robots after 90% learning (even faster evolution)• Noise-resistance helps speeding up learning after

transition

[Pugh J., EPFL PhD Thesis No. 4256, 2008]

Page 46: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Increased Noise Level – Set-UpIncreased Noise Level Set Up

• Scenario 1: One robot1x1 m arena, PSO, 50 iterations, scenario 3

• Scenario 1: One robot learning obstacle avoidance

• Scenario 2: One robotScenario 2: One robot learning obstacle avoidance, one robot running pre-evolved obstacle avoidance

• Scenario 3: Two robots co-learning obstacle avoidance

Idea: more robots more noise (as perceived from an individual robot); no “standard” com bet een the robots b t in scenario 3 information sharing“standard” com between the robots but in scenario 3 information sharing through the population manager!

[Pugh et al, IEEE SIS 2005]

Page 47: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Increased Noise Level – Sample Resultsp

[Pugh et al, IEEE SIS 2005]

Page 48: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

F Si l t From Single to Multi Unit Systems:Multi-Unit Systems:Co-Adaptation in a Co-Adaptation in a

Shared WorldShared World

Page 49: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Adaptation in Multi-Robot ScenariosAdaptation in Multi Robot Scenarios

• Collective: fitnessCollective: fitness become noisy due to partial perception, independent parallelindependent parallel actions

Page 50: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Credit Assignment ProblemCredit Assignment ProblemWith limited communication, no communication at all, or partial perception:

Page 51: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Co-Adaptation in a pCollaborative Framework

Page 52: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Co-Shaping Collaborative Behaviorp g

Three orthogonal axes to consider (extremities and b l d l i ibl )balanced solutions are possible):

1. Performance evaluation: individual vs. group fitness or reinforcement

l i h i2. Solution sharing: private vs. public policies

3 T di i3. Team diversity: homogeneous vs. heterogeneous learning

Page 53: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Multi-Agent Algorithms for g gMulti-Robot Systems

Policy Performance Sharing Diversity

i-pr-he individual private heterogeneous

i-pr-ho individual private homogeneous

i-pu-he individual public heterogeneouspi-pu-ho individual public homogeneous

g-pr-he group private heterogeneousg pr he g p p g

g-pr-ho group private homogeneous

g-pu-he group public heterogeneousg pu he group public heterogeneous

g-pu-ho group public homogeneous

Page 54: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

MA Algorithms for MR Systems

Do not make sense (inconsistent)

Interesting (consistent)

Possible but not scalable

Policy Performance Sharing Diversity

i pr he individual private heterogeneous

g ( )

i-pr-he individual private heterogeneous

i-pr-ho individual private homogeneous

i pu he individual public heterogeneousi-pu-he individual public heterogeneous

i-pu-ho individual public homogeneous

h i t h tg-pr-he group private heterogeneous

g-pr-ho group private homogeneous

g-pu-he group public heterogeneous

g-pu-ho group public homogeneous

Page 55: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

MA Algorithms for MR Systems

Example of collaborative

g y

Example of collaborative co-learning with binary encoding of 100 candidate solutionscandidate solutions and 2 robots

Page 56: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Co-Shaping Competitive BehaviorCo Shaping Competitive Behavior

fi f ≠ fi ffitness f1 ≠ fitness f2

Page 57: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Co-Shaping Obstacle Co Shaping Obstacle Avoidance

• Effects of robotic group sizeEffects of robotic group size• Effects of communication constraints

Page 58: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

MA Algorithms for MR Systemsg y

Page 59: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Distributed Robotic Adaptation• For an individual-public-heterogeneous policy:

Standard solution: off board adaptation using a centralized– Standard solution: off-board adaptation using a centralizedpopulation manager

– New solution [Pugh, SIJ 2009]: on-board on-board adaptationNew solution [Pugh, SIJ 2009]: on board on board adaptation using a distributed population manager

• Fully scalable solution, no single-point of failurey , g p• Why PSO:

appears interesting since this metaheuristic works well with– appears interesting since this metaheuristic works well with small pools of candidate solutions: candidate pool size ≈ robot team size

– limited particle neighborhood sizes → scalable, on-board operation

Page 60: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Varying the Robotic Group SizeVarying the Robotic Group Size

Page 61: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Varying the Robotic Group SizeVarying the Robotic Group Size

Page 62: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Varying the Robotic Group SizeVarying the Robotic Group Size

Page 63: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Var ing the Robotic Gro p Si eVarying the Robotic Group Size

Page 64: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Varying the Robotic Group Size –E l i T ti E i tEvolving vs. Testing Environment

• PSO only; numberPSO only; number of particles = 10

• Gradually increase ynumber of robots on team

• Up to 10x faster learning with little performance loss

• Arena 3x3 m

[Pugh and Martinoli, Swarm Intelligence J., 2009]

Page 65: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based Neighborhoods

• Default neighborhood ring topology 2• Default neighborhood - ring topology, 2 neighbors for each robot

• Problem for real robots: neighbor could be• Problem for real robots: neighbor could be very far away

• Possible solutions – use two closest robots in• Possible solutions – use two closest robots in the arena (capacity limitation), use all robots within some radius r (range limitation); w t so e ad us ( a ge tat o );reality is affected often by both

• How will this affect the performance?ow w s ec e pe o ce?

Page 66: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based Neighborhoods

Page 67: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based N i hb h dNeighborhoods

Ring Topology - Standard

Page 68: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based NeighborhoodsNeighborhoods

2-closest – Model A

Page 69: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based N i hb h dNeighborhoods

Within range r – Model B

Page 70: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Communication-Based NeighborhoodsCommunication Based Neighborhoods• Realistic simulation results

(Webots)(Webots)• 10 Khepera III robots (one

particle per robot)• Standard neighborhood is 2• Standard neighborhood is 2

fixed particles• New particle

neighborhoods defined byneighborhoods defined by communication limitations:– Model A: particles from

closest two robots– Model B: particles from all

robots within range r = 120 cm

• No drawback to• No drawback to communication-based neighborhoods[Pugh and Martinoli, Swarm Intelligence J., 2009]

Page 71: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

V i th C i ti RVarying the Communication RangeArena: 3x3 m, 10 robots total,

Page 72: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Varying the Communication Range• Realistic simulation results (Webots)• Low communication ranges → learning too slowly

High comm nication ranges fitness platea s too earl• High communication ranges → fitness plateaus too early• Limited communication range gives best performance →

scalable solution

[Pugh and Martinoli, Swarm Intelligence J., 2009]

Page 73: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Distributed Adaptation with Real RobotsEnabling Technology

• On-board relative positioning system• Principle:

- Enabling Technology

• Principle: – belt of IR emitters (LED) and receivers

(photodiode)IR LED used as antennas; modulated light– IR LED used as antennas; modulated light (carrier 10.7 MHz)

– RF chip behind, measured RSSI– Measure range & bearing of the next robot– Measure range & bearing of the next robot

(relative positioning) can be coupled with RF channel (e.g. 802.11) for heading assessment

– Can also be used for 20 kbit/s com channel

• [Pugh et al., IEEE Trans. on Mechatronics, 2009]2009]

Page 74: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Distributed Adaptation with Real Robots- Sample Results

• All operations on-board using relative positioning system• Effective learning of obstacle avoidance behavior• Effective learning of obstacle avoidance behavior• Similar results for communication-based neighborhoods• Arena 3 x 3 m

[Pugh and Martinoli, Swarm Intelligence J., 2009]

Page 75: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Distributed Adaptation withR l R b tReal Robots

Before adaptation (5x speed-up)

Page 76: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Distributed Adaptation withR l R b tReal Robots

After adaptation (5x speed-up)

Page 77: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

ConclusionConclusion

Page 78: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Take Home Messages• Evaluative machine-learning techniques are key for

robotic adaptation• Computationally efficient, noise-resistant algorithms can

be obtained with a simple aggregation criterion in the i l i h i lmain algorithmic loop

• Several successful strategies have been designed for dealing with collective-specific problems; theydealing with collective specific problems; they differentiate along three axes: public/private; homogeneous/heterogeneous, individual/group

• Multi robot platforms can be exploited for testing in• Multi-robot platforms can be exploited for testing in parallel multiple candidate solutions.

• PSO appears to be well suited for fully distributed on-pp yboard operation and strategies based local proximity-based neighborhood have been successfully tested

Page 79: An Introduction to Diib d biistributed Robotic Systems · – In simulation: reproduces the real scenario in simulation and applies there machine-learning techniques; the learned

Additional Literature – Part IBooks• Kennedy J. and Eberhart R. C. with Y. Shi, “Swarm Intelligence”. Morgan

Kaufmann Publisher, 2001. ,• Clerc M., “Particle Swarm Optimization”. ISTE Ltd., London, UK, 2006.• Engelbrecht A. P., “Fundamentals of Computational Swarm Intelligence”. John

Wiley & Sons, 2006.

Papers• Pugh J. and Martinoli A., “Distributed Scalable Multi-Robot Learning using

Particle Swarm Optimization”. Swarm Intelligence Journal, 3(3): 203-222,Particle Swarm Optimization . Swarm Intelligence Journal, 3(3): 203 222, 2009.

• Pugh J., Raemy X., Favre C., Falconi R., and Martinoli A., “A Fast On-Board Relative Positioning Module for Multi-Robot Systems”. Special issue on Mechatronics in Multi Robot Systems Chow M Y Chiaverini S Kitts CMechatronics in Multi-Robot Systems, Chow M.-Y., Chiaverini S., Kitts C., editors, IEEE Trans. on Mechatronics, 14(2): 151-162, 2009.

• Murciano A. and Millán J. del R., "Specialization in Multi-Agent Systems Through Learning". Biological Cybernetics, 76: 375-382, 1997.

• Mataric, M. J. “Learning in behavior-based multi-robot systems: Policies, models, and other agents”. Special Issue on Multi-disciplinary studies of multi-agent learning, Ron Sun, editor, Cognitive Systems Research, 2(1):81-93, 2001.