UNIVERSIDAD POLITECNICA DE MADRIDoa.upm.es/14922/1/ALMA_YADIRA_QUINONEZ_CARRILLO.pdf · Yadira...

UNIVERSIDAD POLITECNICA DE MADRID

FACULTAD DE INFORMATICA

Response Threshold Models, Stochastic Learning

Automata and Ant Colony Optimization-based

Decentralized Self-Coordination Algorithms for

Heterogeneous Multi-Tasks Distribution in

Multi-Robot Systems

Ph.D Thesis

Alma Yadira Quinonez Carrillo

M.Sc. in Artificial Intelligence

Madrid, 2012

http://www.upm.es/institucional

http://www.fi.upm.es/

DEPARTAMENTO DE INTELIGENCIAARTIFICIAL

FACULTAD DE INFORMATICA

Response Threshold Models, Stochastic Learning

Automata and Ant Colony Optimization-based

Decentralized Self-Coordination Algorithms for

Heterogeneous Multi-Tasks Distribution in

Multi-Robot Systems

Alma Yadira Quinonez Carrillo

M.Sc. in Artificial Intelligence

Thesis Advisors

Javier de Lope Asiaın

PhD. in Informatics

Darıo Maravall Gomez-Allende

PhD. Telecommunications Engineer

Madrid, 2012

http://dia.fi.upm.es/

http://dia.fi.upm.es/

http://www.fi.upm.es/

Tribunal nombrado por el Magfco. y Excmo. Sr. Rector de la Universidad

Politecnica de Madrid, el dıa —– de ———– de 2012.

Presidente: —————————————–

Vocal: —————————————–

Vocal: —————————————–

Vocal: —————————————–

Secretario: —————————————–

Suplente: —————————————–

Suplente: —————————————–

Realizado el acto de defensa y lectura de la Tesis el dıa —– de ———– de

2012 en la Facultad de Informatica.

VOCAL VOCAL VOCAL

PRESIDENTE SECRETARIO

v

I would like to dedicate this thesis to my Mother and my Brothers.

Acknowledgements

After such a great experience, I obviously have many people to thank...

First, I want to thank all my family members. Thanks for being there and

supporting me in every decision. Thank you for believing in me and giving me

the strength to face even the most difficult things. Definitely thanks to you all I

was able achieve this objective.

I would also like to take this opportunity to thank my supervisors, Javier de

Lope y Darıo Maravall, because they have helped me enormously to further my

understanding and expand my horizons in the field of robotics, but above all, I

am very grateful to them for their unfailing interest, guidance and wisdom during

the development this project.

I am sincerely thankful with the Consejo Nacional de Ciencia y Tecnologıa,

the Univesidad Autonoma de Sinaloa and the Universidad Politecnica de Madrid

for contributing with the financial support in conducting this PhD thesis.

A heartfelt thanks also to the members of the Lab for making my stay more

comfortable, but in particular, to Antonio Fernandez and Juan Bekios for their

comments and suggestions.

Finally, but not least important, I would like to express my gratitude to all

my friends that I met here in Madrid, who they not only encouraged me during

the research career, but also, have given me many great moments. Thanks to

Marinela, Ivan, Lindsay, Miguel, Jez, Boris, Gonzalo, Juan, Tony, Ernesto, Raul,

Ghislain, David and Monse for sharing with me so many lunches and speaking

not only about work, I have enjoyed these last few years enormously! Thank you

all for your support, friendship and conviviality.

Yadira Quinonez

viii

Abstract

In recent decades, there has been an increasing interest in systems comprised of

several autonomous mobile robots, and as a result, there has been a substantial

amount of development in the field of Artificial Intelligence, especially in Robotics.

There are several studies in the literature by some researchers from the scientific

community that focus on the creation of intelligent machines and devices capable

to imitate the functions and movements of living beings. Multi-Robot Systems

(MRS) can often deal with tasks that are difficult, if not impossible, to be accom-

plished by a single robot. In the context of MRS, one of the main challenges is

the need to control, coordinate and synchronize the operation of multiple robots

to perform a specific task. This requires the development of new strategies and

methods which allow us to obtain the desired system behavior in a formal and

concise way.

This PhD thesis aims to study the coordination of multi-robot systems, in

particular, addresses the problem of the distribution of heterogeneous multi-tasks.

The main interest in these systems is to understand how from simple rules inspired

by the division of labor in social insects, a group of robots can perform tasks in

an organized and coordinated way. We are mainly interested on truly distributed

or decentralized solutions in which the robots themselves, autonomously and in

an individual manner, select a particular task so that all tasks are optimally

distributed.

In general, to perform the multi-tasks distribution among a team of robots,

they have to synchronize their actions and exchange information. Under this

approach we can speak of multi-tasks selection instead of multi-tasks assignment,

which means, that the agents or robots select the tasks instead of being assigned a

task by a central controller. The key element in these algorithms is the estimation

ix

of the stimuli and the adaptive update of the thresholds. This means that each

robot performs this estimate locally depending on the load or the number of

pending tasks to be performed. In addition, it is very interesting the evaluation

of the results in function in each approach, comparing the results obtained by the

introducing noise in the number of pending loads, with the purpose of simulate

the robot’s error in estimating the real number of pending tasks.

The main contribution of this thesis can be found in the approach based on

self-organization and division of labor in social insects. An experimental scenario

for the coordination problem among multiple robots, the robustness of the ap-

proaches and the generation of dynamic tasks have been presented and discussed.

The particular issues studied are:

• Threshold models: It presents the experiments conducted to test the re-

sponse threshold model with the objective to analyze the system perfor-

mance index, for the problem of the distribution of heterogeneous multi-

tasks in multi-robot systems; also has been introduced additive noise in the

number of pending loads and has been generated dynamic tasks over time.

• Learning automata methods: It describes the experiments to test the learn-

ing automata-based probabilistic algorithms. The approach was tested to

evaluate the system performance index with additive noise and with dy-

namic tasks generation for the same problem of the distribution of hetero-

geneous multi-tasks in multi-robot systems.

• Ant colony optimization: The goal of the experiments presented is to test

the ant colony optimization-based deterministic algorithms, to achieve the

distribution of heterogeneous multi-tasks in multi-robot systems. In the

experiments performed, the system performance index is evaluated by in-

troducing additive noise and dynamic tasks generation over time.

x

Resumen

En las ultimas decadas, ha habido un interes creciente en los sistemas compuestos

por varios robots moviles autonomos, y como resultado, ha surgido una cantidad

sustancial de desarrollo en el campo de la inteligencia artificial, especialmente

en la robotica. Hay varios estudios en la literatura por parte de algunos inves-

tigadores de la comunidad cientıfica que se centran en la creacion de maquinas

inteligentes y dispositivos capaces de imitar las funciones y los movimientos de

los seres vivos. En los sistemas multi-robot (MRS) a menudo pueden tratar con

tareas que son difıciles, por no decir imposibles, de realizar por un solo robot. En

el contexto de los MRS, uno de los principales retos es la necesidad de controlar,

coordinar y sincronizar el funcionamiento de multiples robots para realizar una

tarea especıfica. Esto requiere el desarrollo de nuevas estrategias y metodos que

permitan obtener el comportamiento deseado del sistema de una manera formal

y concisa.

Esta tesis tiene como objetivo el estudio de la coordinacion de sistemas multi-

robot, en particular, aborda el problema de la distribucion de multiples tareas

heterogeneas. El principal interes por este tipo de sistemas es comprender como

a partir de reglas sencillas inspiradas en la division del trabajo en los insectos

sociales, un grupo de robots pueden realizar tareas de una manera organizada y

coordinada. Estamos interesados principalmente en soluciones verdaderamente

distribuidas o descentralizadas en el que los propios robots, de forma autonoma

y de manera individual, seleccionan una tarea particular de tal modo que todas

las tareas se distribuyan de manera optima.

En general, para realizar la distribucion de multiples tareas entre un equipo

de robots, tienen que sincronizar sus acciones e intercambiar informacion. Bajo

este enfoque se puede hablar de la seleccion de multiples tareas en lugar de la

xi

asignacion de multiples tareas, es decir, como los agentes o robots seleccionan

las tareas en lugar de ser asignados a una tarea por un controlador central. El

elemento fundamental en estos algoritmos es la estimacion de los estımulos y la

actualizacion adaptativa de los umbrales. Esto significa que cada robot realiza

dicha estimacion de forma local dependiendo de la carga o el numero de tareas

pendientes por ejecutar. Ademas, es muy interesante la evaluacion de los resul-

tados en funcion de cada enfoque comparando los resultados obtenidos mediante

la introduccion de ruido en el numero de cargas pendientes para simular el error

del robot en la estimacion del numero real de tareas pendientes.

La principal aportacion de esta tesis se puede encontrar en un enfoque basado

en la auto-organizacion y division del trabajo en los insectos sociales. Un esce-

nario experimental para el problema de la coordinacion entre multiples robots, la

robustez de los enfoques y la generacion de tareas dinamicas han sido presentados

y discutidos. Los temas especıficos estudiados son los siguientes:

• Modelos de umbral: se presentan los experimentos realizados para pro-

bar el modelo umbral de respuesta con el objetivo de analizar el ındice de

rendimiento del sistema, para el problema de la distribucion de multiples

tareas heterogeneas en los sistemas multi-robot; tambien se ha introducido

ruido aditivo en el numero de cargas pendientes y se han generado tareas

dinamicas a traves del tiempo.

• Metodos de automatas de aprendizaje: se describen los experimentos para

probar los automatas de aprendizaje basadas en algoritmos probabilısticos.

El enfoque fue probado para evaluar el ındice de rendimiento del sistema con

ruido aditivo y la generacion de tareas dinamicas para el mismo problema de

la distribucion de multiples tareas heterogeneas en los sistemas multi-robot.

• Optimizacion de colonias de hormigas: el objetivo de los experimentos pre-

sentados es poner a prueba el algoritmo de optimizacion de colonias de

hormigas basado en algoritmos deterministas, para lograr la distribucion de

multiples tareas heterogeneas en los sistemas multi-robot. En los experi-

mentos realizados se evaluo el ındice de rendimiento del sistema mediante

la introduccion de ruido aditivo y la generacion de tareas dinamicas en el

tiempo.

xii

Contents

Acknowledgements viii

Abstract ix

Resumen xi

Contents xiii

List of Figures xviii

List of Tables xx

I Goals and Background 1

1 Introduction 2

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 General Objective . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Specific Objectives . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Main Contributions and Publications . . . . . . . . . . . . . . . . 6

1.3.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . 6

1.3.2 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 State of the Art 11

2.1 Multi-Robot Systems . . . . . . . . . . . . . . . . . . . . . . . . . 12

xiii

CONTENTS

2.1.1 Coordination in Multi-Robot Systems . . . . . . . . . . . . 14

2.1.2 Architectures for Multi-robot Systems . . . . . . . . . . . 16

2.1.2.1 Centralized Architectures . . . . . . . . . . . . . 16

2.1.2.2 Hierarchical Architectures . . . . . . . . . . . . . 16

2.1.2.3 Decentralized Architectures . . . . . . . . . . . . 17

2.1.2.4 Hybrid Arquitectures . . . . . . . . . . . . . . . . 17

2.1.3 Main Problems among a Group of Robots . . . . . . . . . 19

2.1.4 Coordination Schemes: Cooperative and Competitive . . . 20

2.2 Fields of Application . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1 Cooperative Manipulation . . . . . . . . . . . . . . . . . . 22

2.2.2 Unstructured Environments . . . . . . . . . . . . . . . . . 23

2.2.3 Formation Control . . . . . . . . . . . . . . . . . . . . . . 24

2.2.4 Biologically-Inspired . . . . . . . . . . . . . . . . . . . . . 25

2.3 Previous and Related Work . . . . . . . . . . . . . . . . . . . . . 26

2.3.1 Formal Methods in Relation to Coordination . . . . . . . . 26

2.3.1.1 Multi-Agent Systems . . . . . . . . . . . . . . . . 27

2.3.1.2 Swarm Robots . . . . . . . . . . . . . . . . . . . 27

2.3.1.3 Multi-Robot Systems . . . . . . . . . . . . . . . . 28

II Setting the Problem 32

3 Problem Description 33

3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Formal description of the problem . . . . . . . . . . . . . . . . . . 34

3.3 Application Scenario . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Description of the Proposed Solution . . . . . . . . . . . . . . . . 35

III Foundations 40

4 Theoretical Fundamentals 41

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 Threshold Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.1 An Overview of Response Threshold Model . . . . . . . . 44

xiv

CONTENTS

4.2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 Learning Automata Methods . . . . . . . . . . . . . . . . . . . . . 48

4.3.1 A Brief Introduction . . . . . . . . . . . . . . . . . . . . . 48

4.3.2 Definition of Stochastic Processes . . . . . . . . . . . . . . 49

4.3.3 Basic Definition of Learning Automata . . . . . . . . . . . 51

4.3.4 Stochastic Reinforcement Algorithms based on Reward and

Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.4 Ant Colony Optimization . . . . . . . . . . . . . . . . . . . . . . . 54

4.4.1 A Brief Introduction . . . . . . . . . . . . . . . . . . . . . 54

4.4.2 Biological Inspiration . . . . . . . . . . . . . . . . . . . . . 56

4.4.3 The Ant System Approach . . . . . . . . . . . . . . . . . . 58

IV Experimentation and Conclusions 61

5 Experimental Results 62

5.1 Preliminaries of the Experimentation . . . . . . . . . . . . . . . . 63

5.1.1 Evaluation of the Performance Index . . . . . . . . . . . . 63

5.1.1.1 Additive Noise Generation . . . . . . . . . . . . . 64

5.1.1.2 Dynamic Tasks Generation . . . . . . . . . . . . 64

5.2 Experiments with Threshold Models . . . . . . . . . . . . . . . . 65

5.2.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.2.2 Evaluation of the Approach with Additive Noise . . . . . . 66

5.2.3 Evaluation of the Approach with dynamic tasks . . . . . . 67

5.2.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . 67

5.3 Experiments with Learning Automata-based Probabilistic Algo-

rithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.3.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68


5.3.3 Evaluation of the Approach with Dynamic Tasks . . . . . 69


5.4 Experiments with Ant Colony Optimization-based Deterministic

Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.4.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

xv

CONTENTS


5.4.3 Evaluation of the Approach with Dynamic Tasks . . . . . 73


6 Conclusions and Further Work 77

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.2 Future Research Work . . . . . . . . . . . . . . . . . . . . . . . . 80

Bibliography 83

xvi

List of Figures

2.1 Taxonomy: coordination dimensions in multi-robot systems . . . . 14

2.2 Multi-robot system . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Box-Pushing Mission [59; 107; 160; 166] and group of mobile robots

designed to work cooperatively lifting columns (http://birg.epfl.

ch/page28710.html) . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4 Exploration in unstructured environments. (a) The Mars explo-

ration rovers, Spirit and Opportunity, with a manipulator arm in

front, (b) a conceptual drawing for robotic rescue of Hubble space

telescope, (c) The Pathfinder rover, Sojourner and (d) Rocky 4. . 24

2.5 Formation Control. (a) Flying in Formation Takes Aircraft Far-

ther, Dylan Ashe (http://www.popsci.com/). In (b) shows im-

age of Vicon cameras overlooking a group of Khepera III robots.

3 cameras shown, 8 cameras total [98] . . . . . . . . . . . . . . . . 25

2.6 Bio-inspired robotics . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1 Experimental scenario . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2 Procedure for the selection of multi-tasks . . . . . . . . . . . . . . 38

4.1 Threshold function . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 Semi-logarithmic plot with different thresholds (θ = 1, 5, 20, 50)

and with n = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Interaction of learning automaton with random environment . . . 52

4.4 In [12] presents a experimental setting that shows the shortest path

finding capability of ant colonies . . . . . . . . . . . . . . . . . . . 55

xviii

http://birg.epfl.ch/page28710.html


http://www.popsci.com/

LIST OF FIGURES

5.1 Learning curves with the evolution of the system performance in-

dex for self-election of tasks using Response Threshold Models with

noise = 0.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66


dex for self-election of tasks using Response Threshold Models with

noise = 0.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.3 Dynamic tasks generation: learning curves with the evolution of

the system performance index for self-election of tasks using Re-

sponse Threshold Models . . . . . . . . . . . . . . . . . . . . . . . 68


dex for self-election of tasks using Learning Automata-based prob-

abilistic algorithms with noise = 0.10 . . . . . . . . . . . . . . . . 69


dex for self-election of tasks using Learning Automata-based prob-

abilistic algorithms with noise = 0.25 . . . . . . . . . . . . . . . . 70


the system performance index for self-election of tasks using Learn-

ing Automata-based probabilistic algorithms . . . . . . . . . . . . 70


dex for selfelection of tasks using Ant Colony Optimization-based

deterministic algorithms with noise = 0.10 . . . . . . . . . . . . . 72


dex for selfelection of tasks using Ant Colony Optimization-based

deterministic algorithms with noise = 0.25 . . . . . . . . . . . . . 72


the system performance index using Ant Colony Optimization-

based deterministic algorithms . . . . . . . . . . . . . . . . . . . . 73

5.10 The index k represents the number of tasks expected to be gener-

ated during a time interval for different values of λ and P (X = k)

describes the probability that a value of variable X with a given

probability distribution is equal to k . . . . . . . . . . . . . . . . 74

5.11 Number of tasks performed by each robots . . . . . . . . . . . . . 75

xix

List of Tables

2.1 Taxonomies multi-robot . . . . . . . . . . . . . . . . . . . . . . . 15

5.1 Experiments performed without dynamic tasks and their respective

variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.2 Experiments performed with dynamic tasks and their respective

variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

xx

Part I

Goals and Background

1

Chapter 1

Introduction

A rational and fruitful discussion is

impossible unless the participants share a

common framework of basic assumptions

or, at least, unless they have agreed on

such a framework for the purpose of the

discussion.

Karl R. Popper

SUMMARY: This chapter details the aspects related to the research

area. Section 1.1 mentions the reasons that justify why it is important

to develop this research work. Section 1.2 defines the general and specific

objectives. Section 1.3 presents the main contribution of the thesis and

presents the results obtained that have been presented in several interna-

tional conferences and published in international scientific journals with

peer reviewed. Finally, section 1.4 briefly describes the organization of the

thesis.

2

1.1. MOTIVATION

1.1 Motivation

The systems formed by multiple mobile robots, also known as Multi-Robot Sys-

tems (MRS) are employed for different reasons, however, one of the main moti-

vations is that MRS can be used to increase the system effectiveness in terms of

time and quality, providing greater flexibility in the tasks execution. Generally

speaking, the term multi-robot system includes different types of robotic systems,

for example, several industrial manipulators, mobile robots with manipulators on

board, or team of autonomous vehicles, but, in this thesis, the term will be used

to refer to a team of cooperating mobile robots to carry out the distribution of

heterogeneous multi-tasks.

The problem of coordination in MRS has been discussed in the literature

in many forms; each of the proposed methods are applied for groups of robots

that work closely together to accomplish a task composed of multiple sub-tasks.

As is typical for many complex systems, mathematical models are needed to

obtain tradeoff and accuracy in a system. The main benefits or advantages of

these systems are that the robots are capable of performing multiple tasks with

much greater precision than humans, but mostly because they can be extremely

efficient, they can perform calculations quickly, they can minimize risk and also

complete a task in less time. Probably one of the most promising directions for

research in this area is based on the coordination of multiple robots.

In recent decades, there has been a large amount of research done with respect

to autonomous mobile robots related to the coordination between them [3; 18;

24; 75]. These investigations have been directed toward finding efficient and

robust methods for controlling these groups of mobile robots. With this increase

there has also arisen new problems that require the execution of bigger and more

complex tasks. A very useful solution to this problem is to implement multiple

cooperative robots to accomplish a certain task since the cost is generally lower

for several robots than it would be for one single robot. In addition, a group of

robots is capable to perform many tasks as well as faster than a single independent

robot could ever do.

For example, a group of unmanned aerial vehicles (UAVs) can be deployed to

perform dangerous tasks to improve the chance of success and to study the conse-

3

1.2. THESIS OBJECTIVES

quences in case of a natural disaster. In some applications such as reconnaissance

missions, mine detection, surveillance and rescue victims, groups of robots can

augment and even replace humans in order to avoid possible injury to those that

protect us. During these missions, it is necessary to maintain communication

within the team of robots to carry out successfully the task at hand.

MRS can often deal with tasks that are difficult, if not impossible, to be

accomplished by a single robot. In the context of MRS, one of the challenges is

the need to control, coordinate and synchronize the operation of multiple robots

to perform a specific task. This requires the development of new strategies and

methods to obtain the desired system behavior, by means of, simple rules inspired

by the division of labor in social insects, in order that a group of robots can

perform tasks in an organized and coordinated way.

1.2 Thesis Objectives

This PhD thesis focuses on the self-coordination problem of MRS and in particu-

lar addresses the distribution of heterogeneous multi-tasks in a robust and efficient

manner. We take into account a specifically distributed or decentralized approach

as we are particularly interested in experimenting with truly autonomous and de-

centralized techniques in which the robots themselves are responsible for choosing

a particular task in an autonomous and individual way. In this regard, we have

experimented with different techniques: firstly, the application of the response

threshold models inspired by division of labor in social insects, secondly, the

application of the reinforcement learning algorithm based on learning automata

theory, and finally, ant colony optimization-based deterministic algorithms.

There are different strategies to address the task assignment problem, but in

this thesis is presented different approaches based on self-organizing and biolog-

ically inspired to address the multi-tasks selection instead of multi-tasks assign-

ment. This thesis will attempt, first, to answer the following questions:

• It is posible that agents or robots select the tasks instead of being assigned?

• It is posible to obtain an optimal distribution of the tasks by introducing

noise in the approaches?

4

1.2. THESIS OBJECTIVES

1.2.1 General Objective

The main goal of this PhD thesis is:

“Study, analyze and propose a set of techniques or methods for the

problem of coordinating multi-robot systems, specifically in the dis-

tribution of heterogeneous multi-tasks, and experimenting with dif-

ferent approaches based chiefly on self-organization and emergence

that is biologically inspired.”

1.2.2 Specific Objectives

The main goal is decomposed into several objectives, then, we establish the fol-

lowing specific objectives for this research:

• Investigate decentralized approaches inspired by the division of labor in

social insects and apply to the problem of distribution of heterogeneous

multi-tasks in MRS.

• Define the experimental scenario.

• Define the number of robots and the number of tasks in the system.

• Design the auto-assignment algorithm for multi-tasks with response thresh-

old models.

• Design the auto-assignment algorithm for multi-tasks by the reinforcement

learning algorithm based on learning automata theory.

• Design the auto-assignment algorithm for multi-tasks using ant colony optimization-

based deterministic algorithms.

• Analyze the robustness of the approaches by introducing noise to the meth-

ods.

• Generate dynamic tasks over time.

5

1.3. MAIN CONTRIBUTIONS AND PUBLICATIONS

1.3 Main Contributions and Publications

1.3.1 Main Contributions

The thesis presents several contributions to the self-coordination problem of

multi-robot systems in the distribution of heterogeneous multi-tasks with dif-

ferent approaches biologically-inspired. Therefore, the results obtained are based

on papers written that have been presented and published in several international

conferences and journals.

The main contributions of the thesis are:

• A bio-inspired solution based on response threshold models to solve the

problem for self-coordination of multi-robots, through the distribution of

heterogeneous and specialized multi-tasks in multi-robot systems.

• A solution through automata learning-based probabilistic algorithm, that

focuses on the general problem of coordinating multiple robots, specifically,

for the self-coordination in the selection of heterogeneous multi-tasks in

multi-robot systems.

• A solution using two different approaches by applying ant colony optimization-

based deterministic algorithms as well as learning automata-based prob-

abilistic algorithms which addresses the general problem of coordinating

multiple robots specifically for decentralized distribution of multi-tasks in

heterogeneous robot teams.

• A solution using two different approaches by applying response threshold

models and stochastic learning automata to solve the problem correspond-

ing to self-coordination in the distribution of heterogeneous multi-tasks in

multi-robot systems.

• An experimental scenario for all approaches has been proposed in order to

analyze the coordination problem among multiple robots. The robustness of

each method has been studied by the introduction of noise, which perturbs

6

1.3. MAIN CONTRIBUTIONS AND PUBLICATIONS

the number of pending load. The performance index with generation of

tasks over time has also been analyzed.

1.3.2 Publications

The results presented have influenced the contents of this thesis and have been

published in several international conferences and journals. The research re-

sults have been published in the IEEE library, the ACM library, the ISI Web of

Knowledge, Lecture Notes in Computer Science and Lecture Notes in Artificial

Intelligence by Springer-Verlag. The publications are documented in the follow-

ing works:

Journals Publications:

• De Lope, J., Maravall, D. and Quinonez, Y. (2012). Response threshold

models and stochastic learning automata for self-coordination of heteroge-

neous multi-tasks distribution in multi-robot systems. Robotics and Au-

tonomous Systems - Impact Factor: 1.313 [31].

International Conferences Publications:

• Quinonez, Y., De Lope, J. and Maravall, D. (2009). Communication and

coordination of robots teams in dynamic environments. Twelve Interna-

tional Conference on Computer Aided Systems Theory, EUROCAST 2009,

pp. 150–151 [128].

• Quinonez, Y., Baca, J., De Lope, J., Ferre, M. and Aracil, R. (2010). Self-

alignment approach based on cooperative behaviors for the docking process

of modular mobile robots. IEEE International Conference on Electronics,

Robotics and Automotive Mechanics, CERMA 2010, pp. 445–450 [130].

• Quinonez, Y., Maravall, D. and De Lope, J. (2012). Application of self-

organizing techniques for the distribution of heterogeneous multi-tasks in

multi-robot systems. IEEE International Conference on Electronics, Robotics

and Automotive Mechanics, CERMA 2012, pp. 66–71 [133].

7

1.4. THESIS STRUCTURE

Book Chapters Publications

• Quinonez, Y., De Lope, J. and Maravall, D. (2009). Cooperative and com-

petitive behaviors in a multi-robot system for surveillance tasks. Computer

Aided Systems Theory, EUROCAST 2009. Revised Selected Papers, LNCS

5717. R. Moreno-Diaz, F. Pichler, A. Quesada (Eds.) Springer-Verlag,

Berlin Heidelberg, pp. 437–444 [129].

• Quinonez, Y., De Lope, J. and Maravall, D. (2011). Bio-inspired decentral-

ized self-coordination algorithms for multi-heterogeneous specialized tasks

distribution in multi-robot systems. Foundations on Natural and Artificial

Computation, LNCS 6686. J.M. Ferrandez et al. (Eds.) Springer-Verlag,

Berlin Heidelberg, pp. 30–39 [131].

• Quinonez, Y., De Lope, J. and Maravall, D. (2011). Stochastic learning

automata for self-coordination in heterogeneous multi-tasks selection in

multi-robot systems. International Conference on Advances in Artificial

Intelligence, MICAI 2011, Part I, LNAI 7094, pp. 443–453 [132].

• De Lope, J., Maravall, D. and Quinonez, Y. (2012). Decentralized multi-

tasks distribution in heterogeneous robot teams by means of ant colony

optimization and learning automata. International Conference on Hybrid

Artificial Intelligence Systems, HAIS 2012, Part I, LNCS 7208, pp. 103–114

[32].

1.4 Thesis Structure

This document is organized by a set of chapters whose contents are described

briefly as follows.

• Chapter 2. State of the Art

This chapter explains the main features that present the systems formed by

multiple robots, also it introduces an overview of previous related work on

this research, in order to cover all the necessary knowledge and contextu-

alize the associated domains. It presents an overview on the main issues of

8


multi-robot systems, control architectures, coordination schemes and main

problems between theses systems. In addition, it provides some applications

of robotics that involve different fields using multiple robots, such as: co-

operative manipulation, unstructured environments, formation control and

biologically-inspired. Finally, it describes briefly the main previous works

related with the multi-robot systems and formal methods used.

• Chapter 3. Problem Description

This chapter defines the problem statement of the thesis, it presents a formal

description of the problem and describes the experimental scenario. Finally,

it details the description of the proposed solution to the previously defined

problems.

• Chapter 4. Theoretical Fundamentals

This chapter some mathematical concepts used throughout the thesis are

reviewed. The main objective of the chapter is to describe mathematical

models or probabilistic based on distributed or decentralized approaches in-

spired by division of labor in social insects. It presents a brief introduction

about mathematical models. Firstly, it describes an overview of response

threshold model and specifically a description of mathematical model of re-

sponse thresholds. Secondly, it presents a brief introduction about learning

automata methods, basic definitions of the theory of stochastic processes,

a basic definition of learning automata and stochastic reinforcement algo-

rithms based reward and penalty. And finally, it describes a brief introduc-

tion of the ant colony optimization, biological inspiration and description

of the ant system algorithm.

• Chapter 5. Experimental Results

This chapter we present the experimental results obtained from the applying

of the different decentralized approaches inspired on division of labor in so-

cial insects, such as: the response threshold model, ant colony optimization-

based deterministic algorithms and the learning automata-based probabilis-

tic algorithms. We analyze the results of experimentation, evaluating the

performance index by introducing additive noise to the number of pending

9


loads and we generated dynamic tasks over time.

• Chapter 6. Conclusions and Further Work

This chapter we present the conclusions of the thesis, and finally, are de-

tailed the future research lines derived from this research work.

10

Chapter 2

State of the Art

Science, despite its incredible advances, is

not and will never be able to explain

everything. It will continue to conquer new

areas that today are beyond our

understanding. But the frontiers of

knowledge, however high these may be

raised, will always have an infinite world

of mystery.

Gregorio Maran

SUMMARY: This chapter explains the main features that present the

systems formed by multiple robots, also it introduces an overview of pre-

vious related work on this research, in order to cover all the necessary

knowledge and contextualize the associated domains. Section 2.1 provides

an overview on the main issues of multi-robot systems, control architec-

tures, coordination schemes and main problems between theses systems.

Section 2.2 presents some applications of robotics that involve different

fields using multiple robots, such as: cooperative manipulation, unstruc-

tured environments, formation control and biologically-inspired. Finally,

section 2.3 serves as an presentation and review of the main previous works

related with the multi-robot systems and formal methods used.

11

2.1. MULTI-ROBOT SYSTEMS

2.1 Multi-Robot Systems

MRS is one of the characteristic applied areas of Artificial Intelligence that has

gotten an amazing growth since its inception until today [50; 55], and it has

developed very significant progress in various fields of application [124], becoming

a fundamental tool to produce, work and perform dangerous jobs on earth and

beyond.

In recent years, MRS are increasingly used in highly dynamic or contradictory

environment to deal with complex tasks [83], are quickly becoming a vast research

area and includes several different topics and ideas, as shown in the various works

[4; 35; 49; 65; 80; 122]. A MRS consists of a set of robots that, in the same

environment, interact with each other to achieve a common goal [53], thus trying

to improve the effectiveness, performance and robustness. These systems provide

greater flexibility in performing tasks and possible fault tolerance. To achieve that

several robots coordinate with each other to perform a specific mission is not a

trivial task, because, they must be designed to operate in dynamic environments

in which we must also take into account the classical problems of autonomous

robotics (e.g. uncertainty and unforeseen changes always present), new difficulties

arising from the influence of the team robots on the environment and the task

goal.

The main advantages of these systems with regard to a single robot is that they

have higher flexibility, efficiency and reliability achieving a more robust behavior

by accomplishing coordinated tasks that are not possible for single robots; they

can perform complex tasks much faster and execute tasks beyond the limits of

single robots. In fact, a multi-robot system may result robust to malfunctions

like unreliable communication and robot failures. Arai et al. [4] and Parker [122]

have identified the following primary research topics within MRS:

• biological inspirations;

• communication;

• architectures, task allocation, and control;

• localization, mapping, and exploration;

12


• object transport and manipulation;

• motion coordination;

• reconfigurable robots;

• learning

During these years, the scientific community has developed some research

progress in cooperative robotics with respect to mechanisms for coordination

and communication [85]. Dudek et al. [49] present a taxonomy for multi-agent

robotic systems, where proposed a classification based on the size of the team,

communication parameters (communication range, bandwidth and topology), the

reconfigurability of the team, the processing capacity of each member and the

team composition (homogeneous vs. heterogeneous robots).

A taxonomy for the classification of coordination approaches in MRS have

proposed in [53; 80]. They present a classification based on different levels of

coordination (unaware, aware but non coordinated, weakly coordinated, strongly

coordinated systems) and is characterized by two groups of dimensions, that is

the coordination dimension (cooperation, knowledge, coordination and organi-

zation) and the system dimension (communication, team composition, system

architecture and team size). The term dimension refers to specific features that

are grouped together in the taxonomy. Fig. 2.1 shows a hierarchical structure for

the coordination dimensions of the taxonomy. The different levels of the struc-

ture are: A cooperation level, a knowledge level, a coordination level, and an

organization level.

The first level of the taxonomy is concerned with the ability of the system to

cooperate in order to accomplish a specific task. The second level is concerned

with how much knowledge each robot in the system has about the presence of

other robots. The third level is concerned with the mechanism that is used in

order to achieve cooperation in the system. The fourth level is concerned with the

way the decision system is realized within the MRS. Finally, the work in [66] have

presented a taxonomy based on coordination mechanisms and on multi-robot task

allocation.

13


Cooperative

Aware Unware

Strongly

Coordinated

Weakly

Coordinated

Not

Coordinated

Strongly

Centralized

Weakly

Centralized Distributed

Co

ori

nat

ion

O

rgan

izat

ion

K

no

wle

dg

e C

oo

per

atio

n

Figure 2.1: Taxonomy: coordination dimensions in multi-robot systems

Some researchers have proposed taxonomies or classification systems that al-

low to organize and to control a multi-robot system. Then, in table 2.1 describes

a summary with the most significant features of some taxonomies multi-robot

presented in the literature.

2.1.1 Coordination in Multi-Robot Systems

Coordination is the act of organizing a group of mobile robots that is of fun-

damental importance for any MRS. That is, coordination in MRS imply that a

group of robots working together to accomplish specified actions simultaneously

that can result in the completion of an overall system goal at the global-level.

Cooperation refers to the simultaneous action of two or more agents that work

together and produce the identical effect. In the context of multi-robot systems

cooperation is defined as constructive and synergistic interaction of robots in a

system to exchange information in an intelligent manner and thus achieve the

execution of tasks more quickly and efficiently. In [80; 87] present a explicit defi-

14


Taxonomy Domain DescriptionYuta et al. [169] Multi-robot Defined from the objectives

and mechanisms of decision.Fulbright et al. [60] Multi-agent Establishes three classifica-

tions according the couplingof agents.

Cao [19] Cooperative robots Based on problems and solu-tions of the cooperation.

Balch [8] Multi-robot Useful in systems that em-ploy reinforcement learning(tasks and rewards).

Stone et al. [145] Multi-agent Study the homogeneity ofthe agents and their level ofcommunication.

Todt [154] Multi-robot Based on coordination be-tween robots.

Table 2.1: Taxonomies multi-robot

nition about cooperation and coordination in a MRS as follows:

COORDINATION: Cooperation in which the actions performed by each robot

take into account the actions executed by the other robots in such a way that the

whole ends up being a coherent and high performance operation.

COOPERATION: Situation in which several robots operate together to per-

form some global task that either cannot be achieved by a single robot, or whose

execution can be improved by using more than one robot, thus obtaining higher

performances.

Coordination is an essential characteristic between a groups of robot and is

an important issue of investigation [84], because, they require the development

of new techniques for control and coordination that enable the interaction be-

tween them and with environment to solve problems together. The coordination

between the robots can vary but there are usually four kinds of architectures for

coordinating of multi-robots, which are centralized, distributed, hierarchical and

15


Hybrid architectures.

2.1.2 Architectures for Multi-robot Systems

Robot architectures are designed to facilitate the concurrent execution of task-

achieving behaviors. At a very low level, robots must be able to react quickly to

dynamic changes in the environment and perform reactive routines in order to

accomplish tasks such as obstacle avoidance. At higher level, robots must be able

to coordinate with each other, performing asynchronous tasks such as cooperative

search or highly synchronized tasks such as cooperative transportation. Several

different kinds of control architectures for MRS have been presented in literature,

however, the main distinction can be done between centralized, hierarchical, de-

centralized, and hybrid [124].

2.1.2.1 Centralized Architectures

Centralized multi-robot systems were developed as a method to coordinate com-

munication between robots and the system. Centralization allows the main pro-

cessing and computational requirements to be removed from the individual robots,

and be completed on an external computer [149]. In centralized systems, a cen-

tral unit collects and manages information about the environment and optimize

the coordination among the robots to ensure the proper achievement of the mis-

sion; moreover, they can easily manage faults of some of the robots. In these

approaches, the central unit plays a key role, because it handles the whole sys-

tem, that is, it has to coordinate the information received by the sensors and

manage global information of the environment, to take all possible decisions and

to communicate with all robots of team, therefore, must be powerful enough to

satisfy all technological requirements.

2.1.2.2 Hierarchical Architectures

Hierarchical architectures are realistic for some applications. In this control ap-

proach, each robot oversees the actions of a relatively small group of other robots,

each of which in turn oversees yet another group of robots, and so forth, down

to the lowest robot, which simply executes its part of the task. This architecture

16


scales much better than centralized approaches, and is reminiscent of military

command and control. A point of weakness for the hierarchical control architec-

ture is recovering from failures of robots high in the control tree [124].

2.1.2.3 Decentralized Architectures

In Decentralized control architectures, the act of coordination is significantly more

complex [170]. Decentralized multi robot systems have stemmed from the inabil-

ity to adapt a fully centralized system to specific environments. Often the ability

to develop a fully centralized system is difficult due to the number of robots or the

capabilities of the central processor [99] and therefore decentralized systems are

needed. These systems are highly scalable to large multi-robot systems and ap-

plicable to outdoor unknown environments [25]. Decentralized systems can easily

result tolerant to possible faults, however, one major drawback of decentralized

systems is the complexity of the communications network that needs to be devel-

oped between the robots [90], since each robot works independently because the

resources are distributed among all the robots. Each robot uses its own sensors to

extrapolate local information of the environment and the relative position of the

robots closest to take its own decisions; that is, it is more difficult to coordinate

the robots and optimize the execution of the mission, then, a lot of cooperation

should be developed for that the system can work together.

2.1.2.4 Hybrid Arquitectures

Hybrid control architectures combine local control with higher-level control ap-

proaches to achieve both robustness and the ability to influence the entire team’s

actions through global goals, plans, or control. Many multi-robot control ap-

proaches make use of hybrid architectures [124].

For these schemes have been proposed several works in the literature with

experiments on coordination of multi-robot systems [24; 75; 88; 105]. There

are several examples of different multi-robot specific architectures, employing

different control strategies. Below we brief describe three prominent architectures

that have been proposed in literature:

1. The ALLIANCE architecture has been developed by Parker [121], is a

17


control architecture for fault tolerant, reliable and adaptive to coopera-

tive control of teams of heterogeneous mobile robots performing missions

composed of loosely coupled subtasks that may have ordering dependencies.

ALLIANCE is a fully distributed, behavior-based architecture that incor-

porates the use of mathematically-modeled motivations. The ALLIANCE

architecture is implemented on each robot in the cooperative team, delin-

eates several behavior sets, each of which correspond to some high-level

task-achieving function. The primary mechanism enabling a robot to select

a high-level function to activate is the motivational behavior.

2. The Layered Architecture for coordination of mobile robots was developed

by Simmons et al. [144], is an architecture that enables multiple robots to

explicitly coordinate actions at multiple levels of abstraction. Their layered

architecture has three layers than enables robots to interact directly at the

behavioral level, the executive level and the planning level. This architec-

ture ensures that at all levels the robots utilize coordinated behaviors, co-

ordinated task execution and coordinated planning. Each robot essentially

has these three layers and on an individual robot the layers can exchange

information while on a robot-to-robot basis the synonymous layers (e.g. the

executive layer) talk to each other.

3. The CAMPOUT architecture, designed by Huntsberger et al. [78], is an

architecture that is able to autonomously adapt to the uncertainties of a

dynamic environment. “CAMPOUT is a distributed control architecture

based on a multi-agent behavior-based methodology, wherein higher-level

functionality is composed by coordination of more basic behaviors under the

downward task decomposition of a multi-agent planner. Basically CAM-

POUT provides the infrastructure, tools and guidelines that consolidate a

number of diverse techniques to allow the efficient use and integration of

these components for meaningful interaction and operation”. CAMPOUT

is comprised of five different architectural mechanisms including, behavior

representation, behavior composition, behavior coordination, group coordi-

nation and communication behaviors

The above architectures are but a few of the complex architectures that have

18


been developed strictly for multi-robot systems, other architectures have been

proposed and presented in [23; 56; 151; 159; 171].

2.1.3 Main Problems among a Group of Robots

Communication plays an important role in multi-robot systems and can increase

their capacity and effectiveness, however, is one of the main problems among a

group of autonomous robots due to its complexity and dynamism as it depends

on environmental conditions as the interaction between themselves. The amount

of information that is exchanged at a time can vary from one problem to an-

other and consequently increases the degree of coordination depending on system

complexity [85].

Communication in a multi-robot system is the ability possessed by members

of the system to transmit and receive information between them, in a system of

multiple robots can be two types of communication [163]: intentional or direct, in

which used dedicated devices to ensure an effective communication. In this first

type, the messages have a defined receiver which it always get the information,

that is, communication is transmitted and received via some sort protocol or

language as a medium. The second type is the non-intentional or indirect, in

which information is transmitted by environmental changes or by visible state of

the agents, also known as stigmery. In this type of communication there is no

specific receptor for messages, that means, agents can leave marks and trails that

can convey information to other agents that will recognize these changes in the

environment.

Several investigations have been directed to the problem of communication

and information flow between multiple robots. Different works focusing on this

problem and have been presented taking into account limited communication

[30; 58; 110] and recently there has been an increased interest about the self-

emergence of a common lexicon in robot teams [96; 101; 104].

19


2.1.4 Coordination Schemes: Cooperative and Competi-

tive

Cooperative and competitive methods provide a means of coordinating behav-

ioral response for conflict resolution and offer an alternative to competitive. The

coordination can be viewed as a competition among behaviors; this type of com-

petitive strategy can be performed in a variety of ways. Generally, a coordination

function (serving as an arbiter) selects a single behavioral response. The function

can take the form of either a prioritization network (in which a strict behavioral

dominance hierarchy exists) or an action-selection method (in which, on the basis

of sensor information, only the most active behavior is selected).

As we have previously commented a MRS has several advantages over a single

robot, however, there are many problems that need to be considered in a dynamic

environment, for example, multiple moving objects, various obstacles, team mem-

bers, among others. All this makes more difficult to achieve coordination between

robots. Currently one of the main interests of the international community is de-

sign strategies for communication and coordination for MRS, which allow robots

to modify their behavior to cope with the environmental changes or actions per-

formed by other robots, in order to obtain cooperative behavior that allows them

to achieve a common goal.

In previous works [128; 129] we have presented a control architecture to achieve

cooperative and competitive behaviors in a MRS in an unknown environment. It

has established a surveillance scenario where there are two teams of robots: the

red robots must patrol and detect the blue robots in an office-like environment

(see Fig. 2.2). The objective of red robots is to work coordinately in order to

catch the blue robots (cooperative), meanwhile the goal of blue robots is to avoid

be caught by any member of red robots (competitive).

In another work [130], we have proposed two alignment strategies for self re-

configuration of modular mobile robots by means of cooperative behaviors. The

strategies are based on a modular robot system [5; 168] using mobile reconfigu-

rations and simulated to accomplish the task. The cooperative behaviors allow

robots to modify their behavior to cope with environmental changes or actions

performed by other robots, in order to obtain cooperative behavior that allows

20

2.2. FIELDS OF APPLICATION

Figure 2.2: Multi-robot system

them to achieve a common goal. According to the results experimental obtained

in both works, the coordination of multi-robot systems in dynamic environments

require a well-structured control architecture, and to achieve collaborative behav-

ior between members of a system, it needs a combination of behaviors associated

with each robot. The results demonstrates that implementing the cooperative be-

haviors to both robots is the fastest way to achieve self-alignment for the docking

process.

2.2 Fields of Application

Currently, there are many fields of application that require the use of a group of

robots, able to exhibit it more versatile behavior and flexibly to a great variety

of situations. For this reason, research on MRS has increased and is being a field

much studied by several researchers. Traditionally, robotics applications [124]

were focused mainly in the industrial sector (e.g. welding, assembly, processing,

workpiece handling, cutting materials by robot), where the main objective was

the massive automation in services for increase productivity, flexibility, quality,

and above all, to improve security to reduce the risk of people in dangerous tasks.

In the past two decades, application fields of robotics has been extended to

other sectors [17] some examples are: robots for construction [6; 71] (e.g. build-

ings, tunnels, roads, bridges, walls; domestic service robots [29; 137] (e.g. vacuum

cleaners, lawn-mowing, window cleaning, pool funds, tanks, tubes and pipes; de-

21


fense robots, rescue and safety [94; 112; 138] (e.g. rescuing victims, deactivators

mines, fire fighting and explosives, surveillance and security systems; assistive

robots [70; 97] (e.g. helps disabled wheelchair, operational rehabilitation robots,

wearable rehabilitation robots and other welfare functions; robots in medicine

[118; 134] (e.g. diagnostic methods, surgical and interventional robotics, robot-

assisted recovery and rehabilitation, behavioral therapy, personalized care for

special-needs populations.

At the present, applications of multi-robot systems span a broad spectrum

of areas, including human-unreachable environments, such as space, underwater,

and rescue; challenging domains, such as construction and teams of unmanned

aerial vehicles; and adversarial domains, such as robot soccer. Various specific

tasks are addressed, e.g., foraging and coverage of a given area, multi-target

observation, object pushing and transportation, exploration and flocking [158].

There are several areas of research that currently being explored in the field of

MRS, focusing mainly on issues of coordination, cooperation, communication,

localization, resource conflicts, architectures, among others. These applications,

require more than one robot to complete a specific task and are needed to control

the robots simultaneously to ensure synchronicity between them.

MRS have numerous applications and can involve different fields of robotics,

for example, industrial, military and service, or research and study of biological

systems, and they can greatly affect different types of missions, for example,

exploration, box pushing, the military operation, navigation in an unstructured

environment, traffic control, entertainment, simulations of biological systems (see

Fig. ??). In some industrial applications, for example, concern the possibility to

move large objects that hardly a single robot can be sufficiently powerful to push

alone a object and it can not enable to apply forces in all generalized directions.

Therefore, a multi-robot solution can be useful for share the needed power among

multiple robots.

2.2.1 Cooperative Manipulation

Some tasks can require transporting objects (see Fig. 2.3), to achieve that a

team of robots cooperate to carry a large object in an environment containing

22


static and dynamic obstacles, it is not an easy task. Different works about MRS

have been discussed and presented in the literature to achieve this type of mis-

sion generally called Box-Pushing Mission, for example, in [107] are presented

some experimental results of box pushing using two legged robots, in the works

[59; 160; 166] have presented different methods for the problem of transporting

objects by multiple mobile robots, the work in [152] presents an approach to carry

a deformable object by means of two mobile robots with manipulators on board.

Some have addressed the aerial transport of objects using cables [57; 109] and

in [77] have proposed a solution to the problem box-pushing with multiple au-

tonomous robotic fish in an underwater environment. Finally, others have taken

inspiration from ant societies [9; 89].

Figure 2.3: Box-Pushing Mission [59; 107; 160; 166] and group of mobilerobots designed to work cooperatively lifting columns (http://birg.epfl.ch/page28710.html)

2.2.2 Unstructured Environments

The exploration in unknown environments (see Fig. 2.4) with a team of mobile

robots is another kind of application which have been extensively studied in the

literature in many forms. To achieve this mission in a cooperative way, all the

robots must be coordinated to explorer different parts of the environment with

23




goal to cover the whole environment in less time than a single robot. Several

authors proposed multi-robot exploration strategies based on market principles,

in which robots place bids on subtasks of the exploration attempt and does not

require a central agent, in [142] have proposed a distributed bidding algorithm

for multiple robots in exploration tasks and addresses the problem caused by the

limited communication range. The work in [18] presents an approach to explore

an unstructured environment that has been implemented on real robots for differ-

ent environments. Another approaches for coordination of multiple robots using

market- based approach were proposed in [141; 172]

Figure 2.4: Exploration in unstructured environments. (a) The Mars explorationrovers, Spirit and Opportunity, with a manipulator arm in front, (b) a conceptualdrawing for robotic rescue of Hubble space telescope, (c) The Pathfinder rover,Sojourner and (d) Rocky 4.

2.2.3 Formation Control

Research on formation control involves a collection of decision making agents

with limited processing capabilities, locally sensed information, and limited inter-

agent communications, all seeking to achieve a collective objective (see Fig. 2.5).

In the recent years, there is growing interest in distributed control due to its

many advantages such as energy saving, scalable property and robustness [92;

24


95]. Formation control is one of the most studied problems in MRS and many

researchers start working on the consensus based formation control [20; 36; 54;

127; 135; 164]. In the leader-follower approach, each robot is assigned a leader

from which it must maintain certain constraints [27; 52; 67; 147; 148; 162].

Figure 2.5: Formation Control. (a) Flying in Formation Takes Aircraft Farther,Dylan Ashe (http://www.popsci.com/). In (b) shows image of Vicon camerasoverlooking a group of Khepera III robots. 3 cameras shown, 8 cameras total [98]

2.2.4 Biologically-Inspired

The field of application in multi-robot systems has increased in recent years,

several investigations have focused on the applications of biological inspiration

as they provide fascinating examples of functional collective behavior [119; 136],

characterized by rapid changes, high uncertainty, indefinite richness, and limited

availability of information. These examples have been useful to study and apply

these findings to the design of multi-robot systems. The first works inspired in

the behavior of social insects (e.g., ants, bees, birds and fishes) in relation to the

study of group behavior have been presented in [91; 106; 120]. Most bio-inspired

robots are designed for specific tasks and for different environments (see Fig.

2.6), in order to cope with uncertain situations and react quickly to unforeseen

changes in the environment. Pfeifer et al. [125] have presented a study about

self-organization, embodiment and biologically inspired robotics.

25

http://www.popsci.com/

2.3. PREVIOUS AND RELATED WORK

Figure 2.6: Bio-inspired robotics

2.3 Previous and Related Work

Several researchers have addressed the problem of coordination in MRS, currently,

there are several studies that focus mainly on the coordination of a set of robots

using different techniques, in order to solve a specific problem. In the following

subsections, we review some potential trends of research articles related with

the coordination of multi-agent systems, swarm robots and multi-robot systems.

In particular, we focus on previous and related work to coordination in MRS,

reviewing some of the approaches to coordination that employ formal methods.

2.3.1 Formal Methods in Relation to Coordination

In the last decade, there has been an increasing interest in systems comprised

of several autonomous mobile robots, and as a result, there has been a substan-

tial amount of development in this field; several researchers have studied the use

of formal methods for the coordination and control of MRS. These works focus

mainly on the coordination of a set of robots using different techniques, in order

to solve a specific problem. With regard to the optimal tasks assignment prob-

lem, a brief review of some potential trends of research articles related to the

coordination of multi-agent systems, swarm robots and multi-robot systems will

26


be presented here. The discussion is focused on the recent literature in the area

coordination with multiple robots.

2.3.1.1 Multi-Agent Systems

Researches in multi-agent systems about self-organization and emergence focus

on naturally inspired approaches [62; 82] and socially-based approaches [72], have

been studied and experimented with several mechanisms leading to self organi-

zation [10].

Price and Tino suggest a number of strategies to address problems of task

allocation in multi-agent systems, based on the principle of self-organization of

social insects through the mathematical model developed by Bonabeau. They

make a comparison of decentralized algorithms (FIFO and Greedy) to measure

and evaluate the effectiveness of each strategy to process the mail and at the

same time minimizing the number of changes [126]. The problem that has been

considered for these algorithms of adaptation is a variation of the mail retrieval

proposal by Bonabeau.

Shang and Wang [140] have applied a similar problem of congestion of public

resources in multi-agent systems: the famous “El Farol” bar problem in which

a population of N agents have to self-coordinated respect to attendance at a

place with limited capacity C, much lower than N. This strategy provides a sim-

ple mechanism for a large collection of decentralized decision makers to solve a

complex congestion problem.

Agassounon and Martinoli [2] have proposed a system for collecting objects,

similar to response threshold completely deterministic, that is, when the stimulus

exceeds a threshold determined immediately begins the execution of the task. In

this case, it uses the time to find an object as stimulus to decide whether a robot

should to run the task or rest.

2.3.1.2 Swarm Robots

In general, researchers in swarm robotics are inspired by the decentralized self-

organizing biological systems and collective behavior of social insects in particular

27


[68]. Swarm robotics is a novel approach to robotics which tries to circumvent

problems with classical, monolithic robots like inflexibility and individual com-

plexity by applying the principles of swarm intelligence to the field of robotics

[44]. Typically these systems are composed of robots that, at the individual level,

have relatively limited capacity to solve the task and limited knowledge about

their environment. The general paradigm is often referred to as swarm intelli-

gence [16; 47; 61].

Baglietto et al. have presented a coordination approach to swarm robots both

navigation and task allocation based on RFID (Radio Frequency Identification,

RFID). RFID devices are distributed a priori in the environment by building a

navigation chart; each RFID device contains navigation instructions that allow

the robots to run the routes from one place to another. Robots cannot commu-

nicate with each other, but may do so indirectly by writing and reading RFID

devices. To perform the distributed task allocation algorithm defines an auction,

where the central server takes work to be undertaken by a team of robots, ana-

lyzes and decides the number of robots, then robots are informed about the new

tasks The allocation is the result of negotiations that each robot makes its own.

Similarly using RFID devices to communicate, leaving registration messages be-

tween them, for example, messages and records assignments and out of zones.

The system has been implemented in Player/Stage and navigation algorithm has

been tested in MATLAB [7].

In the study by Yang et al. [167] have proposed a foraging mission in swarm

robots, using mechanisms of response threshold with a nondeterministic selection

of the task to be performed. Experiments have been implemented in TeamBots.

2.3.1.3 Multi-Robot Systems

One of the most popular approaches based on auction market mechanisms for

the coordination of multi-robot systems was introduced by Dias and Stentz [33]

in 2000. They consider that in multi-robot systems based on auctions, the robots

are designed as agents of their own interests operating in a virtual economy. The

28


tasks are assigned to the robots through the auction market mechanisms, for each

task the complete robot generates some income that are reflected in the form of

virtual money for providing a service to the team. However, when executing a

task, the robot consumes resources such as fuel or network bandwidth, therefore,

requires some expenses to pay for the resources used to complete the task. In

2004 [34] Dias has developed a coordination mechanism called Traderbots, which

is designed to inherit the effectiveness and flexibility of a market economy. In

this approach, were made some improvements in relation to the estimated costs

to improve the efficiency of the team, then, in 2006 [86] this mechanism was ap-

plied in teams of harvesting to search treasure in an unknown environments.

Shiroma and Campos have proposed a framework for coordination and dis-

tribution of tasks between a set of heterogeneous mobile robots called CoMutaR

(Coalition formation based on Multi-tasking robots), allowing the robots to per-

form multiple tasks same time. It is based on the Contract Net Protocol to form

coalitions concurrent through actions, use an auction process of a single round.

They considered two specific experiments: (1) that two robots cooperate to push

a box and (2) that a set of three tasks are performed by two robots [143].

Gerkey and Mataric have proposed an auction method for multi-robot coor-

dination in their MURDOCH system [64]. A variant of the Contract Net Pro-

tocol, MURDOCH produces a distributed approximation to a global optimum

of resource usage. The work basically shows the effectiveness of distributed ne-

gotiation mechanisms such as MURDOCH for coordinating physical multi-robot

systems. In most of the previous work, the communication between robots is as-

sumed to be perfect, which makes their algorithms unable to handle unexpected,

occasional communication link breakdowns.

Song et al. have proposed a Distributed Bidirectional Auction algorithm for

multi-robot systems coordination. A task is divided into n sub-tasks, a robot can

only run a sub-task, the allocation of sub-tasks is decided by both the auctioneer

and bidder; the auctioneer chooses the pre-winners ordering the prices of offer,

while the bidders chosen all tasks that pre-won the sub-task which has the lowest

29


price. After the first round, the sub-tasks that were not chosen by any bidder

enters a second round of auction depending on the initial price auction, this pro-

cess is repeated until all sub-tasks have been completed [146].

In [93] Lim et al. have presented an architecture based on the auction market

for the cooperation of a team of robots. On this platform, each team of robots

is controlled by a respective MRS Client program and communicating through

ZigBee Wireless Personal Area Network (WPAN). Each WPAN is assigned with

a different identity (ID) so the data security of communicated information be

preserved. A client program that acts as a buyer is used to deliver the tasks for

users in the market. Then, a server program of tasks coordination is used to com-

pare the buyers’ demand matches the supply from sellers. These programs are

based on client/server architecture and are connected through Local Area Net-

work (LAN) using Transmission Control Protocol (TCP) and Internet Protocol

(IP).

30

Part II

Setting the Problem

32

Chapter 3

Problem Description

Most of the fundamental ideas of science

are essentially simple, and may, as a rule,

be expressed in a language comprehensible

to everyone.

Albert Einstein

SUMMARY: This chapter defines the problem statement proposed in

this thesis. Section 3.1 establishes the idea of research, describing the

issues related to the problem of coordinating a team of robots, mainly, the

task assignment between them. Section 3.2 presents a formal description

of the problem. Section 3.3 shows the experimental scenario established to

carry out the experiments with different decentralized approaches. Finally,

section 3.4 details the description of the proposed solution to the previously

defined problems.

33

3.1. PROBLEM STATEMENT

3.1 Problem Statement

The topics of research on MRS have been studied by several researchers of the

scientific community due to the complexity of these systems. The problem of

coordinating a team of robots involves a series of challenges that going beyond

manipulation, modeling and navigation of the robot, that means, to accomplish

a large task which can be divided into smaller parallel subtasks where a group

works on an individual subtask. For example, in some works presented by Zlot et

al. have been demonstrated the ability to handle task decomposition and loosely

coordinated tasks using market-based techniques [173; 174].

Task assignment implies determining the order in which sub-tasks should be

completed, groups that must meet each sub-task, and robots that should belong

to which groups. Once the task assignment is completed, robots should be found

with their new groups. In addition, groups should be able to communicate with

other groups to ensure that the overall task is completed.

In MRS, optimal task/job allocation or assignment is an active research prob-

lem, in which several central or global allocation methods have been proposed

[79]. The probabilistic approaches have been used to solve major challenges of

mobile robotics, getting some new and innovative solutions to important problems

such as navigation, localization, tracking and robot control. This approach could

be applied to the problem of coordinating multiple robots to the self-election of

heterogeneous specialized tasks.

3.2 Formal description of the problem

The optimal multi-task selection problem in multi-robot systems can be formally

defined as follows:

• “Let L = l1(t), l2(t), ..., lJ(t) be the different specialized tasks. Each

lj ∈ L has a number of j jobs or pending loads where J = j1, j2, ..., jK.Let R = r1, r2, ..., rN be the set of N heterogeneous mobile robots. We

made several assumptions concerning the problem description mentioned

above; we have supposed that all members R = r1, r2, ...rN are able to

participate in any jobs or pending loads lj”.

34

3.3. APPLICATION SCENARIO

Perform the multi-tasks selection in order to obtain an optimal distribution

of a robot team formed by N heterogeneous robots with K different robots roles

or robots jobs among the K different types of heterogeneous specialized tasks

or equivalently, in such a way that the robots themselves, autonomously and in

an individual manner, select a particular task such that all the existing tasks

L = l1(t), l2(t), ..., lK(t) are optimally executed in the shortest time.

3.3 Application Scenario

We have established the following experimental scenario (Fig. 3.1) in order to

analyze a particular strategy or solution for the coordination of multi-robot sys-

tems as regards the optimal distribution of the existing tasks. Given a set of N

heterogeneous mobile robots in a region, achieving an optimal distribution for

different types of tasks. The set of N robots will form sub-teams for each type

of task lj. The sub-teams are dynamic over time, i.e. the same robots will not

be always part of the same sub-team, but the components of each sub-team can

vary depending on the situation.

Most of the proposed solutions in the technical literature are of a centralized

nature, in the sense that an external controller is in charge of distributing the tasks

among the robots by means of conventional optimization methods and based on

global information about the system state [65]. However, we are mainly interested

on truly decentralized solutions in which the robots themselves, autonomously

and in an individual and local manner, select a particular task so that all the tasks

are optimally distributed and executed. In this regard, we have experimented with

different techniques; first, the application of response threshold models inspired

by division of labor in social insects, secondly, the application of reinforcement

learning algorithm based on learning automata theory, and finally, ant colony

optimization-based deterministic algorithms.

3.4 Description of the Proposed Solution

Research in multi-robot systems has increased considerably to the point that

systems with hundreds of robots have been proposed [75; 88]. To accomplish a

35

3.4. DESCRIPTION OF THE PROPOSED SOLUTION

Distribution of Robots

When the task is completed, the robots change to another task.

Distribution Task

Allo

cation

Change task

Robots perform tasks

L1 L2

L3 L4

L1 L2

L3 L4

L1 L2

L3 L4 L4

L1 L2

L3

Figure 3.1: Experimental scenario

given task, the robots must share information, thus, increasing the size of the

team is required an increase in resources (for example: time, sensory efforts and

bandwidth of communication). In this sense, all communication features such

as network topology, the bandwidth of communication, messages coordination

strategies and the traffic of information between the robots represent open issues

for mobile robot applications.

The research of this work is mainly based on the study of the coordination of

multi-robot systems, in particular, the problem of distribution of heterogeneous

multi-tasks. In this sense, with order to resolve this issue raised, and according

to the general objective and specific objectives of this work research, we propose

experimenting with different techniques based chiefly on self-organization and

emergence biologically inspired. Under this approach we can speak of multi-tasks

selection instead of multi-tasks allocation, that means, as the agents or robots

select the tasks instead of being assigned a task by a central controller.

The key element in these algorithms is the estimation of the stimuli and the

adaptive update of the thresholds. This means that each robot performs this

36


estimate locally depending on the load or the number of pending tasks to be

performed. In addition, it is very interesting the evaluation of the results in

function each approach comparing the results obtained by introducing of noise

the number of pending loads to simulate the robot’s error in estimating the real

number of pending tasks.

Next in Fig. 3.2 it shows the flow chart of the approaches used to carry out

the multi-tasks selection among a group of robots.

37


Start

Define the robots number

and the type of tasks

Generate the total number

of loads for each type of

task

Get the probabilities of each

robot for each task

Select the task

Perform the task

Are there

any more

tasks?

End

Did the

robot completed

the task?

Yes

No

No

Yes

Figure 3.2: Procedure for the selection of multi-tasks

38

Part III

Foundations

40

Chapter 4

Theoretical Fundamentals

When you can measure what you are

speaking about, and express it in numbers,

you know something about it; but when you

cannot measure it, when you cannot

express it in numbers, your knowledge is of

a meagre and unsatisfactory kind.

Lord Kelvin

SUMMARY: This chapter describes mathematical models or probabilis-

tic that have been used according to the problem statement proposed in

this thesis, based on distributed or decentralized approaches inspired by

division of labor in social insects. Section 4.1 presents a brief introduction

about mathematical models. Section 4.2 describes an overview of response

threshold model and specifically a description of mathematical model of re-

sponse thresholds. Section 4.3 explains a brief introduction about learning

automata methods, basic definitions of the theory of stochastic processes,

a basic definition of learning automata and stochastic reinforcement al-

gorithms based reward and penalty. Finally, section 4.4 shows a brief

introduction of the ant colony optimization, its biologically inspired and

description of the ant system algorithm.

41

4.1. INTRODUCTION

4.1 Introduction

The theory of self-organization was originally introduced in the context of physics

and chemistry in order to describe the emergence of macroscopic patterns out

of processes and interactions defined at the microscopic level [14]. This theory

can explain the behavioral aspects of social insects, in particular, it shows how

the complexity of collective behavior of these insects may arise from the inter-

action among individuals who exhibit a simple behavior. Bonabeau and other

researchers say the discovery of the theory of self-organization not only has im-

plications for the study of social insects, but also has been a great tool to transfer

knowledge to the area of distributed artificial systems.

Each day increases the biological influence on research in the field of collective

robotics, because, collective behaviors provide evidence that the systems com-

posed of simple agents can perform complex tasks in the real world. It is known

that the cognitive abilities of these insects are very limited and that complex

behaviors emerge from interactions among a multitude of insects obeying simple

rules, which are mainly based on the existence of social units and individuals that

interact to produce a collective behavior, global and emerging.

In insect societies, many factors contribute to an individuals decision to per-

form a task, including genotypic, environmental, temporal, morphological, physi-

ological, and social factors. However, certain signals can be dominant in stimulat-

ing the performance of a particular task. Within a group of insects, an individual

performs a task if it observes sufficient signals indicating demand for the task to

be performed. These signals might be environmental or in the form of messages

from the fellow members of the society. Such signals can be categorised accord-

ing to the task to perform, hence the name task-associated stimulus or hereafter

stimulus.

In many problems that involve modeling the behavior of some system, we lack

sufficiently detailed information to determine how the system behaves, or the

behavior of the system is so complicated that an exact description of it becomes

irrelevant or impossible. In that case, probabilistic and deterministic models are

often useful.

Probabilistic algorithms are those algorithms that model a problem or search

42

4.1. INTRODUCTION

a problem space using an probabilistic model of candidate solutions. Many meta-

heuristics and computational intelligence algorithms may be considered proba-

bilistic, although the difference with algorithms is the explicit (rather than im-

plicit) use of the tools of probability in problem solving.

In deterministic models good decisions bring about good outcomes. Given a

particular input, it will always produce the same correct output, and the under-

lying machine will always pass through the same sequence of states, therefore,

the outcome is deterministic. One simple model for deterministic algorithms is

the mathematical function; just as a function always produces the same output

given a certain input. The difference is that algorithms describe precisely how

the output is obtained from the input, whereas abstract functions may be defined

implicitly.

In this sense, a learning automata is a model for making adaptation decisions

using only stochastic information, from the environment and not based on detailed

models or estimates of the parameters. It learns to choose the optimal actions of a

specific and finite set of actions called its action set, based only on noisy feedback

from its environment. At each time instant, the automaton randomly chooses an

action of its action set based on its current action probability distribution. Using

the feedback from the environment, the automaton updates the action probability

distribution and uses the updated distribution to select the next action.

Therefore, for both learning automata-based probabilistic algorithms as for

ant colony optimization-based deterministic algorithms the robots take their de-

cision concerning the task election based on the response signals emitted by the

environment: a reward signal whenever the state of affairs is correct, that is, there

is no pending task to be executed and all the robots are busy, and a penalty signal

whenever there are idle robots or pending tasks. On the contrary for response

threshold model the decision is based on its estimation of the current state of

affairs. In the Multi-Agent technical literature this kind of decisions based on

the use by the agents of a model of the state of the world are known as inductive

learning. Following the terminology introduced by Brian Arthur in the El Farol

Bar problem [140].

43

4.2. THRESHOLD MODELS

4.2 Threshold Models

4.2.1 An Overview of Response Threshold Model

Threshold models are based by an understanding of the decentralized mechanisms

that underlie the organization of natural swarms such as ants, bees, birds and

fish. Social insects provide one of the best-known examples of biological self

organized behavior. By means of local and limited communication, they are able

to accomplish impressive behavioral feats: as maintaining the health of the colony

and caring for their young.

A social insect colony operates without any central control, no one is in charge,

and no colony member directs the behavior of another. With this decentralized

way to work, colony exhibits flexibility and robustness, two features that are de-

sirable in an artificial system [89]. Social insect colonies are formed by highly

cooperative groups that are expert at manipulating and exploiting their environ-

ment, defending resources and brood, and allow for task specialization among

group members.

The response threshold model assumes that individuals have inherent thresh-

old to respond to stimuli associated with specific tasks and, in a group, the

individuals with the lowest threshold for a task will perform this task more of-

ten. Division of labor emerges from the differences between individuals in their

thresholds. Different versions of the response threshold model have looked at

the effect of threshold reinforcement [108; 156], colony size [63; 81], number of

tasks [69] and genetic diversity [51]. These studies assume that task stimuli are

well-mixed in the environment; the cues used by individuals to choose tasks are

therefore global.

Insect societies are characterized by the division of labor, communication be-

tween individuals and the ability to solve complex problems [13], and these char-

acteristics have long been a source of inspiration and subject of numerous studies,

acquiring great relevance for many researchers both in the field of robotics as in

biology. On the one hand, the biologists trying to prove their theories of social

insects on robots, and on the other hand, researchers in the discipline of robotics

seek solutions to problems that cannot be solved by a single robot.

44


Seeley et al. [139] have considered the following experiment to study the

collective behavior in a colony of insects, focusing on the work performed by bees

to get honey. Two food sources are presented to the colony at 8:00 A.M. at the

same distance from the hive: source A is characterized by a sugar concentration

of 1.0 mol/1 and source B by a concentration of 2.5 mol/1. Between 8:00 A.M.

and noon, source A has been visited 12 times and source B, 91 times. At noon,

the sources are modified: source A is now characterized by a sugar concentration

of 2.5 mol/1 and source B 0.75 mol/1. Between noon and 4:00 P.M., source A

has been visited 121 times and source B only 10 times. Have shown that a bee

has a relatively high probability of going to a good food source and abandon a

poor food source.

4.2.2 Model

Based on these observations, these simple rules of behaviors allow the bees to

select the best quality source; Eric Bonabeau 1 et al. have proposed a simple

mathematical model of response thresholds for the regulation of division of labor

in insect societies [15]. In this model it is assumed that each task is associated

with a stimulus or set of stimuli, so that individuals can detect information on

each of the different stimulus intensity (see Fig. 4.1), therefore, can assess the

demand for a particular task when are in contact with the stimulus associated.

Figure 4.1: Threshold function

1http://www.icosystem.com/about-us/management-team/bonabeau/

45

http://www.icosystem.com/about-us/management-team/bonabeau/


Let s be the intensity of a stimulus associated with a particular task; s can be

a number of encounters, a chemical concentration, or any quantitative cue sensed

by individuals. A response threshold θ, expressed in units of stimulus intensity, is

an internal variable that determines the tendency of an individual to response to

the stimulus s and perform the associated task. More precisely, θ is such that the

probability of response is low for s < θ and high for s > θ. This mathematical

model that satisfies this requirement is given by:

Tθij(sj) =snj

snj + θnij(n > 1) (4.1)

where Tθij(sj) is the probability of response of the robot ri to execute the task lj;

n > 1 determines the steepness of the threshold, following the recommendations

of works by other authors [2; 15], the value of n in all experiments always equals

2.

Fig. 4.2 shows the values of the equation 4.1 for different values of thresholds

θ. It can be noted more clearly that: for s < θ, the probability of engaging task

performance is close to 0, and for s > θ, this probability is close to 1. Then, the

probability than an individual will perform a task depends on s.

Figure 4.2: Semi-logarithmic plot with different thresholds (θ = 1, 5, 20, 50) andwith n = 2.

46


The underlying idea is very simple, when a stimulus exceeds the threshold of

response of an individual, that individual is likely to respond to stimuli, and en-

gage in the task because the level of the stimulus associated with that task exceeds

its threshold. The intensity of a stimulus decreases as the individual performs the

task; therefore, individuals with high thresholds are unlikely to perform the task

when other individuals, with lower thresholds, maintain the stimulus intensity

below their thresholds. However, when individuals with low thresholds do not

perform the task, individuals that have high thresholds may engage in the task

performance because the stimulus intensity exceeds their thresholds. Algorithm

1 describes the implementation done for this approach.

Algorithm 1: Algorithm of response threshold for the robot ri1: Input: L = list of tasks unselected2: for all l ∈ L do3: if sj > θij then4: return l (begins running the task lj)5: end if6: end for all7: return null

The tasks can be constant or can be and time-dependent variable. Stimuli

associated with each task can vary considerably from one task to another de-

pending on the nature of tasks, task demand and by number of robots that are

executing the task. Each task is associated with the demand expressed in the

form of a stimulus, when a robot performs a task tends to reduce the intensity

of associated stimulus, and as a result, modifies the intensity of the stimuli for

tasks that is not running.

Each robot r has a set response thresholds θi = θ1, θ2, ..., θN. Each thresh-

old θi corresponds to a task type lj = l1, l2, ..., lJ that the robot is capable of.

The initial values of the threshold are randomized to ensure that their roles are

not predetermined; when a robot engages in performing a task lj, the task asso-

ciated threshold is decremented by a minimum amount, as follows:

θnewi,j = θoldi,j − σ (4.2)

47

4.3. LEARNING AUTOMATA METHODS

And conversely, the thresholds of other tasks that are not running are incre-

mented by a minimum amount, as follows:

θnewi,j = θoldi,j + σ (4.3)

where σ > 0 is a factor of increase or decrease that allows to the thresholds vary

over time, depending on the performance of tasks. Then, Algorithm 2 describes

how the thresholds can vary when a robot engages in performing a task.

Algorithm 2: Algorithm of response threshold for the robot ri1: if just engaged in Lj then2: θnewi,j ← θoldi,j − σ;3: if θnewi,j < θmin then4: θnewi,j ← θmin5: end if6: for i = 0→ N do7: if j 6= i then8: θnewi,j ← θoldi,j + σ;9: if θnewi,j > θmax then

10: θnewi,j ← θmax11: end if12: end if13: end for14: end if

4.3 Learning Automata Methods

4.3.1 A Brief Introduction

Automata models of learning systems introduced in the 1960’s were popularized

as learning automata in a survey paper in 1974 [114]. Learning automata [116]

have been studied and have attracted a considerable interest in last years. The

first researches on learning automata models were developed in Mathematical

Psychology, that describe the use of stochastic automata with updating of action

probabilities which results in reduction in the number of states in comparison

48


with deterministic automata. They can be applied to a broad range of modeling

and control problems, control of manufacturing plants, pattern recognition, path

planning for manipulators, among other. An important point to note is that the

decisions must be made with very little knowledge concerning of the environment,

to guarantee robust behavior without the complete knowledge of the system. In

a purely mathematical context, the goal of a learning system is the optimization

of a function not known explicitly [114].

Learning is defined as any permanent change in behavior as a result of past

experience, and an automata is a machine or control mechanism designed to au-

tomatically follow a predetermined sequence of operations or respond to encoded

instructions [117]. The definition of learning automata is given in [157] as fol-

lowed: “The stochastic automaton attempts a solution of the problem without

any information on the optimal action (initially, equal probabilities are attached

to all the actions). One action is selected at random, the response from the en-

vironment is observed, action probabilities are updated based on that response,

and the procedure is repeated. A stochastic automaton acting as described to

improve its performance is called a learning automaton”.

Stochastic learning automata operating in stationary, as well as, non sta-

tionary random environments have been studied extensively [116; 153]. Various

algorithms have been proposed in the literature (e.g., LR−I algorithm, LR−P

algorithm, pursuit algorithm, etc.) for the automaton to update its action proba-

bility vector [116]. The objective of stochastic learning automata is to determine

how the choice of the action at any stage should be guided by past actions and

responses, so when a specific action is performed the environment provides a

random response which is either favorable or unfavorable [102].

4.3.2 Definition of Stochastic Processes

Stochasticity or uncertainty appears in all systems but so far was not possible

the solution for optimization problems of large systems considering explicitly this.

Uncertainty may be due to lack of reliable data, measurement errors, or treated

in parameters representing information about the future. In deterministic opti-

mization assumes that the parameters of the problem are known with certainty,

49


even its average value. In stochastic optimization is reflected this condition; their

values are not known, only their distributions and it is usually assumed that these

are discrete with a finite number of possible states.

Loosely speaking, a stochastic process is simply a collection of random vari-

ables indexed by time t, taking values from a set T that may be discrete time

(T = 0, 1, 2, · · ·), that is, is a countable collection (usually N) of random vari-

ables indexed by the non-negative integers, in which case it speaks of stochastic

process in discrete time. Also can be continuous time T = [0,∞] or T = [0, a], 0 <

a < ∞), that means, is an uncountable collection (usually T = R) of random

variables indexed by the non-negative real numbers, then it speaks of stochastic

processes in continuous time. It may be denoted by X = X(t, ω); t ∈ T, ω ∈ Ω.A more precise definition may be given as follows.

Definition. A stochastic process is a family of indexed random variables

X = X(t, ω); t ∈ T, ω ∈ Ω defined on a probability space (Ω, F, P ) and taking

values in a measurable space (S,A). Ω is the sample space, F is a sigma algebra

defined on the sample space and P is a probability measure on Ω. T is an arbi-

trary set.

There are many ways of visualizing a stochastic process as follows:

• For each choice of t ∈ T,X(t, ω) is a random variable.

• For each choice of ω ∈ Ω, X(t, ω) is a function of t.

• For each choice of ω and t, X(t, ω is a number.

• In general it is an ensemble (family) of functions X(t, ω) where t and ω can

take different possible values.

A stochastic process X = Xt : t ∈ T can be considered as an application

that depends on two arguments:

X : T × Ω→ S (t, ω)→ X(t, ω) = Xt(ω)

50


Considering fixed t is obtained X(t, ·) = Xt(·) that is a random variable

defined on (Ω, F, P ) and taking values in (S,A).

Let Xt : t ∈ T a stochastic process and t1, . . . , tn a finite subset of T . The

multivariate distribution of the random vector Xt1 , . . . , Xtn is called a finite-

dimensional distribution of the process.

Given a probabilistic space (Ω, F, P ), is called discrete time stochastic process

to any succession of random variables Xnn∈N all defined in the same space,

considering real random variables i.e. Xn : (Ω, F, P ) → (R,B,R). For each

ω ∈ Ω the stochastic process Xnn∈N it obtains a succession of real numbers

that is called trajectory of the process associated to ω.

4.3.3 Basic Definition of Learning Automata

A learning automaton is a sextuple < x,Q, u, ~P (t), G,R >, where x is the finite

set of inputs, Q = q1, q2, . . . , qm is a finite set of internal states, u is the set

of outputs, ~P (t) = p1(t), p2(t), . . . , pm(t) is the state probability vector at time

instant t, G : Q→ u is the output function (normally considered as deterministic

and one-one), and R is an algorithm called the reinforcement scheme, which

generates ~P (t+ 1) from ~P (t) and the particular input at a discrete instant t.

The automaton operates in a random environment and chooses its current

state according to the input received from the environment. The new state prob-

abilities distribution ~P (t+ 1) reflects the information obtained from the environ-

ment. The random environment has a set of inputs u and its set of outputs is

frequently binary 0, 1, with ‘0’ corresponding to the reward response and ‘1’

to the penalty response. If the input to the environment is ui the environment

produces a penalty response with probability ci.

Fig. 4.3 shows the feedback configuration of a learning automaton operating

in a random environment. At each instant t the environment evaluates the action

of the automaton by either a penalty ‘1’ or reward ‘0’. The performance of the

automaton’s behaviors is the average penalty

I(t) =1

m

m∑i=1

pi(t)ci (4.4)

51


which must be minimized. In order to minimize the expectation of penalty (4.4),

the reinforcement scheme modifies the state probability vector ~P . The basic

idea is to increase pi if state qi generates a reward and to decrease pi when the

same state has produced a penalty. A great number of reinforcement schemes for

minimizing the expected value of penalty have been studied and compared. One

of the most serious difficulties that arise in learning automata is the dichotomy

between learning speed and accuracy. If the speed of convergence is increased in

any particular reinforcement scheme, this action is almost invariably accompanied

by an increase of convergence to the undesired state [113; 115].

Learning Automata

Random Environment

q1, q2, ..., qmu1, u2, ..., um

c1, c2, ..., cm

0,1

p(t)

Figure 4.3: Interaction of learning automaton with random environment

4.3.4 Stochastic Reinforcement Algorithms based on Re-

ward and Penalty

In the technical literature a widely used stochastic reinforcement algorithms is

LR−I , which stands for Linear Reward-Inaction algorithm.

Let us suppose that the action chosen by the automaton at instant t is φi, for

the LR−I the updating of the action probabilities is as follows [102]:

pi(t+ 1) = pi(t) + λβ(t) [1− pi(t)] (4.5)

52


pj(t+ 1) = pj(t)− λβ(t)pj(t) ∀j 6= i, 1 ≤ j ≤ N (4.6)

where 0 < λ < 1 is the learning rate and β(t) is the environment’s response:

β = 1 (favorable response or reward) or β = 0 (unfavorable response or penalty

in which case the algorithm do not change the probability, i.e. inaction).

Let’s suppose that there areK different specialized tasks, then we designate by

pij(t), the probability at instant t that robot ri selects task lj these probabilities

hold:

0 ≤ pij(t) ≤ 1;N∑i=1

pij(t) = 1; i = 1, 2, ..., N robots; j = 1, 2, . . . , K tasks (4.7)

Initially, without previous robot’s experience these probabilities are initialized

at the “indifference” position as follow:

pij(0) =1

Kfor i = 1, 2, ..., N robots and j = 1, 2, . . . , K tasks (4.8)

Afterwards it starts the learning process in which each robot updates its elec-

tion probabilities according to the following conventional updating rule:

pij(t+ 1) = pij(t) + λβ(t) [1− pij(t)] (4.9)

where 0 < λ < 1 is the learning rate with a fixed value of 0.2; β(t) is the usual

reward signal generated by the environment of the learning automata with the

following interpretation: β(t) = 1; reward if and only if for the corresponding task

lj at instant t it holds that #Rj(t) ≤ #Lj(t), i.e. the number of robots performing

task lj is lower than the number of tasks lj to be executed; β(t) = 0; penalty

if and only if #Rj(t) > #Lj(t); i.e. when the number of robots performing

task lj is greater than the number of tasks lj or whenever there are not pending

tasks to be executed the automata receives a penalty signal. In few words: at

each instant t the environment evaluates the action of the automata, when the

response generated by environment is 1 means that the action is “favorable” and

53

4.4. ANT COLONY OPTIMIZATION

if the response value is 0 corresponds to an “unfavorable” as follow:

βLj(t) =#Rj

#Lj=

If ≤ 1 then reward β = 1

If > 1 then penalty β = 0(4.10)

4.4 Ant Colony Optimization

4.4.1 A Brief Introduction

For over many years, communities or colonies of social insects have been deeply

studied by some researchers [119; 136], as they provide fascinating examples of

functional collective behavior; and are certainly an example of decentralized res-

olution problems, by the way how these insects perform tasks like finding food,

building or expanding their nests, division of labor, etc. In addition, another

important feature is that they can solve problems in a way very flexible because

it allows adaptation to environmental changes robustly. Therefore, it has devoted

a great deal of research to figuring out how the social insects achieve these feats.

With these researches, has allowed computer scientists to design a variety of “ant

algorithms”, all of which attempt to capture some amazing qualities of social

insects such as self-organization, flexibility, and robustness.

Ant Colony Optimization (ACO) is a meta-heuristic approach that was in-

troduced in the early 1990’s by Marco Dorigo in [37; 38]. Since its introduction

to the present, a growing number of researchers have been involved in further

developing it. The general idea of the ACO approach is to solve combinatorial

optimization problems based on the behavior of real ants, more specifically, the

inspiring source is how ants can find shortest paths between food sources their

nest [12] (see fig. 4.4). ACO algorithms are stochastic search procedures based

on a colony of artificial ants (computational agents) that work cooperatively and

communicate through artificial pheromone trails [43], by means a parameterized

probabilistic model [45] called by the authors “the pheromone model”.

ACO algorithms use a population of artificial ants to construct feasible solu-

tions to a discrete optimization problem. The solutions are evaluated according

to a fitness function and according to a pre-defined rule implant their solution

information in a global memory known as a pheromone mapping where each

54


Figure 4.4: In [12] presents a experimental setting that shows the shortest pathfinding capability of ant colonies

component of the pheromone mapping corresponds to an individual connection

of the problem being optimized. Ant algorithms are based on two essential prin-

ciples [42]: (1) self-organization, in which global behavior arises from a myriad of

low-level interactions, and (2) stigmergy, in which the individuals interact with

one another indirectly using the environment as an intermediary. That is, one

individual changes its surroundings (e.g., by laying a pheromone trail), and other

individuals then react to those changes at a later time.

ACO algorithms imitate the foraging behavior of natural ants and have been

successfully used in several problems, allowing the application of this search

metaphor to the finding of the solutions of hard combinatorial optimization prob-

lems like the travelling salesman problem [26; 39; 48; 150; 161], the quadratic

assignment problem [100], the job shop scheduling problem [39]. Later scientists

55


have applied them to many different discrete optimization problems [1; 21; 22; 41;

165] and also it has been applied to different combinatorial optimization problems

[11; 76].

4.4.2 Biological Inspiration

“Greater understanding of biology in modern times has enabled significant break-

throughs in improving healthcare, quality of life, and eliminating many diseases

and congenital illnesses. Simultaneously there is a move towards imitating nature

and copying many of the wonders uncovered in biology, resulting in biologically

inspired systems” [73]. Biological inspiration can play many different roles, one of

biology’s most important roles is that it can serve as an existence proof of perfor-

mance that some desirable behavior is possible. That is, a biological system may

operate according to principles that have applicability to non-biological comput-

ing problems. By studying the biological system, one may be able to derive or

understand the relevant principles and use them to help solve a non-biological

problem.

Ants are able to find the shortest path between their nest and a food source

following the trail of a chemical substance called “pheromone” [28]. If not

pheromone trails available, ants move randomly, but in the presence of pheromones

they have a tendency to follow the trail, that is, the ants choose the path to follow

by this simple rule: the stronger the pheromone trail, the higher the desirability;

then, the probability of such an event occurring is inversely proportional to the

amount of pheromone and directly proportional to the distance away from the

nest. This behavior allows ants to identify the shortest paths between their nest

and the food source. What is even more amazing is that these emergent proper-

ties seem to exist without the requirement for centralized control [74].

ACO algorithms are based on the following ideas:

• Each path followed by an ant is associated with a candidate solution for a

given problem.

• When an ant follows a path, the amount of pheromone deposited on that

56


path is proportional to the quality of the corresponding candidate solution

for the target problem.

• When an ant has to choose between two or more paths, the path(s) with a

larger amount of pheromone have a greater probability of being chosen by

the ant.

As a result, the ants eventually converge to a short path, hopefully the opti-

mum or a near-optimum solution for the target problem, as explained before for

the case of natural ants. In essence, the design of an ACO algorithm involves the

specification of [38]:

• An appropriate representation of the problem, which allows the ants to

incrementally construct/modify solutions through the use of a probabilistic

transition rule, based on the amount of pheromone in the trail and on a

local, problem-dependent heuristic.

• A method to enforce the construction of valid solutions, that is, solutions

that are legal in the real-world situation corresponding to the problem def-

inition.

• A problem-dependent heuristic function (η) that measures the quality of

items that can be added to the current partial solution.

• A rule for pheromone updating, which specifies how to modify the pheromone

trail (τ).

• A probabilistic transition rule based on the value of the heuristic function

(η) and on the contents of the pheromone trail (τ) that is used to iteratively

construct a solution.

Artificial ants have several characteristics similar to real ants, namely:

• Artificial ants have a probabilistic preference for paths with a larger amount

of pheromone.

• Shorter paths tend to have larger rates of growth in their amount of pheromone.

• The ants use an indirect communication system based on the amount of

pheromone deposited on each path.

57


4.4.3 The Ant System Approach

Ant System was first introduced and applied to TSP by Dorigo et al. [39; 40; 46].

Initially, each ant is randomly put on a city. During the construction of a feasible

solution, ants select the following city to be visited through a probabilistic decision

rule. When an ant k states in city i and constructs the partial solution, the

probability moving to the next city j neighboring on city i is given by:

pkij(t) =

[τij(t)]

α[ηij ]β∑

u∈Jk(i)

[τiu(t)]α[ηiu]

βif j ∈ Jk(i)

0 otherwise

(4.11)

where τij is the intensity of trails between edge (i,j) and ηij is the heuristic visibil-

ity of edge (i,j), and ηij = 1/dij. Jk(i) is a set of cities which remain to be visited

when the ant is at city i. α and β are two adjustable positive parameters that

control the relative weights of the pheromone trail and of the heuristic visibility.

After each ant completes its tour, the pheromone amount on each path will be

adjusted with equation

τij(t+ 1) = (1− ρ)τij(t) + ∆τij(t) (4.12)

In this equation,

∆τij(t) =m∑k=1

∆τ kij(t) (4.13)

∆τ kij(t) =

QLk, if (i, j) ∈ tour done by ant k

0 otherwise(4.14)

(1 − ρ) is the pheromone decay parameter (0 < ρ < 1) where it represents the

trail evaporation when the ant chooses a city and decide to move. Lk is the length

of the tour performed by ant k and m is the number of ants.

In this case, a generic robot ri selects the tasks in a deterministic way based

on “forces” fij(t). These forces are updated, after being initialized at the “indif-

58


ference” position, as follows:

fij(t+ 1) = ρfij(t) + (1− ρ)β(t); 0 ≤ ρ ≤ 1 (4.15)

where ρ is the usual learning rate of ant colony optimization-like algorithms and

β(t) is the reward/penalty signal at instant t with the same exact interpretation

than for the learning automata-based probabilistic algorithms.

59

Part IV

Experimentation and Conclusions

61

Chapter 5

Experimental Results

Always doubt yourself, until the data leaves

no doubt.

Louis Pasteur

SUMMARY: This chapter presents the experimental results obtained

from the applying of the different decentralized approaches inspired by di-

vision of labor in social insects. Section 5.1 details the preliminaries of the

experimentation, the evaluation of the performance index by introducing

additive noise to the number of pending loads and dynamic tasks genera-

tion over time. Section 5.2 presents the experiments with threshold models,

goals, evaluation of the approach with additive noise and dynamic tasks.

Section 5.3 shows the learning curves with the evolution of the system

using learning automata-based probabilistic algorithms including the ex-

periments with additive noise and dynamic task. Section 5.4 describes the

experiments with ant colony optimization-based deterministic algorithms,

presenting the goals, the evaluation of the approach with additive noise

and dynamic tasks.

62

5.1. PRELIMINARIES OF THE EXPERIMENTATION

5.1 Preliminaries of the Experimentation

We have conducted several experiments to evaluate the system performance index

by applying of response threshold models, learning automata-based probabilistic

algorithms as well as ant colony optimization-based deterministic algorithms to

solve the optimal distribution of the tasks among the N robots; so that all of them

are executed by means of the minimum number of robots. The ideal objective

is that the performance index or learning curve corresponding to the load lj(t)

of each task tend asymptotically to zero for all curves in the minimum time and

using the minimal possible number of robots for task execution.

In the simulations we have considered some variants such as: the multi-robot

system size, different loads lj(t) for each type of task, two different ways to carry

out the tasks selection, the additive noise generation to simulate the robot’s error

and the dynamic generation of tasks lj(t) over time. According to the results

obtained with eq. 4.1, eq. 4.9 and eq. 4.15 we have also employed two different

mechanisms for the response threshold model and for the learning automata-based

probabilistic algorithms, for the selection of tasks:

1. Maximum principle (MP): at each instant t choose the task that has the

highest probability for all Tθij(sj), pij(t) and fij(t).

2. The strictly random method (SRM): using the probabilities Tθij(sj), pij(t)

and fij(t) in the strict sense of the word, it generates a random number with

uniform distribution (0− 1) and it selects the appropriate task to the value

obtained by the method of inversion of discrete probability distributions.

5.1.1 Evaluation of the Performance Index

The performance index is an indicator that evaluates the efficiency of each method

concerning the optimal distribution of the existing tasks so that all of them are

executed by means of the minimum number of robots. In other words, the per-

formance index or learning curve for each task is the corresponding load Lj(t)

versus time, the ideal objective being that all these curves tend asymptotically

to zero in the minimum time and also with the additional constraint of using the

minimum possible number of robots for task execution.

63

5.1. PRELIMINARIES OF THE EXPERIMENTATION

For all experiments, the graphics show the performance index for 4 types

of tasks with different loads. Each task is represented by a different color (for

example: task 1 is red, task 2 is blue, task 3 is green and task 4 is purple). The

continuous line means that the evaluation of the performance index is without

noise and with noise is the dotted line.

5.1.1.1 Additive Noise Generation

To evaluate the evolution of the performance index we have introduced additive

noise, perturbing the number of pending loads to simulate the robot’s error in

estimating the real number of pending tasks. The noise generated is modeled

using a normal distribution (“White Noise”) as follows:

Noise = R +R ∗ S = R(1 + S) (5.1)

where Noise is the noise generated to the number of pending loads li(t), which is

proportional to the amplitude of the noise R without perturbing, S is a Gaussian

distribution with a mean of ‘0’ and a typical deviation ‘0.005’ N(0, 0.005).

Table 5.1 and 5.2 shows a scheme of the experiments performed with their

respective variants.

Without Noise With NoiseXXXXXXXXXXXXApproaches

MechanismsMP SRM MP SRM

Not dy-namictasks

Threshold Models Fig.5.1 and Fig. 5.2 Fig.5.1 and Fig. 5.2Learning Automata Fig.5.4 and Fig. 5.5 Fig.5.4 and Fig. 5.5Ant Colony Optimization Fig.5.7 and Fig. 5.8 Fig.5.7 and Fig. 5.8

Table 5.1: Experiments performed without dynamic tasks and their respectivevariants

5.1.1.2 Dynamic Tasks Generation

In the previous experiments, the number of loads for each type of task is deter-

mined from the beginning of the simulation and there is no change until the end

64

5.2. EXPERIMENTS WITH THRESHOLD MODELS

of the execution. To evaluate the performance of the algorithm we have gen-

erated dynamic tasks, that is, new tasks appear in the environment. This idea

was rescued from classical models of queues simulation, so we have used Poisson

distribution to determine the probability of generating a number of tasks through

time:

f(k;λ) =e−λλk

k!(5.2)

Specifically we will have a different distribution for k = 1 to 100. Each λ is a

positive real number that represents the number of tasks expected to be generated

during a time interval. The expected number of tasks generated is decreasing, and

therefore the system is stable, we have parameterized this constant λ as follows:

λ(t) = σ − α ∗ t (5.3)

where σ is the initial value (for example, 10 or 20) and α is a factor of “reduction

tasks” that initially we have defined to 1. Finally, t corresponds the time of

execution at each instant.

Without Noise With NoiseXXXXXXXXXXXXApproaches

MechanismsMP SRM MP SRM

Dynamictasks

Threshold Models Fig.5.3 Fig.5.3Learning Automata Fig.5.6 Fig.5.6Ant Colony Optimization Fig.5.9 Fig.5.9

Table 5.2: Experiments performed with dynamic tasks and their respective vari-ants

5.2 Experiments with Threshold Models

5.2.1 Goals

In this subsection we present the experiments conducted to test the response

threshold model proposed by Bonabeau et al. and are described in subsection

4.2.2, for the problem of heterogeneous multi-tasks distribution in multi-robot

systems; we have introduced additive noise in the number of pending loads and

65


we have generated dynamic tasks through time. The objective of the experiments

is to analyze the performance index of the system. In the following sections we

describe the experiments performed and the preliminary results obtained.

5.2.2 Evaluation of the Approach with Additive Noise

Fig. 5.1 and Fig. 5.2 show the evolution of the system performance index obtained

for self-selection of heterogeneous specialized tasks through response threshold

models, using both mechanisms: maximum principle and the strictly random

method, with a team of robots formed by 20 – 30 heterogeneous robots and 4

types of heterogeneous specialized tasks with different loads. Each experiment

has been run 10 times and the results shown are the mean of all.

Fig. 5.1 shows the performance index through threshold response models for

the two task selection mechanisms mentioned above and for different values of

noise (noise = 0.10), Fig. 5.2 presents the results obtained with noise = 0.25.

It can be noted that in all cases the generation of additive noise does not affect

the performance of the approach, on the contrary, in most cases better results

are obtained with the generation of noise.

10 20 30 40 50 600

50

100

150

200

250

300

350

400

450

Time

Tas

ks

The strictly random method

5 10 15 20 25 30 35 40 45 500

50

100

150

200

250

300

350

400

450

Tas

ks

Maximum principle

Time

J0 J1 J2 J3Without NoiseWith Noise (0.10)


Figure 5.1: Learning curves with the evolution of the system performance indexfor self-election of tasks using Response Threshold Models with noise = 0.10

66


10 20 30 40 50 600

50

100

150

200

250

300

350

400

450

Time

Tas

ks

Maximum principle

10 20 30 40 50 600

50

100

150

200

250

300

350

400

450

Time

Tas

ks




Figure 5.2: Learning curves with the evolution of the system performance indexfor self-election of tasks using Response Threshold Models with noise = 0.25

5.2.3 Evaluation of the Approach with dynamic tasks

Fig. 5.3 shows the evolution of the system performance index with dynamic tasks

generation through time using the Poisson distribution. Experiments have been

performed 10 times and the results shown are the mean of all, additive noise is

also generated in the loads with the maximum principle and the strictly random

method. In the results it can be observed dynamic tasks generation, the tasks

number generated is decreasing over time. All learning curves tend to zero and

not affected the performance by introducing of noise, it can see that better results

are obtained with the maximum principle than with the strictly random method.

5.2.4 Results and Discussion

We have presented and evaluated a method for the multi-tasks distribution among

a team of robots, experimental results show that the proposed method is an

effective method and can be efficiently applied to solve this self-coordination

problem in multi-robot systems.

67

5.3. EXPERIMENTS WITH LEARNING AUTOMATA-BASEDPROBABILISTIC ALGORITHMS

20 40 60 80 100 120 1400

100

200

300

400

500

600

700

800Maximum principle

Time

Tas

ks

20 40 60 80 100 120 1400

100

200

300

400

500

600

700

800The strictly random method

Time

Tas

ks



Figure 5.3: Dynamic tasks generation: learning curves with the evolution of thesystem performance index for self-election of tasks using Response ThresholdModels

5.3 Experiments with Learning Automata-based

Probabilistic Algorithms

5.3.1 Goals

Experiments to test the learning automata-based probabilistic algorithms are

described in subsection 4.3.3 and 4.3.4. The approach was tested to evaluate the

performance index of the system with additive noise and dynamic tasks generation

for the same problem of heterogeneous multi-tasks distribution in multi-robot

systems.


In the same way, Fig. 5.4 and Fig. 5.5 present the evolution of the learning

curves obtained for self-selection of heterogeneous specialized tasks through learn-

ing automata-based probabilistic algorithms, using both mechanisms: maximum

principle and the strictly random method. Besides, experiments are formed by

20 – 30 heterogeneous robots and 4 types of heterogeneous specialized tasks with

different loads . Each experiment has been run 10 times and the results shown

are the mean of all.

Fig. 5.4 shows the performance index using Learning Automata-based proba-

68


bilistic algorithms for both mechanisms and for different values of noise (noise =

0.10), Fig. 5.5 shows the results with noise = 0.25. It can be observed that

learning curves corresponding to the load lj(t) of each task tend asymptotically

to zero. However, when it introduced additive noise in this approach can be

clearly seen that in some cases more time is required for the execution of tasks.

10 20 30 40 50 600

50

100

150

200

250

300

350


Time

Tas

ks

10 20 30 40 50 600

50

100

150

200

250

300

350


Time

Tas

ks



Figure 5.4: Learning curves with the evolution of the system performance indexfor self-election of tasks using Learning Automata-based probabilistic algorithmswith noise = 0.10

According to previous results it can be observed that system performance

with the learning automata approach is more affected with the introduction of

noise versus to the results shown in the response threshold models approach.

5.3.3 Evaluation of the Approach with Dynamic Tasks

Fig. 5.6 shows the evolution of the system performance index with dynamic tasks

generation through time using the Poisson distribution. Experiments have been

performed 10 times and the results shown are the mean of all, we have also ad-

ditive noise generated in the loads with the maximum principle and the strictly

random method. In the results, dynamic tasks generation can be observed, in-

dicating that the tasks number generated is decreasing over time. All learning

curves tend to zero in both mechanism and not affect the performance of the

69


5 10 15 20 25 30 35 40 450

50

100

150

200

250

300


Time

Tas

ks

5 10 15 20 25 30 35 400

50

100

150

200

250

300


Time

Tas

ks



Figure 5.5: Learning curves with the evolution of the system performance indexfor self-election of tasks using Learning Automata-based probabilistic algorithmswith noise = 0.25

approach, however, better results are obtained with strictly random method than

with the maximum principle.

0 20 40 60 80 100 120 140 160 180 2000

200

400

600

800

1000

1200

Time

Tas

ks

Maximum principle

0 20 40 60 80 100 120 140 160 180 2000

200

400

600

800

1000

1200


Time

Tas

ks



Figure 5.6: Dynamic tasks generation: learning curves with the evolution of thesystem performance index for self-election of tasks using Learning Automata-based probabilistic algorithms

70

5.4. EXPERIMENTS WITH ANT COLONYOPTIMIZATION-BASED DETERMINISTIC ALGORITHMS


We have presented the automata learning-based probabilistic algorithm, applied

to self-coordination problem of multi-robot systems. In particular, it addresses

the distribution of heterogeneous multi-tasks to be executed by a team of het-

erogeneous mobile robots. We have evaluated the robustness of the approach by

introducing noise, disturbing the real number of pending tasks and generating

dynamic tasks over time using Poisson distribution. The results confirm that the

robots are capable to select the existing tasks in an autonomously and individu-

ally manner, without the intervention of any global and central tasks scheduler.

5.4 Experiments with Ant Colony Optimization-

based Deterministic Algorithms

5.4.1 Goals

The goal of the experiments presented in this subsection is to test the ability of

the ant colony optimization-based deterministic algorithms to achieve a distribu-

tion of heterogeneous multi-tasks in multi-robot systems, described in subsection

4.4. The performance index of the system is then evaluated in the experiments

performed, through the introduction of additive noise and the dynamic tasks

generation over time


In this case, Fig. 5.7 and Fig. 5.8 also show the evolution of the system per-

formance index obtained through the ant colony optimization by introducing

additive noise in the number of pending loads (noise = 0.10 and noise = 0.25).

Each experiment has been run 10 times and the results shown are the mean

of all, to carry out the self-election of heterogeneous tasks we have used both

mechanisms: maximum principle and the strictly random method, with a team

of robots formed by 20–30 heterogeneous robots and 4 types of heterogeneous

specialized tasks with different loads.

71


10 20 30 40 50 60 70 800

100

200

300

400

500

600

Maximum principle

Time

Tas

ks

10 20 30 40 50 60 70 800

100

200

300

400

500

600


Time

Tas

ks



Figure 5.7: Learning curves with the evolution of the system performance in-dex for selfelection of tasks using Ant Colony Optimization-based deterministicalgorithms with noise = 0.10

10 20 30 40 50 60 70 800

100

200

300

400

500

600

Maximum principle

Time

Tas

ks

10 20 30 40 50 60 700

100

200

300

400

500

600


Time

Tas

ks



Figure 5.8: Learning curves with the evolution of the system performance in-dex for selfelection of tasks using Ant Colony Optimization-based deterministicalgorithms with noise = 0.25

According to the results shown above in Fig. 5.7 and Fig. 5.8, it can be noted

that in all cases the best results are with the maximum principle method instead

72


of strictly random method. All learning curves tend to zero, however, when

additive noise is introduced to the number of pending tasks, the performance

index of the system is also affected and it can be seen clearly that in most cases

more time is required for the execution of tasks.

5.4.3 Evaluation of the Approach with Dynamic Tasks

Finally, we present the results obtained from the evolution of the system per-

formance index by dynamic tasks generation through time, using the Poisson

distribution, by applying ant colony optimization (see Fig. 5.9). Similarly, ex-

periments have been performed 10 times and the results shown are the mean of

all, we also have additive noise generated in the loads with the maximum principle

and the strictly random method. In the results, dynamic tasks generation over

time can be noted with the tasks number generated decreasing over time. All

learning curves tend to zero in both mechanisms and the introduction of additive

noise does not affect the performance, sometimes results are more optimal with

the introduction of noise.

20 40 60 80 100 120 140 160 180 2000

200

400

600

800

1000

Time

Tas

ks


20 40 60 80 100 120 140 160 180 2000

100

200

300

400

500

600

700

800

900


Time

Tas

ks



Figure 5.9: Dynamic tasks generation: learning curves with the evolution of thesystem performance index using Ant Colony Optimization-based deterministicalgorithms

Fig. 5.10 shows the probability mass function and the cumulative distribution

function obtained in experiments with dynamic tasks generation using the Poisson

distribution.

73


Figure 5.10: The index k represents the number of tasks expected to be generatedduring a time interval for different values of λ and P (X = k) describes theprobability that a value of variable X with a given probability distribution isequal to k

Fig. 5.11 shows a summary of the number of tasks lj performed by each robot

Ri using both mechanisms: maximum principle and the strictly random method,

for the approaches proposed in this thesis. It can clearly see that each robot

specializes in a particular task, and after to complete the current task is moved

to perform another task.


We have evaluated the efficiency of the approach concerning the optimal distri-

bution of the existing tasks so that all of them are executed by means of the

minimum number of robots. In the experiments conducted, the performance in-

dex of the system by introducing additive noise and the dynamic tasks generation

over time is evaluated. According to the results obtained, the approach can be

efficiently applied to solve this self-coordination problem in multi-robot systems.

74


0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90

Robots

Num

ber

of t

asks

Maximum principle

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90

Robots

Num

ber

of t

asks


Tasks J0 J2 J3 J4Tasks J0 J2 J3 J4

(a) Using the response threshold approach in Fig. 5.1

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90Maximum principle

Robots

Num

ber

of t

asks

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70


Robots

Num

ber

of t

asks


(b) Using learning automata-based probabilistic algorithms in Fig. 5.4

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90


Robots

Num

ber

of t

asks

0 2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70


Robots

Num

ber

of t

asks


(c) Using ant colony optimization-based deterministic algorithms in Fig. 5.7

Figure 5.11: Number of tasks performed by each robots

75

Chapter 6

Conclusions and Further Work

There are two modes of acquiring

knowledge, namely, by reasoning and

experience. Reasoning draws a conclusion

and makes us grant the conclusion, but

does not make the conclusion certain, nor

does it remove doubt so that the mind may

rest on the intuition of truth unless the

mind discovers it by the path of experience.

Roger Bacon

SUMMARY: This chapter summarizes the results of the thesis and con-

cludes by suggesting possible future extensions to the presented work.

77

6.1. CONCLUSIONS

6.1 Conclusions

The research described in this thesis has concerned the coordination of multi-

robot systems; which focuses on the self-coordination problem to the distribution

of heterogeneous multi-tasks using different approaches. In particular, the appli-

cation of response threshold models, the application of reinforcement learning al-

gorithm based on learning automata theory and, finally, ant colony optimization-

based deterministic algorithms. We have focused our interest on truly decen-

tralized solutions in the sense that the robots have to select existing tasks in

an autonomously and individually manner, so that all the tasks are optimally

executed without the intervention of any global and central tasks scheduler. Af-

ter a brief overview on experimental results obtained, we present in detail the

conclusions of this research work as follows:

• We have proposed and presented a bio-inspired solution based on response

threshold models to solve the problem corresponding to the multi-tasks

distribution. More specifically, it addresses the self-election of heteroge-

neous and specialized tasks by autonomous robots, as opposed to the usual

multi-tasks allocation problem in multi-robot systems in which an exter-

nal controller distributes the existing tasks among the individual robots.

According to the results obtained, we have shown that the bio-inspired

threshold model can be efficiently applied to solve this self-coordination

problem in multi-robot systems [131].

• We have proposed and presented a solution through automata learning-

based probabilistic algorithm, applied to the self- coordination problem of

multi-robot systems, taking into account the distribution of heterogeneous

multi-tasks in a team of mobile robots. The performance indexes or learn-

ing curves obtained for each task corresponding to load Li(t) versus time,

confirm that the robots are capable to select the existing tasks in an au-

tonomous and individual manner without the intervention of any global and

central tasks scheduler. We have shown that the algorithm can be efficiently

applied to solve this self-coordination problem in multi-robot systems ob-

taining truly decentralized solutions [132].

78

6.1. CONCLUSIONS

• We have compared two different approaches and we have proposed a solution

to the self-coordination problem of multi-robot systems in the distribution

of heterogeneous multi-tasks by applying Ant Colony Optimization-based

deterministic algorithms as well as Learning Automata-based probabilistic

algorithms. We have evaluated the efficiency of each method concerning the

optimal distribution of the existing tasks so that all of them are executed by

means of the minimum number of robots. According to the results obtained,

we can speak of multi-tasks selection instead of multi-tasks allocation, that

means, as the agents or robots select the tasks instead of being assigned

a task by a central controller. We have shown that both approaches can

be efficiently applied to solve this self-coordination problem in multi-robot

systems obtaining truly decentralized solutions [32].

• Apart of the analysis mentioned above, with the performance indexes achieved

by each approach, we have also analyzed the robustness of each method as

regards the estimation error or noise as it is an important and critical pa-

rameter concerning the practical viability of these methods or this method

in real multi-robots scenarios. We have perturbed the number of pending

load to simulate the robot’s error in estimating the real number of pending

tasks and we have also studied the performance index with dynamic gen-

eration of loads through time. To carry out the selection of tasks in the

approaches we used two mechanisms: maximum principle and the strictly

random method. In most experiments, the best results are obtained with

strictly random method instead of the maximum principle. According to

the results obtained the noise generated does not affect the performance of

the approaches since the best result are obtained by generating noise in the

pending loads [32].

• Finally, we have experimented with response threshold models and learning

automata-based probabilistic algorithms applied to the general problem of

coordinating multiple robots. We have conducted several experiments to

evaluate the evolution of the performance index considering some variants,

such as, the multi-robot system size, different loads for each type of task, two

different ways to carry out the tasks selection, the additive noise generation

79

6.2. FUTURE RESEARCH WORK

to simulate the robot’s error and the dynamic generation of tasks over

time. According to the results obtained the noise generated does not affect

the performance of the response threshold models approach, since the best

result are obtained by generating noise in the pending loads, however, by

applying learning automata-based probabilistic algorithms in some cases

more time is required for the execution of tasks. We have also shown that

both approaches can be efficiently applied to solve this self-coordination

problem in multi-robot systems, obtaining truly decentralized solutions.

6.2 Future Research Work

This PhD thesis describes in detail a study about the coordination problem in

multi-robot systems, but in particular, it addresses the distribution of heteroge-

neous mult-tasks among multiple robots. The solutions presented in this work

were useful to complete the goals proposed at the beginning of this thesis. How-

ever, with the development and the results obtained by the methods proposed,

revisions and improvements that lead to new research lines can be extended in

many ways. Next, we summarized possible future research lines arising from this

PhD thesis as follows:

• We acknowledge the need for more flexible inter-robot and inter-group co-

ordination, because, environments may not always be fully known and the

communication will not be perfect. A major contributor to complexity of

multi-robot problems is task assignment. Therefore, an interesting topic

of research would be to study and test other sophisticated techniques for

optimizing the distribution of multi-tasks.

• With respect to the mathematical part, it would be interesting to perform

the tasks generation following a periodic pattern (hours, days, months,

etc.) through manipulation of sinusoidal functions. In addition, it would

interesting to define tasks with priorities, that is, tasks with penalty costs

due to inactivity in certain tasks or non-compliance with some important

tasks.

80

6.2. FUTURE RESEARCH WORK

• It would be interesting to study and implement these results in some robotic

simulators (e.g. Player&Stage, Pyrobotics, Webots, RoboCup) and specify

which multi-robot simulator will be the most appropriate to carry out the

implementation in real robots.

81

Bibliography

Of the various instruments invented

by man, the most amazing is the

book; all others are extensions of his

body Only the book is an extension

of the imagination and memory.

Jorge Luis Borges

[1] Alaya, I., Solnon., C. and Ghedira, K. (2007). Ant colony optimization

for multi-objective optimization problems. In Proceedings of the 19th IEEE

International Conference on Tools with Artificial Intelligence, pp. 450–457.

56

[2] Agassounon, W. and Martinoli, A. (2002). Efficiency and robustness of

threshold-based distributed allocation algorithms in multi-agent systems.

In 1st International Joint Conference on Autonomous Agents and Multi-

Agents Systems, pp. 1090–1097.27, 46

[3] Arcak, M. (2007). Passivity as a design tool for group coordination. In IEEE

Transactions on Automatic Control, 52(8):1380–1390. 3

[4] Arai, T., Pagello, E. and Parker, L.E. (2002). Guest editorial advances in

multirobot systems. In IEEE Transactions on Robotics and Automation,

volume 18, pages 655–661. 12

[5] Baca, J.A. (2011). A heterogeneous modular robotic system towards the exe-

cution of cooperative tasks. Ph.D. thesis, Universidad Politcnica de Madrid.

20

83

BIBLIOGRAPHY

[6] Baeksuk, C., Kyungmo, J., Youngsu, C., Daehie, H., Myo-Taeg, L., Shin-

suk, P., Yongkwun, L., Sung-Uk, L., Min, C.K. and Kang, H.K. (2009).

Robotic automation system for steel beam assembly in building construc-

tion. In IEEE 4th International Conference on Autonomous Robots and

Agents, pages 655–661. 21

[7] Baglietto, M., Cannata, C., Capezio, F., Grosso, A. and Sgorbissa, A.

(2009). A multi-robot coordination system based on RFID technology. In

IEEE International Conference on Advanced Robotics, pages 1–6. 28

[8] Balch, T. (1998). Taxonomies of multirobot task and reward. Technical Re-

port, Carnegie Mellon University. 15

[9] Berman, S., Lindsey, Q., Sakar, M., Kumar, V. and Pratt, S. (2010). Study

of group food retrieval by ants as a model for multi-robot collective trans-

port strategies. Robotics: Science and Systems, The MIT Press. 23

[10] Bernon, C., Chevrier, V., Hilaire, V. and Marrow, P. (2005). Applications

of self-organising multi-agent systems: an initial framework for comparison.

Informatica, 30:73–82. 27

[11] Blum, C. and Dorigo, M. (2004). The hyper-cube framework for ant colony

optimization. IEEE Transactions on Systems, Man, and Cybernetics -Part

B, 34(2):1161–1172. 56

[12] Blum, C. (2005). Ant colony optimization: introduction and recent trends.

Physics of Life Reviews, 2(4):353–373. xviii, 54, 55

[13] Bonabeau, E., Theraulaz, G. and Deneuborurg, J. (1996). Quantitative

study of the fixed threshold model for the regulation of division of labour

in insects societies. Proceedings Biological Science, pages 1565–1569. 44

[14] Bonabeau, E., Theraulaz, G., Deneubourg, J.L., Aron, S. and Camazine. S.

(1997). Self-organization in social insects. Trends in Ecology & Evolution,

12(5):188–193. 42

84

BIBLIOGRAPHY

[15] Bonabeau, E., Theraulaz, G. and Deneubourg, J. (1998). Fixed response

thresholds and the regulation of division of labor in insect societies. Bulletin

of Mathematical Biology, pages 753–807. 45, 46

[16] Bonabeau, E., Dorigo, M. and Theraulaz, G. (1999). Swarm intelligence:

from natural to artificial systems. New York: Oxford Univ. Press. 28

[17] Braunl, T. (2008). Embedded robotics: mobile robot design and applica-

tions with embedded systems. Springer-Verlag Berlin Heidelberg. 21

[18] Burgard, W., Moors, M., Stachniss, C. and Schneider, F. (2005). Coordi-

nated multi-robot exploration. IEEE Transactions on Robotics, 21(3):376–

386. 3, 24

[19] Cao, Y., Fukunaga, A.S. and Kahng, A.B.(1997). Cooperative mobile

robotics: antecedents and directions. Autonomous Robots, 4:1–23. 15

[20] Cao, Y., Ren, W. and Li, Y. (2009). Distributed discrete-time coordinated

tracking with a time-varying reference state and limited communication.

Automatica, 45(5):1299–1305. 25

[21] Chaharsooghi, S.K. and Meimand Kermani, A.H. (2008). An intelligent

multi-colony multi-objective ant colony optimization (ACO) for the 0-1

knapsack problem. In IEEE Congress on Evolutionary Computation, pages

1195–1202. 56

[22] Chaharsooghi, S.K. and Meimand Kermani, A.H. (2008). An effective

ant colony optimization algorithm (ACO) for multi-objective resource

allocation problem (MORAP). Applied Mathematics and Computation,

200(1):167–177.56

[23] Chaimowicz, L., Sugar, T., Kumar, V. and Campos, M. (2001). An archi-

tecture for tightly coupled multi-robot cooperation. In IIEEE International

Conference on Robotics and Automation, volume 4, pages 2292–2297. 19

[24] Chaimowicz, L., Grocholsky, B., Keller, J.F., Kumar, V. and Taylor, C.J.

(2004). Experiments in multirobot air-ground coordination. In IEEE Inter-

85

BIBLIOGRAPHY

national Conference on Robotics and Automation, volume 4, pages 4053–

4058. 3, 17

[25] Chunyang, L., Yingwei, M. and Chang’an, L. (2009). Cooperative multi-

robot map-building under unknown environment. In Proceedings of the 2009

International Conference on Artificial Intelligence and Computational In-

telligence, volume 3, pages 392–396. 17

[26] Colorni, A., Dorigo, M. and Maniezzo, V. (1991). Distributed optimiza-

tion by ant colonies. In Proceedings of ECAL91 - European Conference on

Artificial Life, pages 134–142. 55

[27] Dai, Y. and Lee, S.G. (2011). Leader-follower formation control based on

hybrid formation control framework and waypoint in cone method. In IEEE

International Conference on Robot, Vision and Signal Processing, pages

233–236. 25

[28] Detrain, C., Deneubourg, J.L. and Pasteels, J. (1999). Decision-making in

foraging by social insects. In C. Detrain, J.L. Deneubourg, and J. Pasteels,

editors, Information Processing in Social Insects. 56

[29] De Almeida, A.T. and Fong, J. (2011). Domestic service robots. IEEE

Robotics and Automation Magazine, 18(3):18–20. 21

[30] De Hoog, J., Cameron, S. and Visser, A. (2010). Dynamic team hierarchies

in communication-limited multi-robot exploration. In IEEE International

Workshop on Safety Security and Rescue Robotics, pages 1–7. 19

[31] De Lope, J., Maravall, D. and Quinonez, Y. (2012). Response threshold

models and stochastic learning automata for self-coordination of hetero-

geneous multi-tasks distribution in multi-robot systems. Robotics and Au-

tonomous Systems, DOI information: 10.1016/j.robot.2012.07.008. 7

[32] De Lope, J., Maravall, D. and Quinonez, Y. (2012). Decentralized multi-

tasks distribution in heterogeneous robot teams by means of ant colony

optimization and learning automata. In Hybrid Artificial Intelligent Sys-

tems, volume 7208, pages 103–114.8, 79

86

BIBLIOGRAPHY

[33] Dias, B. and Stentz, A. (2000). A free market architecture for distributed

control of a multirobot system. In 6th International Conference on Intelli-

gent Autonomous Systems, pages 115–122. 28

[34] Dias, B. (2004). Traderbots: A new paradigm for robust and efficient multi-

robot coordination in dynamics environments. Ph.D. dissertation, Robotics

Institute, Carnegie Mellon University, Pittsburgh. 29

[35] Dias, M.B., Zlot, R., Kalra, N. and Stentz, A. (2006). Market-based

multi-robot coordination: a survey and analysis. Proceedings of the IEEE,

94(7):1257–1270. 12

[36] Dimarogonas, D.V. and Johansson, K.H. (2010). Stability analysis for

multi-agent systems using the incidence matrix: Quantized communication

and formation control. Automatica, 46(4):695–700. 25

[37] Dorigo, M., Maniezzo, V. and Colorni, A. (1991). The ant system: an

autocatalytic optimizing process. Technical Report TR91-016, Politecnico

di Milano. 54

[38] Dorigo, M. (1992). Optimization, learning and natural algorithms. Ph.D.

thesis, Dipartimento di Elettronica, Politecnico di Milano, Milan. 54, 57

[39] Dorigo, M., Maniezzo, V. and Colorni, A. (1996). The ant system: optimiza-

tion by a colony of cooperating agents. In IEEE Transactions on Systems,

Man, and Cybernetics-Part B, 26(1):29–41. 55, 58

[40] Dorigo, M. and Gambardella, L.M. (1997). Ant Colony System: A co-

operative learning approach to the traveling salesman problem. In IEEE

Transactions on Evolutionary Computation, 1(1):53–66. 58

[41] Dorigo, M., Di, C. and Gambardella, L.M. (1999). Ant algorithms for dis-

crete optimization. Artificial Life, 5(2):137–172. 56

[42] Dorigo, M., Bonabeau, E. and Theraulaz, G. (2000). Ant algorithms and

stigmergy. Future Generation Computer Systems, 16(9):851–871. 55

87

BIBLIOGRAPHY

[43] Dorigo, M. and Stutzle, T. (2004). Ant colony optimization. MIT Press,

Cambridge, MA. 54

[44] Dorigo, M. (2005). Swarm-bot: An experiment in swarm robotics. In Proc.

of the 2005 IEEE Swarm Intelligence Symp, pages 192–200. 28

[45] Dorigo, M. and Blum, C. (2005). Ant colony optimization theory: a survey.

Theoretical Computer Science, 344(2-3):243–278. 54

[46] Dorigo, M., Birattari, M. and Stutzle, T. (2006). Ant colony optimization:

artificial ants as a computational intelligence technique. IEEE Computa-

tional Intelligence Magazine, 1(4):28–39. 58

[47] Dorigo, M. and Birattari. M. (2007). Swarm intelligence. Scholarpedia,

2(9):1462. 28

[48] Duan, H. and Xiufen, Y. (2007). SHybrid ant colony optimization us-

ing memetic algorithm for traveling salesman problem. In IEEE Interna-

tional Symposium on Approximate Dynamic Programming and Reinforce-

ment Learning, pages 92–95. 55

[49] Dudek, G., Jenkin, M.R.M., Milios, E. and Wilkes, D. (1996). A taxonomy

for multi-agent robotics. Autonomous Robots, 3(4):375–397. 12, 13

[50] Duro, R.J., Grana, M. and de Lope, J. (2010). On the potential contribu-

tions of hybrid intelligent approaches to Multicomponent Robotic System

development. Information Sciences, 180(14):2635–2648. 12

[51] Eckholm, B., Anderson, K., Weiss, M. and DeGrandi-Hoffman, G. (2011).

Intracolonial genetic diversity in honeybee (Apis mellifera) colonies in-

creases pollen foraging efficiency. Behavioral Ecology and Sociobiology,

65(5):1037–1044. 44

[52] Emrani, S., Dirafzoon, A. and Talebi, H.A. (2011). Leader-follower forma-

tion control of autonomous underwater vehicles with limited communica-

tions. In IEEE International Conference on Control Applications, pages

921–926. 25

88

BIBLIOGRAPHY

[53] Farinelli, R., Iocchi, L. and Nardi, D. (2004). Multirobot systems: A clas-

sification focused on coordination. IEEE Transactions on Systems, Man,

and Cybernetics, Part B, 34(5):2015-2028. 12, 13

[54] Feng, S. and Zhang, H. (2011). Formation control for wheeled mobile robots

based on consensus protocol. In IEEE International Conference on Infor-

mation and Automation, pages 696–700. 25

[55] Ferrandez, J.M., de la Paz, F. and De Lope, J. (2010). Intelligent robotics

and neuroscience. Robotics and Autonomous Systems, 58(12):1221-1222. 12

[56] Fierro, F., Das, A., Spletzer, J., Esposito, J., Kumar, V., Ostrowski, J.P.,

Pappas, G., Taylor, K.J., Hur, Y., Alur, R., Lee, I., Grudic, G. and Southall,

B. (2002). A framework and architecture for multi-robot coordination. The

International Journal of Robotics Research, 21(10-11):977–995. 19

[57] Fink, J., Michael, N., Kim, S. and Kumar, V. (2009). Planning and con-

trol for cooperative manipulation and transportation with aerial robots.

International Symposium on Robotics Research, pages 324–334. 23

[58] Fox, D., Ko, J., Konolige, K., Limketkai, B., Schulz, D. and Stewart, B.

(2006). Distributed multi-robot exploration and mapping. Proceedings of

the IEEE, 95(7):1325–1339. 19

[59] Fujii, M., Inamura, W., Murakami, H., Tanaka, K. and Kosuge, K. (2007).

Cooperative control of multiple mobile robots transporting a single object

with loose handling. In IEEE International Conference on Robotics and

Biomimetics, pages 816–822. xviii, 23

[60] Fulbright, R. and Stephens, L.M. (1994). Classification of multiagent sys-

tems, USC Technical Report ECE 06-94-02. 15

[61] Garnier, S., Gautrais, J. and Theraulaz, G. (2007). The biological principles

of swarm intelligence. Swarm Intelligence, 1(1):3–31. 28

[62] Gabbai, J.M.E., Yin, H., Wright, W.A. and Allinson, N.M. (2005). Self-

organization, emergence and multi-agent systems. In IEEE International

Conference on Neural Networks and Brain, pages 13–15. 27

89

BIBLIOGRAPHY

[63] Gautrais, J., Theraulaz, G., Deneubourg, J.L. and Anderson, C. (2002).

Emergent polyethism as a consequence of increased colony size in insect

societies. Journal of Theoretical Biology, 215(3):363–373. 44

[64] Gerkey, B.P. and Mataric, M.J. (2002). Sold!: auction methods for mul-

tirobot coordination. IEEE Transactions on Robotics and Automation,

18(5):758–768. 29

[65] Gerkey, B.P. and Mataric, M.J. (2003). Multi-robot task allocation: ana-

lyzing the complexity and optimality of key architectures. In IEEE Interna-

tional Conference on Robotics and Automation, volume 3, pages 3862–3868.

12, 35

[66] Gerkey, B. and Mataric, M.J. (2004). A formal analysis and taxonomy of

task allocation in multi-robot systems. International Journal of Robotics

Research, 23(9):939–954.13

[67] Ghommam, J., Mehrjerdi, H. and Saad, M. (2011). Leader-follower forma-

tion control of nonholonomic robots with fuzzy logic based approach for

obstacle avoidance. In IEEE/RSJ International Conference on Intelligent

Robots and Systems, pages 2340–2345. 25

[68] Gordon, D.M. (2007). Control without hierarchy. Nature, 446(7132):143. 28

[69] Gove, R., Hayworth, M., Chhetri, M. and Rueppell, O. (2009). Division of

labour and social insect colony performance in relation to task and mating

number under two alternative response threshold models. Insectes Sociaux,

56(3):319–331. 44

[70] Guglielmelli, E., Johnson, M.J. and Shibata, T. (2009). Guest editorial

special issue on rehabilitation robotics. In IEEE Transactions on Robotics,

volume 25, pages 447–480. 22

[71] Hanjong, J., ChiSu, S., Kyunghun, K., Kyunghwan, K. and Jaejun, K.

(2007). A study on the advantages on high-rise building construction which

the application of construction robots take. In IEEE Control, Automation

and Systems, pages 1933–1936. 21

90

BIBLIOGRAPHY

[72] Hassas, S., Di Marzo-Serugendo, G., Karageorgos, A. and Castelfranchi,

C. (2006). Self-Organising mechanisms from social and business/economics

approaches. Informatica, 30(1):63–71. 27

[73] Hinchey, M.G. and Sterritt, Roy. (2007). 99% (Biological) inspiration.... In

Proceedings of the Fourth IEEE International Workshop on Engineering of

Autonomic and Autonomous Systems, pages 187–195. 56

[74] Hirsh, A.E. and Gordon, D.M. (2001). Distributed problem solving in social

insects. Annals of Mathematics and Artificial Intelligence, 31(1-4):199–221.

56

[75] Howard, A., Parker, L.E. and Sukhatme, G.S. (2006). Experiments with a

large heterogeneous mobile robot team: exploration, mapping, deployment

and detection. The International Journal of Robotics Research, 25(5-6):431–

447. 3, 17, 35

[76] Hu, X., Zhang., J. and Li, Y. (2008). Orthogonal methods based ant colony

search for solving continuous optimization problems. Journal of Computer

Science and Technology, 23(1):2–18. 56

[77] Hu, Y., Wang, L., Liang, J. and Wang, T. (2011). Cooperative box-pushing

with multiple autonomous robotic fish in underwater environment. In IEEE

in IET Control Theory and Applications, volume 5, pages 2015–2022. 23

[78] Huntsberger, T.L., Pirjanian, P., Trebi-Ollennu, A., Nayar, H.D., Aghazar-

ian, H., Ganino, A.J., Garrett, M., Joshi, S.S. and Schenker, P.S. (2003).

CAMPOUT: a control architecture for tightly coupled coordination of mul-

tirobot systems for planetary surface exploration. IEEE Transactions on

Systems, Man and Cybernetics, Part A: Systems and Humans, 33(5):550–

559. 18

[79] Huntsberger, T.L., Trebi-Ollennu, A., Aghazarian, H., Schenker, P.S., Pir-

janian, P. and Nayar, H.D. (2004). Distributed control of multi-robot sys-

tems engaged in tightly coupled tasks. Autonomous Robots, 17(1):79–92.

34

91

BIBLIOGRAPHY

[80] Iocchi, L., Nardi, D. and Salerno, M. (2001). Reactivity and deliberation:

a survey on multi-robot systems. In Balancing Reactivity and Social Delib-

eration in Multi-Agent Systems, pages 9–34. 12, 13, 14

[81] Jeanson, R., Fewell, J.H., Gorelick, R. and Bertram, S. (2007). Emergence

of increased division of labor as a function of group size. Behavioral Ecology

and Sociobiology, 62(2):289–298. 44

[82] Jelasity, M., Babaoglu, O. and Laddaga, R. (2006). Guest editors’ introduc-

tion: self-management through self-organization. IEEE Intelligent Systems,

21(2):8-9. 27

[83] Jones, C., Shell, D., Mataric, M.J. and Gerkey, Brian. (2004). Principled

approaches to the design of multi-robot systems. In Proc. of the Workshop

on Networked Robotics, IEEE/RSJ International Conference on Intelligent

Robots and Systems, pages 71–80. 12

[84] Jones, C. and Mataric, M.J. (2004). The use of internal state in multi-

robot coordination. In Proceedings of the Hawaii International Conference

on Computer Sciences, pages 27–32. 15

[85] Jones, C. and Mataric, M.J. (2005). Behavior-based coordination in multi-

robot systems. Autonomous Mobile Robots: Sensing, Control, Decision-

Making, and Applications. 13, 19

[86] Jones, E., Browning, B., Dias, B., Argall, B., Veloso, M. and Stentz, A.

(2006). Dynamically formed heterogeneous robot teams performing tightly-

coordinated tasks. In IEEE International Conference on Robotics and Au-

tomation, pages 570–575. 29

[87] Khamis, A.M., Kamel, M.S. and Salichs, M.A. (2006). Cooperation: con-

cepts and general typology. In IEEE International Conference on Systems,

Man and Cybernetics, volume 2, pages 1499–1505. 14

[88] Konolige, K., Fox, D., Ortiz, C., Agno, A., Eriksen, M., Limketkai, B., Ko,

J., Morisset, B., Schulz, D., Stewart, B. and Vicent, R. (2006). Centibots:

very large scale distributed robotic teams. In Experimental Robotics IX:

92

BIBLIOGRAPHY

The 9th International Symposium, Springer Tracts in Advanced Robotics,

volume 9, pages 131–140. 17, 35

[89] Kube, R.C. and Bonabeau, E. (2000). Cooperative transport by ants and

robots. Robotics and Autonomous Systems, 30:85–101. 23, 44

[90] Lacroix, P., Polotski, V. and Cohen, Paul. (1999). Decentralized control of

cooperative multi-robot systems. Integrated Computer-Aided Engineering,

6(4):259–274. 17

[91] Langer, D., Rosenblatt, J.K. and Hebert, M. (1994). A Behavior-based

system for off-road navigation. In IEEE Transactions on Robotics and Au-

tomation, volume 10, pages 776–782. 25

[92] Lawton, J.R.T., Beard, R.W. and Young, B.J. (2003). A decentralized ap-

proach to formation maneuvers. IEEE Transactions on Robotics and Au-

tomation, 19(6):933–941. 24

[93] Lim, C., Mamat, R. and Braunl, T. (2009). Market-based approach for

multi-team robot cooperation. In IEEE International Conference on Au-

tonomous Robots and Agents, pages 62–67. 30

[94] Linder, T., Tretyakov, V., Blumenthal, S., Molitor, P., Holz, D., Murphy,

R., Tadokoro, S. and Surmann, H. (2010). Rescue robots at the collapse

of the municipal archive of cologne city: a field report. In International

Workshop on Safety Security and Rescue Robotics, pages 1–6. 22

[95] Liu, S., Chen, C., Xie, L. and Chang, Y.H. (2010). Formation control of

multi-robot systems. In International Conference on Control Automation

Robotics and Vision, pages 1057–1062. 25

[96] Loula, A., Gudwin, R., El-Hani, C.N. and Queiroz, J. (2010). Emergence of

self-organized symbol-based communication in artificial creatures. Cognitive

Systems Research, 11(2):131–147. 19

[97] Low, K.H. (2011). Robot-assisted gait rehabilitation: from exoskeletons to

gait systems. In Defense Science Research Conference and Expo (DSR),

pages 1–10. 22

93

BIBLIOGRAPHY

[98] Macdonald, E.A. (2011). Multi-robot assignment and formation control.

M.S. thesis, Georgia Institute of Technology. xviii, 25

[99] Madhavan, R., Fregene, K. and Parker, L.E. (2002). Distributed heteroge-

neous outdoor multi-robot localization. In IEEE International Conference

on Robotics and Automation, pages 374–381. 17

[100] Maniezzo, V., Dorigo, M. and Colorni, A. (1994). The ant system applied

to the quadratic assignment problem. Technical Report IRIDIA/94-28, Uni-

versit Libre de Bruxelles, Belgium. 55

[101] Maravall, D., De Lope, J. and Domınguez, R. (2010). Self-emergence of

lexicon consensus in a population of autonomous agents by means of evo-

lutionary strategies. In Proceedings of the 5th International Conference on

Hybrid Artificial Intelligence Systems - Volume Part II, pages 77–84. 19

[102] Maravall, D. and De Lope. J. (2011). Fusion of learning automata theory

and granular inference systems: ANLAGIS. Applications to Pattern Recog-

nition and Machine Learning, pages 1237–1242. 49, 52

[103] Maravall, D., De Lope, J. and Domınguez, R. (2011). Coordination of com-

munication in robot teams by reinforcement learning. In Proceedings of the

4th International Conference on Interplay between Natural and Artificial

Computation - Volume Part I, pages 156–164.

[104] Maravall, D., De Lope, J. and Domınguez, R. (2012). Self-emergence of a

common lexicon by evolution in teams of autonomous agents. Neurocom-

puting, 75(1):106–114. 19

[105] Marshall, J.A., Fung, T., Broucke, M.E., Deleuterio, G. and Francis, B.

(2006). Experiments in multirobot coordination. Robotics and Autonomous

Systems, 54(3):265–275. 17

[106] Mataric, M.J. (1993). Designing emergent behaviors: from local interactions

to collective intelligence. In International Conference on From Animal to

Animal: Simulation of Adaptive Behavior, volume 2 pages 432–441. 25

94

BIBLIOGRAPHY

[107] Gerkey, B. and Mataric, M.J. (1995). Cooperative multi-Robot box-

pushing. In IEEE International Conference on Robotics and Automation,

pages 3862–3868. xviii, 23

[108] Merkle, D. and Middendorf, M. (2004). Dynamic polyethism and competi-

tion for tasks in threshold reinforcement models of social insects. Adaptive

Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems,

12(3-4):251–262. 44

[109] Michael, N., Fink, J. and Kumar, V. (2011). Cooperative manipulation and

transportation with aerial robots. Autonomous Robots, 30(1):73–86. 23

[110] Mosteo, A.R., Montano, L. and Lagoudakis, M.G. (2008). Multi-robot rout-

ing under limited communication range. In IEEE International Conference

on Robotics and Automation, pages 1531–1536. 19

[111] Murphy, R.R. (2000). An introduction to AI robotics (intelligent robotics

and autonomous agents), The MIT Press.

[112] Nagatani, K., Okada, Y., Tokunaga, N., Yoshida, K., Kiribayashi, S., Ohno,

K., Takeuchi, E., Tadokoro, S., Akiyama, H., Noda, I., Yoshida, T. and Koy-

anagi, E. (2009). Multi-robot exploration for search and rescue missions: a

report of map building in RoboCupRescue 2009. In International Workshop

on Safety Security and Rescue Robotics, pages 1–6. 22

[113] Narendra, K. and Viswanathan, R. (1972). A two-level system of schotastic

automata for periodic random environments. IEEE Transactions on Sys-

tems, Man, and Cybernetics, pages 285–289. 52

[114] Narendra, K.S. and Thathachar, M.A.L. (1974). Learning automata: a

survey. IEEE Transactions on Systems, Man, and Cybernetics, pages 323–

334. 48, 49

[115] Narendra, K., Wright, E. and Mason, L. (1977). Applications of learning

automata to telephone traffic routing and control. IEEE Transactions on

Systems, Man, and Cybernetics, pages 785–792. 52

95

BIBLIOGRAPHY

[116] Narendra, K.S. and Thathachar, M.A.L. (1989). Learning automata: an

introduction. Englewood Cliffs, NJ: Prentice-Hall, Inc. 48, 49

[117] Obaidat, M., Papadimitriou, G. and Pomportsis, A. (2002). Guest editorial

learning automata: theory, paradigms, and applications. IEEE Transac-

tions on Systems, Man, and Cybernetics, pages 706–709. 49

[118] Okamura, A.M., Mataric, M.J. and Christensen, H.I. (2010). Medical and

health-care robotics. IEEE Robotics and Automation Magazine, 17(3):26–

37. 22

[119] Oster, G. and Wilson, E. (1978). Caste and ecology in the social insects.

Monographs in Population Biology Princeton Univ. Press. 25, 54

[120] Parker, L.E. (1993). Designing control laws for cooperative agent teams.

In IEEE International Conference on Robotics and Automation, volume 3,

pages 582–587. 25

[121] Parker, L.E. (1998). ALLIANCE: An Architecture for Fault Tolerant

Multi-Robot Cooperation. IEEE Transactions on Robotics and Automa-

tion, 14(2):220–240. 17

[122] Parker, L.E. (2003). Current research in multi-robot systems. Journal of

Artificial Life And Robotics, 7(1-2):1–5. 12

[123] Parker, L.E. and Tang, F. (2006). Building Multirobot Coalitions Through

Automated Task Solution Synthesis. Proceedings of the IEEE, 94(7):1289–

1305.

[124] Parker, L.E. (2008). Multiple Mobile Robot Systems. In: Bruno, S., Ous-

sama, K. (eds.) Springer Handbook of Robotics. 12, 16, 17, 21

[125] Pfeifer, R., Lungarella, M. and Iida, F. (2007). Self-organization, embodi-

ment, and biologically inspired robotics. American Association for the Ad-

vancement of Science, volume 318, pages 1088–1093. 25

96

BIBLIOGRAPHY

[126] Price, R. and Tino, P. (2004). Evaluation of adaptive nature inspired task

allocation against alternate decentralised multiagent strategies. PPSN VIII,

LNCS 3242, pages 982–990. 27

[127] Qu, Z., Wang, J. and Hull, R.A. (2008). Cooperative control of dynamical

systems with application to autonomous vehicles. In IEEE Transactions on

Automatic Control, volume 53, pages 894–911. 25

[128] Quinonez, Y., De Lope, J. and Maravall, D. (2009). Communication and

coordination of robots teams in dynamic environments. In Twelve Interna-

tional Conference on Computer Aided Systems Theory - EUROCAST 2009,

pages 150–151. 7, 20

[129] Quinonez, Y., De Lope, J. and Maravall, D. (2009). Cooperative and com-

petitive behaviors in a multi-robot system for surveillance tasks. In Com-

puter Aided Systems Theory - EUROCAST 2009, volume 5717, pages 437–

444. 8, 20

[130] Quinonez, Y., Baca, J., De Lope, J., Ferre, M. and Aracil, R. (2010). Self-

Alignment approach based on cooperative behaviors for the docking process

of modular mobile robots. In Electronics, Robotics and Automotive Mechan-

ics Conference (CERMA), pages 445–450. 7, 20

[131] Quinonez, Y., De Lope, J. and Maravall, D. (2011). Bio-inspired decentral-

ized self-coordination algorithms for multi-heterogeneous specialized tasks

distribution in multi-robot systems. In Proceedings of the 4th International

Conference on Interplay between Natural and Artificial Computation - Vol-

ume Part I, pages 30–39. 8, 78

[132] Quinonez, Y., De Lope, J. and Maravall, D. (2011). Stochastic learning

automata for self-coordination in heterogeneous multi-Tasks selection in

multi-Robot systems. In Advances in Artificial Intelligence, volumen 7094,

pages 443–453. 8, 78

[133] Quinonez, Y., Maravall, D. and De Lope, J. (2012). Application of self-

organizing techniques for the distribution of heterogeneous multi-tasks in

97

BIBLIOGRAPHY

multi-robot systems. In Electronics, Robotics and Automotive Mechanics

Conference (CERMA), pages 66–71.7

[134] Reed, K.B., Majewicz, A., Kallem, V., Alterovitz, R., Goldberg, K., Cowan,

N.J. and Okamura, A.M. (2011). Robot-assisted needle steering. IEEE

Robotics and Automation Magazine, 18(4):35–46. 22

[135] Ren, W. (2010). Consensus tracking under directed interaction topologies:

algorithms and experiments. In IEEE Transactions on Control Systems

Technology, volume 18, pages 230–237.25

[136] Robinson, G. (1992). Regulation of division of labor in insect societies.

Annual Review of Entomology, 37(1):637–665. 25, 54

[137] Sahin, H. and Guvenc, L. (2007). Household robotics: autonomous de-

vices for vacuuming and lawn mowing. IEEE Control Systems Magazine,

27(2):20–90. 21

[138] Santana, P., Barata, J., Cruz, H., Mestre, A., Lisboa, J. and Flores, L.

(2005). A multi-robot system for landmine detection. In IEEE Conference

on Emerging Technologies and Factory Automation, volume 1, pages 721–

728. 22

[139] Seeley, T., Camazine, S. and Sneyd, J. (1991). Collective decision-making in

honey bees: how colonies choose among nectar sources. Behavioral Ecology

and Sociobiology, pages 277–290. 45

[140] Shang, L. and Wang, X.F. (2004). Decentralized PI control for a congestion

game. In IEEE International Conference on Control, Automation, Robotics

and Vision, pages 316–319. 27, 43

[141] Sheng, W., Yang, Q., Ci, S. and Xi, N. (2004). Multi-robot area exploration

with limited-range communications. In IEEE/RSJ International Confer-

ence on Intelligent Robots and Systems, volume 3, pages 1414–1419. 24

[142] Sheng, W., Yang, Q., Tan, J. and Xi, N. (2006). Distributed multi-

robot coordination in area exploration. Robotics and Autonomous Systems,

54(12):945–955. 24

98

BIBLIOGRAPHY

[143] Shiroma, P. and Campos, M. (2009). CoMutaR: A framework for multi-

robot coordination and task allocation. In IEEE/RSJ International Con-

ference on Intelligent Robots and Systems, pages 4817–4824. 29

[144] Simmons, R., Smith, T., Dias, M.B., Goldberg, D., Hershberger, D., Stentz,

A. and Zlot, R. (2002). A Layered architecture for coordination of mobile

robots. In Multi-Robot Systems: From Swarms to Intelligent Automata,

Proceedings from the 2002 NRL Workshop on Multi-Robot Systems, Kluwer

Academic Publishers. 18

[145] Stone, P. and Veloso, M.(2000). Multiagent Systems: A Survey from a

Machine Learning Perspective. Autonomous Robots, 8(3):345-383. 15

[146] Song, T., Yan, X., Liang, A., Chen, K. and Guan, H. (2009). A distributed

bidirectional auction algorithm for multirobot coordination. In IEEE In-

ternational Conference on Research Challenges in Computer Science, pages

145–148. 30

[147] Soorki, M.N., Talebi, H.A. and Nikravesh, S.K.Y. (2011). A robust dynamic

leader-follower formation control with active obstacle avoidance. In IEEE

International Conference on Systems, Man, and Cybernetics, pages 1932–

1937. 25

[148] Soorki, M.N., Talebi, H.A. and Nikravesh, S.K.Y. (2011). Robust leader-

following formation control of multiple mobile robots using Lyapunov re-

design. In 37th Annual Conference on IEEE Industrial Electronics Society,

pages 277-282. 25

[149] Spletzer, J., Das, A.K., Fierro, R., Taylor, C.J., Kumar, V. and Ostrowski,

J.P. (2001). Cooperative localization and control for multi-robot manipu-

lation. In IEEE/RSJ International Conference on Intelligent Robots and

Systems, volume 2, pages 631–636. 16

[150] Stutzle, T. and Hoos, H. (1997). MAX-MIN ant system and local search

for the travelling salesman problem. In IEEE International Conference on

Evolutionary Computation, pages 309–314. 55

99

BIBLIOGRAPHY

[151] Tambe, T., Pynadath, D.V., Chauvat, N., Das, A. and Kaminka, G.A.

(2000). Adaptive agent integration architectures for heterogeneous team

members. In Proceedings of the International Conference on Multiagent Sys-

tems, pages 301–308. 19

[152] Tanner, H.G., Loizo, S.G. and Kyriakopoulos, K.J. (2002). Nonholonomic

navigation and control of cooperating mobile manipulators. In IEEE Trans-

actions on Robotics and Automation, volume 19, pages 53–64. 23

[153] Thathachar, M.A.L. (2002). Varieties of learning automata: an overview.

IEEE Transactions on Systems, Man, and Cybernetics, 32(6):711–722. 49

[154] Todt, E., Rausch, G. and Suarez, R. (2000). Analysis and classification of

multiple robot coordination methods. In IEEE International Conference on

Robotics and Automation, volume 4, pages 3158–3163. 15

[155] The player and stage project: http://playerstage.sourceforge.net.

[156] Theraulaz, G., Bonabeau, E. and Deneubourg, J.L. (1998). Response

threshold reinforcement and division of labour in insect societies. Proceed-

ings of the Royal Society B: Biological Sciences, 265:327–332. 44

[157] Unsal, C. (1997). Stochastic Learning Automata. Chapter 3 of dissertation

intelligence navigation of autonomous vehicles in an automated highway

system: learning methods and interacting vehicles approach”. 49

[158] Veloso, M.M. and Nardi, D. (2006). Special issue on multirobot systems.

Proceedings of the IEEE, 94(7):1253–1256. 22

[159] Volpe, R., Nesnas, I., Estlin, T., Mutz, D., Petras, R. and Das, H. (2001).

The CLARAty architecture for robotic autonomy. In IEEE Proceedings on

Aerospace Conference, volume 1, pages 121–132. 19

[160] Wang, Z., Nakano, E. and Takahashi, T. (2003). Solving function distribu-

tion and behavior design problem for cooperative object handling by multi-

ple mobile robots. IEEE Transactions on Systems, Man, and Cybernetics,

Part A, 33(5):537–549. xviii, 23

100

http://playerstage.sourceforge.net

BIBLIOGRAPHY

[161] Wei, L. and Yuren, Z. (2010). An effective hybrid ant colony algorithm for

solving the traveling salesman problem. In Proceedings of the International

Conference on Intelligent Computation Technology and Automation, valume

1, pages 497–500. 55

[162] Weihua, Z. and Go, T.H. (2010). Robust cooperative Leader-follower forma-

tion flight control. In 11th International Conference on Control Automation

Robotics and Vision, pages 275–280. 25

[163] Xiao-Lin, L., Jing-Ping, J. and Kui, X. (2004). Towards multirobot commu-

nication. In IEEE International Conference on Robotics and Biomimetics,

pages 307–312. 19

[164] Xiao, F., Wang, L., Chen, J. and Gao, Y. (2009). Finite-time formation

control for multi-agent systems. Automatica, 45(11):2605–2611. 25

[165] Yagmahana, B. and Yanisey, M.M. (2008). Ant colony optimization for

multi-objective flow shop scheduling problem. Computers and Industrial

Engineering, 54(3):411–420. 56

[166] Yamashita, A., Arai, T., Ota, J. and Asama, H. (2003). Motion planning of

multiple mobile robots for cooperative manipulation and transportation. In

IEEE Transactions on Robotics and Automation, volume 19, pages 223–237.

xviii, 23

[167] Yang, Y., Zhou, C. and Tian, Y. (2009). Swarm robots task allocation

based on response threshold model. In IEEE International Conference on

Autonomous Robots and Agents, pages 171–176. 28

[168] Yerpes, A., Baca, J., Escalera, J.A., Ferre, M. and Aracil, R. (2008).

Modular robot based on 3 rotational DoF modules. In Proceedings of the

IEEE/RSJ International Conference on Intelligent Robots and Systems,

pages 889–894. 20

[169] Yuta, S. and Premvuti, S. (1992). Coordinating autonomous and centralized

decision making to achieve cooperative behaviors between multiple mobile

101

BIBLIOGRAPHY

robots. In Proceedings of the 1992 lEEE/RSJ International Conference on

Intelligent Robots and Systems, volume 3, pages 1566–1574. 15

[170] Zhang, W. and Hu, J. (2008). Optimal multi-agent coordination under tree

formation constraints. In IEEE Transactions on Automatic Control, volume

53, pages 692–705. 17

[171] Zhu, A. and Yang, S.X. (2006). A SOM-based multi-agent architecture for

multirobot systems. Int. J. Robot. Autom., volume 21, pages 91–99. 19

[172] Zlot, R., Stentz, A., Dias, B. and Thayer, S. (2002). Multi-robot explo-

ration controlled by a market economy. In IEEE International Conference

on Robotics and Automation, volume 3, pages 3016–3023. 24

[173] Zlot, R. and Stentz, A. (2006). Market-based multirobot coordination for

complex tasks. The International Journal of Robotics Research, 25(1):73–

101. 34

[174] Zlot, R. and Stentz, A. (2006). Market-based multirobot coordination using

task abstraction. In Field and Service Robotics, volume 24, pages 167–177.

34

102

UNIVERSIDAD POLITECNICA DE MADRIDoa.upm.es/14922/1/ALMA_YADIRA_QUINONEZ_CARRILLO.pdf · Yadira...

Documents

Transcript of UNIVERSIDAD POLITECNICA DE MADRIDoa.upm.es/14922/1/ALMA_YADIRA_QUINONEZ_CARRILLO.pdf · Yadira...