Artiﬁcial Swarm Intelligence and Cooperative Robotic Systems*

Artificial Swarm Intelligence and Cooperative Robotic Systems

Iaroslav OmelianenkoNewGround Advanced Research

Kiev UkraineEmail yaricnewgroundcomua

AbstractmdashIn this paper we look at how Artificial SwarmIntelligence can evolve using evolutionary algorithms that tryto minimize the sensory surprise of the system We will showhow to apply the free-energy principle borrowed from statisti-cal physics to quantitatively describe the optimization method(sensory surprise minimization) which can be used to supportlifelong learning We provide our ideas about how to combinethis optimization method with evolutionary algorithms in orderto boost the development of specialized Artificial Neural Net-works which define the proprioceptive configuration of particularrobotic units that are part of a swarm We consider howoptimization of the free-energy can promote the homeostasis ofthe swarm system ie ensures that the system remains within itssensory boundaries throughout its active lifetime We will showhow complex distributed cognitive systems can be build in theform of hierarchical modular system which consists of specializedmicro-intelligent agents connected through information channelsWe will also consider the co-evolution of various robotic swarmunits which can result in development of proprioception and acomprehensive awareness of the properties of the environmentAnd finally we will give a brief outline of how this system canbe implemented in practice and of our progress in this area

Keywordsmdashartificial intelligence neuro-evolution swarm intel-ligence robotic swarm cooperative robotics free-energy principleactive inference evolutionary computation novelty search

I INTRODUCTION

With current advances in machine learning and robotics itrsquossafe to assume that in near future we will be witnessing deeppenetration of cooperative robotic systems into everyday lifeof millions of people as well as into many industrial processesClose cooperation with fragile human beings suggests thatrobotic systems should evolve to achieve deep understanding ofthe environment in order to safely use their powerful actuatorsSuch level of environmental awareness can be achieved byconsidering robotic system as not a single unit but rather asa distributed swarm of robotic units connected by peer-to-peermesh network with powerful central command and control(CNC) unit Such structure allows to join effective decisionmaking (by powerful CNC unit) with high awareness of theenvironmental dynamics achieved through spatially distributedsensorium

In this work we consider how to effectively build suchdistributed intelligent system which is able to adapt to thespecific environmental conditions due to inherent capabilitiesfor a lifelong learning We will show that there are plausibleparallels with biological systems can be made and applybiologically inspired paradigms to the design of such system

This paper is organized as follows In Section II we give anoverview of the structure of the swarm system at the physicallevel It is followed by Section III describing the hierarchy of

the swarm system at the logical level as well as describingits main components which make itrsquos cooperative intelligencepossible In Section IV we discuss implementation of swarmlearning model and optimization methods to be used for initialtraining of artificial swarm intelligence Section V providedescription of simple swarm architecture and how it can beimplemented Finally in Sections VI and VII we discuss ourfindings and describe our current and future works in the area

II SWARM STRUCTURE - PHYSICAL LEVEL

In this paper we consider swarm as a tightly integratedrobotic environment in which many semi-autonomous typesof robotic organisms cooperatively interact to achieve specificgoals The robotic units in the environment organized in well-defined hierarchy where particular types of units provide spe-cific sensory inputs for the swarm while others are capable ofaffecting the environment In addition the swarm will includepowerful units (command centers) to perform energy- andcomputationally-intensive tasks for strategic decision-makingand general knowledge accumulation

Our idea is to have specific types of units to evolve inthe process of swarm maturation from some kind of basicunits Each basic unit has unique combination of sensors andactuators allowing it to perceive and affect the environmentin a certain way Later in the evolutionary process somesensors and actuators of basic units will be inactivated tocreate an unique configuration for the new derived unit typeThe evolutionary process will be bounded by environmentalfactors and will be optimized to ensure cooperative behaviorof various types of robotic units to achieve the desired resultsSuch cooperative integration allows a sensorium of one typeof units to provide guidance for another type eg aerial unitswill provide reconnaissance data about environment that is notdirectly perceived by ground-based units

The continuous communication with the command center isan important part of coordinating collective behavior betweenunits of the swarm However in case of an emergency it isassumed that all units can operate autonomously to completethe current task even when the communication link is down

The autonomous behavior of each unit is assisted byspecialized tactical decision making software which consist ofmodularhierarchical neural networks trained to operate withinsensorimotor framework of that specific robotic unit

III SWARM HIERARCHY - LOGICAL LEVEL

We regard the robotic swarm as a single sensorimotorlogical unit which has an unified spatial awareness of theenvironment distributed over all its constituent parts But

This work was supported by NewGround LLC

Preprints (wwwpreprintsorg) | NOT PEER-REVIEWED | Posted 28 January 2019

copy 2019 by the author(s) Distributed under a Creative Commons CC BY license

Preprints (wwwpreprintsorg) | NOT PEER-REVIEWED | Posted 28 January 2019 doi1020944preprints2019010282v1

copy 2019 by the author(s) Distributed under a Creative Commons CC BY license

the internal structure of the swarmrsquos consciousness is not amonolith but rather consist of a complex hierarchy of micro-intelligent agents that interact with each other to perform datacollection data processing and decision-making according tothe current environmental conditions and tasks to be com-pleted

The dynamics of the swarm will be controlled with the helpof attractors - the active components of the system that willlead the cooperative passive components Passive componentsare implemented in such a way as to assist any external actioninitiated by a friendly active unit or person Active componentsare dedicated swarm units that have improved sensorimotorcapabilities and higher level of autonomy due to the morepowerful cognitive models implemented in them

Thus while each component of the swarm has a certainlevel of autonomy the full potential of the system is revealed inthe synergistic interaction of all its components As a result theArtificial Swarm Intelligence (ASI) [14] system is emergingwhich has a cognitive potential that exceeds the sum of all itscomponents [15]

An appropriate swarm hierarchy will emerge in the processof evolution of the system from seeds of basic units Eachseed of a specific robotic unit describes its basic sensorimotorcapabilitiesconfiguration in terms of the topology of theArtificial Neural Network (ANN) controlling it The neuro-evolution [11] [12] process will be applied to the mentionedseed configurations to create specialized ANN modules [13]depending on the sensorimotor dynamics of the system andthe observed environmental characteristics

The evolutionary potential of each ANN module will bebounded by the hardware on which robotic unit operatesThus as in biological systems the most potent units will beevolved where the most of the computing power and energy isavailable Such inherent limitations will contribute to a targetedevolution of the hierarchical modular system where each parthas certain processing capabilities in accordance with its placein the swarm hierarchy

A The Command and Control Center

Strategic planning of the general behavior of the swarm iscarried out by the Command and Control (CNC) center wheresensory data from all swarm units are collected and processedThe decision-making process of the CNC center use collecteddata to model the next state of the environment in whichswarm operates After that the modelling error is correctedusing actual sensory inputs received from the swarm units(posterior probabilities distribution) after performing a specificaction based on the model predictions (prior probabilitiesdistribution) Thus it becomes possible to create a highlyadaptive system that is able to quickly learn the propertiesof novel environments and quickly adapt the behavior of theswarm to effectively achieve the goals In addition with eachnew environment learned the knowledge base of the CNCsystem will be expanded which will allow it to easy masternovel environments using transfer learning methods

Our idea is to use the Minimal Criterion Novelty Search[4] optimization method for a comprehensive exploration ofthe environment With this method of learning the exploring

agent will be rewarded depending on the level of novelty ofsolution that was found At the same time the minimal fitnesscriterion is introduced which allows to optimize the trainingtime and avoid agent sticking in the some noisy parts of theenvironment which have a high potential for the novelty butlow objective fitness

The ultimate goal of the swarm systemrsquos training is tominimize sensory surprise (free-energy) [9] of the CNC mod-ule while roaming in the environment The minimal value ofsurprise implies that the CNC system is fully aware of theproperties of the surrounding environment in the context ofthe actual sensory inputs and the probability distributions ofthe causes of the inputs Thus the CNC system is able tomaintain a reliable model of the environment and make theright inference about the outcomes of actions that will beperformed And with this awareness it becomes possible tobuild an optimal strategy for using the observed properties ofthe environment to achieve specific goals and accomplish theobjectives

Another important aspect of the CNC module is that itmaintains a library of already explored environments and canuse this prior knowledge as a stepping stone for improvingperformance in a different environment

The CNC module will control the overall dynamics of theswarm through the attractors - the dedicated active units Butat the same time to perform micro-tasks that require highaccuracy it will be able to directly control each unit of theswarm

B Micro-Intelligent Agents and Information Channels

Recently researchers have shown [21] that in simple modelsof neural networks the amount of effective information [26]increases as you coarse-grained the model of neuron interac-tions in the neural network that is consider groups of them asa whole The possible states of these interconnected units forma causal structure in which transitions between states can bemathematically modeled using so-called Markov chains [22]At a certain macroscopic scale effective information reachesa peak This is the scale at which states of the system havethe most causal power predicting future states in the mostreliable and effective manner By increasing the scale furtheryou begin to lose important details about the causal structureof the system The neural groups with most casual power atthe optimal macroscopic scale we refer to as Micro-IntelligentAgents (MIA)

MIA agents are the foundation of the cognitive hierarchyof the swarm system They are able to process tiny amountsof input data (bit sets words) and aggregate them into higher-order information units that will be passed to the next level ofprocessing Physically MIA agents are represented as highlyspecialized software modules evolved to process certain kindsof input data received from specific set of sensors or fromInformation Channels (ICH) [21]

Information Channels in turn facilitate the exchange ofinformation between different parts of the swarm and encodethe casual structure of the sensory inputs Information Chan-nels evolve along with Micro-Intelligent Agents in the processof maturation of Artificial Swarm Intelligence and become

2

Preprints (wwwpreprintsorg) | NOT PEER-REVIEWED | Posted 28 January 2019 Preprints (wwwpreprintsorg) | NOT PEER-REVIEWED | Posted 28 January 2019 doi1020944preprints2019010282v1

an important part of the swarm system configuration whichdetermines its ability to ingress environmental informationfrom its sensorium

Fig 1 The Information Channel connecting two MIA agentsone of which encodes the signal on the input side and theother decodes on the output side respectively Thus ICH withassociated MIA agents becomes an informative representation(coding) of a particular sensory input for a particular type andstate of the environment

The Information Channel consist of two finite sets Xand Y and is a collection of transition probabilities p(y|x)for each x isin X such that for every x and y p(y|x) ge 0and for every x

sumy p(y|x) = 1 (set known as the channel

matrix) X and Y are the input and output of the channelrespectively [23] The channel is governed by a channel matrixwhich is a fixed entity represented by a specific ArtificialNeural Network emerging during the corresponding neuro-evolutionary process Both channels and causal structures canbe represented as Transition Probability Matrixes (TPM) [22]and in a situation where the channel matrix contains the sametransition probabilities for a certain set of state transitions theTPMs will be identical The causal structure is a matrix thattransforms previous states into future ones

The TPM associates each state si in S with the probabilitydistribution of past states (SP |S = si) which could led toit and the probability distribution of future states to whichit could lead to (SF |S = si) In these terms the ED (EffectDistribution which is an ASI modelrsquos internal states describingsensory inputs) can be formally expressed as expectation of(SF |do(S = si)foralli isin 1 n) given some ID (InterventionDistribution which is the sensor casuals respectively) Wheredo(x) is an operator that formalizes interventions [29] whichset a system (or variables in a causal model) into a specificstate which facilitates the identification of the causal effectsof that state on the system disrupting its relationship with theobserved history

To send a message through a channel matrix with theseproperties it is necessary to define some encodingdecodingfunction implemented through the specialized MIA agents inour architecture (see Figure 1) A message can be some binarystring like 110111010001 generated via the application ofsome ID (Intervention Distribution) The encoding functionφ message rarr encoder is a rule that associates some

channel input with some output along with some decodingfunction ψ The encodingdecoding functions together create acode-book which is a concise informative description of theenvironmental properties sampled by the ASI agent from itssensory inputs

According to the Shannonrsquos theory of communication [25]communication channels have a certain capacity that is theability of the channel to convert inputs into outputs in themost informative and reliable way Also he found that the rateof information transmission over the channel correlates withthe changes in the input probability distribution p(X) Thusthe channel capacity (C) is determined by a set of inputs thatmaximizes the mutual information I(XY ) By maximizingchannel capacity it is possible to effectively increase itstransmission rate

C = maxp(X)I(XY ) (1)

where I(XY ) is the mutual information for representingthe transmission rate of information over the channel

I(XY ) = H(X)minusH(X|Y ) (2)

where H(X) represents the total possible entropy of thesource and the conditional entropy H(X|Y ) indicates howmuch information is left about X once Y is taken intoaccount Therefore H(X|Y ) has a clear causal interpretationas the amount of information lost during the processing andtransmission of sensory inputs

Taking into account the discoveries of Shannon it can beseen that the use of MIA agents as a encoderdecoder forinputoutput of the data bits from the information channeleffectively increases its capacity and transmission rate Whichis crucial given the distributed nature of the swarm intelligence

To draw parallels with biology swarm intelligence can beviewed as full body consciousness in which MIA agents canbe viewed as individual cells or cell conglomerates (sensingand pre-processing organs etc) [1] [2] and information chan-nels as a signaling pathways in the form of peripheral neuralsystem or information exchange routines through chemicalmarkers

IV SWARM LEARNING MODEL

It is assumed that swarm intelligence is not fully deter-mined by a constant learned set of skills - a priori pre-trainedknowledge but rather is a combination of prior knowledgewith continuous lifelong learning abilities The lifelong worldexploration is based on a certain kind of visual consciousnessmediated by inherent knowledge of the swarm system about itssensorimotor contingencies [5] and proprioception Here thevisual consciousness is not only about the sensation of visibleelectromagnetic specter but rather full gamma of sensationscollected from the environment and processed as a visualiza-tion model where internal visualization of the environmentis compared with actual data received from sensors In thisaspect it is important that swarm system evolve to maturitywith all its constituents incorporated in order to learn its uniquesensorimotor configuration which combines the perception of

3


external world with the available actuating mechanisms to acton it This unique configuration will be encoded by controlANN emerged during evolution Thus the experience gainedduring evolution of the swarm system becomes its uniquebubble of experience [6] which serves as a starting point forself-awareness development

The learning model will be optimized to minimize surprise(negative log probability) of some sensory data given a modelof how this data were generated which is called activeinference [7] The bounds of sensory surprise in a quantitativeterms can be expressed as a free-energy [8] Free-energymeasures the difference between the probability distributionof environmental factors affecting our system and an arbitrarydistribution encoded by the swarm systemrsquos configuration iethe topology of the control ANN its connection strengthsand modules included The swarm intelligence can minimizefree-energy (sensory surprise) by changing its configuration(topology of the control artificial neural network) to affectthe way it samples the environment (eg when we movethrough in a dark room we build a mental model of whatto expect in the next step and then compare what we feel withthese prior expectations) Thus the adaptive exchange with theenvironment will be established allowing swarm intelligence toconstruct prior expectations in a dynamic and context-sensitivefashion

According to the recently proposed free-energy principleframework the following key points characterize the model ofadaptive learning (biological)

bull Adaptive agents must occupy a limited reper-toire of states and therefore minimize thelong-term average value of surprise associ-ated with sensory exchanges with the worldMinimizing surprise allows them to resist thenatural tendency to disorder

bull Surprise is based on predictions about sensa-tions that depend on the internal generativemodel of the world Although surprise cannotbe measured directly the free-energy boundon surprise can be suggesting that agentsminimize free-energy by changing their pre-dictions (perception) or by changing the pre-dicted sensory inputs (action)

bull Perception optimizes predictions by mini-mizing free-energy with respect to the ac-tivity of neural units (perceptual inference)efficacy (learning and memory) and gain(attention and salience) This gives Bayes-optimal (probabilistic) representations aboutwhat caused the sensations (providing a linkto the Bayesian brain hypothesis)

bull Learning under the free-energy principle canbe formulated in terms of optimizing theconnection strengths in hierarchical modelsof the sensorium This is based on associa-tive plasticity for encoding causal regularitiesand address the same synaptic mechanismsas those underlying the formation of a cellassembly

bull Action under the free-energy principle isreduced to the suppression of sensory pre-

diction errors that depend on the predicted(expected or desired) movement trajectoriesThis provides a simple account of motor con-trol in which action is enslaved by perceptual(proprioceptive) predictions

bull Perceptual predictions are based on prior ex-pectations regarding the trajectory or move-ment through the agentrsquos state space Thesepriors can be acquired (as empirical priorsduring hierarchical inference) or they can beinnate (epigenetic) and therefore subject toselective pressure

bull The predicted movements or state transitionsrealized by the action correspond to policiesin the optimal control theory and reinforce-ment learning In this context value is in-versely proportional to surprise (and implic-itly free energy) and the rewards correspondto innate priors that constrain policies

( [9])

In this work our idea is to integrate this biologicallyinspired model of adaptive learning into a field of artificialintelligence in order to achieve our goal - to develop anintelligent distributed swarm system that can maintain itssensory states within the physical bounds of its components inface of constant environmental flux In short terms this meansavoiding sensory surprise which relates not only to the currentstate of the swarm system (which cannot be changed) butalso to the transition from one state to another (which can bechanged) The guiding principle for the mentioned transitionsbetween states which is compatible with the swarm survival(eg the flight of a swarm of drones within a small margin oferror) can be considered as a global random attractor [10] Inthe learning process we will be optimizing the movements ofthis attractor by minimizing free-energy bound on the sensorysurprise of the swarm

Fig 2 Information exchange between the environment and theASI agent The swarm learning model will be represent as a setof internal states that control actions and make prediction errorcorrections in accordance with sensations caused by externalenvironmental states

The dependencies between the inputsoutputs of a swarm

4


along with its internal states and the external states of theenvironment allows us to formulate a swarm learning model(see Figure 2)

The internal states of the Artificial Swarm Intelligencesystem (3) and its actions (4) work to minimize the free-energyF (s micro) which is a function of sensory inputs s(t) (5) and aprobabilistic representation q(Θ|micro) of its causes

micro = argminF (s micro) (3)

α = argminF (s micro) (4)

s = g(xΘ) + z (5)

The environment (6) is described by equation of motionthat specify the trajectory of its hidden states

˙x = f(x αΘ) + w (6)

Where causes Θ sup xΘ γ of sensory inputs includehidden states x(t) parameters Θ and precision γ controllingthe amplitude of the random fluctuations z(t) and w(t)

Free-energy depends on two probability densities therecognition density q(Θ|micro) and the one that generates sensorysamples and their causes p(sΘ|m) The latter represents aprobabilistic generative model (denoted by m) the form ofwhich is determined by the controlling ANN of the ASI agent

The sensory surprise of an ASI agent is a negative log-likelihood [17] associated with its sensory states s(t) that havebeen caused by some unknown quantities Θ sup xΘ

minus ln p(s(t)|m) = minus ln

intp(s(t)Θ) dΘ (7)

It can be shown [9] that the free-energy can be representedas a surprise minus ln p(s(t)|m) plus a Kullback-Leibler [16]divergence between the recognition q(Θ|micro) and conditionalp(Θ|s) densities

F = DKL(q(Θ|micro) p(Θ|s)) minus ln p(s|m) (8)

In this case the free-energy can be minimized by trainingASI agent model to infer causes of the sensory samples inBayes-optimal fashion

With another form of free-energy presentation (9) it can beshown that it can be minimized by affecting the environment(action) in accordance with the current internal state of theASI agent followed by sampling of sensory data After thatthe ASI agent will reconfigure its sensor circuits to sampleinputs which are predicted by its recognition density in orderto minimize prediction error

F = DKL(q(Θ|micro) p(Θ)) minus 〈 ln p(s(α)|Θm) 〉q (9)

V EXAMPLE OF THE SWARM SYSTEM IMPLEMENTATION

In this part we will look at how to apply the proposeddesign of the swarm robotic system to build very basicexploratory system that is able to develop awareness of thespatial environment and accomplish the tasks assigned

Such a system should at least have the following compo-nents (see Figure 3)

bull The command-and-control (CNC) module with highpower capacity and advanced computational capabili-ties which has reliable communication channels withother members of the swarm as well as with humanoperators

bull The light-weight mobile (LWM) units which can beused to quickly explore environment and deploy spa-tial coordinate grid for independent local positioningsystem

bull The heavy-weight mobile (HWM) units that canphysically reach various areas of the environmentand perform assigned tasks They will use the spatialcoordinate grid provided by the LWM units to confirmtheir position and direction

Fig 3 The structure of the Swarm Exploratory System

The role of light-weight mobile units is to quickly explorean unknown environment and occupy certain positions to formspatial grid coordinate system that will be used to facilitatespatial awareness of the swarm The LWM units must behighly mobile so that they can be deployed as quickly aspossible This means that depending on the properties ofthe environment they can be physically presented as droneslightweight gyro-boards etc

The heavy-weight mobile platforms will complement theLWM units with additional sensor arrays capable of collectingthe most comprehensive environmental data Thus with thedeployment of HWM platforms the swarmrsquos awareness ofenvironmental dynamics must be sufficient enough to fulfillthe task set by the human operators even in an autonomousmode

The positions of the HWM platforms will be visuallycorrelated with respect to the spatial grid formed by the LWMunits (see Figure 4) which makes the swarm independent ofany external positioning system (GPS GLONAS etc) Alsothe special configuration of the infrared LED arrays placedon the body of the HWM platforms at certain places willallow to determine their direction relatively to the spatial gridwithout any reference to the geomagnetic data Furthermore itis possible to use other variants of local positioning based forexample on near-field radio signals The advantage of using avisual signal is that it is much more difficult to suppress thana radio signal To further enhance the reliability of the localcoordinate system visual positioning can be supplementedwith acoustic or seismic signals depending on the propertiesof the environment in which it is deployed This is especiallyuseful in underground or underwater environments where lightand radio signals propagation is limited

5


With a local grid coordinate system the swarm can operatein any environment where there are no other means to obtainreliable positioning from external systems Thus swarm canbe deployed in many harsh environments underground under-water in the interplanetary space or in places where signalsfrom an external positioning systems are suppressed

The CNC module could be implemented as a stationarystructure with a powerful battery array and capable processingunits To increase the resilience and survivability of the swarmit is possible to use multiple CNC systems deployed on HWMplatforms which use a master-slave architecture The currentmaster CNC module will manage the swarm and the slaveCNC modules will maintain their state by listening to regularupdates from the master In the event of masterrsquos failure a newmaster will be elected according to a specific election protocol[18] [19]

VI CONCLUSIONS

In this paper we presented our ideas on the develop-ment of robotic systems controlled by an Artificial SwarmIntelligence We looked at how swarm can be trained toobtain full sensorimotor awareness about all of its constituentparts (proprioception) endowing it with sense of relative partspositions movement and acceleration We also consideredhow to support lifelong learning which is one of the mostimportant features of the ASI system that allows it to quicklyadapt to the conditions of novel environments using transfer-learning methods

We looked at the Artificial Swarm Intelligence as a macro-scale hierarchy of the Micro-Intelligent Agents and InformationChannels which according to the theory of casual emergence[24] provides more informative casual model of the environ-ment than micro-scale systems that take into account every bitof input data Ie map (macro-scale) is better than territory(micro-scale) in terms of information models

We suggested using evolutionary computational [20] meth-ods (eg neuro-evolution) for evolving of specific controlneural networks from basic (seed) ANN configurations Weassume that during the evolutionary process the specific typeof control ANN configuration will be evolved that are ableto take full advantage of the sensorimotor capabilities of aparticular robotic unit The overall configuration of the swarmis also the subject of an evolutionary development in whichvarious types of robotic units will co-evolve complementingeach otherrsquos characteristics

We have described a learning model based on the notionof minimizing the sensory surprise of a swarm system whenit interacts with the environment And applied the free-energyprinciple [8] to determine the optimization function which canbe used as a guiding principle during the lifelong training ofthe ASI model

Finally we provided an overview of the implementationof a simple swarm system with the definition of its majorconstituent parts and their interactions

Summing up our research the following key conceptsshould be considered when designing the Artificial SwarmIntelligence system

Fig 4 The grid structure of the coordinate system is formedby light-weight mobile units The hexagonal spatial structureallows to generate grid code that determines the coordinates inthe local coordinate system relative to the master CNC module

bull The sensorimotor configuration of the swarm systemmust be constant during the period of primary trainingItrsquos important that during maturation the ArtificialSwarm Intelligence becomes fully aware of its phys-ical boundaries and learns how to act within them(proprioception) An embodiment allows ASI to buildcomplex models of internal sensory experiences linkedwith the causes of the specific sensory inputs whichallows the use of active inference [7] optimizationtechniques in order to support lifelong learning

bull The ASI system should be designed as a modularhierarchical architecture in which data collectiondata processing and decision-making procedures aredistributed across multiple hierarchical levels and spanacross multiple physical units of the swarm The exactarchitecture structure and format of communication(information) channels should be evolved in the pro-cess of ASI maturation as part of the early epistemo-logical experiments with the environment checkingphysical boundaries predicting outcomes of physicalactions etc

bull The dynamics of the swarm should be controlled bythe interaction of active and passive units Active unitsserve as a random attractors [10] which shape thebehavior of passive units By doing this they reducethe computational load on the command-and-controlunit when building the optimal strategy for executionof the current task

bull It is important to have energy- and computationally-rich Command-and-Control units to maintain proprio-ception of the swarm system and control execution ofassigned tasks on the high level

bull Another important aspect of Artificial Swarm Intelli-gence is to maintain the library of already exploredenvironments which can be uses for transfer-learningwhen building a model of novel environments

bull The swarm must include various types of robotic units

6


with different types of sensorimotor configurationsand specializations This provides boost during anASI evolution and allows the selection of appropri-ate configurations that are beneficial for a particularenvironment Such flexibility allows to use a trainedswarm system to explore any type of environment forwhich it has an appropriate sensorium and actuators

bull The ASI system should evolve local spatial awarenessbased on its component parts This implies that it haslight-weight mobile units that can be quickly deployedinto strategic places to form a local spatial coordinategrid with hexagonal cells After that it will have along-range coordinate system for determining the rel-ative positions of the swarm units without any externalcues (see Figure 4) It is important that mentionedhexagons is not uniform but differ in sizes directionsand shifts relative to one another Due to such complexgeometry they will provide unique coding patterns(grid codes) for each possible spatial coordinates [28]relative to the master CNC module

VII FUTURE WORK

Right now we are working on the implementation of thedescribed swarm system architecture We already developedmany components of the software framework which can beused to implement control ANN modules

The software is written in GO programming language [27]which makes it highly interoperable that can be executed onwide range of hardware platforms Also our emphasis onusing neuro-evolution methods to evolve specific topology andconfiguration of control neural networks results in the modesthardware requirements exposed

We open sourced various components of our softwareframework at GitHub

bull httpsgithubcomyaricomgoNEAT

bull httpsgithubcomyaricomgoNEAT NS

bull httpsgithubcomyaricomgoESHyperNEAT

In parallel we are working on the hardware components oflight-weight mobile platform to begin field experiments withautonomous behavior

REFERENCES

[1] Marijuan PC Navarro J and del Moral R (2010) On prokaryoticintelligence Strategies for sensing the environment Biosystems 9994ndash103 (2010) httpsdoiorg101016jbiosystems200909004

[2] Lyon P (2015) The cognitive cell Bacterial behaviour reconsideredFrontiers in Microbiology 6 264 (2015) httpsdoiorg103389fmicb201500264

[3] Parr Thomas and Friston Karl J (2018) The Anatomy of InferenceGenerative Models and Brain Structure Frontiers in ComputationalNeuroscience 12 90 (2018) httpsdoiorg103389fncom201800090

[4] Joel Lehman and Kenneth O Stanley (2011) Novelty Search and theProblem with Objectives Genetic Programming Theory and PracticeIX (GPTP 2011) New York NY Springer 2011 httpeplexcsucfedupaperslehmangptp11pdf

[5] OrsquoRegan J K and Noe A (2001) A sensorimotor account of visionand visual consciousness Behavioral and Brain Sciences 24(5)939ndash73 httpsdoiorg101017S0140525X01000115

[6] David Gamez (2018) Human and Machine Consciousness Cam-bridge UK Open Book Publishers 2018 httpsdoiorg1011647OBP0107

[7] Adams R A Shipp S and Friston K J (2013) Predictions notcommands active inference in the motor system Brain Struct Funct218 611ndash643 httpsdoiorg101007s00429-012-0475-5

[8] Friston K Kilner J Harrison L (2006) A free energy principle for thebrain J Physiol Paris 2006 Jul-Sep 100(1-3)70-87 Epub 2006 Nov13 httpsdoiorg101016jjphysparis200610001

[9] Friston K (2010) The free-energy principle a unified brain theoryNature Reviews Neuroscience mdash Aop published online 13 Jan 2010volume 11 httpsdoiorg101038nrn2787

[10] Crauel H and Flandoli F (1994) Attractors for random dynamicalsystems Probab Theory Relat Fields 100 365ndash393 (1994) httpsdoiorg101007BF01193705

[11] K O Stanley and R Miikkulainen (2004) Competitive coevolutionthrough evolutionary complexification Journal of Artificial IntelligenceResearch 2163ndash100 2004 httpdlacmorgcitationcfmid=16224671622471

[12] K O Stanley B D Bryant and R Miikkulainen (2005) Real-timeneuroevolution in the NERO video game IEEE Transactions onEvolutionary Computation Special Issue on Evolutionary Computationand Games 9(6)653ndash668 2005 httpsdoiorg101109TEVC2005856210

[13] J C Bongard (2002) Evolving modular genetic regulatory networksIn Proceedings of the 2002 Congress on Evolutionary Computation2002 httpsdoiorg101109CEC20021004528

[14] Beni G Wang J (1993) Swarm Intelligence in Cellular RoboticSystems In Dario P Sandini G Aebischer P (eds) Robots and Bio-logical Systems Towards a New Bionics NATO ASI Series (Series FComputer and Systems Sciences) vol 102 Springer Berlin Heidelberghttpsdoiorg101007978-3-642-58069-7 38

[15] Nasuto S Bishop M (2008) Stabilizing Swarm Intelligence Search viaPositive Feedback Resource Allocation In Krasnogor N NicosiaG Pavone M Pelta D (eds) Nature Inspired Cooperative Strategiesfor Optimization (NICSO 2007) Studies in Computational Intelli-gence vol 129 Springer Berlin Heidelberg httpsdoiorg101007978-3-540-78987-1 11

[16] Kullback S Leibler R A (1951) On Information and SufficiencyAnn Math Statist 22 (1951) no 1 79ndash86 httpsdoiorg101214aoms1177729694

[17] R A Fisher and Edward John Russell On the mathematical foundationsof theoretical statistics 222 Philosophical Transactions of the RoyalSociety of London Series A Containing Papers of a Mathematical orPhysical Character httpdoiorg101098rsta19220009

[18] H Attiya and J Welch (2004) Distributed Computing FundamentalsSimulations and Advance Topics John Wiley amp Sons inc 2004 chap3 httpsdoiorg1010020471478210

[19] Haeupler Bernhard amp Ghaffari Mohsen (2013) Near Optimal LeaderElection in Multi-Hop Radio Networks Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithm httpsdoiorg1011371978161197310554

7


[20] Back Thomas and Fogel David B and Michalewicz Zbigniew (1997)Handbook of Evolutionary Computation 1st ed IOP Publishing LtdBristol UK UK ISBN 978-0-7-503-0895-3

[21] Erik P Hoel (2017) When the Map Is Better Than the TerritoryEntropy 2017 19(5)188 httpsdoiorg103390e19050188

[22] Gagniuc Paul A (2017) Markov Chains From Theory to Implemen-tation and Experimentation USA NJ John Wiley amp Sons pp1ndash235ISBN 978-1-119-38755-8

[23] Cover TM Thomas JA (2012) Elements of Information TheoryJohn Wiley amp Sons Hoboken NJ USA 2012 ISBN 978-0-471-24195-9

[24] Hope LR Korb KB (2005) An information-theoretic causal powertheory In Australasian Joint Conference on Artificial IntelligenceSpringer BerlinHeidelberg Germany 2005 pp 805ndash811 httpsdoiorg101007978-0-387-84816-7 10

[25] Shannon CE (1948) A mathematical theory of communication BellSyst Tech J 1948 27 623ndash666 httpsdoiorg101002j1538-73051948tb01338x

[26] Tononi G Sporns O (2003) Measuring information integrationBMC Neurosci 2003 4 31 httpsdoiorg1011861471-2202-4-31

[27] Robert Griesemer Rob Pike Ken Thompson (2009) The Go Pro-gramming Language Google GO team 2009-2018 Retrieved fromhttpsgolangorg

[28] Jacob L S Bellmund Peter Gardenfors Edvard I Moser ChristianF Doeller (2018) Navigating cognition Spatial codes for humanthinking Science 09 Nov 2018 Vol 362 Issue 6415 eaat6766httpsdoiorg101126scienceaat6766

[29] Neuberg L (2003) CAUSALITY MODELS REASONING ANDINFERENCE by Judea Pearl Cambridge University Press2000 Econometric Theory 19(4) 675-685 httpsdoiorg101017S0266466603004109

8


INTRODUCTION

Swarm Structure - Physical Level

Swarm Hierarchy - Logical Level

The Command and Control Center

Micro-Intelligent Agents and Information Channels

Swarm Learning Model

Example of the Swarm System Implementation

Conclusions

Future Work

References

the internal structure of the swarmrsquos consciousness is not amonolith but rather consist of a complex hierarchy of micro-intelligent agents that interact with each other to perform datacollection data processing and decision-making according tothe current environmental conditions and tasks to be com-pleted

The dynamics of the swarm will be controlled with the helpof attractors - the active components of the system that willlead the cooperative passive components Passive componentsare implemented in such a way as to assist any external actioninitiated by a friendly active unit or person Active componentsare dedicated swarm units that have improved sensorimotorcapabilities and higher level of autonomy due to the morepowerful cognitive models implemented in them

Thus while each component of the swarm has a certainlevel of autonomy the full potential of the system is revealed inthe synergistic interaction of all its components As a result theArtificial Swarm Intelligence (ASI) [14] system is emergingwhich has a cognitive potential that exceeds the sum of all itscomponents [15]

An appropriate swarm hierarchy will emerge in the processof evolution of the system from seeds of basic units Eachseed of a specific robotic unit describes its basic sensorimotorcapabilitiesconfiguration in terms of the topology of theArtificial Neural Network (ANN) controlling it The neuro-evolution [11] [12] process will be applied to the mentionedseed configurations to create specialized ANN modules [13]depending on the sensorimotor dynamics of the system andthe observed environmental characteristics

The evolutionary potential of each ANN module will bebounded by the hardware on which robotic unit operatesThus as in biological systems the most potent units will beevolved where the most of the computing power and energy isavailable Such inherent limitations will contribute to a targetedevolution of the hierarchical modular system where each parthas certain processing capabilities in accordance with its placein the swarm hierarchy

A The Command and Control Center

Strategic planning of the general behavior of the swarm iscarried out by the Command and Control (CNC) center wheresensory data from all swarm units are collected and processedThe decision-making process of the CNC center use collecteddata to model the next state of the environment in whichswarm operates After that the modelling error is correctedusing actual sensory inputs received from the swarm units(posterior probabilities distribution) after performing a specificaction based on the model predictions (prior probabilitiesdistribution) Thus it becomes possible to create a highlyadaptive system that is able to quickly learn the propertiesof novel environments and quickly adapt the behavior of theswarm to effectively achieve the goals In addition with eachnew environment learned the knowledge base of the CNCsystem will be expanded which will allow it to easy masternovel environments using transfer learning methods

Our idea is to use the Minimal Criterion Novelty Search[4] optimization method for a comprehensive exploration ofthe environment With this method of learning the exploring

agent will be rewarded depending on the level of novelty ofsolution that was found At the same time the minimal fitnesscriterion is introduced which allows to optimize the trainingtime and avoid agent sticking in the some noisy parts of theenvironment which have a high potential for the novelty butlow objective fitness

The ultimate goal of the swarm systemrsquos training is tominimize sensory surprise (free-energy) [9] of the CNC mod-ule while roaming in the environment The minimal value ofsurprise implies that the CNC system is fully aware of theproperties of the surrounding environment in the context ofthe actual sensory inputs and the probability distributions ofthe causes of the inputs Thus the CNC system is able tomaintain a reliable model of the environment and make theright inference about the outcomes of actions that will beperformed And with this awareness it becomes possible tobuild an optimal strategy for using the observed properties ofthe environment to achieve specific goals and accomplish theobjectives

Another important aspect of the CNC module is that itmaintains a library of already explored environments and canuse this prior knowledge as a stepping stone for improvingperformance in a different environment

The CNC module will control the overall dynamics of theswarm through the attractors - the dedicated active units Butat the same time to perform micro-tasks that require highaccuracy it will be able to directly control each unit of theswarm

B Micro-Intelligent Agents and Information Channels

Recently researchers have shown [21] that in simple modelsof neural networks the amount of effective information [26]increases as you coarse-grained the model of neuron interac-tions in the neural network that is consider groups of them asa whole The possible states of these interconnected units forma causal structure in which transitions between states can bemathematically modeled using so-called Markov chains [22]At a certain macroscopic scale effective information reachesa peak This is the scale at which states of the system havethe most causal power predicting future states in the mostreliable and effective manner By increasing the scale furtheryou begin to lose important details about the causal structureof the system The neural groups with most casual power atthe optimal macroscopic scale we refer to as Micro-IntelligentAgents (MIA)

MIA agents are the foundation of the cognitive hierarchyof the swarm system They are able to process tiny amountsof input data (bit sets words) and aggregate them into higher-order information units that will be passed to the next level ofprocessing Physically MIA agents are represented as highlyspecialized software modules evolved to process certain kindsof input data received from specific set of sensors or fromInformation Channels (ICH) [21]

Information Channels in turn facilitate the exchange ofinformation between different parts of the swarm and encodethe casual structure of the sensory inputs Information Chan-nels evolve along with Micro-Intelligent Agents in the processof maturation of Artificial Swarm Intelligence and become

2



















3













( [9])




4






s = g(xΘ) + z (5)


˙x = f(x αΘ) + w (6)





















5




VI CONCLUSIONS














6




VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References


















3













( [9])




4






s = g(xΘ) + z (5)


˙x = f(x αΘ) + w (6)





















5




VI CONCLUSIONS














6




VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References












( [9])




4






s = g(xΘ) + z (5)


˙x = f(x αΘ) + w (6)





















5




VI CONCLUSIONS














6




VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References





s = g(xΘ) + z (5)


˙x = f(x αΘ) + w (6)





















5




VI CONCLUSIONS














6




VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References



VI CONCLUSIONS














6




VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References



VII FUTURE WORK








REFERENCES




















7












8


INTRODUCTION







Conclusions

Future Work

References











8


INTRODUCTION







Conclusions

Future Work

References

Artiﬁcial Swarm Intelligence and Cooperative Robotic Systems*

Documents

Transcript of Artiﬁcial Swarm Intelligence and Cooperative Robotic Systems*