Grounded Symbolic Communication between Heterogeneous Cooperating Robots

Autonomous Robots 8, 269–292, 2000c© 2000 Kluwer Academic Publishers. Manufactured in The Netherlands.

Grounded Symbolic Communication between HeterogeneousCooperating Robots

DAVID JUNGCenter for Engineering Science Advanced Research (CESAR), Computer Science and Mathematics Division

(CSMD), Oak Ridge National Laboratory, Oak Ridge, TN 37831, [email protected]; http://pobox.com/˜david.jung

ALEXANDER ZELINSKYRobotic Systems Laboratory (RSL), Department of Systems Engineering, Research School of Information Sciences

and Engineering, The Australian National University, Canberra, ACT 0200, [email protected]; http://syseng.anu.edu.au/rsl/

Abstract. In this paper, we describe the implementation of a heterogeneous cooperative multi-robot system thatwas designed with a goal of engineering a grounded symbolic representation in a bottom-up fashion. The systemcomprises two autonomous mobile robots that perform cooperative cleaning. Experiments demonstrate successfulpurposive navigation, map building and the symbolic communication of locations in a behavior-based system.We also examine the perceived shortcomings of the system in detail and attempt to understand them in termsof contemporary knowledge of human representation and symbolic communication. From this understanding, wepropose theAdaptive Symbol Grounding Hypothesisas a conception for how symbolic systems can be envisioned.

Keywords: cooperative robotics, heterogeneous systems, symbolic communication, symbol grounding,learning, representation, behavior-based, mobile robots

1. Introduction

The Behavior-based approach to robotics has proventhat it is possible to build systems that can achievetasks robustly, react in real-time and operate reliably.The sophistication of applications implemented rangesfrom simple reactivity to tasks involving topologicalmap building and navigation. Conversely, the classicalAI approach to robotics has attempted to construct sym-bolic representational systems based on token manip-ulation. There has been some success in this endeavoralso. While more powerful, these systems are gener-ally slow, brittle, unreliable and do not scale well—astheir ‘symbols’ are ungrounded.

In this paper, we present an approach for engi-neering grounded symbolic communication betweenheterogeneous cooperating robots. It involves design-

ing behavior that develops shared groundings betweenthem. We demonstrate a situated, embodied, behavior-based multi-robot system that implements a coopera-tive cleaning task using two autonomous mobile robots.They develop shared groundings that allow them toground a symbolic relationship between positions con-sistently. We show that this enables symbolic commu-nication of locations between them.

The subsequent part of the paper critically examinesthe system and its limitations. The new understand-ing of the system we come to shows that our approachwill not scale to complex symbolic systems. We arguethat it is impossible for complex symbolic representa-tional systems to be responsible for appropriate behav-ior in situated agents. We propose theAdaptive SymbolGrounding Hypothesisas a conception of how systemsthat communicate symbolically can be envisioned.

270 Jung and Zelinsky

Before presenting the system we have developed,the first section briefly discusses cooperation and com-munication generally and looks at some instances ofbiological cooperation in particular. From this, we de-termine the necessary attributes of symbolic systems.

2. Cooperation and Communication

Cooperation and communication are closely tied.Communication is an inherent part of the agent inter-actions underlying cooperative behavior, whether im-plicit or explicit. If we are to implement a concretecooperative task that requires symbolic level communi-cation, we must first identify the relationship betweencommunication and cooperative behavior. This sec-tion introduces a framework for classifying communi-cation and uses it to examine some examples of co-operative behavior in biological systems. From thisexamination, we draw conclusions about the mecha-nisms necessary to support symbolic communication.These mechanisms are utilized in our implementationof the cooperative cleaning system, as described in thesubsequent section.

It is important to realize that ‘cooperation’ is aword—a label for a human concept. In this case, theconcept refers to a category of human and possiblyanimal behavior. It doesnot follow that this behav-ior is necessarily beneficial to the agents involved.Since evolution selects behavioral traits that promotethe genes that encourage them, it will be beneficial tothe genes but not necessarily the organism or species.Human cooperative behaviour, for example, is a con-glomerate of various behavioural tendencies selectedfor different reasons (and influenced cultural knowl-edge). Because the design of cooperative robot sys-tems is in a different context altogether, we need tounderstand which aspects are peculiar to biologicalsystems.

2.1. Some Characteristics of Communication

Many authors have proposed classifications for thetypes of communication found in biological and ar-tificial systems (e.g. see Arkin and Hobbs, 1992b;Balch and Arkin, 1994; Cao et al., 1995; Dudek et al.,1993; Kube and Zhang, 1994; Matari´c, 1997). Likeany classification, these divide the continuous space ofcommunication characteristics into discrete classes ina specific way—and hence are only useful within the

context for which they are created. We find it necessaryto introduce another classification here.

A communicative act is an interaction whereby asignal is generated by anemitterand ‘interpreted’ bya receiver. We view communication in terms of thefollowing four characteristics.r Interaction distance—This is the distance between

the agents during the communicative interaction. Itcan range from direct physical contact, to visualrange, hearing range, or long range.r Interaction simultaneity—The period between thesignal emission and reception. It can be immediatein the case of direct contact, or possibly a long timein the case of scent markers, for example.r Signaling explicitness—This is an indication of theexplicitness of the emitter’s signaling behavior. Thesignaling may be a side effect of an existing behav-ior (implicit), or an existing behavior may have beenmodified slightly to enhance the signal through evo-lution or learning. The signaling may also be theresult of sophisticated behavior that was specificallyevolved, learnt, or in the case of a robot, designed,for it.r Sophistication of interpretation—This can be ap-plied to either the emitter or the receiver. It is anindication of the complexity of the interpretation pro-cess that gives meaning to the signal. For example, achemical signal may invoke a relatively simple chainof chemical events in a receiving bacterium. It ispossible that a signal has a very different meaningto the emitter and receiver—the signal may have nomeaning at all to its emitter. Conversely, the pro-cess of interpretation of human language is the mostcomplex example known.

2.2. Representation

It is not possible to measure the sophistication of theinterpretive process by observing the signal alone. Ac-cess to the mechanics of the process within an agentwould be necessary. Unfortunately, our current under-standing of the brain mechanisms underlying commu-nication in most animals is poor, at best. Therefore,our only approach is to examine the structure of thecommunicated signals. Luckily, there appears to besome correlation between the structural complexity ofcommunicated signals and the sophistication of theirinterpretive processes. Insect mating calls are simplein structure and we posit a simple interpretive process.

Grounded Symbolic Communication 271

At the other end of the spectrum, human language hasa complex structure and we consider its interpretationamongst the most sophisticated processes known. Birdsong and human music are possible exceptions, as theyare often complex in structure, yet have a relativelysimple interpretation. This is due to other evolutionaryselection pressures, since song also provides fitness in-formation about its emitter to prospective mates and, inthe case of birds, serves to distinguish between mem-bers of different species.

Science, through the discipline of linguistics, haslearned much about the structure of the signals gen-erated by humans that we call language (see Robins,1997). We utilize a small part of that here by describ-ing a conception of the observed structure of language.Using Deacon’s terms we define three types of refer-ence, or levels of representation: -iconic, indexicalandsymbolic(Deacon, 1997).

Iconic representation is by physical similarity towhat it represents. The medium may be physically ex-ternal to the agent—for example, as an orange discpainted on a cave wall may represent the sun. Alter-natively, it may be part of the agent, such as some re-peatable configuration of sensory neurons, or “internalanalog transforms of the projections of distal objectson our sensory surfaces” (Shepard and Cooper, 1982).

Indexical reference represents a correlation or as-sociation between icons. All animals are capable oficonic and indexical representation to varying degrees.For example, an animal may learn to correlate the iconfor smoke with that for fire. Hence, smoke will come to

Figure 1. Levels of representation.

be an index for fire. Even insects probably have limitedindexical capabilities. Empirical demonstrations are amechanism for creating indexical references in others.Pointing, for example, creates an association betweenthe icon for the physical item being indicated and theobject of a sentence. The second part of this paper de-scribes how we have used empirical demonstration toenable the communication of locations between robots.

The third level of representation is symbolic. A sym-bol is a relationship between icons, indices and othersymbols. It is the representation of a higher-level pat-tern underlying sets of relationships. It is hypothesizedthat language can be represented as a symbolic hierar-chy (Newell and Simon, 1972). We will use the termsub-symbolicto refer to representations that need onlyiconic and indexical references for their interpretation.If the interpretation of a symbol requires following ref-erences thatall eventually lead to icons, the symbolis said to begrounded. That is, the symbols at thetop of Fig. 1 ultimately refer to relationships betweenthe icons at the bottom. A symbol’sgroundingis the setof icons, indices and other symbols necessary to inter-pret it.

The problems associated with trying to synthesize in-telligence from ungrounded symbol systems—the clas-sical AI approach—have been well documented in theliterature. One such problem is termed theframe prob-lem(see Ford and Hayes, 1991; Pylyshym, 1987). Theimportance ofsituatedandembodiedagents has beenactively espoused by members of the behavior-basedrobotics community, in recognition of these problems,


for many years (see Brooks (1991), Matari´c (1992b),Pfeifer (1995) and Steels (1996) for a selection). Thishypothesis is called thephysical grounding hypoth-esis (Brooks, 1990). Consequently, we adopted thebehavior-based approach for our implementation.

2.3. Cooperative Biological Systems

In this subsection we describe five selected biologi-cal cooperative systems and classify the communica-tion each employs using the scheme introduced above.From these, in the following subsection, we identifythe necessary mechanisms for symbolic communica-tion, which were transferred to the implementation ofthe cooperative multi-robot cleaning system.

2.3.1. Bacteria. Cooperation between simple organ-isms on earth is almost as old as life on earth itself. Overa billion years ago bacteria existed similar to contem-porary bacteria recently observed to exhibit primitivecooperation. Biologists have long understood that bac-teria live in colonies. Only recently has it become ev-ident that most bacteria communicate using a numberof sophisticated chemical signals and engage in altruis-tic behavior (Kaiser and Losick, 1993). For example,Mycobacteria assemble into multi-cellular structuresknown as fruiting bodies. These structures are assem-bled in a number of stages each mediated by differentchemical signal systems. In these cases the bacteriaemit and react to chemicals in a genetically determinedway that evolved explicitly for cooperation. Hence,we classify the signaling as explicit. The interactiondistance is moderate compared to the size of a bac-terium, and the simultaneity is determined by the speedof chemical propagation. The mechanism for interpre-tation is necessarily simple in bacteria.

We can consider thiscommunication without mean-ing preservation—the meaning of the signal is dif-ferent for emitter and receiver. The emitter gener-ates a signal without interpreting it at all (hence, ithas no meaning to the emitter). The receiver inter-prets it iconically.1 This type of communication hasbeen implemented and studied in multi-robot systems.For example, Balch and Arkin have implemented acollective multi-robot system, both in simulation andwith real robots, to investigate to what extent com-munication can increase their capabilities (Balch andArkin, 1994). The tasks they implemented were basedon eusocial insect tasks, such as forage, consume, andgraze. One scheme employed was the explicit signal-ing of the emitter’s state to the receiver. They showed

that this improves performance, as we might expect.Specifically it provides the greatest benefit when thereceiver cannot easily sense the emitter’s state implic-itly. This finding was also observed by Parker in the im-plementation of a puck moving task where each robotbroadcast its state periodically (Parker, 1995); and byKube and Zhang with their collective box pushing sys-tem (Kube and Zhang, 1994). The second part of thispaper will demonstrate that the result also holds forour system.

2.3.2. Ants. Of the social insect societies, the mostthoroughly studied are those of ants, termites, bees andwasps (Wilson, 1971; Wilson, 1975; Crespi and Choe,1997). Ants display a large array of cooperative be-haviors. For example, as described in detail by Pasteelset al. (1987), upon discovering a new food source, aworker ant leaves a pheromone trail during its returnto the nest. Recruited ants will follow this trail to thefood source with some variation while laying their ownpheromones down. Any chance variations that result ina shorter trail to the food will be reinforced at a slightlyfaster rate, as the traversal time back and forth is less.Hence, it has been shown that a near optimal shortestpath is quickly established as an emergent consequenceof simple trail following with random variation.

In this case, the interaction distance is local—thereceiver senses the pheromone at the location it wasemitted. As the signal persists in the environment forlong periods, there may be significant delay betweenemission and reception. The signaling mechanism islikely to be explicit and the interpretation, while morecomplex than for bacteria, is still relatively simple. Theants also communicate by signaling directly from an-tennae to antennae.

Since both emitter and receiver can interpret the sig-nal in the same way, we consider itcommunicationwith meaning preservation. The crucial element be-ing that both agentsshare the same groundingfor thesignal. In this case, the grounding is probably geneti-cally determined—through identical sensors and neu-ral processes. This mechanism can also be applied tomulti-robot systems. For example, if two robots sharedidentical sensors, they could simply signal their sensorvalues. This constitutes an iconic representation, and itis grounded directly in the environment for both robotsidentically. Nothing special needs to be done to ensurea shared grounding for the signal.

2.3.3. Wolves. A social mammal of the Canine family,wolves are carnivores that form packs with strict social


hierarchies and mating systems (Stains, 1984). Wolvesare territorial. Territory marking occurs through re-peated urination on objects on the periphery of andwithin the territories. This is a communication schemereminiscent of our ants and their chemical trails.Wolves also communicate with pheromones excretedvia glands near the dorsal surface of the tail.

Wolves hunt in packs. During a pack hunt, individ-uals cooperate by closely observing the actions of eachother and, in particular, the dominant male who directsthe hunt to some extent. Each wolf knows all the packmembers and can identify them individually, both vi-sually and by smell. Communication can be directedat particular individuals and consists of a combinationof specific postures and vocalizations. The interactiondistance in this case is the visual or auditory rangerespectively, and the emission and reception is effec-tively simultaneous. The signals may be implicit, in thecase of observing locomotory behavior, for example;or more explicit in the case of posturing, vocalizing andscent marking. It seems likely that the signals in eachof these cases are interpreted similarly by the emitterand receiver.

Again, this is an instance ofcommunication withmeaning preservation. A significant difference is thatthe shared grounding enabling the uniform interpreta-tion of some signals (e.g. vocalizations and postures) isnot wholly genetically determined. Instead, a specificmechanism exists such that the grounding is partiallylearnt during development—in a social environmentsufficiently similar to both that a shared meaning isensured.

2.3.4. Non-Human Primates. Primates display so-phisticated cooperative behavior. The majority of in-teractions involve passive observation of collaboratorsvia visual and auditory cues, which are interpreted asactions and intentions. As Bond writes in referenceto Vervet monkeys, “They are acutely and sensitivelyaware of the status and identity of other monkeys, aswell as their temperaments and current dispositionalstates” (Bond, 1996). Higher primates are able to rep-resent the internal goals, plans, dispositions and in-tentions of others and to construct collaborative plansjointly through acting socially (Cheney and Seyfarth,1990). In this case, the interaction is simultaneous andoccurs within visual or auditory range. The signalingis implicit but the sophistication of interpretation forthe receiver is considerable. Some explicit posing andgesturing is also utilized, which is used to establish andcontrol ongoing cooperative interactions.

As with the Wolves, we observe communication withmeaning preservation through a shared grounding thatis developed through a developmental process. In thiscase, the groundings are more sophisticated, as is thedevelopmental process required to attain them.

2.3.5. Humans. In addition to the heritage of our pri-mate ancestors, humans make extensive use of com-munication, both written and spoken, that is explic-itly evolved or learnt. There is almost certainly somea priori physiological support for language learning inthe developing human brain (Bruner, 1982). Humanscooperate in many and varied ways. We display a ba-sic level of altruism toward all humans and sometimesanimals. We enter into cooperative relationships—symbolic contracts—with mates, kin, friends, organi-zations, and societies whereby we exchange resourcesfor mutual benefit. In many cases, we provide resourceswith no reward except the promise that the other party,by honoring the contract, will provide resources whenwe need them, if possible. We are able to keep track ofall the transactions and the reliability with which othershonor contracts (see Deacon (1997) for a discussion).

Humans also use many types of signaling for com-munication. Like our primate cousins, we make ex-tensive use of implicit communication, such as postur-ing (body language). We also use explicit gesturing—pointing, for example. Facial expressions are a formof explicit signaling that has evolved from existing ex-pressions to enhance the signaling reliability and reper-toire. Posturing, gesturing and speaking all involvesimultaneous interaction. However, with the advent ofsymbolic communication we learned to utilize longer-term interactions. A physically realized icon, such asa picture, a ring or body decoration, is more perma-nent. The ultimate extension of this is written language.The coming of telephones, radios and the Inter-net have obviously extended the interaction distancesconsiderably.

While symbolic communication requires consider-able sophistication of interpretation, humans also usesignals that can be interpreted more simply. For ex-ample, laughter has the same meaning to all humans,but not to other animals. We can make the necessaryconnection with the emotional state since we can hearand observe others and ourselves laughing—we sharethe same innate involuntary laugh behavior.

The developmental process that provides the sharedgroundings for human symbolic communication—sion of the processes present in our non-human pri-mate ancestors (Hendriks-Jansen, 1996). The major


differences being in the complexity due to the sheernumber of groundings we need to learn and the intrin-sic power of symbolic representation over exclusivelyindexical and iconic representation. Symbolic repre-sentations derive their power because they provide adegree of independence from the symbolic, indexicaland iconic references that generated the relationshiprepresented. New symbols can be learnt using lan-guage metaphor (Lakoff and Johnson, 1980; Johnson,1991).

2.4. Symbolic Communication and its Prerequisites

Evolution does not have the luxury of being able tomake simultaneous independent changes to the designof an organism and also ensure their mutual consistency(in terms of the viability of the organism). For thisreason, once a particular mechanism has been evolved,it is built upon rather than significantly re-designedto effect a new mechanism. It is only when selectionpressures change enough to render things a liability thatthey may be discarded. This is why layering is observedin natural systems (e.g. Mallot, 1995).

In the examples above, we can perceive a layeringof communication mechanisms that are built up as welook at each in turn—from bacteria to humans. Eachleveraging the mechanism developed in the previouslayer. The ant’s use of chemical pheromone trails toimplement longer duration interactions is supported bydirect-contact chemical communication, pioneered bytheir distant bacterial ancestors. Wolves also employthis type of communication, which provides an envi-ronment that supports the developmental process forlearning other shared groundings. The sophistication ofsuch developmental processes is greater in non-humansprimates and significantly so in humans. However, evenfor humans, these processes still leverage the simplerprocesses that provide the scaffolding of shared iconicand indexical groundings (see Thelen and Smith, 1994;Hendriks-Jansen, 1996).

We believe such layering is integral to the general ro-bustness of biological systems. If a more sophisticatedmechanism fails to perform, the lesser ones will stilloperate. We emulate the layering in the implementa-tion of our system for this reason.

From our examination, the following seem to benecessary for symbolic communication between twoagents.

r Some iconic representations in common (e.g. bypossessing some physically identical sensory-motorapparatus).r Either a shared grounding for some indexical repre-sentations, a common process that develops sharedindexical groundings, or a combination of both (e.g.a mechanism for learning the correlation betweenicons—such as correlating ‘smoke’ with ‘fire’).r A common process that develops shared symbolicgroundings (e.g. mother and infant ‘innate’ behaviorthat scaffolds language development—turn-taking,intentional interpretation, mimicking etc.)

Additionally, unless the symbol repertoire is to befixed with specific processes for acquiring each symbol,it seems necessary to have:r A mechanism for learning new symbols by commu-

nicating known ones (e.g. interpretation and learn-ing through metaphor).

The implementation of this last necessity in a robotsystem is currently beyond the state-of-the-art. How-ever, the first three are implemented in the cooperativecleaning system, as described in the following section.

3. The System

Our research involved the development of an architec-ture for behavior-based agents that supports coopera-tion (Jung, 1998; Jung and Zelinsky, 1999a).2 To val-idate the architecture we implemented a cooperativecleaning task using the two Yamabico mobile robotspictured in Fig. 2 (Yuta et al., 1991). The task is toclean our laboratory floor space. Our laboratory is acluttered environment, so the system must be capableof dealing with movable obstacles, people and otherhazards.

3.1. The Robots

As we are interested in heterogeneous cooperation, webuilt each robot with a different set of sensors and actua-tors, and devised the cleaning task such that it cannot beaccomplished by either robot alone. One of the robots,‘Joh’, has a vacuum cleaner that can be turned on andoff via software. Joh’s task is to vacuum piles of ‘litter’from the laboratory floor. As our aim was not to design


Figure 2. The two Yamabicos ‘Flo’ and ‘Joh’.

a high performance cleaning system per se, choppedStyrofoam serves as ‘litter’. Joh cannot vacuum closeto walls or furniture, as the vacuum is mounted be-tween the drive wheels. It has the capability to ‘see’piles of litter using a CCD camera and a video transmit-ter that sends video to aFujitsu MEP tracking visionsystem. The vision system uses template correlation,and can match about 100 templates at frame rate. Thevision system can communicate with the robot, via aUNIXTM host, over a radio modem. Visual obstacle-avoidance behavior has been demonstrated at speeds ofup to 600 mm/sec (Cheng and Zelinsky, 1996).

The other robot, ‘Flo’, has a brush tool that isdragged over the floor to sweep distributed litter intolarger piles for Joh to pick-up. It navigates around theperimeter of the laboratory where Joh cannot vacuumand deposits the litter in open floor space. Sensingis primarily using four specifically developed passivetactile ‘whiskers’ (Jung and Zelinsky, 1996). Thewhiskers provide values proportional to their angle ofdeflection. Both robots are also fitted with ultrasonicrange sensors and wheel encoders.

3.2. A Layered Solution

We implemented the cleaning task by layering solu-tions involving more complex behavior over simplersolutions (see Fig. 3). This provides a robust finalsolution, reduces the complexity of implementation

Figure 3. Layered solution.

and allows us to compare the system performance atintermediate stages of development.

The first layer involves all the basic behavior re-quired to clean the floor, but does not include any ca-pacity to purposefully navigate, explicitly communi-cate or cooperate. Flo sweeps up litter and periodicallydeposits it into piles where it is accessible by Joh. Johuses the vision to detect the piles and vacuum them


up. Therefore, the signaling—depositing litter piles—is implicit in this case, as it is normal cleaning behavior.The interaction is not simultaneous, as Joh doesn’t nec-essarily see the piles as soon as they are deposited. Theinteraction distance ranges over the size of the labora-tory. Flo doesn’t interpret the piles of litter as a signal atall—and in fact has no way of sensing them. Joh has asimple interpretation—the visual iconic representationof the pile acts as a releaser to vacuum over it.

The second layer gives Joh an awareness of Flo.We added the capability for Joh to visually detect andtrack the motion of Flo. This is another communicationmechanism that provides state information about Flo toJoh. In this case, the signaling is again implicit, the in-teraction distance is visual range and the interaction issimultaneous. Joh uses the visual iconic representationof Flo to ground an indexical reference for the likely lo-cation of the pile of litter deposited. Figure 4 shows Johvisually observing Flo via a distinctive pattern. Detailsof the implementation the visual behavior we employedcan be found in (Jung et al., 1998). A top view of typicaltrajectories of the robot is shown in Fig. 5.

Figure 4. Joh visually tracking Flo (no vacuum attached).

The third layer introduces explicit communication.Specifically, upon depositing a pile of litter, Flo signalsvia radio the position (distance and orientation) of thepile relative to its body and the relative positions of thelast few piles deposited. Flo and Joh both have identicalwheel encoders, so we are ensured of a shared ground-ing for the interpretation of the communicated relativedistance and orientation to piles. Although odometryhas a cumulative error, this can be ignored over suchshort distances. The catch is that the positions are rela-tive to Flo. Hence, Joh must transform them to egocen-tric positions based on the observed location of Flo. IfFlo is not currently in view, the information is ignored.A typical set of trajectories is shown in Fig. 6.

The fourth and final layer involves communicationof litter locations by Flo to Joh even when Flo can-not be seen. This is accomplished by using a symbolicinterpretation for a specific geometric relationshipof positions to each other. What is communicatedto convey a location is analogous to ‘litter positionis 〈specific-geometric-relation-between〉〈position-A〉〈position-B〉〈direction〉〈distance〉’. The positions are


Figure 5. Typical trajectories when Joh can observe Flo depositing litter (Layer 2).

indexical references that are themselves groundedthrough a shared process, a location-labeling behav-ior, described below. The distance and direction are infact raw encoder data, hence an iconic reference, rely-ing on the shared wheel encoders. There is no signalcommunicated for the symbolic relation itself (like aword), since there is only one symbol in the system,it is unambiguous. Obviously, if more symbols wereknown, or a mechanism for leaning new symbols avail-able, labels for the symbols would need to be generatedand signaled (and perhaps syntax established).

First, we describe the action selection scheme em-ployed, as it is the basis for the navigation and mapbuilding mechanism, which in turn is the basis for thelocation-labeling behavior.

3.3. Action Selection

We needed to design an action selection mechanismthat is distributed, grounded in the environment, andemploys a uniform action selection mechanism overall behavior components. Because the design was

undertaken in the context of cooperative cleaning, wealso required the mechanism to be capable of cooper-ative behavior and communication, in addition to nav-igation. Each of these requires some ability to plan.This implies that the selection of which action to per-form next must be made in the context of which actionsmay follow—that is, within the context of an ongo-ing plan. In order to be reactive, flexible and oppor-tunistic, however, a plan cannot be a rigid sequence ofpre-defined actions to be carried out. Instead, a planmust include alternatives, have flexible sub-plans andeach action must be contingent on a number of factors.Each action in a planned sequence must be contingenton internal and external circumstances including theanticipated effects of the successful completion of pre-vious actions. Other important properties are that theagent should not stop behaving while planning occursand should learn from experience.

There were no action selection mechanisms in theliterature capable of fulfilling all our requirements. Asour research is more concerned with cooperation thanaction selection per se, we adopted Maes’ spreading


Figure 6. Typical trajectories when explicit communication is utilized (Layer 3).

activation algorithm and modified it to suit our needs.Her theory “models action selection as an emergentproperty of an activation/inhibition dynamics amongthe actions the agent can select and between the actionsand the environment” (Maes, 1991).

3.3.1. Components and Interconnections.The be-havior of a system is expressed as a network that con-sists of two types of nodes—Competence ModulesandFeature Detectors. Competence modules (CMs) are thesmallest units of behavior selectable, and feature de-tectors (FDs) deliver information about the external orinternal environment. A CM implements a componentbehavior that links sensors with actuators in some ar-bitrarily complex way. Only one CM can be executingat any given time—a winner-take-all scheme. A CM isnot limited to information supplied by FDs—the FDsare only separate entities in the architecture to make ex-plicit the information involved in the action selectioncalculation.

The graphical notation is shown above where rectan-gles represent CMs and rounded rectangles represent

FDs. Although there can be much exchange of informa-tion between CMs and FDs the interconnections shownin this notation only represent the logical organizationof the network for the purpose of action selection.

Each FD provides a singleConditionwith a confi-dence [0..1] that is continuously updated from the en-vironment (sensors or internal states). Each CM hasan associatedActivationand the CM selected for exe-cution has the highest activation from allReadyCMswhose activations are over the current global threshold.A CM is Readyif all of its preconditionsare satisfied.The activations are continuously updated by aspread-ing activation algorithm.

The system behavior is designed by creating CMsand FDs and connecting them withprecondition links.These are shown in the diagram above as solid linesfrom a FD to a CM ending with a white square. It ispossible to have negative preconditions, which mustbe false before the CM can beReady. There also existcorrelation links, dotted lines in the figure, from a CMto a FD. The correlations can take the values [−1..1]and are updated at run-time according to a learning


Figure 7. Network components and interconnections.

algorithm. A positive correlation implies the execu-tion of the CM causes, somehow, a change in the envi-ronment that makes the FD condition true. A negativecorrelation implies the condition becomes false. Thedesigner usually initializes somecorrelation links tobootstrap learning.

Together these two types of links, the preconditionlinks and the correlation links, completely determinehow activation spreads thought the network. The otheractivation linksthat are shown in Fig. 7 are determinedby these two and exist to better describe and understandthe network and the activation spreading patterns. Theactivation links dictate how activation spreads and aredetermined as follows.r There exists asuccessor linkfrom CM p to CM s

for every FD condition ins’s preconditions list thatis positively correlated with the activity ofp.r There exists apredecessor linkin the opposite di-rection of every successor link.r There exists aconflictor link from CM x to CM yfor every FD condition iny’s preconditions list thatis negatively correlated with the activity ofx.

The successor, predecessor and conflictor links re-sulting from the preconditions and correlations areshown in Fig. 7.

In summary, a CMs has a predecessor CMp, if p’sexecution is likely to make one ofs’s preconditionstrue. A CM x has a conflictor CMy, if y’s executionis likely to make one ofx’s preconditions false.

3.3.2. The Spreading of Activation. A rigorous de-scription of the spreading activation algorithm is

beyond the scope of this paper. The algorithm has beendetailed in previous publications (Jung, 1998; Jung andZelinsky, 1999a). The activation rules can be more con-cisely described in terms of the activation links. Themain spreading activation rules can be simply stated:r UnreadyCMs increase the activation of predeces-

sors and decrease the activation of conflictors, andr ReadyCMs increase the activation of successors.

In addition, these special rules change the activationof the network from outside in response to goals andthe current situation:r Goals increase the activation of CMs that can satisfy

them and decrease the activation of those that conflictwith them, andr FDs increase the activation of CMs for which theysatisfy a precondition.

To get a feel for how it works, we describe part of anetwork that implements the cleaning task for Flo, asshown in Fig. 8. With some of the components shown,a crude perimeter-following behavior is possible.

The rectangles are basic behaviors (CMs), the ovalsfeature detectors (FDs), and only the correlation andprecondition links are shown (the small circles indi-cate negation of a precondition). The goal isClean-ing. This occurs when Flo roughly follows the perime-ter of the room by usingFollow to follow walls andReverseTurn to reverse and turn away from theperimeter when an obstacle obstructs the path. Peri-odically the litter that has accumulated in the sweeperis deposited away from the perimeter byDumpLitter.


Figure 8. Partial network for Flo (produced by our GUI).

The spreading activation algorithm ‘injects’ acti-vation into the network CMs via goals and via FDsthat meet a precondition. Therefore, theCleaninggoal causes an increase in the activation ofFollow,DumpLitter and ReverseTurn. Suppose Flo is ina situation where its left whiskers are against a wall(ObstacleOnLeft is true) and there are no obstaclesin front (ObstacleAhead andFrontHit both false).In this case, the activation ofFollow will be increasedby all the FDs in its precondition set (includingTimerwhich is false before being triggered). Being the onlyCM ready, it is scheduled for execution until the sit-uation changes. Once theTimer FD becomes true,Follow is no longer ready, butDumpLitter becomesready and is executed.Follow andDumpLitter alsodecrease each other’s activation as they conflict—eachis correlated with the opposite state ofTimer.

Although, the selection of CMs in this example de-pends mainly on the FD states, when the selection ofCMs depends more on the activation spread from otherCMs, the networks can exhibit ‘planning’—as Maeshas shown. This is the basis for action planning in ournetworks, and gives rise to path planning as will bedescribed below.

From the rules we can imagine activation spreadingbackward through a network, from the goals, throughCMs with unsatisfied preconditions via the precondi-tion links until areadyCM is encountered. Activationwill tend to accumulate at the ready CM, as it is feed-ing activation forward while its successor is feeding itbackward. Eventually it may be selected for execution,

after which its activation is reset to zero. If its execu-tion was successful, the precondition of its successorwill have been satisfied and the successor may be ex-ecuted (if it has no further unsatisfied preconditions).We can imagine multiple routes through the network,activation building up faster via shorter paths. Thesepaths of higher activation represent ‘plans’ within thenetwork. The goals act like a ‘homing signal’ filter-ing out through the network and arriving at the current‘situation’.

One important difference between our and Maes’networks is that in ours the flow of activation isweighted according to the correlations—which are up-dated continuously at run-time according to previousexperience. The mechanism for adjusting the correla-tion between a given CM-FD pair is simple. Each timethe CM becomes active, the value of the FD’s conditionis recorded. When the CM is subsequently deactivated,the current value of the condition is compared with therecorded value. It is classified as one of:Became True,Became False, Remained Trueor Remained False. Acount of these cases is maintained(Bt, Bf, Rt, Rf). Thecorrelation is then:

corr= (2Bt + Rt)

2N− (2Bf + Rf)

2N

Where the total samplesN = Bt + Bf + Rt + Rf .To keep the network plastic, the counts are decayed

so recent samples have a greater effect than historicones.


Figure 9. (a) Geometric vs (b) topological path planning.

3.4. Navigation and Map Building

3.4.1. Spatial and Topological Path Planning.Thereare two main approaches to navigational path planning.One method utilizes a geometric representation of therobot environment, perhaps implemented using a treestructure. Usually a classical path planner is used to findshortest routes through the environment. The distancetransform method falls into this category (Zelinskyet al., 1993). These geometric modeling approaches donot fit with the behavior-based philosophy of only usingcategorizations of the robot-environment system thatare natural for its description, rather than anthropocen-tric ones. Hence, numerous behavior-based systemsuse a topological representation of the environment interms only of the robot’s behavior and sensing (e.g. seeMataric, 1992a). While these approaches are more ro-bust than the geometric modeling approach, they sufferfrom non-optimal performance for shortest path plan-ning. This is because the robot has no concept of spacedirectly, and often has to discover the adjacency oflocations.

Consider the example in Fig. 9 above, where therobot in (a) has a geometric map and its planner candirectly calculate the path of least Cartesian distance,directly from A to D. However, the robot in (b) hasa topological map with nodes representing the pointsA, B, C and D, connected by afollow-wall behavior.Since it has never previously traversed directly from Ato D, the least path through its map is A-B-C-D.

Consequently, our aim was to combine the benefitsof geometric and topological map representations in abehavior-based system using our architecture.

3.4.2. A self-Organizing Map. In keeping with thebehavior-based philosophy, we found no need to ex-plicitly specify a representation for a map or a specificmechanism for path planning. Instead, by introducingthe key notion oflocation feature detectors(locationFDs), the correlation learning and action selection nat-urally gave rise to map building and path planning—for‘free’.

A location feature detector is a component of our ar-chitecture specialized to respond when the robot is ina particular location (the detector’s characteristic loca-tion). We employ many detectors and the locations towhich they respond are non-uniformly distributed overthe laboratory floor space. Each location FD contains avectorv, whose components are elements of the robotstate vector:

x = [g, s, fds]

where

g = (x, y, θ) global odometry coordinates

s = sensor values

fds= non-location FD values

The variableg contains global Cartesian coordinatesand orientation estimated from wheel encoders and amodel of the locomotion controller. The sensors in-clude ultrasonic range readings and in Flo’s case, tac-tile whisker values. Thefds component contains thecondition values of all FDs in the system, except forthe location FDs themselves. For example, in Joh’scase this includes visual landmark FDs.

The condition confidence value of each locationFD is updated by comparing it to the current stateof the robot’s sensors and other non-location FDs. Aweighted Euclidean normNw is used—with the(x, y)coordinate weights dominating.

ci = 1− Nw(v, x)

Hence, the vector of the location FD whose conditionis true with highest confidence is considered to repre-sent the ‘current location’ of the robot. The detectorsare iconic representations of locations (see Fig. 10).

The location FD vectorsv are initialized such thatthe(x, y) components are distributed as a regular gridover the laboratory floor space, and the other compo-nents are randomly distributed over the vector space.During operation of the system, the location FD vec-tors are updated using Kohonen’sself-organizing map(SOM) algorithm (Kohonen, 1990). This causes thespatial distribution of the location FD vectors to ap-proximate the frequency distribution of the robot’s statevector over time. Figure 11 shows how the detec-tors have organized themselves to represent one of ourlaboratories. One useful property of a SOM is thatit preserves topology—nodes that are adjacent in the


Figure 10. Schematic of location detector SOM and current location index.

representation are neighboring locations in the vectorspace.

Since the location FD vectorsv are continuouslymatched with the robot state vectorx, in which the(x, y) coordinates are estimated via odometry, there isa major drawback. The odometry error in(x, y) is cu-mulative. We remedy this by updating the robot statevector coordinates. Specifically, the system has featuredetectors for various landmark types that are automat-ically correlated with the location FDs by the correla-tion learning described above. If it should happen thata landmark FD becomes true with high confidence thatis strongly correlated with a location FDneighboringthe location FD for the ‘current location’, then the statevector(x, y) component is updated. The coordinatesare simply moved closer to the coordinates of the loca-tion FD to which the landmark is correlated. Assumingthe landmarks don’t move over moderate periods, thisserves to keep the location FD(x, y) components reg-istered with the physical floor space.

The system also maintains an indexical referencethat represents the robot’s current location. Recall thatan indexical reference is a correlation between icons.The robots each have a sense of time—in terms ofthe ordering relation between sensed events (which isshared to the extent that the ordering of external eventsis perceived to be the same by both robots). Hence, thecurrent location index is an association between themost active location detector and the current time.

It is clear this mechanism fulfills our requirementfor spatial mapping. The topological mapping derivesagain from the correlation learning in the architecture.Specifically, the system learns by experience that a par-ticular behavior can take the robot from one state toanother—for example by changing the current loca-tion index in a consistent way. Over time, behaviorsuch as follow-wall becomes correlated with the startand end locations of a wall segment. The spreading ac-tivation will cause the behavior to be activated whenthe system needs to ‘plan’ a sub-path from the start to


Figure 11. SOM generated by Flo over part of our Lab.

the end. Similarly, simple motion behavior becomescorrelated with moving the robot from one location toone of its neighbors.

3.4.3. Navigation. Once we have feature detectorsthat respond to specific locations, it is straightforward

to add spatial and topological navigation. Each time abehavior (a CM) is activated, the identity of the currentlocation FD before and after its execution is recorded.A new instance of the CM is created, and initializedwith the ‘source’ location FD as a precondition and the‘destination’ as a positive correlate. Hence, the system


Figure 12. Behavioral adjacency of locations via Follow (FL).

remembers which behavior can take it from one specificlocation to another. If the CM does not consistently dothis, its correlation with the destination location FDwill soon fall. If it falls to zero, the CM is removedfrom the network. Changes in the environment alsocause correlations to change, thus allowing the systemto adapt.

With this mechanism, the system learns topologicaladjacency of locations in terms of behavior. For exam-ple, if the activation of theFollow CM consistentlytakes the robot from the location FD corresponding tothe start of a wall, to the end of the wall, then the linksin Fig. 12 above will be created.

The spreading activation algorithm for action selec-tion is able to plan a sequence of CM activations toachieve navigation between any arbitrary locations.

Spatial navigation is achieved by initializing the net-work so that a simpleForward behavior links each lo-cation FD with its eight neighbors in both directions.Hence, initially the system ‘thinks’ it can move in astraight line between any locations that are neighbors inthe SOM. If presence of an obstacle blocks the straight-line path from one location to its neighbor, then thiswill be learnt through a loss of correlation between thecorrespondingForward CM and the ‘destination’ FD.The mechanisms described here for map building andnavigation are presented in detail in (Jung, 1998; Jungand Zelinsky, 1999b).

3.5. A Shared Grounding for Locations

For layer 4 of the implementation, we wanted to addthe capability for Flo to communicate the locations of

litter piles in a more general way. In such a way that itwould be useful to Joh if Flo were not in view or evenin another room. In the system as described thus far,Flo and Joh do not share any representations except theiconic representations of their shared sensors (odome-try and ultrasonic). The location feature detectors maybe correlated with visual landmarks in Joh’s map, andwhisker landmarks in Flo’s (among other information).

Hence, before we can communicate Flo’s represen-tation for location we need a procedure to establish ashared grounding with Joh. For this purpose, we haveimplemented alocation labelingprocedure. Locationlabeling is essentially behavior whereby Flo teachesJoh a location by empirical demonstration. It proceedsas follows.

If Joh is tracking Flo in its visual field at a particulartime and there are no previously labeled locations nearby, then Joh signals Flo indicating that Flo’s currentlocation should be labeled. Although an arbitrary sig-nal could be generated and communicated to serve as acommon labeling icon for the location, in this specificcase no signal is necessary. Because there are only tworobots, the time ordering of the labeling proceduresis identical to each. Hence, a time ordered sequencenumber maintained by each serves as the labeling iconwith a shared grounding. The first location is labeled‘1st Label’, the next ‘2nd Label’, etc. If Joh receivesa confirmation signal from Flo, it associates the labelicon with Flo’s current location. Joh calculates Flo’slocation based on its own location and a calculation ofFlo’s range from visual tracking information. Flo alsolabels its own location index in the same way. Thisprocedure creates an indexical representation of spe-cific locations that are associations between a location


Figure 13. Shows Iconic references that represent sensory data, Indexical references that associate pairs of icons (a label with a location) anda symbol (see text). The fine lines between location feature detectors show their adjacency in the SOM; the pairs of arrow headed lines fromindexical references define which two icons they associate; and the two sets of arrow headed lines from the symbol designate two ‘exemplars’(see text).

detector icon and the label icon (the shared sequencenumber). Although the locations themselves are notrepresented using the same icons by both Flo and Joh,they represent the same physical location. Figure 13shows the situation after the labeling procedure has oc-curred four times (the symbol is explained below).

3.6. A Symbol for a Relationship Between Locations

The next step is to endow both Joh and Flo with theability to represent an arbitrary location in relationshipto already known locations. Recall that a symbol isdefined as a relationship between other symbolic, in-dexical and iconic references.

Ideally, symbols should be learnt, as in biologicalsystems. The relationship a symbol represents is ageneralization from a set of observed ‘exemplars’—specific relationships between other symbols, indicesand icons. How this can be accomplished is still anopen research area. For this reason, and because weonly need a single symbol that will not be referencedby higher-level symbols, we chose to simply providethe necessary relationship. We can consider the sym-bol a ‘first-level symbol’, as it is not dependent on anyother symbols, but grounded directly to iconic and in-dexical representations. As symbol systems go, ours isas impoverished as it can be.

The relationship represented by the symbol is be-tween two known location indices and a distance andorientation in terms of wheel encoder data. The twoknown locations define a line segment that provides anorigin for position and orientation. The wheel encoderdata then provides a distance and orientation relativeto this—which together defines a unique location (seeFig. 14). For example, a pile could be specified as beingapproximately 5 m away from the 2nd labeled locationat an angle of 30◦ relative to the direction of the 1st la-beled location from the 2nd. The top of Fig. 13 showsthe symbol in the context of the overall system.

3.7. Symbolic Communication

Finally, we are in a position to see how a location canbe symbolically communicated from Flo to Joh. Witha particular pile location in mind, Flo first calculatesthe representation for it using the symbolic relation-ship above. It selects the two closest locations, pre-viously labeled, as the indexical references and com-putes the corresponding iconic wheel encoder data thatwill yield the desired pile location. This information isthen signaled to Joh by signaling the labels for each ofthe known locations in turn, followed by the raw en-coder data. This signal is grounded in both robots, asthe labels were grounded through the location labeling


Figure 14. Schematic of the〈specific-geometric-relation-between〉 symbol used to communication locations.

procedure, and the wheel encoders are a shared sense.Hence, the meaning is preserved. Joh can recover thelocation by re-grounding the labels and reversing thecomputation.

Figure 15. Typical trajectories during cooperation.

3.8. Results

The typical trajectories in Fig. 15 show that Joh isable to successfully vacuum the litter in the pile to


Figure 16. Performance of layered cleaning solutions.

the left. This occurs after the location of the pilehas been communicated symbolically by Flo. The pilewas initially obscured by the cardboard box, but Johwas able to correctly compute its location and plan apath around the box using its map. This can be con-trasted with the layer 3 solution shown in Fig. 6, whereno symbolic communication or map was utilized. Ifthe box were blocking the straight-line path to the lit-ter pile in that case, Joh would not have been able tonavigate to within visual range to locate it.

As the system was not designed as a floor cleaningsystem per-se, rigorous experiments to record its clean-ing performance were not conducted. However, we didrun experiments that seem to show that the addition ofsymbolic communication does improve cleaning per-formance. We expect this intuitively, as the governingfactor in vacuuming performance is the path length be-tween litter piles. The ability to navigate purposivelyfrom one known litter pile location to the next, insteadof having to rely on an obstacle free path, or chancediscovery of the pile locations, shortens the averagepath length.

We also ran experiments utilizing each layer in turn(including the lower ones on which it builds). Werecorded the percentage of the floor cleaned every twominutes from 3–15 minutes. It was difficult to runall of the experiments consistently for more than 15minutes due to problems with hardware reliability. The

results are plotted in Fig. 16. Initially, about 30% ofthe ‘litter’ was distributed around the perimeter and theremainder scattered approximately uniformly over therest of the floor. The percentage cleaned was estimatedby dividing the floor into a grid and counting how manytiles had been cleaned.

Clearly, the addition of each layer improves thecleaning performance. In particular, layer 4, utilizingsymbolic communication, initially falls behind as sometime is used to perform location labeling rather thancleaning. This starts to pay off later after a number oflocations have been labeled.

This experiment also shows the robustness gained bylayering the solution. The implementation of layer 1 isrobust due to its simplicity. If any of the mechanismsemployed in the subsequent layers were to fail, we havedemonstrated that the system will continue to performthe cleaning task, although not as quickly.

4. A Critical Examination

4.1. The Limitations of Our System

The are two obvious limitations to the approach wehave described for developing grounded symbolic com-munication between robots. The first is that the com-mon process by which a shared symbol grounding isdeveloped is the design process. That is, the shared


grounding was established by identical design and im-plementation of the mechanism for its interpretation.This is an impractical way to develop sophisticatedsymbol systems, as the mechanism for the interpre-tation of each symbol must be designed in turn.

Is this just a practicality problem, or it is impossi-ble in principle? When we designed the system, webelieved that it was possible, if impractical, to buildgeneral symbol systems in this way—by explicitly de-signing the process of interpretation for each symbol.We hypothesized that all that was missing was a mech-anism to learn the symbolic representations—to effec-tively automate the process. However, we argue belowthat is it in fact impossible in principle (for all but thesimplest symbol systems—like the one presented).

The second obvious limitation is a related one. Theapproach doesn’t include a mechanism for learning newsymbols, even if it had an existing symbol repertoiredesigned in.

4.2. Symbols Revisited

Our definition of grounded from Section 2.2 con-tained a hidden assumption. We defined a symbol to begrounded if its interpretation required following refer-ences that all eventually lead to icons. Recall that, sym-bols and the structure of their relationships to each otherand to indices and icons, is a linguistic one. It is the em-pirically observed structure of thesignalsthat humansgenerate and interpret. This grammatical structure ofspoken and written language is a relatively persistentone (ignoring the fact that languages change slowlyover time). The hidden assumption, which we now be-lieve to be incorrect, was that this somehow impliesthat a similarly persistent analogous structure must bepresent within the mind of the humans that generate sig-nals conforming to the structure. That is, just becausethere is a relatively persistent symbolic system presentin human cultural artifacts—such as books, paintings,buildings, music, etc.—this does not imply that anysymbol system persists within the human mind. It waswith this invalid assumption that we proceeded to con-struct just such a symbol system within the robots, byrepresenting and relating icons, indices and symbols.

We believe there is ample evidence that no persis-tent symbolic structure within the human mind thatmirrors the structure of human language exists—butthis remains to be seen. Dennett has argued stronglyagainst the idea of a Cartesian theater—a place in themind where all the distributed information is integrated

for a central decision-maker (Dennett, 1993). It seemsthat distributed information about the external world(possibly contradictory) need not be integrated unless aparticular discrimination is necessary for performance(for example to speak or behave). Even then, only theinformation necessary for the discrimination need beintegrated.

Even if humans don’t use the equivalent of a persis-tent cognitive grammar to reason about the world, whycan’t robots use one?

4.3. Symbolic Representation is Not Situated

A symbol represents a discrete category in the con-tinuous space of sensory-motor experience. Hence itdefines a boundary such that points in the space lieeither within the category or outside of it—there areno gray areas. Therefore, a symbol system is a wayof characterizing sensory-motor experience in terms ofmembership of the categories it defines. Symbols de-rive their power by conferring a degree of independencefrom the context dependent, dynamic and situated ex-periences from which they are learnt. This allows sym-bolic communication to preserve its meaning when theinteraction is extended in time (e.g. the period betweenthese words being written and you reading them).

Suppose we build a robot for a particular task thatnecessitates symbolic communication, and endow itwith a symbolic representation system according tothe approach we have outlined, whereby static sym-bol groundings are designed in. The robot issitu-atedin the sense that the task for which is it designedprovides a context for its interaction with the environ-ment (from the theory of situated action—Mills (1940)and Suchman (1987)). The robot is an embodied agentand has a grounded symbol system. It satisfies the cri-teria of thephysical grounding hypothesis(Brooks,1990).

We argue that this approach to building a robot willnot necessarily work, except in the simplest cases. Thetask in which the robot is situated dictates the discrimi-nations it must make in order to behave appropriately—it must behave in terms of itsaffordances(Gibson,1986). Since the discriminations it can make are de-termined by the categories defined by its symbol sys-tem, which is necessarily static, it will only work if thetask very specific—ensuring the appropriate discrim-inations don’t change. This is precisely the situationin which our system operates—in the situated contextdefined by a statically specified cleaning task.


A robot capable of operating flexibly in a dynamicsituated context must continually adapt the discrimina-tions it makes. If using a symbolic representation sys-tem, this implies the categories defined by the symbols,and hence the meaning of the symbols themselves, mustchange.3 However, a dynamic symbol system looses itspower for communication—one of the main reasons forendowing the robot with a symbol system in the firstplace.

Consequently, a robot that utilizes a static symbolicrepresentation system (like the one we presented) can-not besituated if its task is to behave flexibly in adynamic context. Hence, our approach of designing inthe robot’s symbol groundings does not scale from sys-tems designed to achieve simple specific tasks, to moregeneral flexible behavior.

We also see a more pragmatic way in which largersymbol systems built via our approach can become un-situated. In order to manage complexity in the designprocess, we often structure a system by categorizingand apply linguistic labels to design components (i.e.we need to name elements of our designs). Althoughthis activity is logically independent from the way thesystem functions, the anthropocentric groundings weuse in our interpretation of the linguistic labels in-evitably effect the design.

For example, by naming a behavior componentWallFollowing, we may accidentally allow hiddenassumptions from our understanding of ‘walls’ to comeinto play, despite being aware of this pitfall. If the robotpossesses anything that could be called a concept for a‘wall’, it is surely impoverished compared to our hu-man understanding of ‘walls’. We contend that avoid-ing this pitfall becomes harder, to the point of practi-cal impossibility, as the symbol systems become morecomplex and the discrepancy between our labels andthe robot’s representations grow.

4.4. Adaptive Symbol Grounding Hypothesis

There is increasing evidence that humans do notreason about the world and behave using symbolicrepresentations (Hendriks-Jansen, 1996 provides athorough argument). Instead, like other biological sys-tems, we represent4 the world in terms of changingaffordances—dictated by our situatedness. We makeonly the discriminations necessary to behave appropri-ately. The symbols we use to communicate seem tobe generated during language production and interpre-tation by a dynamic process that grounds them in our

adaptive internal representations while preserving theirstatic, public, statistically persistent meaning. Hence,the symbols we generate are influenced by our situ-ated representations during production and they havethe power to influence them during interpretation. Thesymbolic representations themselves are only transient.We refer to this conception as theAdaptive SymbolGrounding Hypothesis.

By this conception, we envisage the process of learn-ing new concepts as follows. A process within theemitter wishing to communicate a new concept dy-namically generates a transient symbolic representa-tion that best approximates it by matching the inter-nal representation with learnt static linguistic symbolrelationships. This structure is reflected in the signal.The interpretation process within the receiver causes asimilar transient symbolic structure to emerge. Again,an approximate match is made between the symbolicstructure and the internal representation—which influ-ences the representations. In this case, the influencecauses a new concept to be discovered. The symbolicstructure provides the scaffolding necessary to get thereceiver thinking in the right way to discover the newconcept.

So the essential points of theAdaptive SymbolGrounding Hypothesiscan be summarized as follows.r The persistent relationships between icons, indices

and symbols that comprise the hierarchical structureof language (e.g., grammar) are only observed in thecommunicated signals.r Agents engaging in symbolic communication do notneed to maintain an explicit representation analogousto the symbolic structure of the language.r Symbol grounding is transient and adaptive. Explicitsymbolic representations and their situated ground-ings only persist during the generation and inter-pretation of the signals of symbolic communication.The specific groundings with which icons for partic-ular symbols are associated depend upon a historyof use. The mapping adapts both to the immediatecontext and to track long-term common usage withina community of language users.

4.5. Implication for Cooperative Robotics

In the future, we will require increasingly complextasks to be carried out by multi-robot teams. Hence,the behavioral sophistication of the individual robotswill be greater. If we wish to engineer multi-robot


systems that can cooperate in complex ways, they willeventually require symbolic communication.

TheAdaptive Symbol Grounding Gypothesisimpliesthat all symbols are learnt. Hence, we advocate theubiquitous use of learning in engineering all roboticsystems. Without it, we don’t believe symbolic com-munication of significance is possible.

Multi-robot systems are usually classified as eitherhomogeneous or heterogeneous. This is usually basedupon physical attributes, such as sensors and actua-tors; but can be equally applied to the computationaland behavioral ability of the robots. A robot system isclassified as heterogeneous if one or more agents aredifferent from the others. Balch proposes a metric tomeasure the diversity in multi-robot systems he callssocial entropy—which also recognizes physically iden-tical robots that differ only in their behavioral repertoire(Balch, 1997).

If robots are engineered with an emphasis on learn-ing and are consequently more a product of their expe-rience, as we suggest above, then even physically ho-mogeneous teams will have significant social entropy.The teams will necessarily be heterogeneous in termsof their representation of the world and hence behav-ior. Therefore, we don’t envisage homogeneous multi-robot systems playing a large role in the cooperativerobotics domain in the long term.

5. Summary

In the first part of the paper, we defined what we meanby grounded and provided a framework for talkingabout symbols in terms of indexical and iconic ref-erences. We also introduced the classification schemefor communication involving the characteristicsin-teraction distance, interaction simultaneity, signalingexplicitnessand sophistication of interpretation. Wediscussed cooperation and communication in bacteria,ants, wolves, primates and humans in these terms to de-duce some prerequisites for symbolic communication.

If we are not interested in preserving the meaningof a signal between emitter and receiver, then the im-plementation is straightforward. If we wish to preservemeaning, then we have to ensure a shared grounding be-tween the agents. In the case of iconic representations,as they are essentially grounded directly in sensory in-formation, this can only be ensured if the sensors areidentical between the agents. In the case of indexical

and symbolic representations, a specific mechanism forestablishing a shared grounding is needed. For index-ical representations, an empirical demonstration canserve to ground them to appropriate icons. Theloca-tion labelingprocedure we implemented on our robotstakes this form.

We described the implementation of the coopera-tive cleaning system, including the spreading activationaction-selection mechanism and purposive navigationin order to provide an understanding for the communi-cation mechanism. The symbolic communication re-lies on:r the shared grounding of icons through common sen-

sors,r the shared grounding for locations, developedthrough a specific process—the location labeling be-havior, andr the shared grounding for the symbol representinga specific relationship between locations—providedby design.

In the final part of the paper, we critically examinedthe system and its limitations. Specifically, one obvi-ous limitation is that the system only contains a singlesymbol, and it was provided at design time—with nomechanism for learning further symbols. By lookingagain at the notion of asymbolwe were able to under-stand that this approach cannot scale to larger symbolsystems.

We argued that situated, embodied agents cannot usesymbolic representations of the world to interactivelybehave in it. TheAdaptive Symbol Grounding Hypoth-esiswas introduced as an alternative conception forhow symbol system might be used in situated agents.Finally, we concluded that symbol grounding must belearnt. Consequently, we advocate the ubiquitous useof learning in heterogeneous multi-robot systems, be-cause without it symbolic communication is not possi-ble. We believe this would be a severe limitation to thesophistication of cooperation in the future.

Acknowledgments

This research was funded primarily by the The Univer-sity of Wollongong and The Australian National Uni-versity, and also in part by the Engineering ResearchProgram, Office of Basic Enegry Sciences, of the


U.S. Department of Energy, under contract DE-AC05-00OR22725 with UT-Battelle, LLC. The submittedmanuscript has been authored by a contractor of theU.S. Government under the above contract. Accord-ingly, the U.S. Government retains a nonexclusive,royalty-free license to publish or reproduce the pub-lished form of this contribution, or allow others to doso, for U.S. Government purposes.

Notes

1. The stereotypical way a particular chemical receptor on the bac-teria’s surface triggers a chain of chemical events within, is anicon for the presence of the external chemical signal.

2. See also http://pobox.com/˜david.jung/thesis.html.3. It may be possible in principle for an agent to use a static sym-

bol system that covers all possible categorizations and hence ac-commodates any possible discrimination needed for appropriatebehavior in any situated context. However, we dismiss this asimpossible in practice due to computation intractability.

4. We do not mean to imply that biological agents represent theworld to themselves. Of course any observations of the internalstates of an agent can be said to represent something—if we asscientific observers interpret it, it represents something to us.

References

Arkin, R.C. and Hobbs, J.D. 1992b. Dimensions of communicationand social organization in multi-agent robotic systems. InProc.Simulation of Adaptive Behavior92, Honolulu, HI.

Balch, T. 1997. Social entropy: A new metric for learning multi-robotteams. InProc. 10th International FLAIRS Conference (FLAIRS-97).

Balch, T. and Arkin, R.C. 1994. Communication in reactive multi-agent robotic systems.Autonomous Robots, 1:27–52.

Bond, A.H. 1996. An architectural model of the primate brain, Dept.of Computer Science, University of California, Los Angeles, CA90024-1596.

Brooks, R.A. 1990. Elephants don’t play chess.Robotics and Auto-nomous Systems, 6:3–15.

Brooks, R.A. 1991. Intelligence without reason. MIT AI Lab. Memo,1293. Prepared for Computers and Thought, IJCAI-91.

Bruner, J.S. 1982. The organisation of action and the nature of adult-infant transaction. InThe Analysis of Action, M. von Cranachand R. Harr´e (Eds.), Cambridge University Press: Cambridge,pp. 313–328.

Cao, Y.U., Fukunaga, A.S., Kahng, A.B., and Meng, F. 1995.Cooperative mobile robotics: Antecedents and directions. IEEE0-8186-7108-4/95.

Cheney, D.L. and Seyfarth, R.M. 1990.How Monkeys See the World,University of Chicago Press.

Cheng, G. and Zelinsky, A. 1996. Real-time visual behavioursfor navigating a mobile robot. InProceedings of the IEEE/RSJ

International Conference on Intelligent Robots and Systems(IROS), Vol. 2, pp. 973.

Crespi, B.J. and Choe, J.C. (Eds.) 1997.The Evolution of SocialBehaviour in Insects and Arachnids, Cambridge University Press:Cambridge.

Deacon, T. 1997.The Symbolic Species: The Co-Evolution ofLanguage and the Human Brain, Penguin Books. ISBN 0-713-99188-7.

Dennett, D.C. 1993.Consciousness Explained, Penguin Books.ISBN 0-14-01.2867-0.

Dudek, G., Jenkin, M., Milios, E., and Wilkes D. 1993. A taxonomyfor swarm robots. InProc. International Conference on Intelli-gent Robots and Systems (IROS), San Francisco, CA, pp. 1151–1156.

Ford, K. and Hayes, P. (Eds.) 1991.Reasoning Agents in a DynamicWorld: The Frame Problem, JAI Press.

Gibson, J.J. 1986.The Ecological Approach to Visual Perception,Lawrence Erlbaum Associates: London. ISBN 0-89859-958-X.

Hendriks-Jansen, H. 1996.Catching Ourselves in the Act, A Brad-ford Book, MIT Press: Cambridge, Massachusetts. ISBN 0-262-08246-2.

Johnson, M. 1991. Knowing through the body,Philosophical Psy-chology, 4:3–18.

Jung, D. 1998. An architecture for cooperation among autonomousagents. Ph.D. Thesis, Intelligent Robotics Laboratory, Universityof Wollongong, Australia. For further information on this researchsee the web site http://pobox.com/˜david.jung/thesis.html.

Jung, D., Heinzmann, J., and Zelinsky, A. 1998. Range and pose esti-mation for visual servoing on a mobile robotic target. InProc. IEEEInternational Conference on Robotics and Automation (ICRA),Leuven, Belgium, Vol. 2, pp. 1226–1231.

Jung, D. and Zelinsky, A. 1996. Whisker-based mobile robotnavigation. InProceedings of the IEEE/RSJ International Confer-ence on Intelligent Robots and Systems (IROS), Vol. 2, pp. 497–504.

Jung, D. and Zelinsky, A. 1999a. An architecture for distributedcooperative planning in a behaviour-based multi-robot system.Journal of Robotics and Autonomous Systems (RA&S), 26:149–174.

Jung, D. and Zelinsky, A. 1999b. Integrating spatial and topo-logical navigation in a behavior-based multi-robot application.In proceedings of the International Conference on IntelligentRobots and Systems (IROS99), Kyongju, Korea, pp. 323–328.

Kaiser, D. and Losick, R. 1993. How and why bacteria talk to eachother.In Cell, 73(5):875–885.

Kohonen, T. 1990. The self-organising map.Proceedings of IEEE,78(9):1464–1479.

Kube, C.R. Zhang, Hong 1994. Collective robotics: From socialinsects to robots.Adaptive Behaviour, 2(2):189–218.

Lakoff, G. and Johnson, M. 1980.Metaphors we Live By, ChicagoUniversity Press: Chicago.

Maes, P. 1991. Situated agents can have goals. InDesigning Au-tonomous Agents, P. Maes (Ed.), MIT-Bradford Press. ISBN 0-262-63135-0. Also published as a special issue of the Journal forRobotics and Autonomous Systems, Vol. 6, No 1. North-Holland.

Mallot, H.A. 1995. Layered computation in neural networks. InThe Handbook of Brain Theory and Neural Networks, M.A.


Arbib (Ed.), MIT Press, Bradford Books, p. 513, ISBN 0-262-01148-4.

Mataric, M.J. 1992a. Integration of representation into goal-drivenbehavior-based robots. InIEEE Transactions on Robotics and Au-tomation, 8(3):304–312.

Mataric, M.J. 1992b. Behavior-based systems: Key properties andimplications. InProceedings, IEEE International Conference onRobotics and Automation, Workshop on Architectures for Intelli-gent Control Systems, Nice, France, pp. 46–54.

Mataric, M.J. 1997. Using communication to reduce locality indistributed multi-agent learning. InProceedings, AAAI-97,Providence, Rhode Island, July 27–31, pp. 643–648.

Mills, C.W. 1940. Situated actions and vocabularies of motive.American Sociological Review, 5:904–913.

Newell, A. and Simon, H.A. 1972.Human Problem Solving,Prentice-Hall, Inc: Englewood Cliffs, NJ.

Parker, L.E. 1995. The effect of action recognition and robot aware-ness in cooperative robotic teams. IEEE, 0-8186-7108-4/95.

Pasteels, J.M., Deneubourg, J., and Goss, S. 1987. Self-organizationmechanisms in ant societies: Trail recruitement to newly discov-ered food sources. InFrom individual to Collective Behavior inSocial Insects, J.M. Pasteels and J. Deneubourg (Eds.), Birk¨auserVerlag: Basel.

Pfeifer, R. 1995. Cognition—Perspectives from autonomous agents.Robotics and Autonomous Systems, 15:47–70.

Pylyshym, Z.W. (Ed.) 1987.The Robot’s Dilemma. The Frame Prob-lem in Artifical Intelligence, Theoretical Issues on Cognitive Sci-ence, Vol. 4, Ablex Pub Corp.

Robins, R.H. 1997.A Short History of Linguistics, 4th ed.(Longman Linguistics Library), Addison-Wesley Pub Co. ISBN:0582249945.

Shepard, R.N. and Cooper, L.A. 1982.Mental Images and theirTransformations, MIT Press/Bradford: Cambridge.

Stains, H.J. 1984. Carnivores.In Orders and Families of RecentMammals of the World, S. Anderson and J.K. Jones, Jr. (Eds.),John Wiley and Sons: NY, pp. 491–521.

Steels, L.1996. The origins of intelligence, InProceedings of theCarlo Erba Foundation, Meeting on Artificial Life, FondazioneCarlo Erba. Milano.

Suchman, L.A. 1987.Plans and Situated Actions: The Problem ofHuman-Machine Communication, Cambridge Press: Cambridge.

Thelen, E. and Smith, L.B. 1994.A Dynamic Systems Approach tothe Development of Cognition and Action, A Bradford book, MITPress. ISBN 0-262-20095-3.

Wilson, E.O. 1971.The Insect Societies: Their Origin and Evolution,Narcourt, Brace & Co: New York.

Wilson, E.O. 1975.Sociobiology: The New Synthesis, Harvard.Yuta, S., Suzuki, S., and Iida, S. 1991. Implementation of a small

size experimental self-contained autonomous robot—sensors, ve-hicle control, and description of sensor based behavior. InProc.Experimental Robotics, Tolouse, France, LAAS/CNRS.

Zelinsky, A., Kuniyoshi, Y., and Tsukue, H. 1993. A qualitativeapproach to achieving robust performance by a mobile agent,Robotics Society of Japan Conference, Japan.

David Jung received his Ph.D. from the Intelligent Robotics Labo-ratory, University of Wollongong, Australia, in 1998, for work oncooperative multi-robot systems. He conducted much of this re-search at the Robotic Systems Laboratory at the Australian Na-tional University. He obtained his Honours degree in ComputerScience from Adelaide University, Australia in 1991, after spend-ing some time working in industry, and studying software engi-neering at the University of South Australia. He is currently em-ployed by UT-Battelle, LLC, and sponsored by the US Departmentof Energy at the Oak Ridge National Laboratory in Tennessee. Heis working with cooperative multi-robot systems in the ComputerScience and Mathematics Division. His interests include learn-ing and the developmental approach to robotics using the tools ofdynamical systems theory.

Dr. Jung is also a member of the IEEE Computer Society andthe Planetary Society.

Alexander Zelinsky worked for BHP Information Technology as aComputer Systems Engineer for six years before joining the Univer-sity of Wollongong, Department of Computer Science, as a Lecturerin 1984. After joining Wollongong University he has been an activeresearcher in the robotics field, he obtained his Ph.D. in Roboticsin 1991. He spent three years (1992–1995) working in Japan as aresearch scientist at Tsukuba University, and the ElectrotechnicalLaboratory.

Since October 1996 he has headed the Robotics research programin the Research School of Information Sciences and Engineering atthe Australian National University. In January 2000 he was promotedto Professor and Head of the Department of Systems Engineering.Prof. Zelinsky is continuing his research in mobile robotics, multiplecooperating robots and human-robot interaction. He is a member ofthe IEEE Robotics and Automation Society and is currently Presidentof the Australian Robotics and Automation Association.

Grounded Symbolic Communication between Heterogeneous Cooperating Robots

Documents

Transcript of Grounded Symbolic Communication between Heterogeneous Cooperating Robots