Robust Adaptive Coverage Control for Robotic Sensor Networks

2325-5870 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2015.2512326, IEEETransactions on Control of Network Systems

1

Robust Adaptive Coverage Control for RoboticSensor Networks

Mac Schwager, Member, IEEE, Michael P. Vitus, Member, IEEE, Samantha Powers,Daniela Rus, Fellow, IEEE, and Claire J. Tomlin, Fellow, IEEE

Abstract—This paper presents a distributed control algorithmto drive a group of robots to spread out over an environmentand provide adaptive sensor coverage of that environment. Therobots use an on-line learning mechanism to estimate the areas inthe environment that require more concentrated sensor coverage,while simultaneously providing this coverage. The algorithmdiffers from previous approaches in that both provable robustnessis emphasized in the learning mechanism, and decentralizationover a communication network is emphasized in the control. Toachieve a provable bound on the learning error the robots explorethe environment to gather sufficient data. They then switch toa Voronoi-based coverage controller to position themselves forsensing. Robots coordinate their learning, control, and modeswitching in a fully distributed fashion over their mesh network.It is proved that the robots approximate the weighting functionwith a known bounded error, and that they converge to locationsthat are locally optimal for sensing. Simulations and multipleexperiments with six iRobot Create robots are presented toillustrate the performance of the method.

I. INTRODUCTION

IN this paper, we present a distributed control algorithm todrive a group of robots to explore an unknown environment,

while providing adaptive sensor coverage of interesting areaswithin the environment. This algorithm can drive a teams ofrobots to perform a range of sensing tasks, such as searchand rescue missions, environmental monitoring, or automaticsurveillance of rooms, buildings, or towns. Our main emphasisin this work over previous works is in providing provablerobustness in learning the areas of importance in the environ-ment. We also carefully consider the details of decentralizingthe algorithm over a wireless mesh network. Specifically, weconsider a function, which we call the weighting function,to capture the fact that some areas of the environment aremore important than others. Our algorithm drives the robotsto explore the environment, thereby collecting the necessaryinformation to learn the weighting function with provablybounded maximum error, and finally to move to positions toprovide locally optimal sensor coverage over the environment,as illustrated in Fig. 1. The coordination between the robots

This work was funded in part by ONR MURI Grants N00014-07-1-0829,N00014-09-1-1051, and N00014-09-1-1031, and the SMART Future Mobilityproject. We are grateful for this financial support.

M. Schwager is with the Department of Aeronautics and Astronautics, Stan-ford University, Stanford, CA 94305, USA, [email protected].

M. P. Vitus and C. J. Tomlin are with the Department of ElectricalEngineering and Computer Sciences, UC Berkeley, Berkeley, CA 94708, USA,vitus,[email protected].

S. Powers and D. Rus are with the Computer Science andArtificial Intelligence Lab, MIT, Cambridge, MA 02139, USA,[email protected],[email protected].

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a) Robot Trajectories

IEEE TRANSACTIONS ON ROBOTICS, VOL. XX, NO. XX, MONTH 20XX 1

Robust Adaptive Coverage Control for RoboticSensor Networks

Mac Schwager, Member, IEEE, Michael P. Vitus, Member, IEEE, Samantha Powers,Daniela Rus, Fellow, IEEE, and Claire J. Tomlin, Fellow, IEEE

Abstract—This paper presents a distributed control algorithmto drive a group of robots to spread out over an environmentand provide adaptive sensor coverage of that environment. Therobots use an on-line learning mechanism to approximate theareas in the environment which require more concentrated sensorcoverage, while simultaneously exploring the environment beforemoving to final positions to provide this coverage. More precisely,the robots learn a scalar field, called the weighting function,representing the relative importance of different regions in theenvironment, and they use an exploration method based ona Traveling Salesperson Problem, followed by a Voronoi-basedcoverage controller to position themselves for sensing over theenvironment. The algorithm differs from previous approaches inthat provable robustness is emphasized in the representation ofthe weighting function. It is proved that the robots approximatethe weighting function with a known bounded error, and thatthey converge to locations that are locally optimal for sensingwith respect to the approximate weighting function. Multipleexperiments with six iRobot Create robots and simulationsusing empirically measured light intensity data are presentedto illustrate the performance of the method.

I. INTRODUCTION

IN this paper we present a distributed control algorithm todrive a group of robots to explore an unknown environment

while providing adaptive sensor coverage of interesting areaswithin the environment. This algorithm can be applied tocontrolling teams of robots to perform tasks such as searchand rescue missions, environmental monitoring, or automaticsurveillance of rooms, buildings, or towns. Our main emphasisin this work over previous works is in providing provablerobustness in learning the areas of importance in the environ-ment. Specifically, our algorithm drives the robots to explorethe environment in a distributed fashion, thereby collectingthe necessary information to learn the important areas inthe environment with provably bounded maximum error, andfinally to move to positions to provide locally optimal sensorcoverage over the environment, as illustrated in Fig. 1. Weprovide experimental results with six iRobot Creates as well

This work was funded in part by ONR MURI Grants N00014-07-1-0829,N00014-09-1-1051, and N00014-09-1-1031, and the SMART Future Mobilityproject. We are grateful for this financial support.

M. Schwager is with the Mechanical Engineering Department and Sys-tems Engineering Division, Boston University, Boston, MA 02215, USA,[email protected].

M. P. Vitus and C. J. Tomlin are with the Department of ElectricalEngineering and Computer Sciences, UC Berkeley, Berkeley, CA 94708, USA,vitus,[email protected].

S. Powers and D. Rus are with the Computer Science andArtificial Intelligence Lab, MIT, Cambridge, MA 02139, USA,[email protected],[email protected].

as simulated results using experimentally collected data todemonstrate the algorithm.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a) RobotTrajectories (b) Estimated Weighting Function

Fig. 1. As shown by the robot trajectories in 1(a), our coverage algorithmdrives the robots to explore the environment so that each basis function centeris visited by at least one robot. Different robots are indicated by different linestyles and colors. This leads to a provably robust reconstruction of the weight-ing function, as shown in 1(b). The robots then switch asynchronously into acoverage mode to provide provable sensor coverage over the environment.

This work is motivated by the need in many applicationsto have a provable bound on the estimation quality of theweighting function, giving a guarantee on the quality of thecoverage provided by the sensing robots. As an example,consider the earthquake and consequent tsunami that struckthe northern coast of Japan in March 2011, causing seriousdamage to the nuclear reactors at Fukushima. Following thedisaster, human workers were critically limited in their abilityto assess the radiation levels in the area, and braved therisk of dangerous radiation exposure as a result. Had thetechnology been available, a team of robots could have beendeployed to monitor the levels of radiation around the reactor.Using the algorithm proposed in this paper, sensing robotswould concentrate on areas where the radiation was mostdangerous, providing continually updated information on howthe radiation was evolving. Most importantly, a rigorous boundon the precision of the radiation estimate could be establisheda priori. This information could be used to direct human work-ers, ensuring that they remain safe from the danger of radiationexposure. If existing coverage algorithms were used, withoutthe robustification proposed in this paper, the radiation levelscould be approximated arbitrarily poorly, with large errorsin regions not visited by the robots. In pathological cases,the approximation could even become unstable, providingunbounded radiation estimates, and leading to saturated controlinputs for the robots. Our robust adaptive coverage algorithmprevents these effects and leads the robots to converge to

(b) Estimated Weighting Function

Fig. 1. As shown by the robot trajectories in 1(a), our coverage algorithmdrives the robots to fully explore the environment. Different robots areindicated by different line styles and colors. This leads to a provably robustreconstruction of the weighting function, as shown in 1(b). The robots thenswitch asynchronously into a coverage mode to provide provable sensorcoverage over the environment.

is carried out in a decentralized way over their network. Weprovide experimental results with six iRobot Creates as wellas simulated results using experimentally collected data todemonstrate the algorithm.

This work is motivated by the need in many applicationsto have a provable bound on the estimation quality of theweighting function, giving a guarantee on the quality of thecoverage provided by the sensing robots. As an example,consider the earthquake and consequent tsunami that struckthe northern coast of Japan in March 2011, causing seriousdamage to the nuclear reactors at Fukushima. Following thedisaster, human workers were critically limited in their abilityto assess the radiation levels in the area, and braved the riskof dangerous radiation exposure as a result. In this example,the unknown weighting function represents the radiation levelsin the area. Had the technology been available, a team ofrobots could have been deployed to monitor the levels ofradiation around the reactor. Using the algorithm proposedin this paper, sensing robots would learn the radiation levels,and then concentrate on areas where the radiation was mostdangerous. Most importantly, a rigorous bound on the preci-sion of the radiation estimate could be established a priori.This information could be used to direct human workers,ensuring that they remain safe from the danger of radiationexposure. If existing coverage algorithms were used, withoutthe robustification proposed in this paper, the radiation levelscould be approximated arbitrarily poorly, with large errorsin regions not visited by the robots. In pathological cases,the approximation could even become unstable, providingunbounded radiation estimates, and leading to saturated controlinputs for the robots. Our robust adaptive coverage algorithmprevents these effects and leads the robots to converge to



2

locally optimal positions, and to approximate the weightingfunction with a provable maximum error.

Sensor coverage algorithms have received a great dealof attention in recent years. Cortes et al. [1] proposed adistributed control algorithm for finding a locally optimalsensing configuration for a group of sensing robots, adaptingconcepts from locational optimization [2], [3]. There have beenseveral extensions to this formulation of coverage control. Thepaper [4] used a distributed interpolation scheme to recursivelyestimate the weighting function so that it did not have to beknown a priori. In [5], the robots were assumed to have alimited sensing or communication range. Pimenta et al. [6]incorporated heterogeneous robots, and extended the algorithmto handle nonconvex environments. Other extensions to non-convex environments were proposed in [7], [8]. Similar frame-works have also been formulated for a stochastic setting [9]–[11]. Most relevant to our work, [12] removed the requirementof knowing the weighting function a priori by learning abasis function approximation of the weighting function on-line. This strategy has provable convergence properties, butrequires that the weighting function lie in a known set offunctions (the set spanned by the basis functions). If thefunction does not lie in this known set, the algorithm from[12] can lead to a divergent weighting function approximationand saturated control inputs for the robots; or, conversely, itmay cause the robots to remain in a local neighborhood oftheir initial positions resulting in a poor global approximationof the weighting function. The purpose of the present work isto guarantee a high-quality estimate of the weighting function,and a locally optimal coverage configuration for the robots,regardless of the weighting function.

The algorithm we propose here drives the robots to explorethe entire space, which enables them to learn the weightingfunction with provable robustness. Specifically, the contribu-tions of this paper are as follows:

1) We propose a new distributed motion control algorithmin which the robots proceed asynchronously throughthree different modes (partition, explore, andcover), by coordinating over their communication net-work.

2) We introduce an online distributed function approxima-tion algorithm with a dead-zone based robustificationmechanism that the robots use to learn the weightingfunction.

3) We prove that the combination of the motion controlalgorithm and robust function approximate algorithmlead to (a) the robots learn the weighting function withbounded maximum error, and (b) they approach finalpositions that are locally optimal for coverage.

4) We show results of hardware experiments with thealgorithm running on six iRobot Creates, as well assimulation results with experimentally collected sensordata.

The robots first partition the environment in partitionmode using a distributed K-means clustering algorithm. Theythen asynchronously switch to perform a distributed explo-ration of the environment using an approximate Traveling

Salesperson Problem (TSP) solver in explore mode. Finally,they asynchronously switch to a distributed cover mode inwhich the robots move to a centroidal Voronoi configuration,which is locally optimal for sensing. The switching betweenthese modes is accomplished with a network algorithm thatensures that the last robot finishes one mode before the firstrobot starts the next. Concurrently with these partition,explore, and cover modes, the robots use an on-linelearning algorithm to approximate the weighting function fromsensor measurements. Since we do not assume the robots canperfectly approximate the weighting function, the parameteradaptation law for learning this weighting function uses adead-zone modification to be robust to function approximationerrors.

In real-world applications the form of the weighting func-tion is not known a priori, and if the learning algorithm doesnot have an explicit robustness mechanism, then it may chatterbetween models or even diverge [13]. Several techniques havebeen proposed in the adaptive control literature to handle thiskind of robustness, including the dead-zone modification [14],[15], the σ-modification [13], the e1-modification [16], and theprojection modification [17]. We adapt a method inspired by adead-zone technique from adaptive control. We prove that thismodification, together with the exploration phase executed bythe robots, results in all robots learning a function that hasbounded maximum point-wise error from the true function.Furthermore, all robots’ learned functions approach a commonfunction, and the robots converge to positions that are locallyoptimal for sensing with respect to the learned function.

There are a number of other notions of multi-robot sensorcoverage different from the formulation in this paper (e.g.[18], [19] and [20], [21]). Another influential method that usesa basis function-based approximation scheme and distributedKalman filtering is described in [22]. Other works haveconsidered the related problem of finding peaks, or sources, inunknown scalar fields with a group of robots [23]–[25]. Someresearchers have also used the gradient of mutual informationas an objective for controlling robots to build estimates ofenvironmental fields, as in [26]–[28] and to avoid hazardousregions while building a field estimate, as in [29]. In this paperwe adopt the locational optimization approach to coveragecontrol for its interesting possibilities for analysis and itscompatibility with existing ideas in adaptive control [30]–[32].We also utilize Gaussian networks for function approximation,which is a well-known technique that has been used in manyother contexts, for example [33].

This paper includes an updated and expanded version ofmuch of the material from [34], as well as new theoretical,experimental, and simulation results. In Section II we in-troduce notation and formulate the problem. In Section IIIwe describe the function approximation strategy and thecontrol algorithm, and we prove the main convergence result.Section IV gives the results of a numerical simulation witha weighting function that was determined from empiricalmeasurements of light intensity in a room. Section V describesthe hardware implementation on six iRobot Creates, and givesresults from a total of 14 experimental runs in two differentexperimental configurations. Finally, conclusions and future



3

work are discussed in Section VI.

II. PROBLEM FORMULATION

In this section we build a model of the robotic sensingnetwork, the environment, and the weighting function definingareas of importance in the environment. Much of this materialis background from existing work, for example [12]. Wethen formulate the robust adaptive coverage problem withrespect to this model. The main objective is to design adistributed control and function approximation algorithm bywhich the robots can (i) explore the environment, (ii) build anapproximation of the weighting function over the environmentwith provable error bounds, and (iii) drive the robots to aposition to provide sensor coverage of the environment, asillustrated in Fig. 1.

Let there be n robots with positions pi(t) in a planarenvironment Q ⊂ R2. The environment is assumed to becompact and convex. We call the tuple of all robot positionsP = (p1, . . . , pn) ∈ Qn the configuration of the multi-robotsystem, and we assume that the robots move with integratordynamics

pi = ui, (1)

so that we can control their velocities directly through thecontrol input ui. We define the Voronoi partition of theenvironment to be V (P ) = V1(P ), . . . , Vn(P ), where

Vi(P ) = q ∈ Q | ‖q − pi‖ ≤ ‖q − pj‖,∀j 6= i,

and ‖ · ‖ is the `2-norm. We think of each robot i as beingresponsible for sensing in its associated Voronoi cell Vi. Next,we define the communication network as an undirected graph,in which all robots whose Voronoi cells touch share an edgein the graph. This graph is known as the Delaunay graph.Then the set of neighbors of robot i is defined as Ni := j |Vi ∪ Vj 6= ∅.

We now define a weighting function over the environmentφ : Q 7→ R>0 (where R>0 denotes the strictly positivereal numbers). This weighting function is not known by therobots. Intuitively, we want a high density of robots in areaswhere φ(q) is large and a lower density where it is small.The weighting function represents regions of importance orinterest. Examples may be radiation concentration levels,temperature, concentration of a chemical contaminant, or thedensity of a quantity of interest such as population density, ortraffic density. Finally, suppose that the robots have sensorswith which they can measure the value of the weightingfunction at their own position, φ(pi) with high precision (i.e.without noise), but that their quality of sensing at arbitrarypoints, φ(q), degrades quadratically in the distance betweenq and pi. That is to say the cost of a robot at pi sensing apoint at q is given by 1

2‖q − pi‖2. This can be extended tonon-quadratic sensing models as described in [5], but here weassume the quadratic model for notational simplicity. Sinceeach robot is responsible for sensing in its own Voronoi cell,the cost of all robots sensing over all points in the environmentis given by

H(P ) =n∑

i=1

∫Vi(P )

12‖q − pi‖2φ(q) dq. (2)

This is the overall objective function that we would like tominimize by controlling the configuration of the multi-robotsystem.1

The gradient of H can be shown2 to be given by

∂H∂pi

= −∫

Vi(P )

(q−pi)φ(q) dq = −Mi(P )(Ci(P )−pi), (3)

where we define Mi(P ) :=∫

Vi(P )φ(q) dq and Ci(P ) :=

1/Mi(P )∫

Vi(P )qφ(q) dq. We call Mi the mass of the Voronoi

cell i and Ci its centroid, and for efficiency of notationwe will henceforth write these without the dependence onP . We would like to control the robots to move to theirVoronoi centroids, pi = Ci for all i, since from (3), this is acritical point of H, and if we reach such a configuration usinggradient descent, we know it will be a local minimum. Globaloptimization of H is known to be computationally intractable(NP-hard for a given discretization of the environment), henceit is standard in the literature to only consider local optimality.

A. Approximate Weighting Function

Note that the cost function (2) and its gradient (3) relyon the weighting function φ(q), which is not known to therobots. In this paper we provide a means by which the robotscan robustly approximate φ(q) online in a distributed way andmove to decrease (2) with respect to this approximate φ(q).

To be more precise, each robot maintains a separate approx-imation of the weighting function, which we denote φi(q, t).These approximate weighting functions are generated froma linear combination of m static basis functions, K(q) =[K1(q) · · · Km(q)]T, where each basis function is a radiallysymmetric Gaussian of the form

Kj(q) =1

2πσ2exp

−‖q − µj‖2

2σ2

, (4)

with fixed width σ and fixed center µj . Furthermore, thecenters are arranged in a regular grid over Q. Each robot thenforms its approximate weighting function as a weighted sumof these basis functions φi(q, t) = K(q)Tai(t), where ai(t) isthe parameter vector of robot i. This function approximationscheme is illustrated in Fig. 2.3 Each element in the parametervector is constrained to lie within some lower and upperbounds 0 < amin < amax <∞ so that ai(t) ∈ [amin, amax]m.These bounds are set based on standard design considerationsso as to prevent memory overflow and control actuator satura-tion. This does not limit the class of functions φ(q) that can beapproximated, it only increases the minimum approximationerror. Robot i’s approximation of its Voronoi cell mass and

1We have pursued an intuitive development of this cost function, thoughmore rigorous arguments can also be made [35]. This function is known inseveral fields of study including the placement of retail facilities [3] and datacompression [36].

2The computation of this gradient is more complex than it may seem,because the Voronoi cells Vi(P ) depend on P , which results in extra integralterms. Fortunately, these extra terms all sum to zero, as shown in, e.g. [37].

3This is a well-known form of universal function approximator. There areseveral delicate choices that must be made in designing such a functionapproximation network, such as values for the Gaussian width σ and thenumber and density of basis functions to use. Issues such as these have beenaddressed in many previous works, for example [33].

4

(q)

i(q, t)

i(q, t) = K(q)Tai(t)

Fig. 2. The weighting function approximation is illustrated in this simplified2-D schematic. The true weighting function φ(q) is approximated by robot ito be φi(q, t). The basis function vector K(q) is shown as three Gaussians(dashed curves), and the parameter vector ai(t) denotes the weighting of eachGaussian.

centroid can then be defined as Mi(P, t) :=∫

Vi(P )φi(q, t) dq

and Ci(P, t) := 1/Mi(P, t)∫

Vi(P )qφi(q, t) dq, respectively.

Again, we will drop the dependence of Mi and Ci on (P, t)for notational simplicity.

We measure the difference between an approximate weight-ing function and the true weighting function as the L∞ func-tion norm of their difference, so that the best approximationis given by

a := arg mina∈[amin,amax]m

maxq∈Q

|K(q)Ta− φ(q)|, (5)

and the optimal function approximation error is given by

φε(q) := K(q)Ta− φ(q). (6)

It will be shown in the proof of the main theorem, Theorem 1,that the L∞ norm gives the tightest approximation bound withour proof technique. The only restriction that we put on φ(q)is that it is bounded over the environment, or equivalently,the approximation error is bounded, |φε(q)| ≤ φεmax < ∞.We assume that the robots have knowledge of this bound,φεmax . In practice, the parameter φεmax is essentially an errortolerance set by the user, and it must be chosen through domainexpertise. Such parameters are common in adaptive controlwith robustification features, as described in, e.g., [30]. Thetheoretical analysis in the previous work [12] was not robustto function approximation errors in that it required φε(q) ≡ 0.One of the main contributions here is to formulate an algorithmthat is provably robust to function approximation errors. Weonly require that the robots have a known bound for thefunction approximation error, φεmax .

Function approximation errors can occur in real-world ap-plications for many reasons. The field may be too complexto be represented exactly by a finite set of basis functions.For example, the field may be noisy, or the robots may havenoisy sensors, leading to a “spiky” field that cannot be exactlyrepresented without some error, as in (6). Hence one of themain motivations for our robust approach is robustness to noisein real-world applications.

Finally, we define the parameter error as ai(t) := ai(t)−a.In what follows we describe an online tuning law by whichrobot i can tune its parameters, ai, to approach a neighborhoodof the optimal parameters, a. Our proposed controller thencauses the robots to converge to their approximate centroids,pi → Ci for all i. An overview of the geometrical objectsinvolved in our set-up is shown in Fig. 3.

Q : Convex environment (q) : Weighting function

pi(t) : Robot location

Vi(P ) : Voronoi cell

Ci(P ) : True centroid

Ci(P, t) : Estimatedcentroid

Fig. 3. A graphical overview of the quantities involved in the controller isshown. The robots move to cover a bounded, convex environment Q theirpositions are pi(t), and they each have a Voronoi region Vi(P ) with atrue centroid Ci(P ) and an estimated centroid Ci(P, t). The true centroidis determined using a sensory function φ(q), which indicates the relativeimportance of points q in Q. The robots do not know φ(q), so they calculatean estimated centroid using an approximation φi(q, t) learned from sensormeasurements of φ(q).

III. ROBUST ADAPTIVE COVERAGE ALGORITHM

In this section we describe the algorithm that drives therobots to spread out over the environment while simultane-ously approximating the weighting function online. The algo-rithm naturally decomposes into two parts: (1) the parameteradaptation law, by which each robot updates its approximateweighting function, and (2) the control algorithm, which drivesthe robots to explore the environment before moving to theirfinal positions for coverage. We describe these two parts inseparate sections and then prove performance guarantees forthe two working together.

Throughout this section, we assume that each robot hasaccess to its own Voronoi cell Vi, from which it can computethe mass, centroid, and lengths of edges of Vi as needed.The robots can compute Vi in a distributed way using well-known distributed algorithms [1]. These algorithms requirethat each robot can communicate directly with its Voronoineighbors (which tend to be the closest neighbors in thenetwork). In other words, we assume that the graph describingthe communication network contains the Delaunay graph.

A. Online Function Approximation

The parameters ai used to calculate φi(q, t) are adjustedaccording to a set of adaptation laws which are introducedbelow. First, we define two quantities,

Λi(t) =∫

s∈Ωi(t)

K(s)K(s)T ds, and (7)

λi(t) =∫

s∈Ωi(t)

K(s)φ(s) ds, (8)

where Ωi(t) = s | s = pi(τ), 0 ≤ τ ≤ t is the set of pointsin the trajectory of pi from time 0 to time t. The quantities in(7) can be calculated differentially by robot i using Λi(t) =Ki(t)Ki(t)T‖pi(t)‖ and λi(t) = Ki(t)φi(t)‖pi(t)‖ with zeroinitial conditions, where we introduced the shorthand notationKi(t) := K(pi(t)) and φi(t) := φ(pi(t)). As previously stated,



5

robot i can measure φi(t) continuously in time with its sensors.Now we define another quantity

Fi =

∫ViK(q)(q − pi)T dq

∫Vi

(q − pi)K(q)T dq∫Viφi(q) dq

. (9)

Notice that Fi can also be computed by robot i as it does notrequire any knowledge of the true weighting function, φ.

The “pre” adaptation law for ai is now defined as

˙aprei = −γBdz

i (Λiai−λi)− ζ∑j∈Ni

lij(ai− aj)−kFiai. (10)

where γ, ζ, and k are positive gains, lij is the length ofthe shared Voronoi edge between robots i and j, and Bdz

i (·)is a dead zone function which gives a zero if its argumentis below some value. We will give Bdz

i careful attention inwhat follows as it is the main tool to ensure robustness tofunction approximation errors. Before describing the deadzone in detail, we note that the three terms in (10) havean intuitive interpretation. The first term is an integral ofthe function approximation error over the robot’s trajectory,so that the parameter ai is tuned to decrease this error. Thesecond is the difference between the robot’s parameters andits neighbors’ parameters. This term will be shown to leadto parameter consensus; the parameter vectors for all robotswill approach a common vector. The third term compensatesfor uncertainty in the centroid position estimate, and will beshown to ensure convergence of the robots to their estimatedcentroids. A more in-depth explanation of each of these termscan be found in [12].

Finally, we give the parameter adaptation law by restrictingthe “pre” adaptation law so that the parameters remain withintheir prescribed limits [amin, amax] using a projection operator.We introduce a matrix Iproj

i defined element-wise as

Iproji (j) :=

0 for amin < ai(j) < amax

0 for ai(j) = amin and ˙aprei (j) ≥ 0

0 for ai(j) = amax and ˙aprei (j) ≤ 0

1 otherwise,

(11)

where (j) denotes the jth element for a vector and the jth

diagonal element for a matrix. The entries of Iproji are only

nonzero if the parameter is about to exceed its bound. Nowthe parameters are changed according to the adaptation law

˙ai = Γ( ˙aprei − Iproj

i˙aprei ), (12)

where Γ ∈ Rm×m is a diagonal, positive definite gain matrix.Although the adaptation law given by (12) and (10) is notation-ally complicated, it has a straightforward interpretation, it isof low computational complexity, and it is composed entirelyof quantities that can be computed by robot i.

As mentioned above, the key innovation in this adaptationlaw compared with the one in [12] is the dead zone functionBdz

i . We design this function so that the parameters are onlychanged in response to function errors that could be reducedwith different parameters. Without this modification, the al-gorithm may appear to work in some instances, but in otherscases the parameters may go unbounded or they may chatter,leading to divergent robot trajectories. More specifically, the

minimal function error that can be achieved is φε, as shown in(6). Therefore if the integrated parameter error (Λiai − λi) isless than φε integrated over the robot’s path, we have no reasonto change the parameters. We will show that the correct formfor the dead zone to prevent unnecessary parameter adaptationis

Bdzi (x) =

0 if ‖x‖ < ‖βi‖φεmax

Λix(‖x‖ − ‖βi‖φεmax) otherwise,(13)

where βi :=∫

s∈Ωi(t)K(s) ds. The dead zone function can be

evaluated by robot i since βi(t) can be computed differentiallyfrom βi = Ki‖pi‖ with zero initial conditions. With thisdefinition, using (7) and (8), we also have that

Bdzi (Λiai − λi) = Bdz

i (Λiai + λεi) (14)

where λεi:=

∫s∈Ωi(t)

K(s)φε(s) ds.

B. Control Algorithm

We propose to use a control algorithm that is composed ofa set of control modes, with switching conditions to determinewhen the robots change from one mode to the next. Therobots first move to partition the basis function centers amongone another, so that each center is assigned to one robot,then each robot executes a Traveling Salesperson Problem(TSP) tour through all of the basis function centers thathave been assigned to it. This tour will provide sufficientinformation so that the weighting function can be estimatedwell over all of the environment. Then the robots carry outa centroidal Voronoi controller using the estimated weightingfunction to drive to final positions. We call the first modethe “partition” mode, the second the “explore” mode, and thethird the “cover” mode. This sequence of control modes isexecuted asynchronously in a distributed fashion, during whichthe function approximation parameters are updated continuallywith (12) and (10).

We would like the robots to switch between the threemodes, in sequence. However, in order to ensure that therequired task from one mode is globally completed beforeproceeding to the next mode, we require a provably-correctglobal coordination algorithm for the robots. To do this in adistributed and asynchronous way, in this paper we let everyrobot maintain an estimate of the mode of every other robot,and we update this estimate using a flooding communicationprotocol. More formally, for each robot we define a modevariable Mii ∈ partition,explore,cover. In orderto coordinate their mode switches, each robot also maintainsan estimate of the modes of all the other robots, so thatMij is the estimate by robot i of robot j’s mode, andMi := (Mi1, . . . ,Min) is an n-tuple of robot i’s estimates ofall robots’ modes. Furthermore, the modes are ordered with re-spect to one another by partition < explore < cover,so that the max(Mi,Mj) function is a tuple containing theelement-wise maximum of the two mode estimate tuples, Mi

and Mj , according to this ordering. These mode estimates areupdated using the flooding communication protocol describedbelow in Algorithm 2. The conditions for a robot to switchfrom one mode to another are given formally in Algorithm 1.



6

Most importantly, Algorithm 1 ensures that the last robot hasfinished the partition mode before the first robot beginsthe explore phase, guaranteeing that each basis functioncenter is visited by one of the robots. The algorithm providesa mechanism for “artificially synchronizing” the robots, eventhough they do not operate on synchronous clocks. The twoalgorithms run concurrently in different threads. The controllaws within each mode are then defined as follows.

Algorithm 1 Switching Control Algorithm (executed by roboti)Require: Communication with Voronoi neighbors.Require: Knowledge of position, pi, in global coordinate

frame.Require: Knowledge of the total number of robots n.Require: Knowledge of flooding algorithm (Algorithm 2)

update period, T .Require: Access to mode estimates Mi updated from Algo-

rithm 2.while Mi 6= (explore, . . . ,explore) do

if Mii == explore thenui = [0, 0]T

elseui = upartitioni

end ifif Distance to mean of basis function centers < εpartition

and Mii == partition thenMii = explore

end ifend whileCompute TSP tour through basis function centers N µ

i

Wait for nT seconds with ui = [0, 0]T

Execute TSP tourMii = coverwhile Mii == cover doui = ucoveri

end while

Algorithm 2 Mode Estimate Flooding (executed by robot i)Require: The network is connected.Require: The robots have synchronized clocks with which

they broadcast during a pre-assigned time slot.Initialize Mi = (partition, . . . ,partition)while 1 do

if Broadcast received from robot j thenMi = max(Mi,Mj) (where partition <explore < cover)

end ifif Robot i’s turn to broadcast then

Broadcast Mi

end ifend while

a) Partition: In the partition mode, each robot uses thecontroller

upartitioni = k( 1|N µ

i |∑

j∈Nµi

µj − pi

), (15)

where N µi := µj | ‖µj − pi‖ ≤ ‖µj − pk‖ ∀k 6= i is the

set of the closest basis function centers to robot i, µj are thebasis function centers from (4), and |N µ

i | is the number ofelements in N µ

i . Since each robot knows its Voronoi cell, itcan determine which basis function centers lie inside the cell,and these will be the ones closer to it than to any other robot.The positions pi(t), pk(t) and the set N µ

i evolve continuouslyin time. Together with (1) this implements a continuous timeK-means clustering algorithm. This mode serves to divide thebasis function centers into n approximately equal sized groups,one group for each robot, in order to “balance” the load in theexplore mode. However, if the groups are not equally sized,this has no bearing on the correctness or convergence of ourcontrol algorithm.

b) Explore: In the explore mode, each robot drivesa tour at a constant speed ‖pi‖ = v through each basisfunction center in its neighborhood, µj for j ∈ N µ

i . Anytour that visits all basis function centers is sufficient forlearning (as proved in Lemma 1 below). We choose theTSP tour since it is the shortest such tour, and hencewill conserve the robots’ energy. Although solving theTSP is NP-hard in the worst case, in practice TSP tourscan be computed efficiently with freely available exact orapproximate TSP solvers (e.g. the Concorde TSP Solverhttp://www.math.uwaterloo.ca/tsp/concorde/).

c) Cover: Finally, for the cover mode, each robot movestoward the centroid of its Voronoi cell using

ucoveri = k(Ci − pi), (16)

where k is the same positive gain from (10).Using the above control and function approximation algo-

rithm, we can prove that all robots converge to the estimatedcentroid of their Voronoi cells, that all robots’ function approx-imation parameters converge to the same parameter vector, andthat this parameter vector has a bounded error with the optimalparameter vector. This is stated formally in the followingtheorem.

Theorem 1 (Convergence). A network of robots with dynamics(1) using Algorithm 1 for control, Algorithm 2 for communi-cation, and (10) and (12) for online function approximationhas the following convergence guarantees:

limt→∞

‖pi(t)− Ci(P, t)‖ = 0 ∀i, (17)

limt→∞

‖ai(t)− aj(t)‖ = 0 ∀i, j, (18)

and limt→∞

‖ai(t)− a‖ ≤∑n

j=1 2‖βj(t)‖φεmax

mineig( ∑n

j=1 Λj(t)) ∀i. (19)

The proof of this theorem depends upon the following twoLemmas, whose proofs are given in the Appendix.

Lemma 1 (Complete Exploration). If every basis functioncenter is visited by at least one robot, then

∑ni=1 Λi(t) > 0

for all t > maxj tj , where tj is the first time that the jth basisfunction center was visited.



7

Lemma 2 (Uniform Continuity). The following functions areuniformly continuous:

ψ1(t) = −n∑

i=1

‖Ci−pi‖2kMi, ψ2(t) = −ζm∑

j=1

αTjL(P )αj ,

and ψ3(t) = −n∑

i=1

γaTiB

dzi (Λiai + λεi

).

Proof of Theorem 1: The proof has two parts. The firstpart is to show that all robots reach “cover” mode and stay in“cover” mode. The second part uses a Lyapunov type prooftechnique similar to the one in [12] to show that once all robotsare in “cover” mode, the convergence claims of (17), (18), and(19) follow.

Firstly, the “partition” mode simply implements a K-meansclustering algorithm [38], in which the basis function centersare the points to be clustered, and the robots move to thecluster means. This algorithm is well-known to converge in thesense that for any εpartition there exists a time T partition

i atwhich the distance between the robot and the centers’ meanis less than εpartition, therefore according to Algorithm 1all robots will reach Mii = explore at some finite time.After this time, according to Algorithm 1, a robot will remainstopped until all of its mode estimates have switched to“explore.”

Now we prove that for any robot, at the time when allof its mode estimates have switched to “explore,” all ofthe other robots have completed the “partition” mode andthey are stopped in fixed positions, thus ensuring a correctpartition of the environment. Suppose the first robot to achieveMi = (explore, . . . ,explore) does so at time Tf . Thismeans that at some time in the past all the other robots, j,have switched to Mjj = explore and stopped moving,but none of them have Mj = (explore, . . . ,explore)(otherwise they would be the first). Therefore at Tf all robotshave completed “partition” mode and are stopped. Supposethe last robot to achieve Mi = (explore, . . . ,explore)does so at Tl. From the properties of the standard flood-ing algorithm, Algorithm 2, we know that Tl − Tf ≤nT (the maximum time between the first robot to obtainMi = (explore, . . . ,explore) and the last robot to doso is nT ). Therefore, at time Tl the first robot to haveMi = (explore, . . . ,explore) will still be stopped,because it waits for nT seconds after achieving Mi =(explore, . . . ,explore). We have established that allrobots are stopped in the time interval [Tf , Tl]. When anyrobot obtains Mi = (explore, . . . ,explore) it will do soat some time Ti satisfying Tf ≤ Ti ≤ Tl, hence all robots willbe stopped at this time. At time Ti robot i computes its TSPtour, according to Algorithm 1. Even though the robots maycompute their TSP tours at different times, they are all at thesame fixed positions when they do so, according to the abovereasoning. Therefore, each basis function center is in at leastone robot’s TSP tour.

After all robots finish their TSP tours in “explore” mode,we have mineig(

∑ni=1 Λi) > 0, from Lemma 1 since all

basis function centers have been visited by at least one robot.

This ensures that the bound in (19) is meaningful. Intuitively,when the robots have completely explored the environment,mineig(

∑ni=1 Λi) is large, making the bound in (19) small.

Each tour is finite length, so it will terminate in finite time,hence each robot will eventually enter the “cover” mode.Furthermore, when some robots are in “cover” mode, andsome are still in “explore” mode, the robots in “cover” modewill remain inside the environment Q. This is because Q isconvex, and since Ci ∈ Vi ⊂ Q and pi ∈ Q, by convexity,the segment connecting the two is in Q. Since the robotshave integrator dynamics (1), they will stay within the unionof these segments over time, pi(t) ∈ ∪τ>0(Ci(τ) − pi(τ)),and therefore remain in the environment Q. Thus at somefinite time, T cover, all robots reach “cover” mode and areat positions inside Q. Furthermore, they never change from“cover” mode once they have reached it, as indicated inAlgorithm 1, which completes the first part of the proof.

Now for the second part of the proof, define a Lyapunov-like function

V = H+12

n∑i=1

aTi Γ−1ai, (20)

which incorporates the sensing cost H, and is quadratic inthe parameter errors ai. We will use several applications ofBarbalat’s lemma [39], [40] to prove (17), (18), and (19).Barbalat’s lemma states that for a function ψ(t), if the limitof its integral exists and is bounded,

∫∞0ψ(τ) dτ = Ψ∞

with |Ψ∞| < ∞, and if ψ(t) is uniformly continuous, thenlimt→∞ ψ(t) = 0.

Notice that V ≥ 0. We will first show that V ≤ 0, whichimplies that limt→∞ V = V∞ exists and is finite (this is abasic theorem of real analysis). Taking the time derivative ofV along the trajectories of the system and simplifying with (3)gives

V =n∑

i=1

(−‖Ci − pi‖2kMi + aT

i kFiai + aTi Γ−1 ˙ai

).

Substituting for ˙ai with (12), (10), and (14) gives

V = −n∑

i=1

(‖Ci − pi‖2kMi + γaT

iBdzi (Λiai + λεi

)

+ ζaTi

∑j∈Ni

lij(ai − aj) + aTi I

proji

˙aprei

).

Rearranging terms we get

V = −n∑

i=1

(‖Ci − pi‖2kMi + γaT

iBdzi (Λiai + λεi

)

+ aTi I

proji

˙aprei

)− ζ

m∑j=1

αTjL(P )αj , (21)

where αj := [a1j · · · amj ]T is the jth element in everyrobot’s parameter vector, stacked into a vector, and L(P ) ≥ 0is the graph Laplacian for the Delaunay graph which de-fines the robots communication network (please refer to theproof of Theorem 2 in [12] for details). Define ψ1(t) :=−

∑ni=1 ‖Ci − pi‖2kMi, ψ2(t) := −ζ

∑mj=1 α

TjL(P )αj ,



8

ψ3(t) := −∑n

i=1 γaTiB

dzi (Λiai + λεi

), and ψ4(t) :=−

∑ni=1 a

Ti I

proji

˙aprei , which gives V = ψ1(t)+ψ2(t)+ψ3(t)+

ψ4(t). Notice that ψ1(t) is a negative sum of the square ofnorms, and therefore is non-positive, and ψ2(t) is non-positivebecause L(P ) ≥ 0. Furthermore, ψ4(t) is non-positive by thedesign of the projection operation in (11) (this is shown inthe proof of Theorem 1 in [12]). Therefore, ψ3(t) is the onlyterm in question, and this term distinguishes the Lyapunovconstruction here from the one in the proof of Theorem 1 in[12].

We now show that ψ3(t) is also non-positive by design.Suppose ‖Λiai + λεi

‖ < ‖βi‖φεmax from (13) for some i.Then Bdz

i (Λai +λεi) = 0 and the ith contribution to the term

ψ3(t) is zero. Now suppose ‖Λiai + λεi‖ ≥ ‖βi‖φεmax for

some i. In that case we have

0 ≤ ‖Λiai + λεi‖ −

∥∥∥∫Ωi(t)

K(s)φε(s) ds∥∥∥

≤ ‖Λiai + λεi‖2 − λT

εi(Λiai + λεi

)

≤ aTi Λi(Λiai + λεi

). (22)

Then from the definition of the dead-zone operator, Bdzi (·), we

have

aTiB

dzi (Λiai + λεi

) =

aTi Λi(Λiai + λεi

)(‖Λiai + λεi‖ − ‖βi‖φεmax) ≥ 0,

since the aTi Λi(Λiai + λεi

) was shown to be non-negativein (22), and ‖Λiai + λεi

‖ − ‖βi‖φεmax is non-negative bysupposition. Therefore the ith contribution to ψ3(t) is non-positive in this case. Then ψ3(t) is a sum of non-positivequantities, hence ψ3(t) ≤ 0. We conclude therefore thatV ≤ 0, and since V ≥ 0, this means limt→∞ V = V∞ existsand is finite.

We now apply Barbalat’s lemma to ψ1(t), ψ2(t), and ψ3(t)to get the three main results of the theorem. We already estab-lished that each of the ψi(t) terms is non-positive, thereforetheir integrals must be non-positive, Ψi(t) :=

∫ t

0ψi(τ) dτ ≤ 0

for all t and for all i = 1, 2, 3, 4. We have that V =∑4i=1 Ψi(t) + V0 ≥ 0, hence Ψi(t) ≥ −V0, ∀i. Now since

Ψi(t) are all lower bounded by −V0 and they have non-positive time derivatives, ψi(t) ≤ 0, we conclude that eachone reaches a finite limit Ψi,∞ (again, this is a fundamentaltheorem of real analysis). Now, to apply Barbalat’s lemma toψ1(t), ψ2(t), and ψ3(t), we only need to show that they areuniformly continuous.4 This is given by Lemma 2, which isproved in the Appendix, hence we conclude that ψk(t) → 0,k = 1, 2, 3.

Now we show that ψk(t) → 0, k = 1, 2, 3 implies theconvergence claims stated in the theorem. It is clear thatψ1(t) → 0 gives the position convergence (17), and ψ3(t) → 0gives parameter consensus (18) (again, see [12] for moredetails on these). Finally, we verify the parameter error conver-gence (19). We know that the dead-zone term approaches zero,ψ3(t) → 0, therefore for all i either limt→ ∞ aT

i Λi(Λiai +λεi

) = 0, or limt→∞(‖Λiai + λεi‖ − ‖βi‖φεmax) < 0. We

4In fact, ψ4(t) is not uniformly continuous, but this is inconsequential toour proof.

already saw from (22) that aTi Λi(Λiai + λεi

) = 0 implies‖Λiai + λεi

‖ − ‖βi‖φεmax ≤ 0, thus we only consider thislater case. Then from limt→∞(‖Λiai +λεi

‖−‖βi‖φεmax) ≤ 0we have

0 ≥ limt→∞

(‖Λiai + λεi

‖ − ‖βi‖φεmax

),

and because of parameter consensus (18),0 ≥ limt→∞

(‖Λj ai + λεj

‖ − ‖βj‖φεmax

)for all j. Then

summing over j, we have

0 ≥ limt→∞

( n∑j=1

‖Λj ai + λεj‖ −

n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).

The last condition has two possibilities; eitherlimt→∞

(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (23)

where the last inequality uses the fact that∥∥∑nj=1 λεj

∥∥ ≤ ∑nj=1 ‖βj‖φεmax‖. Otherwise,

limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <

∑nj=1 ‖βj‖φεmax

)which in turn

implies (23), thus we only need to consider (23). Thisexpression then leads to

0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),

and dividing both sides by mineig( ∑n

j=1 Λj

)(which is

strictly positive according to Lemma 1), gives (19).

Remark 1 (Optimality). The algorithm drives the robotsto converge to a locally optimal sensing configuration withrespect to their learned weighting function. Furthermore, allrobots converge to the same learned weighting function, as en-sured by (18). This means that their positions locally optimizethe coverage cost function (2) where φ(q) is replaced withthe learned weighting function φ(q) = limt→∞K(q)T ai(t).This locally optimal performance is the state of the art, evenwhen the weighting function is fully known before hand [1]. Infact, it is known that the optimization of (2) is NP-hard whenφ(q) is known [3], hence we cannot expect to find the globaloptimum, particularly when φ(q) is not known.

In practice we find that most locally optimal positions areof nearly equal quality, and they are not appreciably worsethan the global optimum. If better local optima are desired,one can implement a simulated annealing style algorithm byadding a small amount of noise to the robots’ control input,with asymptotically decreasing magnitude. Indeed, in anyhardware implementation there will be actuation noise, which



9

will lead the robots to dislodge themselves from undesirablelocal minima.

Remark 2 (Choosing Parameters). The algorithm has severalparameters that must be tuned by the user. These include γ, Γ,k, ζ. The results of the theorem hold regardless of the specificvalues of these gains. The appropriate choices for the gainsdepend greatly upon the specific hardware and electronics oneuses. In practice they should be tuned roughly in simulation,and then fine-tuned experimentally. The values used in ourexperiments are recorded in Sec. V.

Furthermore, one may wonder what is the dependency of theperformance on the number of basis functions m, and the sizeof the known function error bound φεmax . It is straightforwardto show that the algorithm execution time scales linearly in m.Also, the function approximation error grows no worse thanlinearly in φεmax , as evident from (19) in Theorem 1.

Remark 3 (Failure Conditions). The results of Theorem 1 holdstrictly under the modeling assumptions we describe above. Inpractice these assumptions will not hold exactly, hence it isnatural to wonder about the algorithm performance in thesecases. Empirically, we find that the algorithm does not “fail”in the sense of a divergence, or an instability. In the case ofnoise, robot dynamics, or an incorrect function approximationbound φεmax , the algorithm performance degrades gently.This is evidenced by our hardware experiments, in which thealgorithm performs as expected despite the presence of thesesources of error.

Remark 4 (Robot Initial Positions). Strictly speaking, robotsdo not have to start within the region Q. If a robot startsoutside Q, one of two things may happen: (i) the robot willnever move because its Voronoi cell is empty, or (ii) it willeventually enter Q over the course of the execution of thecontroller, and once it enters it will not leave.

Remark 5 (Extensions to Nonconvex Environments and TimeVarying Environments). The algorithm can also be directlyextended to several classes of non-convex environment, in-cluding ones with occlusions and obstacles, by using thecoverage control methods from, e.g., [6]–[8] in place of thebasic coverage controller (16). Furthermore, the algorithmwill perform well for a time-varying weighting function φ(q, t),as long as the weighting function changes slowly comparedwith the speed of the robots and the convergence rate of thelearning algorithm.

IV. SIMULATIONS

The proposed algorithm is tested using the data collectedfrom previous experiments [41] in which two incandescentoffice lights were placed at the position (0,0) of the environ-ment, and the robots used on-board light sensors to measurethe light intensity. The data collected during these previousexperiments was used to generate a realistic weighting functionwhich cannot be reconstructed exactly by the chosen basisfunctions. In the simulation, there are 10 robots and the basisfunctions are arranged on a 15× 15 grid in the environment.

The proposed robust algorithm was compared against thestandard algorithm from [12], which assumes that the weight-

ing function can be matched exactly. The true and optimallyreconstructed (according to (5)) weighting functions are shownin Fig. 4(a) and (b), respectively. As shown in Fig. 4(c)and (d), with the robust and standard algorithm respectively,the proposed robust algorithm significantly outperforms thestandard algorithm, and reconstructs the true weighting func-tion well. The robot trajectories for the robust and standardadaptive coverage algorithm are shown in Fig. 5(a) and (b),respectively. Since the standard algorithm does not includethe exploration phase, the robots get stuck in a local areaaround their starting position, which causes the robots to beunsuccessful in learning an acceptable model of the weightingfunction. In contrast, the robots with the robust algorithmexplore the entire space and reconstruct the true weightingfunction well.


ψ1(t) → 0 gives the position convergence (16), and ψ3(t) → 0gives parameter consensus (17) (again, see [12] for moredetails on these). Finally, we verify the parameter error conver-gence (18). We know that the dead-zone term approaches zero,ψ3(t) → 0, therefore for all i either limt→ ∞ aT

i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS


The proposed robust algorithm was compared against thestandard algorithm from [12], which assumes that the weight-ing function can be matched exactly. The true and optimallyreconstructed (according to (5)) weighting functions are shownin Figure 4(a) and (b), respectively. As shown in Figure 4(c)and (d), with the robust and standard algorithm respectively,the proposed robust algorithm significantly outperforms thestandard algorithm and reconstructs the true weighting func-tion well. The robot trajectories for the robust and standardadaptive coverage algorithm are shown in Figure 5(a) and(b), respectively. Since the standard algorithm doesn’t includethe exploration phase, the robots get stuck in a local areaaround their starting position which causes the robots to beunsuccessful in learning an acceptable model of the weightingfunction. In contrast, the robots with the robust algorithmexplore the entire space and reconstruct the true weightingfunction well.

(a) True Weighting Function (b) Optimally Reconstructed

(c) Robust Algorithm (d) Standard Algorithm

Fig. 4. A comparison of the weighting functions. (a) The true weightingfunction. (b) The optimally reconstructed weighting function for the chosenbasis functions. (c) The weighting function for the proposed algorithm withdead-zone and exploration. (d) The previously proposed algorithm withoutdead-zone or exploration.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a) Robust Algorithm

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(b) StandardAlgorithm

Fig. 5. The vehicle trajectories for the different control strategies. The initialand final vehicle position is marked by a circle and cross, respectively.

V. HARDWARE EXPERIMENTS

In addition to verifying the algorithm in simulation, weimplemented the algorithm on a group of robots to demonstrate

(a) True Weighting Function



i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS






0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1





(b) Optimally Reconstructed



i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS






0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1





(c) Robust Algorithm



i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS






0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1





(d) Standard Algorithm

Fig. 4. A comparison of the weighting functions. (a) The true weightingfunction. (b) The optimal, offline, centrally reconstructed weighting functionfor the chosen basis functions. (c) The weighting function for the proposedalgorithm with dead-zone and exploration. (d) The previously proposedalgorithm without dead-zone or exploration (note the different z-axis scale).


In addition to verifying the algorithm in simulation, weimplemented the algorithm on a group of robots to demonstrateits practical feasibility. Six iRobot Creates with Asus EeePC 1015PXs were used as the robot platforms. The robotssensed light intensity as the weighting function, as measuredby an Advanced Photonix PDV-P9008 photocell mounted oneach robot. The Robot Operating System (ROS) was used asthe means of communication and Python was chosen as thelanguage of implementation. For localizing the robots, a roomoutfitted with a Vicon motion capture system was used. Asis usually the case, several modifications from the theoreticalalgorithm were required for a hardware implementation, ofwhich the most significant are described here.

Collisions: While collisions were not treated directly inthe development of the algorithm and were not accounted forin the simulation, they had to be addressed in the hardwareimplementation. Our method for avoiding collisions was todraw a line between the robot and its goal. The robot’s



10



i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS






0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1








i Λi(Λiai +λεi






0 ≥ limt→∞

(‖Λiai + λεi


),


(‖Λj ai + λεj


)for all j. Then


0 ≥ limt→∞

( n∑j=1


n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∥∥ n∑j=1

Λj ai +n∑

j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥ lim

t→∞

(∣∣∣∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥∣∣∣− n∑j=1

‖βj‖φεmax

).


(‖

∑nj=1 Λj ai‖ > ‖

∑nj=1 λεj

‖), in which case

0 ≥ limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− ∥∥ n∑j=1

λεj

∥∥− n∑j=1

‖βj‖φεmax

)≥

limt→∞

(∥∥ n∑j=1

Λj ai

∥∥− 2n∑

j=1

‖βj‖φεmax

), (22)



limt→∞(‖

∑nj=1 Λj ai‖ < ‖

∑nj=1 λεj

‖), which implies

limt→∞(‖

∑nj=1 Λj ai‖ <


)which in turn


0 ≥ limt→∞

(mineig

( n∑j=1

Λj

)‖ai‖ − 2

n∑j=1

‖βj‖φεmax

),


j=1 Λj

)(which is


IV. SIMULATIONS






0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1





(b) Standard Algorithm

0 5 10 15 20 25 30 35 4050

100

150

200

250

Time (s)

Lyap

unov

Fun

ctio

n

RobustStandard

(c) Lyapunov Function

Fig. 5. The vehicle trajectories for the different control strategies and thevalues of their Lyapunov functions are shown. The initial and final vehicleposition is marked by a circle and cross, respectively. It is evident from theLyapunov function values that the robust algorithm significantly outperformsthe standard algorithm.

position is then projected along this line for two robot radiito check for imminent collisions. If a collision is detected, therobot determines the relative heading of the other robot withrespect to itself, turns 90 degrees in the opposite direction,and travels a fixed distance. This simple algorithm was seen tobe sufficiently robust in practice, though a more sophisticatedcollision avoidance scheme would be preferable if collisionsare a critical concern, such as for quadrotors or other aircraft.

Integration: In implementing the time integration re-quired for the controller and parameter adaptation laws, therewere two modifications from the mathematical model. Firstly,a fixed time step of 0.01s was used in computing the integra-tion, although in actuality the time taken for a single updateof the equations varied considerably from one time step to thenext due to variable wireless transmission delays and otherhard-to-predict delays in the system. We found that assuminga fixed time step provided a more robust solution despite thevariation in actual update times.

Secondly, the dead-zone function used in the experimentswas a normalized version of the one described in the the-oretical development above. Specifically, the dead-zone wascomputed as

Bdzi (x) =

0 if C(x) < 0

x‖x‖C(x) otherwise,

with

C(x) = ‖Λ1/2i ‖×(

‖Λ−1/2i x‖ − ‖Λ1/2

i βi‖φεmax − ‖Λ1/2i Λ0‖amax

),

where Λ0 > 0 is a small but positive definite initial conditionfor Λi. Although it looks more complicated, this dead-zoneis a normalized version of (13). The reason for the changeis that the normalized version was found to require less finetuning of the adaptation gain γ in practice. We chose to usethe unnormalized dead-zone in the theoretical development,

both because it has a simpler mathematical form, and becauseit is more straightforward to prove some of the technicalitiesrequired for the parameter convergence result in Theorem 1.

Delay: Another difficulty not accounted for directly inthe analysis or simulations is delays in communication. Thealgorithm is fully decentralized, however to avoid practicaldifficulties with mesh network hardware, we simulated thenetwork on a base computer, and computed the control actionsof each robot on this base computer. We then used 802.11(Wi-Fi) to transmit waypoint commands to the iRobot Createsfrom the base station. Wi-Fi transmission rates are knownto depend on interference and room geometry in hard-to-predict ways. We mitigated the effects of variable delays due towireless transmission by choosing our control and adaptationgains to be suitably low, and by choosing an environmentdiscretization that was coarse enough to give relatively smallwireless packets, while preserving control performance.

Robot Motion: In our analysis and simulation results,robots are represented as noiseless point integrators. In prac-tice, however, the robots’ motion incorporate noise, and acontroller is required to move the robots to desired positions.For the experiments, a waypoint controller was implemented torotate the robot to the desired heading within some tolerance,and then drive directly to the waypoint.

A. Experimental SetupTwo groups of experiments were conducted with the six

iRobot Creates; the first tested for robustness to real-worldcomplications such as sensing and actuation noise, while thesecond tested specifically for sensitivity to positional errorsin the robots. The control and adaptation gains used in thecontrol algorithm were k = 3, ζ = 30, γ = 0.1, and Γ = I .For estimating the weighting function we used an 8×8 gridof Gaussian basis functions (hence m = 64) with standarddeviation σ = 0.55m. The function error bound was φεmax =0.3, and the minimum and maximum parameters were amin =0.001, amax = 10, respectively.

final positional error. Similarly, if the robot is further away, angular accuracy is more critical to

converging on the desired goal. Thus an equation was developed that varied the required angular

accuracy inversely with the distance from the goal:

errorangle ,max=∣arcsin tol

dist∣ .

where tol is the tolerance in the final position of the robot (5 cm in the experiments), and dist is the

distance from the robot to the desired location. Note, however, that since the allowable angle decreases

with distance, at some distance it is smaller than the robot can reasonably achieve. Thus a minimum

angular error was introduced. Conceptually, this minimum error simply directs the robot in the right

direction, such that the distance from the goal decreases and the allowable angular error becomes

greater.

This method also has the advantage of requiring less accuracy of the robot as it approaches its

goal, allowing for faster run times.

Experimental SetupTwo experiments were done; the first tested for accuracy and repeatability, and the second

tested for sensitivity to error.

Experiment 1The first experiment was run 11 times, each with the same light configuration, shown in Figure

1. This configuration was that of a square, three meters on a side, with a light source placed at the top

right corner and the bottom left corner. The square is approximately marked with black lines in Figure

1. Data was taken over an 8x8 grid of points (Mu), and fitted to a 10x10 grid (q). The photoresistor

used to take light measurements was the PDV-P9008.

Figure 1 The field used in the experiments, with one robot shown. There are light sources in the bottom

left and upper right corners. The 3 meter by 3 meter square explored by the robots is shown

approximately outlined in black.

Six robots were used in each run, though their starting locations were intentionally inconsistent. Fig. 6. The environment used in the experiments, with one robot shown.There are light sources in the bottom left and upper right corners, though thesource in the lower left is not shown in the frame of the photograph. Therobots’ environment is the 3m×3m region outlined in black.

Experimental Setup 1: The first experimental setup wasrun 11 times, each with the same light placement, shown inFig. 6. The environment was a 3m×3m square shown in thefigure. A 10× 10 grid was used for the spatial integration tocompute Mi, Ci and Fi, as required for the control algorithm.Six robots were used in each run, with their starting locationsvaried from one run to the next. The light sensors were placedat the center of the robot, so that the robot’s position was thesame as the sensor’s position.



11

Experimental Setup 2: This experimental setup was re-peated for three runs and was the same as Experimental Setup1 in all respects, except that the light sensors were offset 15cm from the center from the robot. This placement introduceda significant positional error in the location of the sensor,depending both on the location and the angle of the robot. Thiswas to test the robustness of the algorithm to a significantlysource of positional error.

B. Results

The experimental results indicate that the robots are able torepeatably reconstruct the light field and position themselvesat locally-optimal positions for sensing, despite a significantfunction approximation errors, as predicted by our theoreticalresults.

Robot trajectories and the learned light intensity functionfrom one run with Experimental Setup 1 is shown in Fig. 7.Please see the accompanying movie to view the animatedexperimental data. The left frame shows the state of the6 iCreate robots after the partition mode, when theyhave divided the basis function centers amongst themselves.The center frame shows the trajectories of the robots in theexplore mode, during which they make a tour throughall the basis function centers. The right most frame showsthe robots in their final coverage configuration at the end ofthe cover mode, with their approximate trajectories drawn.The final learned weighting function is represented by theunderlying heat map.

Fig. 7. This figure shows robot positions and trajectories for one of the runsfrom Experimental Setup 1 with 6 iCreate robots. Green dots indicate robotpositions, while yellow dots show the next robot waypoint. Left: The stateof the system is shown at the end of partition mode, with all waypointspartitioned amongst the robots. Center: The trajectories of the robots duringexplore mode, visiting all basis function centers. Right: The final robotpositions in cover mode, with the learned weighting function shown as aheat map.

The performance of the algorithm in reconstructing the lightfield in the two experimental setups is shown in Figs. 8 and9. The left panels in Figs. 8 and 9 show an ideal centralized,off-line reconstruction of the field, based on the light intensitydata collected by one robot over all trials for Experiments1 and 2 respectively. The middle panels show the light fieldas reconstructed with our algorithm from one robot, averagedover all trials for Experiments 1 and 2, respectively. The rightpanels show the difference between the ideal and estimatedfields for Experiments 1 and 2, respectively. The figuresshow that the light field was estimated well in both cases,but the positional errors introduced in Experimental Setup 2noticeably degrades the learning performance.

Measured Field Estimated Field Estimate Error


18%

16%

14%

12%

10%

8%

6%

4%

2%

18%

16%

14%

12%

10%

8%

6%

4%

2%

Fig. 8. The field estimation results for Experimental Setup 1 are shown here.Left: The light intensity field as reconstructed offline from all measurements.Middle: The light intensity field estimated by one robot averaged over the 11runs. Right: The error between the two fields. The shade of red representslight intensity relative to the maximum measurable by the sensor, and thewhite boxes represent the standard deviation.Measured Field Estimated Field Estimate Error


18%

16%

14%

12%

10%

8%

6%

4%

2%

18%

16%

14%

12%

10%

8%

6%

4%

2%

Fig. 9. This figure shows the field estimation results for Experimental Setup 2.Left: The light intensity field as reconstructed offline from all measurements.Middle: The light intensity field estimated by one robot averaged over the 3runs. Right: The error between the two fields. The shade of red representslight intensity relative to the maximum measurable by the sensor, and thewhite boxes represent the standard deviation.

The function approximation error, averaged over all runsand all robots for Experimental Setup 1 was 4.6%, while it was5.8% for Experimental Setup 2. Similarly, the maximum errorswere 18.3% and 18.1%, respectively. This indicates that thealgorithm performed slightly better in Experimental Setup 1, aswould be expected, however the performance did not decreasesignificantly in Experimental Setup 2, despite the significantpositional error of the sensors, indicating that the algorithm isrobust to such errors.

Figure 3 The total error in the estimated light values over time for each of the 11 runs, as compared to the approximately base truth values given by the cumulative data collected by the robots.

Experiment 2The data for experiment 2 is shown in Figure 4. The average error, averaged over all points and

all runs, is 5.8%, with a maximum error of 18.1%. The average standard deviation of the error is 4.1%

with a maximum of 23.5%. The evolution of the error over time is shown in Figure 5. The results for this experiment have a somewhat higher average error than those of experiment

1, but are otherwise reasonably self-consistent. The shape of the error over time graph is similar to that

of experiment 1, as is the value of convergence. However, these runs would have benefited from an

increased run time, as they have not quite achieved the steady state nature achieved by the first run. This may account for the greater error in the final state.

Err

or in

Mea

sure

d In

tens

ity

Time (s)

105

4

3

2

1

0 1000 2000 3000 4000

(a) Experiment 1

Err

or in

Mea

sure

d In

tens

ity

Time (s)

105

3

2

1

0 1000 2000 3000Figure 5 The plot of error over time for experiment 2. The shape is consistent with experiment 1, as is

the value of convergence.

ConclusionThe algorithm described above has been successfully implemented on iRobot Creates in two

configurations. The first configuration was done with six robots, where the light sensor was centered on

each. This experiment was done 11 times. The final results are shown in Figure 2; the average error is

4.6% with a standard deviation of 3.4%. The evolution of error over time is shown in Figure 3; the

shape is qualitatively consistent over all 11 runs.

The second experiment was designed to check for sensitivity to measurement error. To

approximate this, the light sensor was offset by about 6 inches on each robot. This experiment was

done 3 times, and the results are shown in Figures 4 and 5. The average error is 5.8% with a standard

deviation of 4.1%, slightly higher but still within error of the first configuration. The time evolution of

the error remains qualitatively consistent with the first experiment.

These results suggest that this algorithm may be used by a swarm of robots to measure some

parameter of a field accurately and repeatably.

(b) Experiment 2

Fig. 10. The error in the estimated light intensity field as measured by onerobot over time is shown here for all runs and for both experimental setups. Inall cases, the error decreases over time and settles to a small non-zero value,as predicted by Theorem 1.

The evolution of the function approximation error over timefrom one robot for each of the 11 trials in ExperimentalSetup 1, and for each of the 3 trials in Experimental Setup 2,is shown in Fig. 10. One can see that for each trial, the errordecreases over time, and eventually settles to a value that issmall, but not zero, as predicted by Theorem 1. Performance inExperimental Setup 1 is marginally better than in Experimental



12

Setup 2, as a slightly lower value of the error is attained. Thisis what we would expect, since we intentionally introducedpositional errors for the sensors in Experimental Setup 2.These results again indicate that the algorithm is robust toreal-world errors.

VI. CONCLUSIONS

In this paper, we formulated a distributed control and func-tion approximation algorithm for deploying a robotic sensornetwork to adaptively monitor an environment. The robotsrobustly learn a weighting function over the environmentrepresenting where sensing is most needed. The functionlearning is enabled by the robots exploring the environmentin a systematic and distributed way, and is provably robustto function approximation errors. After exploring the environ-ment, the robots drive to positions that locally minimize acost function representing the sensing quality of the network.The performance of the algorithm is proven in a theorem,and demonstrated in a numerical simulation with an empiricalweighting function derived from light intensity measurementsin a room. The algorithm was also implemented on sixiRobot Create platforms. Robust performance under realisticconditions was demonstrated in a total of 14 experimentaltrials in 2 different configurations.

APPENDIX

Theorem 2 (Gaussian Kernel Matrix is Positive Definite). Thematrix whose entries are given by

kij =1

2πσ2exp

−‖µi − µj‖2

2σ2

,

is positive definite for any m unique vectors µi ∈ R2, i =1, . . . ,m.

Proof of Lemma 1: The proof relies on the above well-known theorem from the theory of kernel machines. Thetheorem follows directly from [42] Corollary 2.1 (c), or morerecently, [43] Propositions 6 and 8.

We will show that after t > maxj tj , the matrix∑n

i=1 Λi(t)can be expressed as the sum of two quantities

n∑i=1

Λi(t) =m∑

j=1

∫s∈ωj

K(s)K(s)T ds+

n∑i=1

∫s∈Ωi(t)\∪m

j=1ωj

K(s)K(s)T ds, (24)

where the first matrix sum on the right hand side is positivedefinite, and the second is positive semi-definite, hence theirsum is positive definite.

Let ij be the robot for which pij= µj at time tj . Then we

have µj ∈ Ωij(t > tj) (recall that Ωi(t) is the set of points

visited by robot i up until time t). Consider a neighborhoodωj ⊂ Ωij

centered at µj of length δs > 0. More formally

ωj := s|s = pij(tj + τ), τ ∈ [−δs/2v, δs/2v],

where v is the constant speed at which the robots traverse theirTSP tour. Also, let δs < minj 6=k ‖µj − µk‖, which ensures

that all ωj are disjoint, ωj ∩ ωk = ∅, for all j 6= k. Now wecan divide each Ωi(t) into a union of disjoint sets Ωi(t) =(∪j|pi(tj)=µj ωj

)∪ Ωi(t)\ ∪m

j=1 ωj , which leads to

Λi(t) =∑

j|pi(tj)=µj

∫ωj

K(s)K(s)T ds+∫Ωi(t)\∪m

j=1ωj

K(s)K(s)T ds.

Summing over all i gives the decomposition in (24). Bothsums on the right hand side of (24) are positive semi-definitesince K(s)K(s)T ≥ 0 for all s, as it is an outer product of avector with itself.

We only have to show that∑m

j=1

∫ωjK(s)K(s)T ds > 0.

To this end, re-write this asm∑

j=1

∫ωj

K(s)K(s)T ds =m∑

j=1

K(µj)K(µj)Tδs+∫ δs/2v

−δs/2v

m∑j=1

[K

(pij

(tj + τ))K

(pij

(tj + τ))T−

K(µj)K(µj)T]v dτ. (25)

By continuity of the Gaussian and the fact that maxeig(·) is acontinuous function, we know that for any ε > 0, there existsδs > 0 such that

maxeigK

(pij

(tj + τ))K

(pij

(tj + τ))T−

K(pij

(tj))K

(pij

(tj))T

< ε

for all τ ∈ [−δs/2v, δs/2v]. Noting that maxeig(·) is convex,we can use Jensen’s inequality5 to move the maxeig outsidethe sum and the integral to get

maxeig∫ δs/2v

−δs/2v

m∑j=1

[K

(pij

(tj + τ))K

(pij

(tj + τ))T−

K(µj)K(µj)T]v dτ

< mδsε,

and taking the mineig of (25) and bounding the right handside, we have

mineig m∑

j=1

∫ωj

K(s)K(s)T ds>

mineig m∑

j=1

K(µj)K(µj)Tδs −mδsε. (26)

Finally, we show that∑m

j=1K(µj)K(µj)T > 0 as follows.Let K(µj) = [k1j , . . . , kmj ]T, and we have K(µj)K(µj)T =[kkjklj ]k,l where [·]k,l denotes the matrix with terms rangingover rows k and columns l. Therefore

m∑j=1

K(µj)K(µj)T =[ m∑

j=1

kkjklj

]k,l

=[K(µk)TK(µl)

]k,l

=[K(µ1) · · · K(µm)

][K(µ1) · · · K(µm)

]T= [kkl]2k,l,

5Jensen’s inequality states that f(Rg(x) dx) ≤

Rf(g(x)) dx for any

convex function f(·).



13

where the second equality comes from the fact that kkj = kjk

because the Gaussian is radially symmetric. From Theorem 2,we know that [kkl]k,l > 0. Denote the minimum eigenvalueof this matrix by λmin = mineig[kk,l]k,l > 0 to obtainmineig

∑mj=1K(µj)K(µj)T

≥ λ2

min. Finally, from (26) wehave

mineig m∑

j=1

∫ωj

K(µj)K(µj)T ds> λ2

minδs −mδsε,

and letting ε < λ2min/m (since we can choose it to be

arbitrarily small) we get the desired result.

Proof of Lemma 2: Following a similar line of reasoningas in [12], Lemmas 1 and 2, we show that (i) ψ1(t), ψ2(t), andψ3(t) are continuous, (ii) their non-differentiabilities are onlyat isolated points in time (i.e. there are no pathological in-tervals of non-differentiability as in the Weierstrass function),and (iii) wherever their derivatives exist, those derivatives areuniformly bounded. These three facts imply that the functionsare uniformly continuous.

(i) Firstly, note that pi(t), ai(t), Λi(t), and λεi(t) are

continuous, being the solutions to differential equations, andCi(P, t), Mi(P, t), Fi(P, t), and L(P ) are continuous becausethe Voronoi cells are continuous functions of the positions ofthe robots, and ai(t) is continuous. The dead-zone functionBdz

i (·) is also continuous by design. Therefore the only dis-continuous quantity in the system is in Iproj

i , appearing on theright hand side of ˙ai. Neither Iproj

i nor ˙ai appear in any ofψ1(t), ψ2(t), or ψ3(t), hence we conclude that these are allcontinuous functions.

(ii) The non-differentiabilities in the system only apear inthree varieties. The first variety is in the dead-zone Bdz

i (·)when C(Λiai+λεi

) transitions to 0 which can only occur at anisolated time according to the differential equations defining Λ,ai, and λεi

. The second is in Iproji when one of the parameters

aij transitions to the upper or lower limit, amax or amin,which again can only happen at an isolated time due to thedifferential equation governing ai(t). The third variety occurswhen a robot gains or lose Voronoi neighbors, causing a non-differentiability in L(P ), which can only occur at an isolatedtime due to the differential equation governing pi.

(iii) The boundedness of the derivatives |ψk(t)| < ψmaxk ,

k = 1, 2, 3 can be established directly as follows. Firstly, it wasshown in the proof of Theorem 1 that ψk(t), k = 1, 2, 3 areall bounded, and Ci, pi ∈ Q are bounded since Q is bounded,therefore pi(t), Λi(t), Mi(t), ai(t), L(P ), and λεi

(t) are allbounded, as is Bdz

i (Λiai + λεi). This implies that ˙ai from

(12) and (10) is bounded because it is a polynomial functionof these quantities. Also K(q) is a Gaussian function, hence itis bounded, so Ki(t) is bounded, and Λi = KiKT

i ‖pi‖, λεi≤

Ki‖pi‖φεmax , and β = Ki‖pi‖ are all bounded, therefore ψ3(t)is bounded. Since the length of the edge of a Voronoi celllij changes as a bounded function of the robots’ positions pi

and pj , L(P ) is bounded, therefore ψ2(t) is bounded. Finally,˙Ci and ˙

Mi are bounded, as they are polynomial functions ofthe above quantities, and the Mi in the denominator of ˙

Ci isbounded away from zero because amin > 0. Therefore ψ1(t)

is bounded. We have shown (i), (ii), and (iii), therefore weconclude that ψk(t), k = 1, 2, 3 are uniformly continuous.

REFERENCES

[1] J. Cortes, S. Martınez, T. Karatas, and F. Bullo, “Coverage controlfor mobile sensing networks,” IEEE Transactions on Robotics andAutomation, vol. 20, no. 2, pp. 243–255, April 2004.

[2] A. Weber, Theory of the Location of Industries. Chicago, IL: TheUniversity of Chicago Press, 1929, translated by Carl. J. Friedrich.

[3] Z. Drezner, Facility Location: A Survey of Applications and Methods, ser.Springer Series in Operations Research. New York: Springer-Verlag,1995.

[4] S. Martınez, “Distributed interpolation schemes for field estimationby mobile sensor networks,” IEEE Transactions on Control SystemsTechnology, vol. 18, no. 2, pp. 419–500, March 2010.

[5] J. Cortes, S. Martınez, and F. Bullo, “Spatially-distributed coverage op-timization and control with limited-range interactions,” ESIAM: Control,Optimisation and Calculus of Variations, vol. 11, pp. 691–719, 2005.

[6] L. C. A. Pimenta, V. Kumar, R. C. Mesquita, and G. A. S. Pereira,“Sensing and coverage for a network of heterogeneous robots,” inProceedings of the IEEE Conference on Decision and Control, Cancun,Mexico, December 2008.

[7] A. Breitenmoser, M. Schwager, J. C. Metzger, R. Siegwart, and D. Rus,“Voronoi coverage of non-convex environments with a group of net-worked robots,” in Proceedings of the International Conference onRobotics and Automation (ICRA 10), Anchorage, Alaska, USA, May2010, pp. 4982–4989.

[8] C. Caicedo and M. Zefran, “Performing coverage on nonconvex do-mains,” in Proceedings of the 2008 IEEE Multi-Conference on Systemsand Control, 2008, pp. 1019–1024.

[9] A. Arsie and E. Frazzoli, “Efficient routing of multiple vehicles with noexplicit communications,” International Journal of Robust and Nonlin-ear Control, vol. 18, no. 2, pp. 154–164, January 2007.

[10] J. Le Ny and G. J. Pappas, “Sensor-based robot deployment algorithms,”in Proceedings of the IEEE Conference on Decision and Control. IEEE,2010, pp. 5486–5492.

[11] ——, “Adaptive algorithms for coverage control and space partitioningin mobile robotic networks,” University of Pennsylvania, ESE, Tech.Rep., 2010.

[12] M. Schwager, D. Rus, and J. J. Slotine, “Decentralized, adaptive cov-erage control for networked robots,” International Journal of RoboticsResearch, vol. 28, no. 3, pp. 357–375, March 2009.

[13] P. Ioannou and P. V. Kokotovic, “Instability analysis and improvement ofrobustness of adaptive control,” Automatica, vol. 20, no. 5, pp. 583–594,1984.

[14] B. Peterson and K. Narendra, “Bounded error adaptive control,” IEEETransactions on Automatic Control, vol. 27, no. 6, pp. 1161 – 1168,December 1982.

[15] C. Samson, “Stability analysis of adaptively controlled systems subjectto bounded disturbances,” Automatica, vol. 19, no. 1, pp. 81 – 86, 1983.

[16] K. Narendra and A. Annaswamy, “A new adaptive law for robust adap-tation without persistent excitation,” IEEE Transactions on AutomaticControl, vol. 32, no. 2, pp. 134 – 145, Feb. 1987.

[17] K. Tsakalis, “Robustness of model reference adaptive controllers:an input-output approach,” IEEE Transactions on Automatic Control,vol. 37, no. 5, pp. 556 –565, May 1992.

[18] H. Choset, “Coverage for robotics—A survey of recent results,” Annalsof Mathematics and Artificial Intelligence, vol. 31, pp. 113–126, 2001.

[19] D. T. Latimer IV, S. Srinivasa, V. Shue, S. S. adnd H. Choset,and A. Hurst, “Towards sensor based coverage with robot teams,” inProceedings of the IEEE International Conference on Robotics andAutomation, vol. 1, May 2002, pp. 961–967.

[20] Z. J. Butler and D. Rus, “Controlling mobile sensors for monitoringevents with coverage constraints,” in Proceedings of the IEEE Interna-tional Conference of Robotics and Automation, New Orleans, LA, April2004, pp. 1563–1573.

[21] P. Ogren, E. Fiorelli, and N. E. Leonard, “Cooperative control ofmobile sensor networks: Adaptive gradient climbing in a distributedenvironment,” IEEE Transactions on Automatic Control, vol. 49, no. 8,pp. 1292–1302, August 2004.

[22] K. M. Lynch, I. B. Schwartz, P. Yang, and R. A. Freeman, “Decentralizedenvironmental modeling by mobile sensor networks,” IEEE Transactionson Robotics, vol. 24, no. 3, pp. 710–724, June 2008.



14

[23] J. Choi, S. Oh, and R. Horowitz, “Distributed learning and cooperativecontrol for multi-agent systems,” Automatica, vol. 45, no. 12, pp. 2802–2814, 2009.

[24] M. Jadaliha, J. Lee, and J. Choi, “Adaptive control of multiagent systemsfor finding peaks of uncertain static fields,” ASME Journal of DynamicSystems, Measurement, and Control, vol. 134, no. 5, pp. 051 007–051 007–8, 2012.

[25] N. Atanasov, J. Le Ny, and G. J. Pappas, “Distributed algorithms forstochastic source seeking with mobile robot networks,” ASME Journalof Dynamic Systems, Measurement, and Control, vol. 137, no. 3, pp.031 004–031 004–9, 2014.

[26] B. Grocholsky, A. Makarenko, and H. Durrant-Whyte, “Information-theoretic control of multiple sensor platforms,” in Procroceedings of theIEEE International Conference on Robotics and Automation (ICRA 03),vol. 1, Taipei, Taiwan, September 2003, pp. 1521–1526.

[27] B. Grocholsky, “Information-theoretic control of multiple sensor plat-forms,” Ph.D. dissertation, University of Sydney, 2002.

[28] G. M. Mathews, “Asynchronous decision making for decentralisedautonomous systems,” Ph.D., University of Sydney, March 2008.

[29] M. Schwager, P. Dames, D. Rus, and V. Kumar, “A multi-robot controlpolicy for information gathering in the presence of unknown hazards,”in Proceedings of the International Symposium on Robotics Research(ISRR 11), Flagstaff, AZ, USA, August 2011.

[30] K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems.Englewood Cliffs, NJ: Prentice-Hall, 1989.

[31] S. S. Sastry and M. Bodson, Adaptive control: stability, convergence,and robustness. Upper Saddle River, NJ: Prentice-Hall, Inc., 1989.

[32] J. J. E. Slotine and W. Li, Applied Nonlinear Control. Upper SaddleRiver, NJ: Prentice-Hall, 1991.

[33] R. Sanner and J. Slotine, “Gaussian networks for direct adaptive control,”IEEE Transactions on Neural Networks, vol. 3, no. 6, pp. 837–863, 1992.

[34] M. Schwager, M. P. Vitus, D. Rus, and C. J. Tomlin, “Robust adaptivecoverage for robotic sensor networks,” in Proceedings of the Interna-tional Symposium on Robotics Research (ISRR 11), Flagstaff, AZ, USA,August 2011.

[35] M. Schwager, D. Rus, and J. J. Slotine, “Unifying geometric, prob-abilistic, and potential field approaches to multi-robot deployment,”International Journal of Robotics Research, vol. 30, no. 3, pp. 371–383, March 2011.

[36] S. P. Lloyd, “Least squares quantization in pcm,” IEEE Transactions onInformation Theory, vol. 28, no. 2, pp. 129–137, 1982.

[37] L. C. A. Pimenta, M. Schwager, Q. Lindsey, V. Kumar, D. Rus, R. C.Mesquita, and G. A. S. Pereira, “Simultaneous coverage and tracking(SCAT) of moving targets with robot networks,” in Proceedings ofthe Eighth International Workshop on the Algorithmic Foundations ofRobotics (WAFR 08), Guanajuato, Mexico, December 2008.

[38] R. Duda, P. Hart, and D. Stork, Pattern Classification. New York:Wiley and Sons, 2001.

[39] I. Barbalat, “Systemes d’equations differentielles d’oscillations nonlineaires,” Revue de Mathematiques Pures et Appliquees, vol. 4, pp.267–270, 1959.

[40] P. A. Ioannou and J. Sun, Robust Adaptive Control. Englewood Cliffs,NJ: Prentice-Hall, 1996.

[41] M. Schwager, J. McLurkin, J. J. E. Slotine, and D. Rus, “From theoryto practice: Distributed coverage control experiments with groups ofrobots,” in Experimental Robotics: The Eleventh International Sympo-sium, vol. 54. Springer-Verlag, 2008, pp. 127–136.

[42] C. Micchelli, “Interpolation of scattered data: distance matrices andconditionally positive definite functions,” Constructive Approximation,vol. 2, no. 1, pp. 11–22, 1986.

[43] T. Hofmann, B. Scholkopf, and A. Smola, “Kernel methods in machinelearning,” The annals of statistics, pp. 1171–1220, 2008.

Mac Schwager is an assistant professor with theAeronautics and Astronautics Department at Stan-ford University. He obtained his BS degree in 2000from Stanford University, his MS degree from MITin 2005, and his PhD degree from MIT in 2009.He was a postdoctoral researcher working jointly inthe GRASP lab at the University of Pennsylvaniaand CSAIL at MIT from 2010 to 2012, and was anassistant professor at Boston University from 2012to 2015. His research interests are in distributedalgorithms for control, perception, and learning in

groups of robots and animals. He received the NSF CAREER award in 2014.

Michael P. Vitus is a Software Engineer at Googledeveloping visual inertial navigation algorithms forProject Tango. He received his Ph.D. from StanfordUniversity in 2012, his M.S. from Stanford Univer-sity in 2006, both in Aeronautics and Astronautics,and his B.S. in 2004 from Rensselaer PolytechnicInstitute in Mechanical and Aeronautical Engineer-ing. His research interests include simultaneous lo-calization and mapping, sensor fusion, deterministicand stochastic motion planning, unmanned aerialvehicles, and optimization.

Samantha Powers received a B.S. in mechani-cal engineering from the Massachusetts Institute ofTechnology, Cambridge, MA, USA, in 2012. Shewas a software engineer with Microsoft from 2013until 2015, and is now pursuing interests in 3Dcontent creation and human computer interfaces.

Daniela Rus is the Andrew (1956) and Erna ViterbiProfessor of Electrical Engineering and ComputerScience and Director of the Computer Science andArtificial Intelligence Laboratory (CSAIL) at MIT.Rus research interests are in robotics, mobile com-puting, and data science. Rus is a Class of 2002MacArthur Fellow, a fellow of ACM, AAAI andIEEE, and a member of the National Academy ofEngineers. She earned her PhD in Computer Sciencefrom Cornell University. Prior to joining MIT, Ruswas a professor in the Computer Science Department

at Dartmouth College.

1028 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 23, NO. 3, MAY 2015

[14] G. C. Chasparis and J. S. Shamma, “Linear-programming-basedmulti-vehicle path planning with adversaries,” in Proc. IEEE Amer.Control Conf., Jun. 2005, pp. 1072–1077.

[15] R. Isaacs, Differential Games. New York, NY, USA: Wiley, 1967.[16] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory,

2nd ed. Philadelphia, PA, USA: SIAM, 1999.[17] I. M. Mitchell, A. M. Bayen, and C. J. Tomlin, “A time-dependent

Hamilton–Jacobi formulation of reachable sets for continuous dynamicgames,” IEEE Trans. Autom. Control, vol. 50, no. 7, pp. 947–957,Jul. 2005.

[18] M. Falcone, “Numerical methods for differential games based onpartial differential equations,” Int. Game Theory Rev., vol. 8, no. 2,pp. 231–272, 2006.

[19] J. Ding, J. Sprinkle, S. S. Sastry, and C. J. Tomlin, “Reacha-bility calculations for automated aerial refueling,” in Proc. 47thIEEE Int. Conf. Decision Control, Cancun, Mexico, Dec. 2008,pp. 3706–3712.

[20] L. C. Evans and P. E. Souganidis, “Differential games and representationformulas for solutions of Hamilton–Jacobi–Isaacs equations,” IndianaUniv. Math. J., vol. 33, no. 5, pp. 773–797, 1984.

[21] M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and ViscositySolutions of Hamilton–Jacobi–Bellman Equations (Systems Control:Foundations Applications). Boston, MA, USA: Birkhäuser, 1997.

[22] R. J. Elliott and N. J. Kalton, The Existence of Value in DifferentialGames. Providence, RI, USA: AMS, 1972.

[23] J. A. Sethian, Level Set Methods and Fast Marching Methods (Cam-bridge Monographs on Applied and Computational Mathematics), vol. 3,2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 1999.

[24] S. Osher and R. Fedkiw, Level Set Methods and Dynamic ImplicitSurfaces. New York, NY, USA: Springer-Verlag, 2002.

[25] I. Mitchell, “Application of level set methods to control and reachabilityproblems in continuous and hybrid systems,” Ph.D. dissertation, Dept.Sci. Comput. Comput. Math. Prog., Stanford Univ., Stanford, CA, USA,2002.

[26] M. G. Crandall and P.-L. Lions, “Viscosity solutions ofHamilton–Jacobi equations,” Trans. Amer. Math. Soc., vol. 277,no. 1, pp. 1–42, 1983.

[27] I. Mitchell. (2009). A Toolbox of Level Set Methods. [Online]. Available:http://people.cs.ubc.ca/~mitchell/ToolboxLS/index.html

[28] J. Sprinkle, A. D. Ames, J. M. Eklund, I. M. Mitchell, and S. S. Sastry,“Online safety calculations for glide-slope recapture,” Innov. Syst. Softw.Eng., vol. 1, no. 2, pp. 157–175, 2005.

[29] M. Bardi, M. Falcone, and P. Soravia, “Numerical methods forpursuit-evasion games via viscosity solutions,” in Stochastic andDifferential Games: Theory and Numerical Methods (Annals ofthe International Society of Dynamic Games), vol. 4, M. Bardi,T. E. S. Raghavan, and T. Parthasarathy, Eds. Boston, MA, USA:Birkhäuser, 1999, pp. 289–303.

[30] J. H. Gillula, G. M. Hoffmann, H. Huang, M. P. Vitus, and C. Tomlin,“Applications of hybrid reachability analysis to robotic aerial vehicles,”Int. J. Robot. Res., vol. 30, no. 3, pp. 335–354, Mar. 2011.

[31] T. D. Parsons, “Pursuit-evasion in a graph,” in Theory and Applicationsof Graphs. Berlin, Germany: Springer-Verlag, 1978, pp. 426–441.

[32] J. Lygeros, C. Tomlin, and S. Sastry, “Controllers for reachabil-ity specifications for hybrid systems,” Automatica, vol. 35, no. 3,pp. 349–370, 1999.

[33] C. J. Tomlin, J. Lygeros, and S. S. Sastry, “A game theoretic approachto controller design for hybrid systems,” Proc. IEEE, vol. 88, no. 7,pp. 949–970, Jul. 2000.

[34] W. Rudin, Principles of Mathematical Analysis, 3rd ed. New York, NY,USA: McGraw-Hill, 1976.

[35] D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control:The Discrete Time Case. San Francisco, CA, USA: Academic,1978.

[36] P. R. Kumar and T. H. Shiau, “Existence of value and randomizedstrategies in zero-sum discrete-time stochastic dynamic games,” SIAMJ. Control Optim., vol. 19, no. 5, pp. 617–634, 1981.

[37] J. A. Sethian, “A fast marching level set method for monotonicallyadvancing fronts,” Proc. Nat. Acad. Sci., vol. 93, no. 4, pp. 1591–1595,1996.

[38] Y.-H. R. Tsai, L.-T. Cheng, S. Osher, and H.-K. Zhao, “Fast sweepingalgorithms for a class of Hamilton–Jacobi equations,” SIAM J. Numer.Anal., vol. 41, no. 2, pp. 673–694, 2004.

Haomiao Huang received the B.Sc. degree in elec-trical engineering from the California Institute ofTechnology, Pasadena, CA, USA, in 2005, and thePh.D. degree in aeronautics and astronautics fromStanford University, Stanford, CA, USA, in 2012.

He is currently an Entrepreneur and a Technologistin the San Francisco Bay Area. He is interested inubiquitous computing systems that are embeddedinto and interact with the physical environment,including sensor networks and distributed robotics.His current research interests include applications of

these technologies in unmanned aerial vehicle control, multiagent games, andguidance for human agents.

Jerry Ding received the B.Sc. degree in electri-cal engineering from the University of Wisconsin-Madison, Madison, WI, USA, in 2006, and the Ph.D.degree in electrical engineering and computer sci-ences from the University of California at Berkeley(UC Berkeley), Berkeley, CA, USA, in 2012.

He was a Post-Doctoral Scholar with the Depart-ment of Electrical Engineering and ComputerSciences, UC Berkeley, from 2012 to 2013. Heis currently a Research Engineer with the UnitedTechnologies Research Center, East Hartford, CT,

USA. He is interested in the development of model-based design andanalysis methods for safety-critical systems, with a focus toward controland verification techniques for nonlinear, hybrid, and stochastic systems. Hiscurrent research interests include applications of these techniques to problemsin autonomous vehicle control, air traffic management, and multiagent robotmotion planning.

Wei Zhang received the B.E. degree in automaticcontrol from the University of Science and Technol-ogy of China, Hefei, China, in 2003, and the M.S.degree in statistics and the Ph.D. degree in electricalengineering from Purdue University, West Lafayette,IN, USA, in 2009.

He was a Post-Doctoral Researcher with theDepartment of Electrical Engineering and Com-puter Sciences, University of California at Berkeley,Berkeley, CA, USA, from 2010 to 2011. He iscurrently an Assistant Professor with the Department

of Electrical and Computer Engineering, Ohio State University, Columbus,OH, USA. His current research interests include control and estimation ofhybrid and multiagent systems, and their applications in various engineeringfields, in particular, power systems, robotics, and intelligent transportation.

Claire J. Tomlin received the B.A.Sc. degree inelectrical engineering from the University of Water-loo, Waterloo, ON, Canada, the M.Sc. degree in elec-trical engineering from Imperial College London,London, U.K., and Ph.D. degree in electrical engi-neering and computer sciences from the Universityof California at Berkeley, Berkeley, CA, USA.

She was an Assistant, an Associate, and a FullProfessor with the Department of Aeronautics andAstronautics, Stanford University, Stanford, CA,USA, from 1998 to 2007. She has held visiting

researcher positions with the NASA Ames Research Center, Mountain View,CA, USA, and Honeywell International, Inc., Morristown, NJ, USA. She iscurrently the Charles A. Desoer Professor with the Department of ElectricalEngineering and Computer Sciences, University of California at Berkeley. Hercurrent research interests include hybrid control systems, with applications toair traffic systems, unmanned aerial vehicles, and systems biology.

Dr. Tomlin was a recipient of the MacArthur Fellowship in 2006, theOkawa Foundation Research Grant in 2006, and the Eckman Award fromthe American Automatic Control Council in 2003.

Claire J. Tomlin received the B.A.Sc. degree inelectrical engineering from the University of Water-loo, Waterloo, ON, Canada, the M.Sc. degree inelectrical engineering from Imperial College Lon-don, London, U.K., and Ph.D. degree in electri-cal engineering and computer sciences from theUniversity of California at Berkeley, Berkeley, CA,USA. She was an Assistant, an Associate, and aFull Professor with the Department of Aeronauticsand Astronautics, Stanford University, Stanford, CA,USA, from 1998 to 2007. She has held visiting

researcher positions with the NASA Ames Research Center, Mountain View,CA, USA, and Honeywell International, Inc., Morristown, NJ, USA. She iscurrently the Charles A. Desoer Professor with the Department of ElectricalEngineering and Computer Sciences, University of California at Berkeley. Hercurrent research interests include hybrid control systems, with applications toair traffic systems, unmanned aerial vehicles, and systems biology. Dr. Tomlinwas a recipient of the MacArthur Fellowship in 2006, the Okawa FoundationResearch Grant in 2006, and the Eckman Award from the American AutomaticControl Council in 2003.

Robust Adaptive Coverage Control for Robotic Sensor Networks

Documents

Transcript of Robust Adaptive Coverage Control for Robotic Sensor Networks