Finite-time convergent gradient flows with applications to network consensus

8
Automatica 42 (2006) 1993 – 2000 www.elsevier.com/locate/automatica Brief paper Finite-time convergent gradient flows with applications to network consensus Jorge Cortés Department of Applied Mathematics and Statistics, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA Received 1 September 2005; received in revised form 12 April 2006; accepted 26 June 2006 Available online 28 August 2006 Abstract This paper introduces the normalized and signed gradient dynamical systems associated with a differentiable function. Extending recent results on nonsmooth stability analysis, we characterize their asymptotic convergence properties and identify conditions that guarantee finite-time convergence. We discuss the application of the results to consensus problems in multi-agent systems and show how the proposed nonsmooth gradient flows achieve consensus in finite time. 2006 Elsevier Ltd. All rights reserved. Keywords: Gradient flows; Nonsmooth analysis; Finite-time convergence; Network consensus; Multi-agent systems 1. Introduction Consider the gradient flow ˙ x =−grad(f )(x) of a differen- tiable function f : R d R, d N. It is well known (see e.g. Hirsch & Smale, 1974) that the strict minima of f are sta- ble equilibria of this system, and that, if the level sets of f are bounded, then the trajectories converge asymptotically to the set of critical points of f. Gradient dynamical systems are em- ployed in a wide range of applications, including optimization, parallel computing and motion planning. In robotics, potential field methods are used to autonomously navigate a robot in a cluttered environment. Gradient algorithms enjoy many fea- tures: they are naturally robust to perturbations and measure- ment errors, amenable to asynchronous implementations, and admit efficient numerical approximations. In this note, we provide an answer to the following question: how could one modify the gradient vector field above so that the trajectories converge to the critical points of the function in finite time?—as opposed to over an infinite-time horizon. There are a number of settings where finite-time convergence This work was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor Andrew R. Teel under the direction of Editor H.K. Khalil. Research supported by NSF CAREER award ECS-0546871. E-mail address: [email protected]. 0005-1098/$ - see front matter 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2006.06.015 is a desirable property. We study this problem with the aim of designing gradient coordination algorithms for multi-agent systems that achieve the desired task in finite time. Our answer to the question above is the flows ˙ x =− grad(f )(x) grad(f )(x) 2 , ˙ x =−sgn(grad(f )(x)), where · 2 denotes the Euclidean distance and sgn(x) = (sgn(x 1 ),..., sgn(x d )). Using tools from nonsmooth stability analysis, we show that, under some assumptions on f, both sys- tems are guaranteed to achieve the set of critical points in finite time. Literature review: Guidelines on how to design dynamical systems for optimization purposes, with a special emphasis on gradient systems, are described in Helmke and Moore (1994). The book by Bertsekas and Tsitsiklis (1997) thoroughly dis- cusses gradient descent flows in distributed computation in settings with fixed-communication topologies. Nonsmooth analysis studies the notion and computational properties of the generalized gradient (Clarke, 1983). Tools for establishing stability and convergence properties of nonsmooth dynamical systems via Lyapunov functions are presented in Bacciotti and Ceragioli (1999), Clarke (2004), Clarke, Ledyaev, Rifford, and Stern (2000) and Shevitz and Paden (1994) (see also refer- ences therein). Finite-time discontinuous feedback stabilizers

Transcript of Finite-time convergent gradient flows with applications to network consensus

Automatica 42 (2006) 1993–2000www.elsevier.com/locate/automatica

Brief paper

Finite-time convergent gradient flows with applications to networkconsensus�

Jorge CortésDepartment of Applied Mathematics and Statistics, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA

Received 1 September 2005; received in revised form 12 April 2006; accepted 26 June 2006Available online 28 August 2006

Abstract

This paper introduces the normalized and signed gradient dynamical systems associated with a differentiable function. Extending recentresults on nonsmooth stability analysis, we characterize their asymptotic convergence properties and identify conditions that guarantee finite-timeconvergence. We discuss the application of the results to consensus problems in multi-agent systems and show how the proposed nonsmoothgradient flows achieve consensus in finite time.� 2006 Elsevier Ltd. All rights reserved.

Keywords: Gradient flows; Nonsmooth analysis; Finite-time convergence; Network consensus; Multi-agent systems

1. Introduction

Consider the gradient flow x = −grad(f )(x) of a differen-tiable function f : Rd → R, d ∈ N. It is well known (seee.g. Hirsch & Smale, 1974) that the strict minima of f are sta-ble equilibria of this system, and that, if the level sets of f arebounded, then the trajectories converge asymptotically to theset of critical points of f. Gradient dynamical systems are em-ployed in a wide range of applications, including optimization,parallel computing and motion planning. In robotics, potentialfield methods are used to autonomously navigate a robot ina cluttered environment. Gradient algorithms enjoy many fea-tures: they are naturally robust to perturbations and measure-ment errors, amenable to asynchronous implementations, andadmit efficient numerical approximations.

In this note, we provide an answer to the following question:how could one modify the gradient vector field above so thatthe trajectories converge to the critical points of the functionin finite time?—as opposed to over an infinite-time horizon.There are a number of settings where finite-time convergence

� This work was not presented at any IFAC meeting. This paper wasrecommended for publication in revised form by Associate Editor Andrew R.Teel under the direction of Editor H.K. Khalil. Research supported by NSFCAREER award ECS-0546871.

E-mail address: [email protected].

0005-1098/$ - see front matter � 2006 Elsevier Ltd. All rights reserved.doi:10.1016/j.automatica.2006.06.015

is a desirable property. We study this problem with the aimof designing gradient coordination algorithms for multi-agentsystems that achieve the desired task in finite time.

Our answer to the question above is the flows

x = − grad(f )(x)

‖grad(f )(x)‖2, x = −sgn(grad(f )(x)),

where ‖ · ‖2 denotes the Euclidean distance and sgn(x) =(sgn(x1), . . . , sgn(xd)). Using tools from nonsmooth stabilityanalysis, we show that, under some assumptions on f, both sys-tems are guaranteed to achieve the set of critical points in finitetime.

Literature review: Guidelines on how to design dynamicalsystems for optimization purposes, with a special emphasis ongradient systems, are described in Helmke and Moore (1994).The book by Bertsekas and Tsitsiklis (1997) thoroughly dis-cusses gradient descent flows in distributed computation insettings with fixed-communication topologies. Nonsmoothanalysis studies the notion and computational properties ofthe generalized gradient (Clarke, 1983). Tools for establishingstability and convergence properties of nonsmooth dynamicalsystems via Lyapunov functions are presented in Bacciotti andCeragioli (1999), Clarke (2004), Clarke, Ledyaev, Rifford, andStern (2000) and Shevitz and Paden (1994) (see also refer-ences therein). Finite-time discontinuous feedback stabilizers

1994 J. Cortés / Automatica 42 (2006) 1993–2000

for a class of planar systems are proposed in Ryan (1991).Finite-time stability of continuous autonomous systems is rigor-ously studied in Bhat and Bernstein (2000). Coron (1995) devel-oped finite-time stabilization strategies based on time-varyingfeedback. Previous work on motion coordination of multi-agentsystems has proposed cooperative algorithms based on gradientflows to achieve tasks such as cohesiveness (Gazi & Passino,2003; Ögren, Fiorelli, & Leonard, 2004; Tanner, Jadbabaie,& Pappas, 2003), deployment (Cortés & Bullo, 2005; Cortés,Martínez, & Bullo, 2005) and consensus (Olfati-Saber, Fax, &Murray, submitted; Olfati-Saber & Murray, 2004; Ren, Beard,& Atkins, 2005), The distributed algorithms in these worksachieve the desired coordination task over an infinite-timehorizon.

Statement of contributions: In this paper we introduce thenormalized and signed gradient descent flows associated toa differentiable function. We characterize their convergenceproperties via nonsmooth stability analysis. We also identifygeneral conditions under which these flows reach the set ofcritical points of the function in finite time. To do this, weextend recent results on the convergence properties of generalnonsmooth dynamical systems via locally Lipschitz and reg-ular Lyapunov functions (e.g., Bacciotti & Ceragioli, 1999;Cortés & Bullo, 2005; Shevitz & Paden, 1994). In particular,we develop two novel results involving second-order informa-tion on the evolution of the Lyapunov function to establishfinite-time convergence. These results are not restricted togradient flows, and can indeed be used in other setups withdiscontinuous vector fields and locally Lipschitz functions. Wediscuss in detail the application of these results to networkconsensus problems. We propose two coordination algorithmsbased on the Laplacian of the network graph that achieve con-sensus in finite time. The normalized gradient descent of theLaplacian potential is not distributed over the network graphand achieves average-consensus, i.e., consensus on the averageof the initial agents’ states. The signed gradient descent of theLaplacian potential is distributed over the network graph andachieves average–max–min-consensus, i.e., consensus on theaverage of the maximum and the minimum values of the ini-tial agents’ states. We also consider networks with switchingconnected network topologies.

Organization: Section 2 introduces differential equationswith discontinuous right-hand sides and presents various non-smooth tools for stability analysis. Section 3 introduces thenormalized and signed versions of the gradient descent flowof a function and characterizes their convergence properties.Conditions are given under which these flows converge in finitetime. Section 4 discusses the application of the results to net-work consensus problems. Section 5 presents our conclusions.

Notation. The set of strictly positive natural (resp. real)numbers is denoted by N (resp. R+). For d ∈ N, e1, . . . , ed

is the standard orthonormal basis of Rd . For x ∈ Rd , letsgn(x) = (sgn(x1), . . . , sgn(xd)) ∈ Rd , let x′ be the transposeof x, and let ‖x‖1 and ‖x‖2 be the 1-norm and the 2-normof x, resp. For x, y ∈ Rd , let x · y be the inner product. Let1 = (1, . . . , 1)′ ∈ Rd . For S ∈ Rd , let co(S) denote its convexclosure. Define also diag((Rd)n) = {(x, . . . , x) ∈ (Rd)n | x ∈

Rd} for n ∈ N. Given a positive semidefinite d × d-matrix A,let H0(A) ⊂ Rd denote the eigenspace corresponding to 0 (ifA is positive definite, set H0(A) = {0}). We denote by �H0(A) :Rd → H0(A) the orthogonal projection onto H0(A). Let �2(A)

and �d(A) be the smallest nonzero and greatest eigenvalue ofA, resp., i.e., �2(A)=min{� | � > 0 and � eigenvalue of A} and�d(A) = max{� | � eigenvalue of A}. It is easy to see

x′Ax��2(A)‖x − �H0(A)(x)‖22, x ∈ Rd . (1)

2. Nonsmooth stability analysis

This section introduces discontinuous differential equationsand presents various nonsmooth tools to analyze their stabilityproperties. We present two novel results on the second-orderevolution of locally Lipschitz functions and on finite-time con-vergence.

2.1. Discontinuous differential equations

For differential equations with discontinuous right-handsides we understand the solutions in terms of differential in-clusions following Filippov (1988). Let F : Rd → 2Rd

be aset-valued map. Consider the differential inclusion

x ∈ F(x). (2)

A solution to this equation on an interval [t0, t1] ⊂ R is definedas an absolutely continuous function x : [t0, t1] → Rd suchthat x(t) ∈ F(x(t)) for almost all t ∈ [t0, t1]. Now, considerthe differential equation

x(t) = X(x(t)), (3)

where X : Rd → Rd is measurable and essentially locallybounded (Filippov, 1988). We understand the solution of (3) inthe Filippov sense. For each x ∈ Rd , consider the set

K[X](x) =⋂�>0

⋂�(S)=0

co{X(Bd(x, �)\S)}, (4)

where � denotes the usual Lebesgue measure in Rd . Intuitively,this set is the convexification of the limits of the values of thevector field X around x. A Filippov solution of (3) on an interval[t0, t1] ⊂ R is defined as a solution of the differential inclusion

x ∈ K[X](x). (5)

Since the set-valued map K[X] : Rd → 2Rd

is upper semi-continuous with nonempty, compact, convex values and locallybounded, the existence of Filippov solutions of (3) is guaran-teed (cf. Filippov, 1988). A maximal solution is a Filippov so-lution whose domain of existence is maximal, i.e., cannot beextended any further. A set M is weakly invariant (resp. stronglyinvariant) for (3) if for each x0 ∈ M , M contains a maximalsolution (resp. all maximal solutions) of (3).

2.2. Stability via nonsmooth Lyapunov functions

Let f : Rd → R be locally Lipschitz and regular (see Clarke,1983 for detailed definitions). From Rademacher’s Theorem

J. Cortés / Automatica 42 (2006) 1993–2000 1995

(Clarke, 1983), we know that locally Lipschitz functions aredifferentiable a.e. Let �f ⊂ Rd denote the set of points wheref fails to be differentiable. The generalized gradient of f at x ∈Rd (cf. Clarke, 1983) is defined by

�f (x) = co

{lim

i→+∞ df (xi) | xi → x, xi /∈ S ∪ �f

},

where S can be any set of zero measure. Note that if f is con-tinuously differentiable, then �f (x) = {df (x)}.

Given a locally Lipschitz function f, the set-valued Liederivative of f with respect to X at x (cf. Bacciotti & Ceragioli,1999; Cortés & Bullo, 2005) is

LXf (x) = {a ∈ R | ∃v ∈ K[X](x) with

� · v = a, ∀� ∈ �f (x)}.

For x ∈ Rd , LXf (x) is a closed and bounded interval in R,possibly empty. If f is continuously differentiable at x, thenLXf (x) = {df · v | v ∈ K[X](x)}. If, in addition, X is contin-uous at x, then LXf (x) = {LXf (x)}, the usual Lie derivativeof f in the direction of X at x. The next result, from Bacciottiand Ceragioli (1999), shows that this Lie derivative measuresthe evolution of a function along the Filippov solutions.

Theorem 1. Let x : [t0, t1] → Rd be a Filippov solution of(3). Let f be a locally Lipschitz and regular function. Then t →f (x(t)) is absolutely continuous, d/dt (f (x(t))) exists a.e. andd/dt (f (x(t))) ∈ LXf (x(t)) a.e.

Sometimes, we can also look at second-order information forthe evolution of a function along the Filippov solutions. Thisis what we prove in the next result.

Proposition 2. Let x : [t0, t1] → Rd be a Filippov solutionof (3). Let f be a locally Lipschitz and regular function. As-sume that LXf : Rd → 2R is single-valued, i.e., it takes theform LXf : Rd → R, and assume it is Lipschitz and regu-lar. Then d2/dt2(f (x(t))) exists a.e. and d2/dt2(f (x(t))) ∈LX(LXf )(x(t)) a.e.

Proof. Applying Theorem 1 to f and LXf , resp., we de-duce that (i) t → f (x(t)) is absolutely continuous, andd/dt (f (x(t))) = LXf (x(t)) a.e., and, (ii) t → LXf (x(t)) isabsolutely continuous, and d/dt (LXf (x(t))) = LX(LXf )

(x(t)) a.e. Since t → LXf (x(t)) is continuous, the expression

f (x(t)) = f (x(t0)) +∫ t

t0

d

dt(f (x(s))) ds

= f (x(t0)) +∫ t

t0

LXf (x(s)) ds,

and the second fundamental theorem of calculus implies thatt → f (x(t)) is continuously differentiable, and therefored/dt (f (x(t))) = LXf (x(t)) for all t. Now, using (ii), weconclude that t → d/dt (f (x(t))) is differentiable a.e. andd2/dt2(f (x(t))) ∈ LX(LXf )(x(t)) a.e. �

The following result is a generalization of LaSalle princi-ple for discontinuous differential equations (3) with nonsmoothLyapunov functions. The formulation is taken from Bacciottiand Ceragioli (1999), and slightly generalizes that of Shevitzand Paden (1994).

Theorem 3 (LaSalle Invariance Principle). Let f : Rd → R

be a locally Lipschitz and regular function. Let x0 ∈ S ⊂Rd , with S compact and strongly invariant for (3). Assumethat either max LXf (x)�0 or LXf (x) = ∅ for all x ∈ S.Let ZX,f = {x ∈ Rd | 0 ∈ LXf (x)}. Then, any solution x :[t0, +∞) → Rd of (3) starting from x0 converges to the largestweakly invariant set M contained in ZX,f ∩ S. Moreover, ifthe set M is a finite collection of points, then the limit of allsolutions starting at x0 exists and equals one of them.

The following result is taken from Cortés and Bullo (2005).

Proposition 4 (Finite-time convergence with first-order infor-mation). Under the same assumptions of Theorem 3, furtherassume that there exists a neighborhood U of ZX,f ∩ S in Ssuch that max LXf � −� < 0 a.e. on U\(ZX,f ∩S). Then, anysolution x : [t0, +∞) → Rd of (3) starting at x0 ∈ S reachesZX,f ∩ S in finite time. Moreover, if U = S, then the conver-gence time is upper bounded by (f (x0) − minx∈S f (x))/�.

Often times, first-order information is inconclusive to assessfinite-time convergence. The next result uses second-order in-formation to arrive at a satisfactory answer.

Theorem 5 (Finite-time convergence with second-order infor-mation). Under the same assumptions of Theorem 3, furtherassume that (i) x ∈ Rd → LXf (x) is single-valued, Lipschitzand regular; and (ii) there exists a neighborhood U of ZX,f ∩S

in S such that min LX(LXf )�� > 0 a.e. on U\(ZX,f ∩ S).Then, any solution x : [t0, +∞) → Rd of (3) starting at x0 ∈ S

reaches ZX,f ∩ S in finite time. Moreover, if U = S, then theconvergence time is upper bounded by −LXf (x0)/�.

Proof. Since x ∈ Rd → LXf (x) is single-valued and con-tinuous, then ZX,f = {x ∈ Rd | LXf (x) = 0} is closed. Letx : [t0, +∞) → Rd be a solution of (3) starting from x0 ∈S\ZX,f . We reason by contradiction. Assume it does not exist Tsuch that x(T ) ∈ ZX,f . By Theorem 3, x(t) → M ⊂ ZX,f ∩S

when t → +∞, and, therefore, there exists t∗ � t0 with x(t) ∈U for all t � t∗. Using Proposition 2 (cf. assumption (i)) com-bined with assumption (ii), we write for g(t) = d/dt (f (x(t))),

g(t) = g(t∗) +∫ t

t∗

d

dsg(s) ds�g(t∗) + �(t − t∗), t > t∗.

Since x(t∗) /∈ ZX,f by hypothesis, then g(t∗) < 0. Noting thatt → g(t) is continuous, we deduce that there exists T � t∗ −g(t∗)/� such that g(T ) = 0, i.e., x(T ) ∈ ZX,f , which is acontradiction. The upper bound on the convergence time canbe deduced using similar arguments. �

1996 J. Cortés / Automatica 42 (2006) 1993–2000

3. Finite-time convergent gradient flows

Here, we formally introduce the normalized and signed gra-dient flows of a differentiable function, and characterize theirconvergence properties. We build on Section 2 to identify con-ditions for finite-time convergence. Let

x = − grad(f )(x)

‖grad(f )(x)‖2, (6a)

x = −sgn(grad(f )(x)). (6b)

Both equations have discontinuous right-hand sides. Hence, weunderstand their solutions in the Filippov sense. Note that thetrajectories of (6a) and of x = −grad(f )(x) describe the samepaths.

Lemma 6. The Filippov set-valued maps associated with thediscontinuous vector fields (6a) and (6b) are

K

[grad(f )

‖grad(f )‖2

](x)

=co

{lim

i→+∞grad(f )(xi)

‖grad(f )(xi)‖2

∣∣∣∣ xi → x, grad(f )(xi) �= 0

},

K[sgn(grad(f ))](x)

= {v ∈ Rd | vi = sgn(gradi (f )(x)) if gradi (f )(x) �= 0 and

vi ∈ [−1, 1] if gradi (f )(x) = 0, for i ∈ {1, . . . , d}}.Note K[grad(f )/‖grad(f )‖2](x)=grad(f )(x)/‖grad(f )(x)‖2if grad(f )(x) �= 0.

The proof of this result follows from the definition (4) of theoperator K and the particular forms of (6a) and (6b).

For a differentiable function f, let Critical(f ) = {x ∈Rd | grad(f )(x) = 0} denote the set of its critical points, andlet Hess(f )(x) denote its Hessian matrix at x ∈ Rd . The nextresult establishes the general asymptotic properties of the flowsin (6).

Proposition 7. Let f : Rd → R be a differentiable function.Let x0 ∈ S ⊂ Rd , with S compact and strongly invariant for(6a) (resp., for (6b)). Then each solution of Eq. (6a) (resp.Eq. (6b)) starting from x0 asymptotically converges toCritical(f ).

Proof. For Eq. (6a), if grad(f )(x) �= 0, then

Lgrad(f )/‖grad(f )‖2f (x) ={

grad(f )(x)

‖grad(f )(x)‖2· grad(f )(x)

}= {‖grad(f )(x)‖2}.

If, instead, grad(f )(x)= 0, then Lgrad(f )/‖grad(f )‖2f (x)={0}.Therefore, we deduce

L−grad(f )/‖grad(f )‖2f (x) = −‖grad(f )(x)‖2 for all x ∈ Rd .

Consequently, Z−grad(f )/‖grad(f )‖2,f =Critical(f ) is closed, andLaSalle Invariance Principle (cf. Theorem 3) implies that eachsolution of (6a) starting from x0 asymptotically converges to

the largest weakly invariant set M contained in Critical(f )∩S,which is Critical(f ) ∩ S itself.

For Eq. (6b), we have a ∈ Lsgn(grad(f ))f (x) if andonly if there exists v ∈ K[sgn(grad(f ))](x) such thata = v · grad(f )(x). From Lemma 6, we deduce thata = sgn(grad1(f )(x)) ·grad1(f )(x)+· · ·+ sgn(gradn(f )(x)) ·gradn(f )(x) = ‖grad(f )(x)‖1. Therefore, we deduce

L−sgn(grad(f ))f (x) = {−‖grad(f )(x)‖1}.Consequently, Z−sgn(grad(f )),f = Critical(f ) is closed, andLaSalle Invariance Principle implies that each solution of(6b) starting from x0 asymptotically converges to the largestweakly invariant set M contained in Critical(f ) ∩ S, which isCritical(f ) ∩ S itself. �

Let us now discuss the finite-time convergence properties ofthe vector fields (6). Note that Proposition 4 cannot be applied.Indeed,

max L−grad(f )/‖grad(f )‖2f (x) = −‖grad(f )(x)‖2,

max L−sgn(grad(f ))f (x) = −‖grad(f )(x)‖1,

and both infx∈U\Critical(f )∩S‖grad(f )(x)‖2 = 0 andinfx∈U\Critical(f )∩S‖grad(f )(x)‖1 =0, for any neighborhood Uof Critical(f ) ∩ S in S. Hence, the hypotheses of Proposition4 are not verified by either (6a) or (6b).

Under additional conditions, one can establish stronger con-vergence properties of (6). We show this next.

Theorem 8. Let f : Rd → R be a second-order differentiablefunction. Let x0 ∈ S ⊂ Rd , with S compact and strongly invari-ant for (6a) (resp., for (6b)). Assume there exists a neighbor-hood V of Critical(f )∩S in S where either one of the followingconditions hold:

(i) for all x ∈ V , Hess(f )(x) is positive definite; or(ii) for all x ∈ V \(Critical(f ) ∩ S), Hess(f )(x) is positive

semidefinite, the multiplicity of the eigenvalue 0 is con-stant, and grad(f )(x) is orthogonal to the eigenspace ofHess(f )(x) corresponding to 0.

Then each solution of (6a) (resp. (6b)) starting from x0 con-verges in finite time to a critical point of f. Furthermore, ifV =S, then the convergence time of the solutions of (6a) (resp.(6b)) starting from x0 is upper bounded by

1

�0‖grad(f )(x0)‖2

(resp.

1

�0‖grad(f )(x0)‖1

),

where �0 = minx∈S�2(Hess(f )(x)).

Proof. Our strategy is to show that the hypotheses of Theo-rem 5 are verified by both vector fields. From Proposition 7,we know that each solution of (6a) (resp. (6b)) starting fromx0 converges to Critical(f ). Let us take an open set U ⊂ S

such that Critical(f ) ∩ S ⊂ U ⊂ U ⊂ V . Since S is com-pact, U is also compact. By continuity, under either assump-tion (i) or assumption (ii), the function �2(Hess(f )) : U → R,

J. Cortés / Automatica 42 (2006) 1993–2000 1997

x → �2(Hess(f )(x)), reaches its minimum on U , i.e., thereexists �0 > 0 such that �2(Hess(f )(x))��0 for all x ∈ U .Moreover, from (1), we have for all u ∈ Rd ,

u′Hess(f )(x) u

��2(Hess(f )(x))‖u − �H0(Hess(f )(x))(u)‖22. (7)

For (6a), recall from the proof of Proposition 7, that the func-tion x ∈ Rd → Lgrad(f )/‖grad(f )‖2f (x) = ‖grad(f )(x)‖2 issingle-valued, locally Lipschitz and regular, and hypothesis (i)in Theorem 5 is satisfied. Additionally, Z−grad(f )/‖grad(f )‖2,f =Critical(f ). Let us take x /∈ Critical(f ), and let us computeLgrad(f )/‖grad(f )‖2(Lgrad(f )/‖grad(f )‖2f )(x). Noting

�(‖grad(f )‖2)(x) ={

Hess(f )(x)grad(f )(x)

‖grad(f )(x)‖2

},

we deduce

Lgrad(f )/‖grad(f )‖2(Lgrad(f )/‖grad(f )‖2f )(x)

= grad(f )(x)′

‖grad(f )(x)‖2Hess(f )(x)

grad(f )(x)

‖grad(f )(x)‖2. (8)

Let x ∈ U\(Critical(f )∩S). Under either assumption (i) or (ii)in the theorem, �H0(Hessf (x))(grad(f )(x)/‖grad(f )(x)‖2) = 0.Then, using (7) in Eq. (8), we conclude

Lgrad(f )/‖grad(f )‖2(Lgrad(f )/‖grad(f )‖2f )(x)

��2(Hess(f )(x))

∥∥∥∥ grad(f )(x)

‖grad(f )(x)‖2

∥∥∥∥2

2

= �2(Hess(f )(x))��0 > 0,

for x ∈ U\(Critical(f )∩S). Hence, hypothesis (ii) in Theorem5 is also verified, and we deduce that the set Critical(f ) isreached in finite time, which in particular implies that the limitof any solution of Eq. (6a) starting from x0 ∈ S exists and isreached in finite time.

For (6b), recall from Proposition 7, that the functionx∈Rd →Lsgn(grad(f ))f (x)=‖grad(f )(x)‖1 is single-valued,locally Lipschitz and regular, and hypothesis (i) in Theorem 5is satisfied. Additionally, Z−sgn(grad(f )),f = Critical(f ). Let ustake x /∈ Critical(f ), and compute Lsgn(grad(f ))(Lsgn(grad(f ))f )

(x). By definition, a ∈ Lsgn(grad(f ))(Lsgn(grad(f ))f )(x) if andonly if there exists v ∈ K[sgn(grad(f ))](x) such that a = v · �,for any � ∈ �(‖grad(f )‖1)(x). Note that

�(‖grad(f )‖1)(x)

= {� ∈ Rd | � = Hess(f )(x) �, for some � ∈ Rd with

�i = sgn(gradi (f )(x)) if gradi (f )(x) �= 0 and

�i ∈ [−1, 1] if gradi (f )(x) = 0, for i ∈ {1, . . . , d}}.In particular, Hess(f )(x) v ∈ �(‖grad(f )‖1)(x). Then a =v′Hess(f )(x) v. Let us now decompose v as v = �H0(x)(v) +(v −�H0(x)(v)), where �H0(x)(v) ∈ H0(x) and v −�H0(x)(v) ∈H0(x)⊥. Because v ∈ K[sgn(grad(f ))](x), we deduce v ·grad(f )(x) = ‖grad(f )(x)‖1. Let x ∈ U\(Critical(f ) ∩ S).

Under either assumption (i) or (ii),

‖grad(f )(x)‖1 = v · grad(f )(x)

= (v − �H0(x)(v)) · grad(f )(x)

�‖v − �H0(x)(v)‖2‖grad(f )(x)‖2.

Using ‖u‖1 �‖u‖2 for any u ∈ Rd , we deduce from this equa-tion that ‖v − �H0(x)(v)‖2 �1. Therefore, using (7)

a = v′Hess(f )(x)v��2(Hess(f )(x))‖v − �H0(x)(v)‖22

��2(Hess(f )(x))��0 > 0,

for x ∈ U\(Critical(f ) ∩ S). Consequently, we get minLsgn(grad(f ))(Lsgn(grad(f ))f )��0 > 0 on U\(Critical(f ) ∩ S).Hence, hypothesis (ii) in Theorem 5 is also verified, and wededuce that the set Critical(f ) is reached in finite time, whichin particular implies that the limit of any solution of Eq. (6b)starting from x0 ∈ S exists and is reached in finite time. Theupper bounds on the convergence time of the solutions of bothflows also follow from Theorem 5. �

Corollary 9. Let f : Rd → R be a second-order differen-tiable function. Let x0 ∈ S ⊂ Rd , with S compact and stronglyinvariant for (6a) (resp., for (6b)). Assume that for each x ∈Critical(f ) ∩ S, Hess(f )(x) is positive definite. Then each so-lution of (6a) (resp. (6b)) starting from x0 converges in finitetime to a minimum of f.

4. Applications to network consensus

The results of the preceding sections on nonsmooth gradientdynamical systems can be applied to any multi-agent coordina-tion algorithm whose design involves the gradient of meaning-ful aggregate objective functions. As an illustration, we discusshere the application to network consensus problems, and deferthe treatment of other coordination tasks to future work.

Consider a network of n agents. The state of the ith agent,denoted pi ∈ R, evolves according to a first-order dynamicspi(t) = ui . Let G = ({1, . . . , n}, E) be an undirected graphwith n vertices, describing the topology of the network. Thegraph Laplacian matrix LG associated with G (see, for instance,Olfati-Saber & Murray, 2004) is defined as LG = G − AG,where G is the degree matrix and AG is the adjacency matrixof the graph. When the graph G is clear from the context, we willsimply denote it by L. The Laplacian matrix is symmetric, posi-tive semidefinite and has 0 as an eigenvalue with eigenvector 1.More importantly, G is connected if and only if rank(L)=n−1,i.e., if the eigenvalue 0 has multiplicity one. This is the reasonwhy �2(L) = min{� | � > 0 and � eigenvalue of L} is termedthe algebraic connectivity of G.

Two agents pi and pj agree if and only if pi = pj . TheLaplacian potential G : Rn → R+ associated with G (seeOlfati-Saber & Murray, 2004) quantifies the group disagree-ment,

G(p1, . . . , pn) = 1

2P ′LP = 1

2

∑(i,j)∈E

(pj − pi)2,

1998 J. Cortés / Automatica 42 (2006) 1993–2000

with P = (p1, . . . , pn)′ ∈ Rn. Clearly, G(p1, . . . , pn) = 0 if

and only if all neighboring nodes in the graph G agree. If Gis connected, then all nodes agree and a consensus is reached.Therefore, we want the network to reach the critical points ofG. Assume G is connected. The Laplacian potential is smooth,and its gradient is grad(G)(P )=LP . The gradient algorithm

pi(t) = −�G

�pi

=∑

j∈NG,i

(pj (t) − pi(t)), (9)

for i ∈ {1, . . . , n}, is distributed over G, i.e., each agent canimplement it with the information provided by its neighborsin the graph G (see Cortés et al., 2005 for a more thoroughexposition of spatially distributed algorithms). The algorithm(9) asymptotically converges to the critical points of G, i.e.,asymptotically achieves consensus. Actually, since the system islinear, the convergence is exponential with rate lower boundedby �2(L). Additionally, the fact that 1 · (LP ) = 0 implies that∑n

i=1pi is constant along the solutions. Therefore, each solu-tion of (9) is convergent to a point of the form (p∗, . . . , p∗),with p∗ = (1/n)

∑ni=1pi(0) (this is called average-consensus).

Following (6), consider the discontinuous algorithms

pi(t) =∑

j∈NG,i(pj (t) − pi(t))

‖LP(t)‖2, (10a)

pi(t) = sgn

⎛⎝ ∑

j∈NG,i

(pj (t) − pi(t))

⎞⎠ , (10b)

for i ∈ {1, . . . , n}. Note that the algorithm (10b) is distributedover G, whereas the algorithm (10a) is not. Before analyzingtheir convergence properties, we identify a conserved quantityfor each of these flows.

Proposition 10. Define g1 : Rn → R, g2 : Rn → R by

g1(p1, . . . , pn) =n∑

i=1

pi ,

g2(p1, . . . , pn) = maxi∈{1,...,n} {pi} + min

i∈{1,...,n} {pi}.

Then g1 is constant along the solutions of (10a) and g2 isconstant along the solutions of (10b).

Proof. The function g1 is differentiable, with grad(g1)(P )=1.For any P=(p1, . . . , pn) ∈ Rn, L−grad(G)/‖grad(G)‖2g1(P )={0}. Therefore, from Theorem 1, we conclude that g1 is con-stant along the solutions of (10a). On the other hand, fromClarke (1983, Proposition 2.3.1), one deduces that g2 is locallyLipschitz, with

�g2(P ) = co

{ej ∈ Rn | j such that pj = min

i∈{1,...,n}{pi}}

+ co

{ek ∈ Rn| k such that pk = max

i∈{1,...,n} {pi}}

.

Let a ∈ L−sgn(grad(G))g2(P ). By definition, there exists v ∈K[−sgn(grad(G))](P ) with

a = v · � for all � ∈ �g2(P ). (11)

If P ∈ diag(Rn), then �g2(P ) = Rd , and, for (11) to hold,necessarily v = (0, . . . , 0). Therefore, a = 0. If P /∈ diag(Rn),there exist j, k ∈ {1, . . . , n} with pj = mini∈{1,...,n}{pi}, pk =maxi∈{1,...,n}{pi} such that

∑i∈NG,j

(pi − pj ) > 0,∑

i∈NG,k

(pi − pk) < 0,

and hence, from Lemma 6, vj = 1 and vk = −1. There-fore, we deduce a = v · (ej + ek) = 1 − 1 = 0. Note thatL−sgn(grad(G))g2(P ) �= ∅ because sgn(grad(G)) · � = 0 forall � ∈ �g2(P ), and hence 0 ∈ L−sgn(grad(G))g2(P ). Finally,we conclude L−sgn(grad(G))g2(P ) = {0}, and therefore, g2 isconstant along the solutions of (10b). �

The following theorem completely characterizes the asymp-totic convergence properties of the flows in (10).

Theorem 11. Let G = ({1, . . . , n}, E) be a connected undi-rected graph. Then, the flows in (10) achieve consensus in finitetime. More specifically, for P0 = ((p1)0, . . . , (pn)0) ∈ Rn,

(i) the solutions of (10a) starting from P0 converge in finitetime to (p∗, . . . , p∗), with p∗=(1/n)

∑ni=1(pi)0 (average-

consensus). The convergence time is upper bounded by‖LP 0‖2/�2(L);

(ii) the solutions of (10b) starting from P0 converge in finitetime to (p∗, . . . , p∗), with p∗ = 1

2 (maxi∈{1,...,n}{(pi)0} +mini∈{1,...,n}{(pi)0}) (average–max–min-consensus).The convergence time is upper bounded by ‖LP 0‖1/

�2(L).

Proof. Our strategy is to verify the assumptions in Theorem 8.Let −1

G (�G(P0))={(p1, . . . , pn) ∈ Rd |G(p1, . . . , pn)�G(P0)}. Clearly, this set is strongly invariant for both flows.Since L is positive semidefinite, G(p1, . . . , pn)��2(L) ‖P −�H0(A)(P )‖2

2. Then, ‖P − �H0(A)(P )‖22 �G(P0)/�2(L)for

P ∈ −1G (�G(P0)). Consider also the closed set

W(P0) ={P ∈ Rn | min

i{1,...,n} {(pi)0}

� 1

nP · 1� max

i∈{1,...,n}{(pi)0}}

.

One can see that W(P0) is strongly invariant for (10a) and for(10b). Now, define the set S =W(P0)∩−1

G (�G(P0)). Fromthe preceding discussion, we deduce that S is strongly invariantfor (10a) and (10b). Clearly, S is closed. Furthermore, using P =�H0(L)(P ) + P − �H0(L)(P ), and noting �H0(L)(P )=P ·1/n)1,

J. Cortés / Automatica 42 (2006) 1993–2000 1999

5 10 15 20

-6

-4

-2

2

4

6

Time5 10 15 20

-6

-4

-2

2

4

6

Time

5 10 15 20

-6

-4

-2

2

4

6

Time

Fig. 1. From left to right, evolution of (9), (10a) and (10b) for 10 agents starting from a randomly generated initial configuration with pi ∈ [−7, 7],i ∈ {1, . . . , 10}. The graph G = ({1, . . . , 10}, E) has edge set E = {(1, 4), (1, 10), (2, 10), (3, 6), (3, 9), (4, 8), (5, 6), (5, 9), (7, 10), (8, 9)}. The algebraicconnectivity of G is �2(L) = 0.12.

we deduce

‖P ‖2 = ‖�H0(L)(P )‖2 + ‖P − �H0(L)(P )‖2

�√

n max

{∣∣∣∣ mini{1,...,n}{(pi)0}

∣∣∣∣ ,∣∣∣∣ maxi{1,...,n} {(pi)0}

∣∣∣∣}

+ G(P0)

�2(L),

for P ∈ S. Therefore, S is bounded, and hence compact. Now,Hess(G)(P )=L is positive semidefinite at any P ∈ Rn, withthe eigenvalue 0 having a multiplicity 1 (not depending on P).Additionally, for P /∈ Critical(G), grad(G)(P ) = LP �= 0 isorthogonal to span{1}, the eigenspace of L corresponding to theeigenvalue 0. Finally, Theorem 8(ii) with V = S (together withProposition 10) yields the result. �

Fig. 1 illustrates the evolution of the differential equations(9), (10a) and (10b). As stated in Theorem 11, the agents’states evolving under (10a) achieve consensus in finite time at(1/n)

∑ni=1(pi)0, and the agents’ states evolving under (10b)

achieve consensus in finite time at 12 (maxi∈{1,...,n}{(pi)0} +

mini∈{1,...,n}{(pi)0}).Networks with switching network topologies: For networks

with switching connected topologies, one can derive a simi-lar result to Theorem 11. Consider, following Olfati-Saber andMurray (2004), the next setup. Let �n be the finite set of con-nected undirected graphs with vertices {1, . . . , n},

�n = {G = ({1, . . . , n}, E) |G connected, undirected}.

Let I�n⊂ N be an index set associated with the elements

of �n. A switching signal � is a map � : R+ → I�n. For

each time t ∈ R+, the switching signal � establishes the net-work graph G�(t) ∈ �n employed by the network agents. Now,consider a network subject to the switching network topol-ogy defined by � and executing one of the coordination algo-rithms introduced above. In other words, consider the switchingsystem

pi(t) = −�G�(t)

�pi

=∑

j∈NG�(t),i

(pj (t) − pi(t)), (12)

for i ∈ {1, . . . , n}, and the switching systems

pi(t) =∑

j∈NG�(t),i(pj (t) − pi(t))

‖LG�(t)P (t)‖2

, (13a)

pi(t) = sgn

⎛⎜⎝ ∑

j∈NG�(t),i

(pj (t) − pi(t))

⎞⎟⎠ . (13b)

The switching system (12) asymptotically achieves average-consensus for an arbitrary switching signal �. Let G∗ ∈ �n besuch that

�2(LG∗)

�n(LG∗)= min

G∈�n

{�2(LG)

�n(LG)

}.

For the systems in (13), we have the next result.

Corollary 12. Let � : R+ → I�nbe a switching signal. Then,

the flow (13a) achieves average-consensus in finite time upperbounded by �n(LG∗)/�2(LG∗)‖P(0) − (1/n)

∑ni=1(pi)0 1‖2,

and the flow (13b) achieves average-max-min-consensus in fi-nite time equal to 1

2 (maxi∈{1,...,n}{(pi)0}−mini∈{1,...,n}{(pi)0}).

Proof. For the flow (13a), consider the candidate Lyapunovfunction V1 : Rn → R

V1(P ) = 1

2

∥∥∥∥∥P − 1

n

n∑i=1

pi1

∥∥∥∥∥2

2

.

The first-order evolution of this function along the networktrajectories is determined by L−(grad(G))/‖grad(G)‖2V1(P ) =−P ′LGP/‖LGP ‖2, for each G ∈ �n, which is single-valued,Lipschitz and regular. Additionally, for any G ∈ �n,

L−grad(G)/‖grad(G)‖2V1(P )� − �2(LG)

�n(LG)

∥∥∥∥∥P − 1

n

n∑i=1

pi1

∥∥∥∥∥2

�0.

The application of the LaSalle Invariance Principle ensures thatthe flow (13a) achieves average-consensus. From the precedinginequality and the definition of V1,∥∥∥∥∥P(t) − 1

n

n∑i=1

pi 1

∥∥∥∥∥2

�∥∥∥∥∥P(0) − 1

n

n∑i=1

pi1

∥∥∥∥∥2

− �2(LG∗)

�n(LG∗)t ,

which implies the result.

2000 J. Cortés / Automatica 42 (2006) 1993–2000

For the flow (13b), consider the candidate Lyapunov functionV2 : Rn → R

V2(P ) =∥∥∥∥P − 1

2

(max

i∈{1,...,n} {pi} + mini∈{1,...,n} {pi}

)1

∥∥∥∥∞= 1

2

(max

i∈{1,...,n}{pi} − mini∈{1,...,n}{pi}

).

This function is locally Lipschitz and regular. Let a ∈L−sgn(grad(G))V2(P ). Then, there exists v ∈ K[−sgn(grad(G))](P ) with a = v · �, for all � ∈ �V2(P ). Take Pwith V2(P ) �= 0. Let j, k ∈ {1, . . . , n} such that pj =mini∈{1,...,n}{pi}, pk = maxi∈{1,...,n}{pi}.Then 1

2 (ek − ej ) ∈�V2(P ), vj = 1 and vk = −1. Therefore a = −1. The resultfollows from Proposition 4. �

Remark 13. Note that in the proof of Corollary 12, wehave explicitly computed the convergence time of theflow (10b) to achieve average–max–min-consensus to be12 (maxi∈{1,...,n}{(pi)0} − mini∈{1,...,n}{(pi)0}).

5. Conclusions

We have introduced the normalized and signed versions of thegradient descent flow of a differentiable function. We have char-acterized the asymptotic convergence properties of these non-smooth gradient flows, and identified suitable conditions thatguarantee that convergence to the critical points is achieved infinite time. In doing so, we have built on two novel nonsmoothanalysis results on finite-time convergence and second-orderinformation on the evolution of Lyapunov functions. Theseresults are not restricted to gradient flows, and can indeed beused in other setups involving discontinuous vector fields. Wehave discussed the application of the results to network con-sensus problems.

Future work will be devoted to explore (i) the use of theupper bounds on the convergence time of the proposed nons-mooth flows in assessing the time complexity of coordinationalgorithms; (ii) the application of the results to consensus-based sensor fusion algorithms, and other coordination prob-lems such as formation control, deployment and rendezvous;(iii) the robustness properties of the proposed consensus flowsunder asynchronism, delays and network topologies that arenot connected at all times; and (iv) the identification of othernonsmooth distributed algorithms based on gradient informa-tion with similar convergence properties.

References

Bacciotti, A., & Ceragioli, F. (1999). Stability and stabilization ofdiscontinuous systems and nonsmooth Lyapunov functions. ESAIM.Control, Optimisation & Calculus of Variations, 4, 361–376.

Bertsekas, D. P., & Tsitsiklis, J. N. (1997). Parallel and distributedcomputation: Numerical methods. Belmont, MA: Athena Scientific, ISBN1886529019.

Bhat, S. P., & Bernstein, D. S. (2000). Finite-time stability of continuousautonomous systems. SIAM Journal on Control and Optimization, 38(3),751–766.

Clarke, F. H. (1983). Optimization and nonsmooth analysis. Canadianmathematical society series of monographs and advanced texts New York:Wiley, ISBN 047187504X.

Clarke, F. H. (2004). Lyapunov functions and feedback in nonlinear control.In M. S. de Queiroz, M. Malisoff, P. Wolenski (Eds.), Optimal control,stabilization and nonsmooth analysis. Lecture notes in control andinformation sciences (Vol. 301, pp. 267–282). New York: Springer.

Clarke, F. H., Ledyaev, Y., Rifford, L., & Stern, R. (2000). Feedbackstabilization and Lyapunov functions. SIAM Journal on Control andOptimization, 39, 25–48.

Coron, J. M. (1995). On the stabilization in finite time of locally controllablesystems by means of continuous time-varying feedback laws. SIAM Journalon Control and Optimization, 33, 804–833.

Cortés, J., & Bullo, F. (2005). Coordination and geometric optimization viadistributed dynamical systems. SIAM Journal on Control and Optimization,44(5), 1543–1574.

Cortés, J., Martínez, S., & Bullo, F. (2005). Spatially-distributed coverageoptimization and control with limited-range interactions. ESAIM. Control,Optimisation & Calculus of Variations, 11(4), 691–719.

Filippov, A. F. (1988). Differential equations with discontinuous righthandsides, Mathematics and its applications. Vol. 18, Dordrecht: KluwerAcademic Publishers. The Netherlands.

Gazi, V., & Passino, K. M. (2003). Stability analysis of swarms. IEEETransactions on Automatic Control, 48(4), 692–697.

Helmke, U., & Moore, J. B. (1994). Optimization and dynamical systems.New York: Springer, ISBN 0387198571.

Hirsch, W. M., & Smale, S. (1974). Differential equations, dynamical systemsand linear algebra. New York: Academic Press.

Ögren, P., Fiorelli, E., & Leonard, N. E. (2004). Cooperative control of mobilesensor networks: Adaptive gradient climbing in a distributed environment.IEEE Transactions on Automatic Control, 49(8), 1292–1302.

Olfati-Saber, R., Fax, J. A., & Murray, R. M. (submitted). Consensus andcooperation in multi-agent networked systems. Proceedings of the IEEE,submitted for publication.

Olfati-Saber, R., & Murray, R. M. (2004). Consensus problems in networksof agents with switching topology and time-delays. IEEE Transactions onAutomatic Control, 49(9), 1520–1533.

Ren, W., Beard, R. W., & Atkins, E. M. (2005). A survey of consensusproblems in multi-agent coordination. In American control conference (pp.1859–1864), Portland, OR, June 2005.

Ryan, E. P. (1991). Finite-time stabilization of uncertain nonlinear planarsystems. Dynamics and Control, 1, 83–94.

Shevitz, D., & Paden, B. (1994). Lyapunov stability theory of nonsmoothsystems. IEEE Transactions on Automatic Control, 39(9), 1910–1914.

Tanner, H., Jadbabaie, A., & Pappas, G. J. (2003). Stable flocking of mobileagents, Part I: Fixed topology. In IEEE conference on decision and control,(pp. 2010–2015), Maui, HI, December 2003.

Jorge Cortés received the Licenciatura de-gree in mathematics from the Universidad deZaragoza, Zaragoza, Spain, in 1997, and thePh.D. degree in engineering mathematics fromthe Universidad Carlos III de Madrid, Madrid,Spain, in 2001. He was a Postdoctoral ResearchAssociate at the Systems, Signals and ControlDepartment, University of Twente, The Nether-lands (from January to June 2002), and at theCoordinated Science Laboratory, University ofIllinois at Urbana-Champaign (from August2002 to September 2004). He is currently an As-sistant Professor at the Department of Applied

Mathematics and Statistics, Baskin School of Engineering, University ofCalifornia at Santa Cruz. He is the author of the book “Geometric, controland numerical aspects of nonholonomic systems” (New York: Springer Verlag,2002).