A SEQUENTIAL INEQUALITIES*marcotte/ARTIPS/1989_SIAM_Control.pdf · 2009. 3. 17. · SIAM J....

SIAM J. CONTROL AND OPTIMIZATIONVol. 27, No. 6, pp. 1260-1278, November 1989

1989 Society for Industrial and Applied Mathematics

002

A SEQUENTIAL LINEAR PROGRAMMING ALGORITHM FORSOLVING MONOTONE VARIATIONAL INEQUALITIES*

PATRICE MARCOTTE? AND JEAN-PIERRE DUSSAULT$

Abstract. Applied to strongly monotone variational inequalities, Newton’s algorithm achieves localquadratic convergence. In this paper it is shown how the basic Newton method can be modified to yieldan algorithm whose global convergence can be guaranteed by monitoring the monotone decrease of the"gap function" associated with the variational inequality. Each iteration consists in the solution of a linearprogram in the space of primal-dual variables and of a linesearch. Convergence does not depend on strongmonotonicity. However, under strong monotonicity and geometric stability assumptions, the set of activeconstraints at the solution is implicitly identified, and quadratic convergence is achieved.

Key words, mathematical programming, variational inequalities, nonlinear complementarity, Newton’smethod

AMS(MOS) subject classifications. 49D05, 49D10, 49D15, 49D35

0. Introduction. In this paper we consider the variational inequality problemdefined on a convex compact polyhedron in R n. Since this problem can be formulatedas a fixed-point problem involving an upper semicontinuous mapping, it can be solvedby simplicial or homotopy methods for which there already exists a vast literature (seeZangwill [16], Todd [14], Saigal [13]). For large-scale problems, however, thesealgorithms tend to become inefficient, both in terms of computer memory and runningtime requirements. This explains the renewed interest in algorithms closely related toprocedures originally devised for iteratively solving systems of nonlinear equations(Ortega and Rheinboldt [10]) such as the Jacobi, Gauss-Seidel, and Newton schemes(see Pang and Chan [11], Josephy [5], Robinson [12]) or projection algorithms(Bertsekas and Gafni [2], Dafermos [4]) where the cost function is approximated, ateach iteration, by a simpler, e.g., linear, separable, or symmetric function. Local orglobal convergence of the latter methods usually hinges on the a priori knowledge oflower bounds for the Lipschitz constant of the cost function, either in a neighborhoodof a solution (for local convergence) or uniformly on the feasible domain (for globalconvergence). These conditions are difficult, while not impossible, to verify inpractice.

Our approach is basically different. We choose as a merit function the complemen-tary term (or gapfunction associated with the primal-dual formulation ofthe variationalinequality and find its global minimum by application of a first-order minimizationalgorithm. For monotone cost functions, we show that the algorithm converges globallyto an equilibrium solution and possesses the finite termination property if the functionis affine. Furthermore, under geometric stability and strong monotonicity assumptions,the algorithm implicitly identifies the set of constraints that are binding at the equili-brium solution, and convergence toward the equilibrium solution is quadratic. Numeri-cal results comparing this method to Newton’s method with and without linesearch(Marcotte and Dussault [9]) are provided.

* Received bythe editors February 25, 1985; accepted for publication (in revised form) February 24, 1989.

" D6partement de Math6matiques, Coll6ge Militaire Royal de Saint-Jean, Richelain, Qu6bec, J0J 1R0,Canada. This research was supported by National Sciences and Engineering Research Council of Canadagrants 5491 and 5789, and Academic Research Program ofthe Department of National Defense grant FUHBP.

$ D6partement de Math6matiques et d’informatique, Universit6 de Sherbrooke, Boul. Universit6,Sherbrooke, Qu6bec, J1K2R1, Canada.

1260

SEQUENTIAL LINEAR PROGRAMMING ALGORITHM 1261

1. Problem formulation. Notation and basic definitions. Let {Bx <-_ b}, where Bis an m x n matrix (m > n), represent a nonempty convex compact polyhedron in R"and let F be a continuously differentiable function from into R" with Jacobian F’.The variational inequality problem (VIP) associated with F and consists in findingsome vector x* in , called an equilibrium solution, satisfying the variational inequality(VI):

() (x*-x)’t:(x*) <-o

for all x in . Since an equilibrium solution is a fixed point of the upper semicontinuousmapping defined by x- T(x)= {arg maxya, (x-y)’F(x)} it follows from Kakutani’sTheorem [6] and the compactness of that the set S of equilibria is nonempty.

If the Jacobian F’(x) is symmetric for all x in then the function F(x) is thegradient of some function f: R", and (1) is the mathematical expression of thefirst-order necessary conditions corresponding to the optimization problem:

(2) rain f(x) F( t) dt

where the line integral is independent of the path of integration and therefore unam-biguously defined.

In order that a feasible point x be an equilibrium, it is necessary and sufficientthat x be optimal for the linear program

(3) min ytF(x).ye

The optimality conditions for (3) are met by x if and only if we have

A >- O, F(x) + BtA 0 dual feasibility,

(4) At(Bx-b) =0 complementary slackness,

Bx <- b primal feasibility.

In the following, (4) will be referred to as the complementary formulation of VIP. IfF’ is symmetric, (4) corresponds to the Kuhn-Tucker necessary optimality conditionsfor the optimization problem (2). If the constraint set is not polyhedral, a formulationsimilar to (4) can be obtained by imposing a suitable constraint qualification conditionon the problem. The constraints Bx <= b will be referred to as the structural constraints

associated with the variational inequality problem, and the constraints F(x)+ BtA --O,A -> 0 as the nonstructural constraints.

DEFINITION 1. The function F is(i) Monotone on if (x-y)(F(x)-F(y))>=O for all x, y in ;(ii) Strictly monotone on if (x-y)(F(x) F(y))>0 for all x, y in (x y);(iii) Strongly monotone on if there exists a positive number K such that

(x-y)’(F(x)-F(y))>-Kllx-yl[ 2 for all x,y in .When F is the gradient of some ditterentiable function f, then the various concepts ofmonotonicity previously defined correspond, respectively, to convexity, strict convexity,and strong convexity off on . For ditterentiable functions, we also have the followingcharacterization (see Auslender [1]):

(i) Monotonicity on : (x-y)’F’(x-y)>=O for all x, y in ;(ii) Strong monotonicity on : (x-y)tF’(x)(x-y)>= llx-yll for all x, y in ,

for some positive number r.

1262 P. MARCOTTE AND J.-P. DUSSAULT

The solution set S of (1) is nonempty, as noted earlier, convex if F is monotone,and a singleton if F is strictly monotone.

DEFINITION 2. The gap function associated with a VIP is defined, for x in , as

g(x)=max(x-y)’F(x).y

It is clear that a feasible point x is a solution of VIP if and only if it is a globalminimizer for the gap function, i.e., g(x) 0. Using this concept, VIP can be formulatedas the linearly constrained optimization problem

(5) min g(x).xI9

Although, in general, neither quasiconvex nor ditterentiable, it will be shown in Lemma3 that any stationary point of (5) is an equilibrium solution. In particular, a globallyconvergent algorithm using the gap function as a merit function has been proposedby Marcotte [7].

DEFINITION 3. The dual gap function associated with VIP is defined as

,(x)=max(x-y)’F(y).ye

The dual gap function is convex, but its evaluation requires the solution of a nonconvex(in contrast with linear for the gap function) mathematical program. Under a monoton-icity assumption, any global minimizer of the problem minxes, (x) is a solution toVIP. A solution algorithm based on direct minimization of the dual gap function canbe found in Nguyen and Dupuis [17].

DEFINITION 4. We say that VIP is geometrically stable if (y-x*)’F(x*)<=0 forany equilibrium solution x* implies that y lies in the optimal face T*, i.e., the minimalface of containing the set S of all solutions to VIP.

The above stability condition, especially useful when S is a singleton, ensuresthat T* is stable under slight perturbations to the cost function F. It is implied by thegeneralization to VIP of the usual strict complementarity condition:

(6) {Bx* b :> a * > 0}

where A* is an optimal dual vector corresponding to x* in the complementarityformulation (4). If F is strongly monotone, then geometric stability implies the strongregularity condition of Robinson 12]. Also, under geometric stability, there must existat least one solution of VIP satisfying the strict complementarity condition (6); howeverit need not be unique, and there might exist optimal primal-dual couples that are notstrictly complementary. Figure 1 provides examples where geometric stability holdswhile strict complementarity is not satisfied. In the first case, the problem is causedby a redundant constraint, while in the second case it is due to the linear dependenceof the constraints’ gradients at x*.

2. Newton’s algorithm. Since Newton’s method is central to our local convergenceanalysis we recall its definition and main properties. Applied to VIP, Newton’s methodgenerates a sequence of iterates {xk} where x is any vector in and xk+(k >_-0) is asolution to the VIP obtained by replacing F by its first-order Taylor expansion aroundxk, i.e.,

(7) (x’+’-y)t(F(x’)+ F’(x’)(x-x’))<-O ryeS.

The linearized problem will be denoted LVIP (x) and its (nonempty) set of solutions


-F(x*)

redundant constraint dependent constraints’ gradients

(a) (b)

FIG. 1. Geometric stability does not imply strict complementarity. (a) Redundant constraint. (b) Dependentconstraints’ gradients.

NEW (xk). The gap function associated with LVIP (xk) (the linearized gap function)will be denoted Lg (xk, x) and its mathematical expression is

(8) Lg (xk, x) max (x y)’(F(xk) + F’(xk)(x xk)).y

In a similar fashion we define the linearized dual gap function L, (x k, x)"(9) Lp, (x k, x)=max (x-- y)’(F(xk) + F’(xk)(y-- xk)).

y

When F is strongly monotone and its F’ is Lipschitzian, it can be shown thatNewton’s method is locally quadratically convergent. We quote Pang and Chan’s [11]version of this result, also obtained by Josephy [5].

THEOREM 1. If the matrix F’(x*) is positive definite and thefunction F’ is Lipschitzcontinuous at x* then there exists a neighborhood N of x* such that if xk N then thesequence {xk} is well-defined and converges quadratically to x*, i.e., there exists a constantsuch that

(10) lix x*ll llx x*ll Vk such that xk N

where I1" denotes the Euclidian norm in R".The next result shows that Newton’s algorithm has the capability of identifying

T*. Actually we will prove this result for a broad class of approximation algorithmswhere, at each iteration, xk+l is defined as a solution to a VI where F(x) is replacedby the function G(x, xk) parameterized in xk and such that

(11) (i) G(x, y) is strictly monotone in x;

(12) (ii) G(x, y) is continuous as a function of (x, y);

(13) (iii) G(x, x)-- F(x).

Property (i) above ensures that xk+ is unambiguously defined. Property (iii) ensuresthat if xk+ xk then xk is the solution to the original VIP. In many practical situations,G is chosen as a strongly monotone function with symmetric Jacobian. Popular choices


for G are"

Gi(x, y) Fi(Yl, , Yi-1, xi, Yi+l, Y,,), 1,. , n Jacobi iteration,

Gi(x, y) Fi(x1, Xi, Yi+l, Yn), 1, , n Gauss-Seidel iteration,

G(x, y)- F(y)+ F’(y)(x- y) Newton’s method,

G(x, y) Ax + p[F(y) Ay] Projection method,

where p > 0 and A is a symmetric positive definite matrix.Other choices for G may be found in Pang and Chan [11] and Marcotte [8].PROPOSITION 1. Assume that F is monotone and that geometric stability holds for

VIP. Let Xk+l be a solution to the VI:

(xk+I y)tG(xk+l, xk)<0= for all y e

where G satisfies (11), (12), (13). Then, for each optimal solution x* of VIP there existsa neighborhood V ofx* such that if xk V then xk+l T*.

Proof Assume that the result does not hold. Then there exists an extreme pointu of- T* and a subsequence {Xk}k converging to some x* such that

(xk+l u)tG(xk+, xk) 0 for all k e I.

Taking the limit as k--> o (k e I) we obtain

(x* u)tF(x*) (x*- u)tG(x*, x*) <= 0implying, by geometric stability, that u T*, a contradiction.

3. A linear approximation algorithm. In this section we present a model algorithmfor solving VIP based on its complementarity formulation (4) that proceeds by succes-sive linear approximations of both the objective and the nonstructural, usually non-linear, constraints. Throughout this section the function F will be assumed monotonewith Lipschitz continuous Jacobian F’.

Any solution to (4) is clearly a global minimizer for the following (usually)nonconvex, nonlinearly constrained mathematical program"

defmin h(x, A) ht(b-Bx)=xtF(x)+bthx,A

(14)subject to F(x)+Bth=0, Bx<-b, h>-O.

The following lemma relates the objective in (14) to the gap function.LEMMA 1. We have

g(x)=minh(x,h)

Proof.

(15)subject to F x + BtA O, A >- O.

g(x) max (x y)’F(x)ye

x’F(x)- min ytF(x)By<--b

=xF(x)- max bBrl F(

la,O


by linear programming duality theory. Hence

g(x)=x’F(x)+ min b’AF(x)q-BtA =0

Z--->0

after setting A -/x, and the result follows if we replace F(x) by the equivalent term-BtA.

The next lemma, basic to our global convergence analysis, states that any stationarypoint of the mathematical program (14) is actually an equilibrium solution to VIPand justifies the use of an algorithm based on identifying points satisfying first-orderconditions of (14). The proof does not rely on any sort of constraint qualification forthe nonlinearly constrained problem (14).

LEMMA 2. Let (2, .) be a vector satisfying the first-order necessary optimalityconditions for (14). Then is a solution to VIP.

Proof. It suffices to show that h(, )= 0. Assume that h(, )> 0. Without lossof generality we also assume that h(, )= g(); otherwise, would not be optimalfor the linear program

min h(x, A)A

(16)subject to F(:)+B’A=0, A>-0

and an optimal A-solution to (16) would constitute, together with 2, an obvious descentdirection for h at (2, ).

Consider the linearized problem LVIP (2) with its gap function Lg (:) and com-plementarity formulation:

(17)

defmin h(x, A) xt[F(,)h F’(g)(x-g)] + brA

subject to F(g) + F’()(x- 2) + B’A O, A=>0.

Problem (17) constitutes a positive semidefinite quadratic program whose optimalsolution’s primal vector corresponds to a (not necessarily unique) Newton direction.Consider a Frank-Wolfe direction d (Y 2, ]) for (17) at the point (2, ). Directiond is a feasible descent direction for the linearized gap function Lg at 2. Since V h(g, ,)is identical with V/(:, ), and so are the directional derivatives of Lg and g, it followsthat- is also a feasible descent direction for g at

We are now in a position to give a precise statement of our algorithm.

ALGORITHM N.Initialization.Let x be any vector in and

A o arg min b ’A

subject to F(x) + B’A 0

and set k <-- 1.while convergence criterion not metdo 1) Find descent direction d.


(18)

Let (dx(xk), d(xk)) be an extremal solution to the linear problemmin xt(F(xk)+ F"(xk)xk)+ b’AAO

subject to F(xk) + F’(xk)(x xk) + B’A O.

Set d (G(xk)- x2) Perform arc search on the gap function.

if g(d(xk))<=1/2g(xk) then ff-I(19)

else ffarg min g[xk+O(dx(Xk)--xk)].0[0,1]

endwhile.

3) Update.

Some comments are in order:

x+, ,_ x + g(d(x)-x)A k+ earg min

F(xk+I)+BtA =0

k-k+l

(1) At step 2) of Algorithm N, the minimization, with respect to the primal vectorx, of the nonditterentiable objective g could be seen as a search along an arc in thespace of primal-dual variables (x, h). Since dual vectors h have to be computedrepeatedly, this operation can be carried out efficiently using reoptimization techniquesof linear programming.

(2) It is not required, or even advisable, that the arc search be carried out exactly.For instance, the Armijo-Goldstein stepsize rule, or any rule guaranteeing a "sufficient"decrease of the objective along the search direction could be implemented.

(3) For affine functions F Ax + a, Algorithm N reduces to the standard Frank-Wolfe procedure for solving quadratic programming problems, as then the nonstruc-tural constraints become linear.

4. Convergence analysis. We first state and prove a global convergence result forAlgorithm N.

PROPOSITION 2. Any point of accumulation of a sequence generated by AlgorithmN is an equilibrium solution.

Proof. If g(dx(xk))--< .5g(xk) infinitely often at (19), then limk_o g(Xk) =0. Other-wise the linesearch in (19) is asymptotically always performed and, to prove globalconvergence, we will strive to check the conditions behind Zangwill’s global conver-gence theorem, namely:

(i) All points generated by the algorithm lie in a compact set.(ii) The algorithmic map is closed outside the set of solution points S.(iii) At each iteration, strict decrease of the objective function occurs.(i) Since is compact by assumption, it is sufficient to show that the sequence

{dx (xk)} is bounded. By definition of the sequence {d(xk)} we have

(20)d, (xk) arg min b ’A

AO

subject to B’A H(xk)

where H(xk) r F(xk)+ F,(xk)(dx(Xk)_xk) R".


(21)

First observe that the linear program:

min btA subject toA0

B’A -H(xk)

is the dual of the linear program

(22) max -xF(x)

that is feasible and bounded; hence, by linear programming duality, we have that (17)is also feasible and bounded, i.e., that (17) possesses at least one optimal basic solution.Let {Ne}e=,...,p denote the set of full rank square submatrices (basis) of B . Since

dA (x) is extremal, we have

d (x) -N-H(x)for some e e (1, , p}. From the continuity of F and F’ we deduce that H(x) mustlie in some compact set K independent of x k. Therefore d(x)e C derUP=_- -N-K,which is bounded. The same continuity argument is then used to show boundednessof the sequence {+}.

(ii) The closedness of the algorithmic map follows directly from the continuityof F’ and the closedness of the linesearch strategy used.

(iii) We must prove that h(x+, +)< h(x, ,) if the latter term is positive(not zero). This is a direct consequence of Lemma 2. [3

PROPOSITION 3. If F is monotone on and affine, then Algorithm N* convergesin a finite number of iterations.

Proof Replacing F by Ax + a in (16) yields a quadratic programming problem.Its solution set is a face T of the polyhedron {Ax + B’& O, Bx <= b, , >- 0}. For someiterate k we must have that (dx(x), d(x)) lies in (otherwise the iterates wouldalways be bounded away from T, contradicting global convergence of the method).When (dx(x), da(xk))e " we have = 1 and (xk+l, k+l) . [-]

Remark. The preceding result is also valid under the assumption that T* is asingleton (F monotone but not necessarily affine). The proof is similar.

To obtain a rate-of-convergence result for Algorithm N we assume, until explicitlystated otherwise, that the function F is strongly monotone in a neighborhood of thesolution x* with strong monotonicity coefficient and that the geometric stabilitycondition is satisfied at x*. This implies that the entire sequence {x} converges to theunique solution x*. Under these assumptions we will show that Algorithm N* islocally equivalent to Newton’s method, thus implying quadratic convergence andimplicit identification of the set of active constraints at x*. We first show that thedescent direction d obtained from Algorithm N* satisfies d),(x) NEW (xk) if x issufficiently close to x*. The following lemmas will be used in the proof.

LEMMA 3. The optimal dual vector y(x) associated with the nonstructura! constraintF(x)+ B’, =0 of (18) satisfies lim_ y(x) x*.

Proof Write the Lagrangian dual of the linear program (18)"

max min x’[F(xk) + F’t(x)x] + b’A y[F(x) + F’(x)(x-xk) + B’, ].y xcCD

Then observe that the inner minimum has value -oe unless By <= b, in which case theminimum over nonnegative A is achieved when I is zero, yielding

max min xt[F(x) + F"(x)x y’[F(x) + F’(xk)(x x)].yc xC


This expression is equivalent, modulo a constant term, to

(23) max min (x--y)t[F(xk)nt-F’(xk)(x--xk)]--(x--xk)F’(xk)(x--xk)yCb x

and constitutes a quadratic perturbation of the dual gap function L at xk. Since y(xk)is dual-optimal for (18) it must correspond to the y-part of a solution to (23). If y(xk)does not converge to x* then there exists a subsequence {Xk}k1 such thatlimk_oo.ky(xk) 37 X*. Passing to the limit in (23) we obtain, after setting x to :

lim max min (x--y)’[F(xk)+ F’(Xk)(x--xk)]--(x--xk)’F’(xk)(x--xk)kcx y xkI

_<_ (7- 7)’[F(x*) + V’(x*)(;- x*)] (- x*)’F’(x*)(- x*)

_-<-KJJ)7-x*ll 2 by strong monotonicity

<0.

But this contradicts the optimality of the sequence {y(xk)}k1 since we obtain, bytaking y x*"

lim min (x--x*)’[F(xk)+ F’(Xk)(X--Xk)]--(X--Xk)’F’(xk)(X--Xk)k-c x

min (x x*)’F(x*)

=0 by definition of x*.

LEMMA 4. There exists an index K such that k >-K implies d,,(x) T*.Proof From (23) we get

dx(Xk)e D(xk) deZ arg min (x-- y(xk))’[F(xk)+ F’(xk)(x--xk)](24)

(x x)’F’(x)(x x).Since y(xk)-+ X* as koo (Lemma 3), (24) represents, for x k close to x*, a smallquadratic perturbation of the linear program: min,a, x’F(x*). It follows from thegeometric stability assumption that dx(xk) T*. [3

COROLLARY. dx(x*) T*.Proof Since F’ is continuous, the point-to-set mapping :-+{d,,(ff)} is upper

semicontinuous. Hence dx(X*){dx(limk_,ooxk)}=limk_,oo{dx(Xk)} T*. V]

LEMMA 5. limk+oo d,,(xk) x*.Proof From the proof of Lemma 2, we have

g’(xk; dx(Xk)--xk)<o.Passing to the limit and using upper semicontinuity there comes

g’(x*; dx(x*) x*) <= O.

But, by Danskin’s rule of differentiation of max-functions (see [18]), we have

g’(x*" dx(x*)-x*)=max [dx(x*)-x*]’[F(x*)- F"(x*)(y-x*)].yeT*

Assume that T* is not the singleton x* (otherwise the result follows trivially fromLemma 5) and let e be a positive number such that . o-rx*-e(d(x*)-x*) T* (see


Fig. 2). Then we have

o->_ g’(x*; d(x*)-x*)

>- [dx(x*)-x*]’[(x*)+ F"(x*)(d(x*)-x*)]

>= ,lldx(x*)-x*ll ,implying that &, (x*) x*.

FG. 2

LEMMA 6. There exists an index K such that for k >= K, dx(xk) NEW (xk).Proof From (18), d (xk) is an optimal dual vector for the linear program

(25) min z’[F(x’)+ F’(xk)(dx(xk)--xk)].

For k large, d(xk) is close to x* (Lemma 5) and problem (25) is an arbitrary smallperturbation of the linear program

min z’F(x*)

whose set of optimal solutions is T*, by definition. Therefore the optimal solutions to(25) lie in T* by geometric stability. From the complementary slackness theorem oflinear programming we can write

d, (xk)t(Bd(xk) b O.

We conclude that the couple (d,(xk), d(xk)) is optimal for the quadratic program(17). Since its solution is unique in x and equal by definition to NEW (xk) we concludethat d(xk) NEW (xk). ]

PROPOSITION 4. There exist positive constants and such that

IIx-x*ll g(x) llx- x*ll,


Proof It suffices to prove the result in a neighborhood of x*.(26) (i) Proof that g(x) -<_/3 IIx x*[[.

g(x)=max(x-y)’F(x)

(x x*)’F(x) + max (x* y)tF(x)

<-IIx-x*ll. IIF(x)ll +max (x*- y)’F(x*)yCD

+max (x*- y)’(F(x) F(x*))

--< IIx- x*ll, IIV(x)ll / DIIx- x*ll sup

<= M + M’ diam

where M d=SUpxa, ]]F(x)II, M’ d--esup,,a,,, ]lF(()-F(rl)[I/I]-rl]l, and D is thediameter of . Then set M + M’D.

(27) (ii) Proof that g(x) >_- a ][x-We consider three mutually exclusive cases.Case 1. T*= x*. (Fig. 3.) For x sufficiently close to x* we have

g(x)=(x-x*)’V(x)

(x x*)’F(x*)+ (x x*)’(f(x) F(x*))

=> llx-x*ll, IIF(x*)ll cos (x-x*, F(x*))/ ,, IIx-x*ll 2.

F(x*)

FIG. Case 1.

Since F(x*) is orthogonal to no feasible direction from x* into @ (by geometricstability) we must have that cos (x-x*, F(x*)) is positive and bounded away fromzero. Hence (27) holds with

defa c inf {cos (- x*, F(x*))}llF(x*)ll > o.

:x*


Case 2. T* {x*} and x T*. (Fig. 4.) Let p be a positive number such that themapping Proj. (x-(1/p)F(x)), defined for x cp, is contracting, where Proj. denotesthe projection operator on P, in the usual Euclidean norm. The existence of such anumber p is a consequence of, say, Example 3.1 of Dafermos [4]. For x sufficiently

x P x* Y

-F(x*)

x- f(x)FIG. 4. Case 2.

close to x*, p a__er Proj. (x-(1/p)F(x)) lies in T* (see Proposition 1). Let 0 [0, 1] bethe contraction constant, dependent on p; we have lip- x*ll <--ollx-x*ll. we have

IIp-xll>= [Ix-x*]l-[Ix*-pl[ by the triangle inequality(28)

>-_(1-o)llx-x*ll.

Also by construction of p

(29) (x-p)’F(x)- Ilx-p 2.

Define

(30) 0=max {4)lx+ck(p-x) r*}

(0 must be positive since x lies in the relative interior ri (T*)) and y=x+O(p-x).We have

(x- y)tF(x) q,(x-p)’F(x)

=Ollx-pll = by (29)

pllx-pll(1-o)llx-x*ll by (28).

Now 4,1lx-pll- Ilx-y[I must be bounded from below by some positive number s sincex lies in ri (T*) and )7 is on the boundary of T*. It follows that

g(x)>=(x-y)’F(x)

p(1-O)sllx-x*lland the result holds with ce ps(1- 0).

Case 3. x : T* (consequently T* ). (Fig. 5.) Define p Projr. (x). First wewill show that cos (x-p, F(x*)) is bounded below by some positive number %

Define, for x T*, the function x - r/(x) where r/(x) is the intersection of the linegoing through the segment [p, x] with the boundary of , in the direction x-p. LetBe(x*) be a ball of radius e about x*, H B(x*)f3 -T*, and E the closure ofr/(H) (see Fig. 6). We have E f’l T*= and

(31)cos (x-p, (x*)) cos (n(x)-p, F(x*))

->min cos (v-p, F(x*)).

1272 e. MARCOTTE AND J.-P. DUSSAULT

X

T* x*

x-F(x)FIG. 5. Case 3.

F(x*)

E

FIG. 6

But cos(v-p,F(x*))>O (peT*) for each vC_T* by geometric stability. Hence,cos (x-p, F(x*)) >= 3/ > O.

We then write (x-p)tF(x*) >- llx-pll" IIF(x*)ll. Thus

>3(32) (x-p)tF(x)=-[Ix-pll IIF(x*)ll

for x sufficiently close to x*. Now consider the following two subcases.Case 3.1. IIx-pll<-_llp-x*ll with C=ps(1-O)/2ODM’. Define 37 as in Case 2

(see Fig. 4). Then

g(x) >= (x-.9)tF(x)(x -p)’F(x)+ (p -y)’F(x)

O+ (p -.9)’F(p) + (p y)’ (F(x) F(p))

>-os(a-O)llx-x*ll-rllp-x*[IDM’ since pc T* (see Case 2)

_-> (ts(1 o) ODM’)IIx x*ll>ps(1-O)

2

Set c ps 1 O)/ 2.Case 3.2. IIx-Pll--> ffllp-x*l[. We have

(33) [Ix-x*ll Ilx-pll + p-x*ll (1 + ff)IIx-pll.


We obtain

g(x)>=(x-p)’F(x)

llx-pl[" F(x*) by (32)-2

>_..y 1x*

=2 1+ [Ix- I1" IIF(x*

by(28)and(29),withF(x*)O, andtheresultholdswitha= yllF(x*)ll/2(l+). [3

Remark 1. The above general proof does not require ditterentiability of the costmapping F. If F is ditterentiable, the proof of Proposition 4 can be somewhatstreamlined (see Dussault and Marcotte [21]).

Remark 2. Proposition 4 strengthens a result of Pang 19] who derives an estimateof the form

IIx- x*ll oo4g(x)

for some positive constant w.PROPOSITION 5. Let {xk} be a sequence generated by Algorithm N. Then there

exists an index K such that for k >-K, xk+l= NEW (xk), the Newton iterate.

Proof We must prove that g(NEW(xk))<-1/2g(x) for k>=K, in which caseAlgorithm N will set x+1 to d,(x), which is equal to NEW (x) by Lemma 6:

g(NEW(x))=</3llNEW(x)-x*II by Proposition 4

<- t3cllx- x*ll 2 from (10)

_<-- c ilxk _x, llg(x) by Proposition 4

1<_-g(x)2

as soon as IIx-x*ll c/2c, l-1The preceding results can be summarized in a theorem.THEOREM 2. Consider a VIP with monotone cost function F and let {x} be a

sequence generated by Algorithm N. If VIP is geometrically stable, then(i) g(xk+l) < g(xk) if g(x) O.(ii) lim_. g(x) O.(iii) IfF is affine or T* is a singleton then there exists an index K such that g(x) 0

for k >-_ K (finite convergence).(iv) IfF is strongly monotone then the sequence {x} converges quadratically to the

point x* and there exists an index K such that x T* whenever k >= K.5. Numerical results. A working version of Algorithm N has been developed,

using a standard linear programming code, and contrasted against Newton’s method,with or without linesearch. The asymmetric linear complementarity subproblems inNewton’s method have been solved by Lemke’s Complementary Pivoting Algorithm.


In the test problems, has been taken as the unit simplex i=1 Xi--- 1, x >= 0 andthe mapping F assumed the general form

F(x) (A- A’)x + B’Bx + yC(x) + b

where the entries of matrices A and B are randomly generated uniform variates, C(x)is a nonlinear diagonal mapping with components Ci(x) arctan (xi), and the constantvector b is chosen such that the exact optimum be known a priori. The parameter yis used to vary the asymmetry and nonlinearity of the cost function.

Sixteen five-dimensional and sixteen 15-dimensional problems have been gener-ated, with y-values ranging from 10-40. Newton’s search direction differs fromAlgorithm N’s direction in 18 of the 32 problems. In some instances (Figs. 7-10)Algorithm N yields a direction as good or better than Newton’s direction. For someother problems (Figs. 11-14) Newton’s direction is slightly superior. In all cases, thedifference in the number of iterations required to achieve a very low gap value is small.

1.4

1.2

+ 1.0

0.8

<(C) 0.6

0.4

0.2

Dimension 15, 7- 40

2 3 4 5

ALG.N

6 7 8 9 10

Iteration kNEWTON o-o-o- NEWTON (with step)

FIG. 7

1.4

1.2

0.8

0.6

0.4

0.2

Dimension- 15, y 30

2 4 5 6 7

Iteration kALG.N#

8 9 10

NEWTON o-o-o- NEWTON (with step)

FG. 8


1.8

1.6

1.4

1.2

.0

o.8

0.6

0.4

0.2

Dimension 5, y 30

2 3 4 5 6 7 8 9Iteration k

ALG.N NEWTON o-o-o-o- NEWTON(with step)FIG. 9

2.0

1.6

0.4

Dimension 5, y 30

2 3 4 5 6 7 8 9 10

Iteration kALG.N NEWTON NEWTON(with step)

FIG. 10

1.4

1.2

1.0

, 0.6

0.4

0.2

Dimension =15,7 =10

2 3 4

Iteration kNEWTON o--o-o-

FIG. 11

7 8 9 10

NEWTON(with step)


Dimension 5, 7 20

1.2

1.0

t3.8

" 0.6

’ 0.4

0.2

2 3 4 5 6 7 8 9 10

Iteration k

ALG.N# NEWTON o-o-o-o- NEWTON(with step) c3-tzc3-

FIG, 12

1.2

1.0

0.8

13.6

0.4

0.2

Dimension 15, 7 20

3 4 5 6 7 8 9 10Iteration k

ALG.N u t , NEWTON c)cc)c) NEWTON(with step) [] [] []

FIG. 13

Dimension 5, 7 401.8

1.6

1.4

1.2

+ 1o0

0.8<(, 0.6

N o.4

0.2

2 3 4 5 -"6 7 9 10Iteration k

ALG.N NEWTONo-o-o-o- NEWTON(with step)FIG. 14


This preliminary testing shows some promise for the linearization algorithm. Itsdirection finding subproblem involves a linear program, versus an asymmetric linearcomplementarity problem for Newton’s method. The linear subproblem bears closeresemblance to the linear program that must be solved to evaluate the gap function,and as such could benefit from some fine tuning of the computer code. Moreover, itmay well prove unnecessary to solve the subproblem exactly, yielding another areafor further improvement. In contrast, solving linear complementarity problems yieldsa feasible solution only at termination, therefore making the implementation of aninexact strategy more difficult.

Finally let us mention that Marcotte and Gu61at [20] have successfully imple-mented Algorithm N to solve large-scale network equilibrium problems when themapping F, i.e., its Jacobian matrix, is highly asymmetric.

6. Conclusion. The main result ofthis paper has been to prove global and quadraticconvergence of an algorithm for solving monotone variational inequalities. Thealgorithm operates by solving linear programs in the space of primal-dual variables.Computational experiments show that the algorithm is efficient for solving both small-scale and large-scale problems.

Acknowledgments. The authors are indebted to anonymous referees for relevantcomments on an earlier version of this paper that led to numerous improvements.

REFERENCES

[1] A. AUSLENDER, Optimisation. Mdthodes numdriques, Masson, Paris, 1976.[2] D. BERTSEKAS AND E. M. GAFNI, Projection methods for variational inequalities with application to

the traffic assignment problem, Math. Programming Stud., 17 (1982), pp. 139-159.[3] R.W. COTTLE AND G. B. DANTZIG, Positive (semi) definite programming, in Nonlinear Programming,

J. Abadie, ed., North-Holland, Amsterdam, 1967.[4] S. C. DAFERMOS, An iterative scheme for variational inequalities, Math. Programming, 26 (1983),

pp. 40-47.[5] N. H. JOSEPHY, Newton’s method for generalized equations, Technical Report 1966, Mathematical

Research Center, University of Wisconsin, Madison, WI, 1979.[6] S. KAKUTANI, A generalization ofBrouwer’sfixed point theorem, Duke Math. J., 8 (1941), pp. 457-459.[7] P. MARCOTTE, A new algorithm for solving variational inequalities, with application to the traffic

assignment problem, Math. Programming, 33 (1985), pp. 339-351.[8] Algorithms for the network oligopoly problem, J. Oper. Res. Soc., 38 (1987), pp. 1051-1065.[9] P. MARCOTTE AND J.-P.fDussAULT, A note on a globally convergent method for solving monotone

variational inequalities, Oper. Res. Lett., 6 (1987), pp. 35-42.10] J. M. ORTEGA AND W. C. RHEINBOLDT, Iterative Solution ofNonlinear Equations in Several Variables,

Academic Press, New York, 1970.[11] J. S. PANG AND D. CHAN, Iterative methods for variational and complementarity problems, Math.

Programming, 24 (1982), pp. 284-313.[12] S. M. ROBINSON, Generalized equations, in Mathematical Programming: The State of the Art,

A. Bachem, M. Gr6tschel, and B. Korte, eds., Springer-Verlag, Berlin, New York, 1983, pp. 346-367.[13] R. SAIGAL, Fixed point computing methods, in Operations Research Support Methodology, A. G.

Holzman, ed., Marcel Dekker, New York, 1979.14] M.J. TODD, The Computation ofFixed Point and Applications, Springer-Verlag, Berlin, New York, 1976.

[15] W. I. ZANGWILL, Nonlinear Programming: A Unified Approach, Prentice-Hall, Englewood Cliffs, NJ,1979.

[16] W. I. ZANGWILL AND C. B. GARCIA, Equilibrium programming: the path-following approach anddynamics, Math. Programming, 21 (1981), pp. 262-289.

[17] S. NGUYEN AND C. DUPUIS, An efficient method for computing traffic equilibria in networks withasymmetric transportation costs, Transportation Sci., 18 (1984), pp. 185-202.

[18] J. M. DANSKIN, The theory ofmax-min, with applications, SIAM J. Appl. Math., 14 (1966), pp. 641-664.


[19] J. S. PANG, A posteriori error bounds for the linearly-constrained variational inequality problem, Math.Oper. Res., 12 (1987), pp. 474-484.

[20] P. MARCOTTE AND J. GUELAT, Adaptation of a modified Newton method for solving the asymmetric

traffic equilibrium problem, Transportation Sci., 22 (1988), pp. 112-124.[21] J.-P. DUSSAULT AND P. MARCOTTE, Conditions de r.gularitd gom.trique pour les indquations variation-

nelles, RAIRO Rech. Op6r., 23 (1988), pp. 1-16.

A SEQUENTIAL INEQUALITIES*marcotte/ARTIPS/1989_SIAM_Control.pdf · 2009. 3. 17. · SIAM J....

Documents

Transcript of A SEQUENTIAL INEQUALITIES*marcotte/ARTIPS/1989_SIAM_Control.pdf · 2009. 3. 17. · SIAM J....