A new extension of local search applied to the Dial-A-Ride Problem

E L S E V I E R European Journal of Operational Research 83 (1995) 83-104

EUROPEAN JOURNAL

OF OPERATIONAL RESEARCH

T h e o r y a n d M e t h o d o l o g y

A new extension of local search applied to the Dial-A-Ride Problem

P a t r i c k H e a l y a,*, R o b e r t M o l l b

a Department of Computer Science and Information Systems, University of Limerick, Limerick, Ireland b Department of Computer Science, University of Massachusetts, Amherst, MA 01003, USA

Received February 1993; revised July 1993

Abstract

In this paper we propose a cheap yet effective extension to the traditional local improvement algorithm. We present this extension in the context of the Dial-A-Ride Problem, a variant of the Traveling Salesman Problem. This extension, which we call sacrificing, yields significant improvements over its plain local improvement counterpart without adversely affecting the algorithm's running time.

Keywords: Dial-A-Ride Problem; Heuristics; Local search

1. Introduction

Local search is a general-purpose heuristic for solving combinatorial optimization problems. I t has been applied to a diverse range of problems with considerable success. However, while it is generally fast running and easy to code, it suffers f rom one important drawback: solution quality.

In this paper we apply a variant of local search to the Dial-A-Ride-Problem (DARP) [17,18,22]. We will demonstra te significant improvements in the quality of solutions found by our algorithm when compared with the traditional local search algorithm. While the subject of this paper is the Dial-A-Ride- Problem, we believe that our technique, which we call sacrificing, is generalizable and applicable to a wide range of problems.

The paper is organized as follows: below we briefly describe local search and how it is applied to the TSP; a brief motivation and description of sacrificing then follows; we complete the introduction with a summary of the main contributions and results of the paper. Section 2 introduces D A R P and describes how local search is applied to the problem. In Section 3 we describe how local search in D A R P may be

* Corresponding author.

0377-2217/95/$09.50 © 1995 Elsevier Science B.V. All rights reserved SSDI 0377-2217(93)E0292-6

84 P. Healy, R. Moll~ European Journal of Operational Research 83 (1995) 83-104

improved upon by sacrificing solution quality at key times during the search in return for more search opportunities. Our experimental results appear in Section 4. Section 5 concludes our findings and suggests how sacrificing might be applied to a range of other problems.

1.1. Traditional local search

Papadimitriou and Steiglitz [16] define a combinatorial optimization problem, P, to be a pair (F, c), where F is the set of all valid, or feasible, solutions and c is a cost metric that assigns some real-valued cost to every solution s ~ F. From this set, we seek the solution having lowest cost 1

Local search proceeds by finding some starting solution, s ~ F, and attempting to improve upon this solution by perturbing it is some small, or local, way that yields a solution t of lower cost. The process repeats until a solution is reached that will not yield a local improvement. This solution is returned as the champion. Since the solution returned is optimal with respect to all local perturbations it is referred to as a local optimum.

A neighbor, t, is generated by perturbing s in some small way that is specific to the given problem. By applying the known set of perturbations to a given solution, we generate the solution's neighborhood, N(s), with size I N(s) l or when our meaning is clear, I N I.

For example, when dealing with TSP-related problems, the following scheme, known as 2-opt, is a useful perturbation strategy. Given a tour consisting of arcs that link the cities together, break two of these arcs at random. The tour is now broken into two segments, which may be reconnected to form a different tour, In replacing two arcs with two new arcs we generate a new tour that is considered a neighbor of the previous tour. This new tour differs from the original in that the segment bounded by the two broken arcs is now reversed, i.e. the cities that made up that segment are now visited in the reverse order, We note that the size of 2-opt's neighborhood, IN I, is O(n2),

This scheme generalizes well. Instead of breaking two edges we might break k, bearing in mind that the size of the neighborhood now increases to o(ng) . There is an obvious tradeoff here: the larger the neighborhood we search at each iteration, the greater are our chances of finding the global optimum yet, the larger the neighborhood, the longer the search will take. Since we know that the algorithm must, on the last iteration, search the entire neighborhood before concluding that it cannot find an improvement, neighborhoods larger than O(n 3) are rarely considered. For extremely large problem instances even 3-opt may be unreasonable [1]. Compromises that maintain the O(n 2) running-time of 2-opt while permitting some of the greater strength of 3-opt have been reached where a third arc may be broken provided that it is within a bounded number of arcs from the second broken arc [15,20].

1.2. Our variant of local search

The technique we call sacrificing was motivated by local search's tendency to return poor solutions. We sought a variant that would be competitive time-wise but would return higher quality optima. In this way our approach differs from other variants of the algorithm, such as simulated annealing [3,13] and tabu search [5,9,6]. It is generally recognized that the strength of these other variants of local search lies in them being allowed ample search time.

With our sacrificing algorithm we postulate a secondary metric, which gives us a broader view of solution quality. Our algorithm alternates between the primary, traditional problem metric, and this

1 If the problem is one of maximization, we may replace c(s) by - c(s).

P. Healy, R. Moll~European Journal of Operational Research 83 (1995) 83-104 85

secondary metric. When the algorithm is advancing according to the second metric, it may visit solutions of higher secondary quality, which are strictly poorer with respect to the metric that really matters, namely the primary metric. Thus, such a move is a sacrifice.

For the class of combinatorial search problems with a non-uniform neighborhood structure - and as we shall see, DARP is in this class, a secondary metric based on neighborhood size seems especially relevant. In this setting, local search 1) reaches a local optimum according to the primary metric; 2) moves away from this optimum by finding a neighbor with a worse primary score, but a larger neighborhood; and 3) returns to the primary metric algorithm of 1). We speculate that this domain-inde- pendent approach to problems with a non-uniform neighborhood structure will be a valuable enhance- ment to the traditional algorithm, and especially so in cases where neighborhood size can be assessed quickly and accurately. One such example, rectangle packing, is reported on in [7,8].

1.3. Contributions and results

It is common for the local search algorithm described above to be used as a subroutine by a function that generates random starting solutions and finds a local optimum from that starting point. The best solution over multiple runs is then output. In this scenario, it is not uncommon to perform as many runs of the subroutine as possible in a given time interval. Clearly then, efficiency is a concern.

DARP differs from TSP - its generalization - in that we must ensure the feasibility of a selected neighbor in addition to it being of lower length. Psaraftis [18] presents a preprocessing step that is performed each time that a solution is accepted as the best seen to date. This step computes a data-structure that quickly determines the resulting feasibility of any proposed perturbation. For the cases of k = 2, 3 his preprocessing step runs in time O(nk). Since the size of the neighborhood is O(n k) to begin with, Psaraftis argues that this is reasonable. In Section 2 we present algorithms that compute this data structure in O(n k - l ) time. This is significant for when the neighborhood search strategy is first-improvement - that is, accept the first feasible perturbed solution that has shorter length - it is usual for improvements to be found quickly in the earlier stages of the search.

The goal of this work was to calibrate the performance of sacrificing in the context of the Dial-A-Ride-Problem. We restricted our attention to the standard k-opt algorithms, 2-opt and 3-opt, and did not compare our work to other variants of the algorithm, such as Tabu search and simulated annealing. Our justification for this was, as we have already described, that we sought an algorithm that would be competitive with 2-opt on the basis of running time.

Our experimentation focussed than on 2-OPT, 3-OPT and our sacrificing alternative, 2-SAC. The former two are our implementations - with improved solution feasibility checking - of the standard DARP local search algorithms described by Psaraftis. Our sacrificing algorithm, 2-SAC, was an augmentation of 2-OPT so we would expect that it would be superior to it. This we found to be the case: 2-SAC ran in time comparable to the 2-OPT algorithm, yet returned markedly superior solutions. We found that 2-SAC ran for no longer than a constant times 2-OPT's running time.

A more significant question was whether 2-SAC would match the quality of solution found by 3-OPT. This would be quite useful for the running time of 2-SAC is significantly faster than 3-OPT's, as Fig. 10 demonstrates. However, it seems that the sheer strength of 3-OPT is too much for any augmentation of 2-OPT (we tried a number of sacrificing variants). This still holds even when we take account of the significantly longer running time of 3-OPT. We offer some reasons for why this is in our conclusions in Section 5.

In summary then, we present an algorithm that is strictly intermediate to the previously known 2-OPT and 3-OPT algorithms. From the data we present in Section 4, we believe that the algorithm yields solutions more in line with 3-OPT's while costing running-time within a constant factor of 2-OPT.

86 P. Healy, R. Moll~European Journal of Operational Research 83 (1995) 83-104

2. The Dial-A-Ride Problem

Many routing problems can be formulated as a Traveling Salesman Problem with some type of additional constraint. The Dial-A-Ride Problem (DARP) is one such problem. D A R P may be formulated by stating that we are given a 'base ' city and n pairs of cities. A bus is dispatched from the base city and must visit the remaining 2n cities by the shortest route. Each pair of cities consists of a passenger pick-up city and a passenger drop-off city and the pick-up city must be visited before the drop-off city - hence there are n precedence constraints. These constraints provide an interesting twist to the standard TSP problem in that an overwhelming majority of the (2n)! tours possible correspond to infeasible solutions.

In view of its resemblance to the TSP, we base our solution to D A R P on the k-opt local search technique described earlier. This per turbat ion strategy has the property that it preserves the structural constraint of the tour, i.e. that every solution be a permutat ion of the 2n + 1 cities z, but does not, in general, respect the precedence constraints. As a result, many of the perturbations or 'swaps' that would otherwise generate legitimate tours no longer do so. Since the constraints ' effects are to reduce the pool of solutions that are feasible candidates the performance of the local search algorithm is diminished. Interestingly, all solutions are not equally affected by the precedence constraints, i.e. some solutions have more feasible neighbors than others. This will be a feature we will exploit later.

2.1. Background

D A R P may be defined more precisely as follows: given 2n + 1 cities, C = {c 0} u {Cpl, Cai I1 < i < n}, where c m and ca~ denote the cities at which a passenger, Pi, is to be picked up and dropped off, respectively, and c o denotes the city at which the bus is based, find the tour of minimal cost through the 2n + 1 cities that starts and finishes at c o subject to c m -< Cal, Vi, i.e. that every passenger is picked up before he or she is dropped off.

D A R P is NP-hard because TSP reduces to it in the following way. Given a TSP tour on n cities, CT = {ci I 1 < i < n} create a dummy base city, Co, and n dummy D A R P cities, C O = {d i I1 < i < n}. Construct a D A R P problem with cities, C = {c 0} u C r U CD, by pairing off c i and d i and letting c i be a 'pick-up' point and d i its corresponding 'drop-off ' point for dummy passenger, i. Now for all cities x, and all i, let d(co, x ) = O, d(ci, d i) - = 0 and d(ci, x ) = d(di , x) .

Given an optimal D A R P tour we can easily see that removing the drop-off cities and the depot yields a valid TSP tour. This must be an optimal TSP tour for if there is one cheaper we can construct a D A R P tour of equal cost by inserting the depot randomly and inserting the n drop-offs immediately after their (TSP) pick-up city. This contradicts the optimality of the earlier D A R P tour.

2.1.1. k-opt local optimization for TSP-type problems While k-opt was first proposed for the TSP by Croes [4] and Bock [2] in the late 1950's, it was Lin [14]

and Lin and Kernighan [12] who spectacularly demonstra ted its effectiveness. More recently Bentley [1] has used the strategy for solving problems involving hundreds of thousands of cities.

Fig. l a illustrates a TSP tour on a 7-city problem. We may perturb the tour shown by deleting a number of edges and 'rejoining' the resulting subsequences, or segments, in some manner to form a new tour. We call such a per turbat ion a k-swap. Fig. lb shows the result of deleting 2 edges and reconnecting the segments to form a different tour - thus called a 2-swap. By convention we denote a k-swap by a

2 Although we will usually refer to the size of the problem in terms of passenger count, n, on occasion we will denote problem size in terms of city count, N, for compactness. In all cases N = 2n + 1.

P. Healy, 1~ Moll/European Journal of Operational Research 83 (1995) 83-104 87

(a) (b)

Fig. 1. A TSP tour before (a) and after (b) the application of 2-swap (6, 3).

k-tuple of the cities at the beginnings of the k edges being deleted. The 2-swap shown would then be denoted by (6, 3) - even though the interval that gets reversed could be more accurately denoted by (6, 3]. The swap can be seen to simply reverse the traversal order of the segment (6, 3). This swap can also be denoted by (0 6 / 5 1 3 / 4 2), with ' / ' indicating that the arc exiting that city has been broken. Although any number of edges may be deleted values of k greater than 3 are considered to be unreasonable for large problem sizes, since the number of tours in a k-swap neighborhood is approximately

[ ~ N k -1 _ \ / | k | 2 ( k 1 , ! (1 )

Although this quantity is not exact since it double counts some tours, it illustrates how the neighborhood grows with k. Given k arcs to break, there are (N) ways to make the breaks in the tour. Now with the tour broken into k pieces or segments, a new tour may be created by 1) either reversing the traversal order of a segment or leaving it as it was; or 2) reordering the segments, so that a complete segment is visited earlier or later in the tour; or 3) both 1) and 2). There are 2 ~ ways to combine the traversal order of the segments in item 1) and there are k! ways to reorder the visitation of the segments as per item 2). However, to keep the resulting tours unique we must "anchor" one segment and maintain its traversal order, say, the segment that contains the base city. Although some tours are still doubly counted we get approximately 2 k - a ( k - 1)! ways to reconnect the k segments.

In the case of k = 3, then, there are 8 ways of reconnecting a tour that has 3 edges deleted. However, it is possible to replicate some of those tours by choosing a different set of edges to break and reconnecting them.

Fig. 2 illustrates a DARP tour with a 2-swap that will result in an infeasible tour. We label the base city, 0, passenger i's pick-up by i and passenger i's drop-off by - i. This swap is infeasible for its effect is to reverse the segment [ 3 , - 3, 2] thus attempting to drop off passenger 3 before they have boarded the bus.

The fact that not all proposed k-swaps admit to legal tours complicates the application of local optimization in two ways. In addition to the effective neighborhood-size being reduced by the infeasibility of many of the neighbors, the more immediate problem of checking whether or not a proposed k-swap is feasible must also be addressed. We focus, then, on how well local optimization can cope in this setting and how the algorithm may be augmented to bet ter deal with it.

Although our sole requirement is that the final tour be feasible, we insist that every intermediate tour considered by the local search algorithm be feasible also. For the Graph Coloring Problem Johnson et al.

88 P. Healy, R. Moll/European Journal of Operational Research 83 (1995) 83-104

Fig. 2. An infeasible 2-swap for a DARP tour.

[10] invoke a clever penalty function scheme which guarantees that a local optimum will also be feasible, yet allows the consideration of intermediate infeasible solutions. Psaraftis suggests opening the search up to consider infeasible solutions as a possible research direction for DARP. However, given that the density of feasible solution is so low, save for some scheme that prevents unlimited deviations from a feasible tour, this may not be so promising. Our limited experiments in this direction tended to support this.

2.2. Solution feasibility

We can verify the feasibility of a given tour in O(n) time. We build a vector, I, of size 2n that records the stop number at which each pick-up and drop-off occurs. We refer to this data structure as the tour's inverse. This is motivated by the tour data structure, T, which is a vector of size 2n, indexed by stop number. T can be thought of as a mapping from stop numbers to pick-up and drop-off events. T h e reverse mapping, or inverse, is a vector indexed by event that gives the stop number of that event. A tour is feasible if and only if each passenger's pick-up precedes their drop-off. This is easily verified in 2n probes of L

Consider now how the best solution in the current solution's neighborhood might be found. For 2-opt we would need to consider (at worst) all O(n 2) perturbations and check the feasibility of each before proceeding further. If we used the tour-inverse approach the time taken could be O(n 3) per iteration i.e. to find a lower cost neighbor. Since a run of a local optimization algorithm may be expected to run for a number of iterations proportional to the size of the neighborhood [23], this approach is impractical.

For the k = 2, 3 cases, Psaraftis [18] presents an algorithm that can check feasibility in constant time after an initialization t ime of O(nk). His algorithm constructs a (k - / ) -d imens iona l array. F IRS TD ELk , that, when given the first k - 1 components of a proposed k-swap, returns the city before which the k-th component must occur. In the special case of k = 2, he refers to it as simply F IRSTDEL 3. For compactness, we will refer to the arrays as F D k and FD respectively. For the case of k = 2, FD[i] will be the stop number of the first drop-off of a passenger who boarded the bus at or after stop i. The second component of a proposed 2-swap must occur before this city, for otherwise at least one p ick-up/drop-of f pair will be reversed by the 2-swap.

Referring to Fig. 2, choosing either city 1 (pickup passenger 1) or city 2 (pick up passenger 3) as the city after which the first break is made would necessitate breaking the tour so that - 3 is not part of the segment that gets reversed. Thus FD[0] = FD[1] = FD[2] = 3 (the stop number associated with dropping off passenger 3).

3 FIRSTDEL is a mnemonic name for the first passenger delivery, or drop-off, of interest in the algorithm.


Fig. 3. A 3-swap performed on tour (0 1 3 - 3 2 - 1 - 2).

Things are further complicated for the k = 3 case, for, as we have described in the computation of equation 1, in addition to either of the 2 intermediate segments being reversible, their visitation order may be reversed, as illustrated in Fig. 3. Given the tour (0 1 / 3 - 3 / 2 - 1 / - 2 ) with arc deletions shown, two intermediate segments, [ 3 , - 3] and [ 2 , - 1] result. Then in addition to having to check for a p ick-up /drop-of f pair occurring within one of the segments should that segment be reversed we must also check that a precedence relation does not get violated if the two segments are reordered. The safety of reversing the traversal order of a segment is possible using FD, while probing the appropriate entries of FD3 will verify the safety of reordering the relative appearances or, visitation order, of segments.

We present algorithms below that reduce the initialization time of F D k to O(n k - l ) for k = 2, 3 and conjecture that Psaraftis' algorithm may also be improved upon when k > 3. In either case a (k - 1)- dimensional array is re turned that is indexed into by the first k - 1 components of a proposed swap and contains at that entry the stop number before which the k th edge deletion must be performed. The operation of the algorithms hinge upon the fact that for cities occurring at positions i and j, FD[i] < FD[j ] when i < j . We prove this fact in Lemma 1.

1.emma 1. I f i < j then FD[i] < FD[j] .

Proof. Assume the lemma is false. Then for some pair, i < j , FD[i] > FD[j]. Since it is always true that k < FD[k], the ordering of the integers is i < j < FD[ j ] < FD[i]. Now by the definition of FD the 2-swap (j, FD[j] ) is illegal, for it would violate some precedence relation in the segment ( j . . . . , FD[j]]. But the 2-swap (i, FD[i] - 1) is legal, yet it reverses the segment ( j . . . . . FD[j]]. This contradicts the feasibility of the 2-swap (i, FD[i] - 1). []

Algorithm 1. Given as input a feasible tour, T, of length 2n + 1, output a vector, FD, of length 2n + 1. The i-th entry reports the upper limit on the second edge-deletion that guarantees safety of reversal if the first edge-deletion occurs after stop i.

Step 1. Step 2.

Initialize all elements of FD to 2n + 1; Construct the inverse, I, of tour T; for i : = 2 to 2n do

if drop-off( T[ i ]) then p := I [ - T[i]]; {p is stop where corresponding pick-up took place}

90 P. Healy, R. Moll/European Journal of Operational Research 83 (1995) 83-104

Step 3.

Step 4.

FD[p ] := i; end if

end for

for i := 2n - 1 downto 0 do if FD[i] > FD[i + 1]

then FD[i] := FD[i + 1]; end for

We can argue the monotonicity proper ty informally as follows. Call a 2-swap where we know only the first arc we intend breaking a partial 2-swap; denote a partial 2-swap with the edge leaving city i as the first edge to break by ( i , . . . ) . I f we know that the furthest stop at which the tour may be broken in order to make the partial 2-swap (i . . . . ) legal is stop number j, then the partial 2-swap (i - 1 , . . . ) can be no later than stop j and may even be forced to be earlier if stop i is a pick-up whose corresponding drop-off occurs on or before stop j.

Based on these observations, Algori thm 1 proceeds as follows. First, FD is initialized and T 's inverse, / , is created. Then the algorithm proceeds in two main phases (Steps 3 and 4), each costing time O(n); in the first phase we compute some of the entries of FD; the second phase fills in the remaining entries as well as applying Lemma 1 to adjust any entries that may have been incorrect before. In Step 3 we ask: " I f the second divider was placed after the current stop, where would the first one need to be placed to ensure feasibility?". I f the current stop, i, was a drop-off then certainly the entry in FD for the associated pick-up, I [ - T [ i ] ] , must be at most i.

Then for the tour illustrated in Fig. 4a, the array FD at the end of Step 3, is as illustrated in Fig. 4b. Clearly FD[1] (the entry associated with picking up passenger 1) is incorrect in Fig. 4b, for selecting any edge after the dropping off of passenger 2 as the second edge for deletion would result in that passenger 's precedence relation being violated. We correct for this in Step 4 by enforcing the monotonicity requirement, i.e. that the entries of FD form a monotonically increasing sequence. Fig. 4c illustrates the contents of FD after Step 4.

Algorithm 2. Given as input a feasible tour, T, o f length 2n + 1, output an array, FD3, of dimensions 2n - 1 x 2n - 1. The (i, j)-th entry indicates what, if the first two edge-deletions occur after stops numbered i and j, the upper limit on the third edge-deletion m a yb e to guarantee segment reordering safely.

Step 1. Initialize the upper-tr iangular portion of FD3 to 2n + 1; Step 2. Construct the inverse, I , of tour T;

for i : = l t o 2 n - l d o for j : = i + 1 to 2n do

p := S [ - T[j]] if drop-off(T[j]) and p < i

l 0 1 2 3 4 5 6 7 8 T 0 1 2 - 2 3 - 3 4 - 4 -1

(a) ] 0 1 2 3 4 5 6 7 8 1 0 1 2

FD 9 8 3 9 5 9 7 9 9 FD 3 3 3 (b) (c)

3 4 5 6 7 8 5 5 7 7 9 9

Fig. 4. FD (b) after Step 3 and (c) after Step 4 of Algorithm 1 applied to T (a).


1

0 2 1 2 3 4 5 6

0 T 0

2 3

1 2 3 4 5 6 7 8

1 - 1 2 3 (a)

4 5 6 7

7 7 7 -

6 6 8 8 8

(b)

4 - 3 - 2 - 4

1 2 3 4 5 6 7

2 9 7 6 9 7 6

7 6 6

(c)

6 7 8 6 7 8 6 7 8 6 8 8 8 8 8

9 9 9

Fig. 5. FD3 (b) after Step 3 and (c) after Step 4 of Algorithm 2 applied to T (a).

Step 3.

Step 4.

then FD3[p - 1, i] =j; end if

end for end for

f o r j : = l t o 2 n do for i := j - 2 downto 0 do

if FD3[i, j] > FD3[i + 1, j]; then FD3[i, j ] = FD3[i + 1, j];

end if end for

end for

Similar to Algori thm 1, the calculation of FD3 in Algorithm 2 hinges on filling in 'key ' elements of the matrix and taking advantage of the monotonicity property to fill in each column of the array in reverse order. In Step 3, we ask: "If the second edge-deletion occurred after stop number i and the third occurred after stop number j, and the event occurring at j was a drop-off, would this prevent the reordering of two segments and if so, which pair".

Fig. 5b illustrates the contents of the array FD3 after Step 3 of the algorithm, if the initial tour was that illustrated in Fig. 5a. For clarity the entries that are as yet untouched are displayed as hyphens. The (i, j ) - th entry of the matrix corresponds to making the first arc deletion after stop number i and the second deletion after stop j, i < j . I f FD3[i, j] = k then by definition there must be a pick-up event in the segment, [i + i . . . . . j], whose drop-off occurs in segment [ j + 1 , . . . , k - 1]. The monotonicity property then requires that FD3[i - 1, j ] < k.

The final contents of FD3 is that shown in Fig. 5c. From it we can tell that if the first two segments broken are after cities 3 and 6, that is, after the events, pick up passenger 2 and drop off passenger 3, the third and final arc deletion must occur before the eighth stop, or before dropping off passenger 4. Delet ing the arc depart ing the eighth stop and reordering the two segments would create a tour that a t tempted to drop off passenger 4 before picking up him or her.

The algorithm's running t ime is clearly O(n 2) and thus is optimal for the array construction approach. We remark again that construction of the array FD to check the legality of segment reversals is still necessary in the 3-opt case.


2.3. Neighborhood size

We have already seen that many solutions in the neighborhood of a particular solution get ruled out on the grounds of infeasibility. In the presence of n binary precedence constraints, the space of feasible solutions is reduced from (2n)! to ( 2n ) ! / 2 n. The larger the problem-size we consider, then, the greater the impact of the constraints on the feasible neighborhood size. Of course, we are always assured of feasible solutions since the numerator grows at a faster rate than the denominator.

The precedence constraints affect solutions differently. Consider two solutions, x = (0 2 4 3 1 - 2 - 4 - 3 - 1) and y = (0 2 - 2 4 - 4 3 - 3 1 - 1). It is obvious that solution y has n - 1 non-trivial 2-swap neighbors, yet solution x has

1)-n] (n 1)[(2)(n 1)J =3(2 ) non-trivial 2-opt neighbors. Since these are the largest and smallest neighborhoods possible, and since the total number of non-trivial 2-swaps is

the largest ratio of feasible neighborhood to entire neighborhood we can attain will be

3 ( 2 ) / ( 2 n ) -- 3 (2)

Similarly the smallest neighborhood attainable will approach

--- ~ n " (3)

For both the largest and smallest neighborhoods of a given problem-size, therefore, we see that the relative number of feasible solutions is diminishing. Finding a handle on the expected number of neighbors of a given solution is difficult. While it is clear that the probability of finding a reversible 2-swap of fixed length k increases as the number of cities, n, increases and conversely decreases as k increases and n is fixed, we will say little more. However, in Section 4 we will present evidence that the probability of a random 2-swap (or 3-swap) being feasible decreases as the problem size increases, which is consistent with what we are saying here.

3. S a c r i f i c i n g for p r o f i t

We now describe how we apply sacrificing to DARP. As we have already observed, different solutions can have differing numbers of feasible 2-swap or 3-swap neighbors. By biasing our search in the direction of solutions with larger neighborhoods we hope to improve the overall quality of the local optima found.

3.1. D A R P sacrifices

From the previous section we have seen that solutions may have substantially different sized neighborhoods. Given two solutions that are sub-optimal, one with a neighborhood of O(n 2) size and the other of size O(n), in the absence of any correlation between neighborhood size and solution cost, the former ought to be more attractive, since from it we have a larger set of feasible neighbors from which to find one cheaper. Our sacrificing strategy is driven by this observation.

Our search strategy alternates back and forth between an optimize phase and a sacrifice phase.


Q r j

C3 F-,

2000

1000

900

800

700

Feasible Nbd. Ratio(%)

'~ - • - Tour Cost

~ ~ . f r 4Jk" ~ "~

I I I I I I I 30 60 90 120 150 180 210 240

Fig. 6. A sample run of the 2-SAC algorithm,

-6o

-65 ~"

-60

• 45 Z

, .Q

• -40 "~

--a5

-3O

"25

-2O

15 270

Iterations

Optimizing refers to finding a local opt imum using traditional local optimization with tour-length as the solution metric. Once we have found a locally optimal solution we search beyond it by entering a sacrifice phase. As we look at neighbors of the current solution we accept the first one that is superior in a sacrificing sense to the current one. Our sacrificing metric is defined as C'(x)= C(x)/I N ( x ) l , where C(x) is the cost, or length, of solution x and [ N ( x ) l is the size of x ' s feasible neighborhood. Then we say that solution x is superior to solution y if C'(x)< C'(y) . The intuition here is that a solution's neighborhood size should be allowed some influence on the choice of successor. We still keep our eye on the overall cost of solution but solutions with large neighborhoods now become more attractive.

Sacrificing may then be thought of as performing local optimization with respect to this new solution metric, where we are willing to sacrifice the length of the tour for a tour that has more neighbors. Of course, a solution of shorter length is not precluded from being selected in this phase. When we reach local optimality with respect to the sacrificing metric we revert to an optimizing phase, or as we shall shortly see, we may choose to cut the search short in the sacrificing phase. This alternation continues until the optimizing phase cannot recover from the sacrifices in tour-length that were made in the previous phase, i.e. the length of the tour at the end of the optimization phase is at least as large as the best seen at the end of the previous round.

Fig. 6 illustrates a sample run on a 51-city D A R P problem, where the cost and neighborhood size of every solution that the algorithm accepted was recorded. The dotted line indicates solution cost; the jagged line measures the size of each solution's feasible neighborhood. From Eq. (2) earlier we note that the largest possible percentage of feasible neighbors in a solution's neighborhood is 75%. The circled points denote local opt ima in the traditional sense, i.e. solutions whose cost is bet ter than every solution in its neighborhood. The tr iangulated points refer to the end of a sacrifice phase. I t should be noted that traditional local optimization would have terminated at the first circled data point - after approximately 75 search iterations. Thus sacrificing further improved the solution cost f rom 1087.60 to 762.55 - a reduction of approximately 30%.

3.2. Computing the neighborhood size

In order to test a solution's worth for sacrificing we need to estimate, if not compute exactly, the size of its neighborhood, whether it be 2-opt or 3-opt. Intuitively, the further apart (in terms of bus s tops ; as

P. Healy, R. Moll/European Journal of Operational Research 83 (1995) 83-104 95

initial solution [19,21] or the need to search the neighborhood in a best-improvement, or steepest-descent manner, as opposed to the faster first-improvement algorithm. (Yannakakis [24] refers to the method of neighbor selection as the pivoting strategy.) Surprisingly, from random starts, our initial optimization phase performed equally well on a run-for-run basis when run using first-improvement as opposed to best-improvement neighborhood search. First-improvement was a clear winner when the two algorithms were allowed equal amounts of computing time.

Psaraftis [18] reports improved performance for both 2-opt and 3-opt when he generated his initial solutions using a Minimum Spanning Tree heuristic. However, it is unlikely that this would hold up when the algorithm is allowed to start from multiple random initial solutions. Our observations with a somewhat weaker Nearest Neighbor heuristic algorithm support this claim.

The behavior in sacrificing mode were similar: using a best-improvement strategy made little improvement over first-improvement and when the significantly longer running time of best-improvement was factored in, first-improvement was a clear winner.

3.3.2. Terminating sacrificing We have already described the basic flow of the algorithm: alternate between an optimization phase

and a sacrificing phase. While it is clear that the optimization phase should terminate upon discovery of a local optimum, it is not as obvious when sacrificing should terminate. Finding a local optimum with respect to the sacrificing metric is of little value - we are not ultimately interested in solutions with large neighborhoods. It might seem then that continuing the search beyond a few iterations of sacrificing is wasted and expensive effort. On the other hand, if we allow only a (small) constant number of sacrifice iterations, there is the danger that we will not escape the previous local optimum's basin of attraction and when we return to the optimization phase we will just return to the same local optimum again.

To escape the basin of attraction of the previous local optimum we sum the lengths of the segments reversed in each sacrificed 2-swap. When this sum exceeds 2n, where n is the number of customers on the bus, we terminate the sacrificing phase. Looking at Fig. 6 the sacrifice phase appeared to run for a similar number of iterations to the optimize phase when the search settled down. While this heuristic appeared to give satisfactory results, others exist. Although an interesting question, we have not followed this line of research.

3.3.3. Miscellaneous parameters It would be foolish to make unbounded sacrifices. To forestall runaway losses we imposed a maximum

sacrifice limit of 1.25 times the current solution length. It turned out, however, that this parameter was of negligible importance and was rarely called upon. This may be explained by noting that our sacrificing metric is the quotient of the solution length and the solution's neighborhood size: in order for a sacrifice to incur a loss of greater than 25% in solution length, a corresponding gain of 25% in neighborhood size would have been necessary. This proved to be extremely rare.

4. Experimental procedure and results

We compared the performance of our sacrificing method with the performance of the traditional local search algorithm. As we have mentioned previously we did not include the popular simulated annealing and tabu search variations of local search since their running times are not competitive with ordinary local search.

Psaraftis [18] previously demonstrated the superiority of the 3-opt algorithm over 2-opt for DARP, which is consistent with the results already known for TSP. Given this, we became interested in


Table 1 Inter-city length distributions for DARP data-sets DS0-DS 4

N DS 0 DS 1 DS 2 DS 3 DS 4

21 48.64 (22.7) 44.52 (21.2) 58.77 (26.8) 56.63 (25.0) 49.97 (26.0) 31 49.65 (23.0) 54.73 (24.9) 50.04 (23.1) 51.41 (27.2) 53.15 (24.8) 41 52.92 (25.2) 49.14 (23.5) 53.59 (24.7) 50.52 (23.5) 52.16 (23.7) 51 55.13 (25.3) 52.30 (26.5) 49.69 (23.5) 51.32 (25.2) 51.06 (24.2) 61 51.76 (24.9) 50.18 (23.6) 49.25 (24.6) 49.18 (23.2) 51.30 (24.7) 71 52.45 (24.6) 52.82 (26.1) 51.69 (24.2) 51.27 (24.7) 53.25 (24.9) 8t 51.98 (24.9) 54.34 (25.5) 51.65 (24.1) 49.62 (23.9) 50.91 (23.9) 91 49.66 (23.7) 49.55 (23.5) 50.55 (24.3) 48.32 (23.0) 49.90 (24.5)

101 51.87 (25.0) 52.63 (25.2) 52.36 (24.5) 51.52 (24.4) 51.94 (24.7) i l l 50.32 (24.2) 53.29 (25.2) 50.32 (24.2) 50.16 (23.9) 52.89 (24.9) 121 50.60 (24.3) 53.01 (25.8) 50.87 (24.3) 50.01 (24.0) 52.53 (25.2) 131 54.00 (25.3) 52.36 (24.6) 50.42 (24.2) 50.42 (24.2) 49.99 (23.6) 141 52.00 (24.8) 52.61 (25.0) 49.60 (23.5) 50.60 (23.9) 50.44 (24.0) 151 51.94 (24.5) 52.93 (24.9) 52.62 (24.6) 52.39 (24.5) 50.83 (23.9) 161 52.41 (24.4) 52.05 (24.8) 51.43 (24.8) 49.73 (24.0) 52.54 (24.9) 171 51.62 (24.5) 53.48 (25.6) 51.25 (24.4) 52.84 (24.6) 51.63 (24.7) 181 51.17 (24.6) 53.70 (25.5) 51.99 (24.7) 52.76 (24.8) 50.22 (24.3) 191 53.06 (25.0) 52.33 (24.5) 53.39 (25.1) 51.97 (24.9) 50.39 (24.2) 201 51.81 (24.8) 50.60 (24.3) 53.21 (25.0) 53.03 (25.2) 52.03 (24.8)

a t t e m p t i n g to b r idge the gap in so lu t ion qual i ty (which increases wi th p rob lem-s i ze ) be tw e e n the two choices o f k by a u g m e n t i n g the 2-0pt a lgo r i thm with sacrificing. A l t h o u g h the resul ts to fol low ind ica te tha t 3 -op t is still the champion , ou r sacrif icing a lgor i thm m a d e 2-0pt a cons ide rab ly m o r e power fu l a lgo r i thm and m a k e s sacr if ic ing a wor thwhi le s t ra tegy in many s i tuat ions . In wha t follows, 3 - O P T refers to the t r ad i t i ona l local sea rch a lgo r i thm using a 3-swap n e i g h b o r h o o d g e n e r a t i o n scheme and using our O(nZ) a lgo r i thm for compu t ing FD3 . 2 - O P T similar ly uses 2-swap as its p e r t u r b a t i o n s t ra tegy and our O ( n ) a lgo r i t hm for compu t ing F D . By 2 -SAC we m e a n 2 - O P T a u g m e n t e d with our sacrif ice scheme, as desc r ibed in the p rev ious sect ion.

4.1. Experimental procedure

W e c o n s i d e r e d p r o b l e m sizes varying, in i nc remen t s o f 10, f rom 21 ci t ies to 201 cities, or pu t a n o t h e r way, f rom 10 to 100 passengers , in s teps of 5. F o r each va lue o f N, the n u m b e r of cities, we g e n e r a t e d 5 ins tances 4, each compr i s ing N po in t s f rom a square of l eng th 100. T h e base (city c 0) was de f ined to be at loca t ion (50, 50) and the r e m a i n i n g N - 1 cit ies were g e n e r a t e d by choos ing pa i rs of r a n d o m coo rd ina t e s un i fo rmly g e n e r a t e d over the d imens ions of the square . T h e E u c l i d e a n d i s tance me t r i c was u sed th roughou t . E a c h city was t hen iden t i f i ed wi th p icking up or d r o p p i n g off one of the n passenger s by the fol lowing re la t ionsh ip . Pa s senge r i is p i cked up at city 2i - 1 and d r o p p e d off at city 2i. Tab le 1 gives the m e a n in te r -c i ty l eng th g e n e r a t e d by this s tep for each of the five da ta -se t s . In this t ab le and s ubs equen t t ab les s t anda rd -dev i a t i ons are given in pa ren these s .

O n each o f the ins tances and at each o f the va lues of N we g e n e r a t e d 10 runs of 3 - O P T and as many runs of the 2 - O P T and 2 - S A C a lgor i thms as wou ld run in tha t a m o u n t of t ime. Since 3 - O P T cons iders a l a rge r n e i g h b o r h o o d at each i te ra t ion , a d e e p e r and longe r runn ing a lgor i thm sea rch results , so it is only

4 We will refer to the set of 0th instances over all 19 values of N as the 0th data-set, or DS 0. Data-sets DS1-DS 4 are defined similarly.


Table 2 Mean solution length as returned by 2-OPT, 2-SAC and 3-OPT over parentheses

data-sets DS0-DS 4. Standard-deviations are listed in

N Initial 3-OPT 2-SAC 2-OPT

21 1090.83 (161.1) 461.94 (50.0) 507.53 (62.1) 638.80 (94.9) 31 1601.52 (130.2) 584.53 (35.1) 664.29 (54.1) 835.35 (105.0) 41 2101.33 (149.6) 681.10 (50.3) 764.56 (69.4) 1016.29 (129.4) 51 2649.67 (177.0) 751.94 (61.5) 898.84 (92.4) 1163.96 (129.2) 61 3062.13 (168.0) 808.62 (35.4) 979.22 (94.3) 1317.50 (148.9) 71 3710.35 (189.8) 895.81 (39.5) 1114.14 (111.9) 1502.28 (149.3) 81 4171.55 (228.0) 939.29 (46.1) 1230.59 (143.5) 1687.66 (166.4) 91 4513.69 (195.6) 987.14 (41.4) 1294.35 (143.2) 1748.24 (160.3)

101 5261.01 (219.0) 1045.47 (49.2) 1397.92 (157.5) 1913.69 (170.1) 111 5719.54 (273.5) 1072.08 (47.3) 1446.07 (166.3) 2015.50 (185.6) 121 6232.33 (264.4) 1134.69 (57.4) 1587.47 (203.8) 2189.96 (207.6) 131 6746.18 (319.9) 1172.13 (57.6) 1633.82 (194.2) 2262.61 (207.1) 141 7183.30 (282.9) 1229.20 (40.1) 1709.43 (203.6) 2378.90 (201.4) 151 7872.98 (281.1) 1278.93 (49.2) 1835.71 (219.0) 2560.65 (216.7) 161 8310.81 (322.3) 1325.16 (52.4) 1912.66 (229.3) 2638.95 (226.7) 171 8934.90 (326.0) 1380.86 (43.4) 2015.75 (252.3) 2794.33 (221.8) 181 9413.87 (357.9) 1408.22 (50.6) 2087.89 (263.9) 2893.64 (232.4) 191 9962.02 (365.9) 1454.96 (59.2) 2158.38 (267.9) 2998.31 (236.9) 201 10474.83 (361.9) 1524.25 (73.0) 2313.55 (300.0) 3159.62 (253.8)

fair tha t the compar isons be with respect to the same a m o u n t o f comput ing resources. These data represen ted the basis for our compar isons of pe r fo rmance as r epor ted in the next section.

I n all cases the initial start ing solutions were chosen at random. O u r exper iments were run on a Texas Ins t ruments Explorer I I Lisp machine , using the CommonLisp

p rog ramming language.

4.2. Experimental results

W e repor t on the pe r fo rmance of the three algori thms according to three parameters : average tour length, best tour length and average running times of the algorithms. In view of the fact tha t local search algori thms are general ly run more than once, keeping the best run, we want to know what is the best behavior of the algorithm, as opposed to the average behavior. In addi t ion we repor t on average ne ighborhood sizes for the three con tending algorithms.

4.2.1. Tour length performance In this section we analyze the behavior o f the three contending algori thms f rom the solution-cost point

of view. Table 2 illustrates the average improvements observed after running the three algori thms on all five data-sets. The column labeled ' Ini t ia l ' refers to the average length of the randomly chosen initial solutions. The three subsequent columns indicate the average final solution o f the labeled algorithm. The quality of the solutions re tu rned by 2-SAC f luctuate more than ei ther of the o ther two algorithms. This is a t rend that we will of ten see repeated: while 2 -SAC is likely to have a be t te r average pe r fo rmance than 2-OPT, its behavior over mult iple runs is likely to be more erratic.

Fig. 7 illustrates the results o f our exper iments when the algori thms were run for an equal amount of t ime on data-set 0, i.e. the 0th gene ra ted instance at each p rob lem size. R a t h e r than illustrate this on each of the o ther four data-sets we have distilled the peak pe r fo rmance of the algori thms by looking at

98 P. Heal),, 1~ Moll/European Journal of Operational Research 83 (1995) 83-104

o

2000

1000 - - 900 -

800

700

600

500

+ 4-

.4- +

+ x

+ + x x + x x o

o o o

x o

o

+

"t- X X X

O O O

+ X X O

+ ~ O

5 o + 2-Ol:rI" + x 2 - S A C x o o 3 - O P T

T + +

+

x

x x x

o o o o

I I I I I I I I I I 20 40 60 80 100 120 140 160 180 200 220

N

Fig . 7. B e s t s o l u t i o n f o u n d b y 2 - O P T , 2 - S A C a n d 3 - O P T o n d a t a - s e t D S 0.

the percentage improvement of the best-run with respect to the average solution returned. The average of the five "peak-to-mean" ratios at each value of N and for each of the algorithms is shown in Fig. 8. The low deviations from the mean that 3-OPT exhibits is another indication of the power of the 3-opt search strategy. In this figure we also see a widening disparity between the average solution length and the best solution length as returned by 2-SAC. This is consistent with the variances shown in Table 2.

4.2.2 Running time behavior Table 3 illustrates the means and standard deviations (in parentheses) for the running times of each of

the three algorithms as computed over all five data-sets. The final entry of this table is a least squares estimate of the running times over the five data-sets. The estimates predicted the running times for a small number of runs:0f a 501-city problem for 2-OPT and 2 - S A C .

,//

~ , 36

32 0,3

28 6 '

~ ' ' 2 4

, 2 0

16

12

m

F m +

X (--

8 - -

O 4 ~-

20

+ + +

+ + + + + 4- 4_ 4- + + +

+ + -I- x x

X X X X X X X X X X

X X

x x + 2 - O P T

x x 2 - S A C

o 3 - O P T

0 0 0 0 0 0 0 0 0 0 0 0 0 0

0

o

o

I I I I I I I I I 40 60 80 100 120 140 160 1 8 0 200

Fig . 8. A v e r a g e p e a k - t o - m e a n r a t i o o n d a t a - s e t s D S o - D S 4.

I 220

N

P. Healy, R. Moll~European Journal of Operational Research 83 (1995) 83-104

Table 3 Algorithmic running-times in seconds on data-sets DS0-DS 4

99

N 3-OPT 2-SAC 2-OPT

21 5.41 (0•9) 1.81 (0.8) 0.15 (0.0) 31 18.24 (2.4) 4.88 (2•5) 0.36 (0.1) 41 43.49 (7.6) 10.79 (6.7) 0.65 (0.1) 51 84.66 (14.8) 15.39 (9.6) 1.04 (0.3) 61 145.67 (24.5) 20.73 (13.5) 1.40 (0.3) 71 210.08 (33.6) 29.05 (18.5) 1.96(0.4) 81 311.22 (44.9) 32.26 (21.1) 2.36(0.4) 91 447.77 (56.5) 41.03 (26.8) 3.20(0•6)

101 624.24 (86.6) 54.78 (36•1) 3.99 (0.7) 111 846.78 (132•9) 69.08 (47.7) 4.82 (1.0) 121 1053.60 (152.3) 71.71 (48.9) 5.68 (1.0) 131 1333.14 (176.1) 93.34 (59.9) 6.66 (1.2) 141 1728.92 (233.7) 102.44 (67.1) 7.67 (1.4) 151 2026.41 (258.2) 112.23 (69.4) 8.63 (1.5) 161 2501.53 (448.2) 125.91 (73.5) 10.13 (1.7) 171 2940.61 (399.4) 141.19 (81.3) 11.35 (1.9) 181 3841.27 (635.9) 160.65 (98.2) 12.94 (2.1) 191 4232.62 (589.2) 186.08 (109.4) 14.27 (2.4) 201 4620.11 (575.3) 192.17 (112.0) 15.98 (2.7)

* 6.3 X 1 0 - 4 N 2"99 5 . 3 X 1 0 - 3 N 1"98 3.4 X 1 0 - 4 N 2"03

In light of the previous section, the 3-OPT averages are based on 50 random starts, while the figures for 2 -OPT and 2-SAC are based on as many random starts as were possible in this amount of time. Again note the wide spread in running times of the 2-SAC algorithm as reflected in its comparatively larger variances•

Fig. 9 displays these results graphically, plotted on a log-log scale, where we include a plot of f(N) = N 2 as a high-water mark. The starred entry of Table 3 quantifies these estimates. It is significant

4.90

4.20

3.50

2.80

2.10

1.40 - -

0.70 - -

0.00

-0 .70 - -

-1 .40 1.20

- - ° A . ° fix" . . x ° "~ °

S / " ~ ,~-, ~,-~"~x ~ .,~ ~ ,X ~ • ~ ..l.z+ ,t--~ " t " + + + +

~ " " zx . . . . . zx N 2 X @ . . ,I-" "'P " + " + " ~ . + - "

. + , " o-~ ~ o 3-OPT • . " x - , - x 2-SAC

I I [ [ + . . . . . +l 2-OPT I 1.40 1.60 1.80 2.00 2.20 2.40

log(N)

Fig. 9. Average running times for 2-OPT, 2-SAC and 3-OPT on data-sets DSo-DS 4.

100 P. Healy, 1~ Moll~European Journal of Operational Research 83 (1995) 83-104

Table 4 Feasible neighborhood ratios for 2-OPT, 2-SAC and 3-OPT on data-sets DS0-DS 4. Each column represents the percentage ratio of feasible k-swaps. For k = 2, the columns labeled 2-Opt and 2-Sac refer to the ratios after optimization and the column labeled U. Bd. indicates the theoretical upper bound

N 3-OPT 2-OPT/2-SAC

Before After Before 2-Opt 2-Sac U. Bd.

21 22.055 23.935 23.992 26.849 34.632 71.05 31 17.082 19.140 22.593 24.317 33.716 72.41 41 14.007 16.880 21.244 23.500 33.262 73.08 51 12.919 14.966 20.I66 22.110 30.989 73.47 61 10.393 12.184 18.303 19.289 28.672 73.73 71 9.414 10.754 17.577 18.800 27.918 73.91 81 8.578 9.302 16.535 16.310 25.102 74.05 91 7.926 9.757 15.854 16.913 24,708 74.16

101 7.552 8.599 15.521 15.913 24.294 74.24 111 6.805 8.740 14.820 15.733 23.899 74.31 121 6.554 7.639 14.350 14.553 22.012 74.37 131 6.147 7.535 13.930 14.697 22.585 74.42 141 5.786 7.226 13.543 14.007 21.585 74.46 151 5.300 6.463 12.960 13.253 20.418 74.50 161 5.188 6.321 12.703 13.273 20.468 74.53 171 4.746 6.209 12.402 12.731 19.666 74.56 181 4.876 6.081 12.114 12.497 19.102 74.58 191 4.527 5.672 11.830 12.435 19.072 74.60 201 4.288 5.181 11.633 11.917 17.952 74.62

that 2-SAC represents a s lowdown of only a constant factor over the 2-OPT algorithm on the average case: approximately 14 runs of 2-OPT are possible in the same time as one run of 2-SAC.

4.2.3. Neighborhood size Table 4 compares feasible neighborhood s izes for the three algorithms. For 3-OPT we report the

averages both before and after the algorithm. For 2-OPT the first two columns report these figures also. The average effect of sacrificing on neighborhood size is then reported. For the purposes of comparison we append to each row the upper bound on feasible neighborhood size for any solution when expressed as a percentage of all 2-swaps. Asymptotically this approaches 0.75, as we have seen from Eq. (2).

It is interesting that for both 2-OPT and 3-OPT the feasible neighborhood increases unaided, which is in contrast with Morgenstern and Shapiro's work with the Graph Coloring Problem where they report that as a local opt imum is approached, the 'useful' 5 neighborhood diminishes. The algorithms them- selves have a tendency to accumulate long runs of pick-ups at the beginning of the tour and likewise long runs of drop-offs towards the end. This leads to a larger neighborhood. Of course, it is exactly this p h e n o m e n o n that we encourage when we perform our sacrificing. The average neighborhood size at the end of sacrificing is approximately 50% larger than before.

In Section 2.3 we have shown that the precedence constraints restrict the space of feasible solutions to be no more than 75% of the total number of k-swaps in the case of k = 2. However the final two columns of Table 4 show that sacrificing never seems to get anywhere near this upper limit. From the table it is clear that as a percentage of the swaps possible the feasible neighborhood shrinks towards zero. This may be explained by the fact that for an n passenger problem the total number of feasible

5 By useful they mean the number of neighbors that have a 'reasonable chance of being accepted as the next solution'.


solutions is 1 / 2 n of the entire space, then as the problem size grows, the number of feasible solutions in a given neighborhood diminishes likewise.

4.3. Discussion

Psaraftis [18] reports that there was no appreciable difference in the running times of best-improvement and first-improvement. This is in contrast to wha t was known for k-swap algorithms for the TSP and also to our experience with DARP. We speculate that the reason for his observation is the running time associated with the calculation of the F I R S T D E L matrix. His algorithm takes O(n k) to build the table for k-swaps - the same time as it takes to search an entire neighborhood, as best-improvement must do at every iteration. Our improved algorithm allows us to compute this matrix an order of magnitude faster, thus allowing first-improvement significantly faster execution.

The data presented in Table 2 is generally in line with Psaraftis, although comparisons are difficult based on the scant evidence that he presents. From this table we see that sacrificing on average does bet ter than halving the performance gap between 2-OPT and 3-OPT. On a limited amount of data we compared the performances of the three algorithms on values of n in the 250- to 500-city range. The trends were similar: 2-SAC apparently halved the (widening) gap between 2-OPT and 3-OPT.

Looking at the final entry of Table 3, the performance of 2-SAC looks even better: from the average running time estimates presented 2-OPT gets in only a small constant number of runs (approximately 14) for every run of 2-SAC.

It is also interesting to compare Tovey's predictions for general local search algorithms to our predicted running times. He predicts that the number of iterations of a first-improvement search algorithm will be linear in the size of the neighborhood to be searched at each iteration. This is generally in line with our observations for 2-OPT and 3-OPT.

When we compare 2-SAC to 3-OPT, the outcome is also in 2-SAC's favor given that we can run O ( N ) runs of 2-SAC in the same time as one run of 3-OPT. Although 3-OPT unquestionably produces superior solutions, the running time in finding these is often prohibitive: the average running time of 3-OPT on a 201-city problem was in the region of 1¼ hours. The corresponding running time for the 2-SAC algorithm was in the region of 3 minutes. (Coding the algorithms in Common LISP certainly hindered the running times of all our algorithms.)

It would appear that the computational burden of DARP 3-OPT is too great for large problems. Therefore we must make do with the poorer solutions that a faster algorithm generally provides. To this end, getting many random starts of a faster algorithm seems an attractive option. Fig. 10 makes this point by plotting the costs and times taken to find the best solutions using 3-OPT and 2-SAC on a 201-city problem. It would be hard to justify 3-OPT on the basis of this graphic: the best 3-OPT solution is 27% cheaper than the best 2-SAC but 12 times the effort was expended in achieving it. It should be pointed out, however, that comparing the best runs achieved tends to bias the picture in favor of 2-SAC - the lower variance of the 3-OPT algorithm means that a random run is more likely to be similar in performance to the plot shown than would be the case for the 2-SAC algorithm.

4.3.1. Other strategies Sacrificing, as we have already pointed out, has a tendency to load up the majority of passengers in

the first half of the tour and service mainly drop-offs in the second half. Indeed we have observed final tours in which as many as 90% of the pick-ups occur in the first haft of the tour. This suggests the non-sacrificial strategy of randomly picking up everyone in the first n stops and randomly dropping them off in the second half. With this random initial solution we could perform local search. Although the tours re turned by this algorithm appeared to be bet ter than using a completely random starting point

102 P. Healy, 1L Moll/European Journal of Operational Research 83 (1995) 83-104

10000

1000

lOO I I I I I I I I I 600 1200 1800 2400 3000 3600 4200 4800 5400

Time (s)

Fig. 10. Best solutions and associated running times for 3-OPT and 2-SAC.

thanks to the larger neighborhoods explored, once a local optimum is reached, sacrificing for its neighborhood size can do little: it already has a large neighborhood, and cannot benefit from further increases.

A popular heuristic for TSP problems is the 'Nearest Neighbor' strategy [11]. From a given city, the algorithm dictates that we next visit the closest unvisited city. A modified version of this is possible for DARP, where we would visit the nearest feasible unvisited city. Using this as an initial solution, we apply 2-SAC to it. This algorithm performed well and returned solution lengths above the average returned by 2-SAC. Its drawback is, however, that only one run is possible. In keeping with observations made heretofore and in [16], a randomized algorithm may on average perform worse than it but will surely win out ultimately by virtue of its random initial starting solutions.

Although we spent a considerable effort fine-tuning the sacrificing algorithm in the hopes that it might come close to or surpass 3-OPT's performance, perhaps it is not surprising that we did not succeed in this goal. From the point of view of neighborhood size, we have introduced a bias into the search algorithm so that the 2-SAC neighborhood, which ranges between O(n) and O(n2), spends as much of its time as is possible in larger neighborhoods. In contrast, 3-OPT's neighborhood ranges from O(n 2) to O(n3) . It seems that sacrificing has its limitations and for all the extra muscle that it brings it just cannot compete with the larger neighborhood that 3-OPT enjoys.

5. Conclusions

In this paper we have presented a technique that yielded significant improvements for DARP over the local optimization algorithm on which it is based. It did so without drastically altering the time required to find a solution. The quality of solutions found by 2-SAC, our extension to the famous 2-OPT algorithm, fell squarely between 2-OPT and its longer running but superior variant, 3-OPT. 2-SAC did so with no more than a constant increase in running time over 2-OPT.

The Dial-A-Ride-Problem is characterized by the uneven landscape of neighbors' neighborhoods: we have seen that the number of 2-OPT neighbors that a solution can have varies from O(n) to O(n2). Using inter-city distances randomly chosen from a uniform distribution, we were able to achieve improvement

P. Healy, R. Moll/European Journal of Operational Research 83 (1995) 83-104 103

in the pe r fo rmance of the a lgor i thm by sacrificing shorter- length-solut ions for solutions tha t had larger-sized ne ighborhoods .

These results are not simply good for tune, we believe. W e have been able to achieve similar gains using the technique to solve a rectangle layout p roblem [7,8]. The sett ing here was somewhat different in that we were able to choose f rom a family of strategies how to per turb or t ransform the cur rent solution. Nonethe less we found that by making rectangle swaps that increased the number of possible t ransforma- tions al lowed we were able to achieve higher quality solutions. A n increased number of possible t ransformat ions cor responds to a larger ne ighborhood.

These observat ions suggest an impor tan t p rob lem- independen t metr ic for one b road class of combinatorial opt imizat ion problems, namely problems with a non-un i fo rm ne ighborhood structure. For problems in this class, our universal secondary metr ic is ne ighborhood size. For such problems we will move away f rom a pr imary metr ic local op t imum to a neighboring solution if 1) the new solution's pr imary metr ic evaluat ion isn' t m u c h worse; and 2) the new solution has a larger feasible local ne ighborhood.

W e believe that sacrificing can be an effective search strategy for problems of this class. Moreover , we conjecture tha t when ne ighborhood size for such a p rob lem can be calculated or es t imated quickly, an op t imize / sac r i f i ce improvement cycle like the one we repor t on here consti tutes a general and powerful extension to local search.

References

[1] Bentley, J.L., "Experiments on traveling salesman heuristics", in: Proceedings of First AnnualACM-SIAM Symposium on Discrete Algorithms, 1990, 91-99.

[2] Bock, F., "An algorithm for solving 'traveling-salesman' and related network optimization problems", Presented at the 14th National Meeting of the Op. Res. Soc. of America, St. Louis, MO, Oct. 24, 1958.

[3] Collins, N.E., Eglese, R.W. and Golden, B.L., "Simulated annealing: An annotated bibliography", Working Paper Series MS/S 88-019, College of Business and Management, University of Maryland, October 1988.

[4] Croes, G.A., "A method for solving traveling-salesman problems", Operations Research 6/6 (1958). [5] Friden, C., Hertz, A., and de Werra, D., "STABULUS: A technique for finding stable sets in large graphs with tabu search",

Computing 42 (1989) 35-44. [6] Glover, F., "Tabu search - Part I", ORSA Journal on Computing 1/3 (1989) 190-205. [7] Healy, P., "Rule-based local search in rectangle layout", Master's Thesis, University of Massachusetts, Amherst, May 1988. [8] Healy, P., "Sacrificing: An augmentation of local search", Ph.D. Thesis, University of Massachusetts, Amherst, May 1991. [9] Hertz, A., and de Werra, D., "Using tabu search techniques for graph coloring", Computing 39 (1987) 345-351.

[10] Johnson, D.S., Aragon, C.R., McGeoch, L.A. and Schevon, C., "Optimization by simulated annealing: An experimental evaluation, Part 1 (graph partitioning)", Operations Research 37/6 (1989) 865-892.

[11] Johnson, D.S., "Local optimization and the traveling salesman problem", in: Proc. 17th Colloquium on Automata, Languages and Programming, Springer-Verlag, Berlin, 1990, 446-461.

[12] Kernighan, B.W., and Lin, S., "An efficient heuristic procedure for partitioning graphs"; The Bell System Technical Journal 49/2 (1970).

[13] Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., "Optimization by simulated annealing", Science 220 (1983) 671-680. [14] Lin, S., "Computer solutions of the traveling salesman problem", The Bell System Technical Journal 44/10 (1965). [15] Or, I., "Traveling Salesman-type combinatorial problems and their relation to the logistics of blood banking", Ph.D. Thesis,

Department of Industrial Engineering and Management Science, Northwestern University, Evanston, IL, 1976. [16] Papadimitriou, C.H., and Steiglitz, K., Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall, Englewood

Cliffs, NJ 07632, 1982. [17] Psaraftis, H.N., "A dynamic programming solution to the single-vehicle many-to-many immediate request dial-a-ride problem",

Transportation Science 14 (1980) 130-154. [18] Psaraftis, H.N., "k-Interchange procedures for local search in a precedence-constrained routing problem", European Journal

of Operational Research 13 (1983) 391-402. [19] Rothfarb, B., Frank, H., Rosenbaum, D.M., Steiglitz, K. and Kleitman, D.J., "Optimal design of offshore natural-gas pipeline

systems", Operations Research 18/6 (1970).

104 P. Healy, tL Moll~European Journal of Operational Research 83 (1995) 83-104

[20] Savelsbergh, M.W.P., "An efficient implementation of local search algorithms for constrained routing problems", European Journal of Operational Research 47/1 (1990) 75-85.

[21] Steiglitz, K., Weiner, P., and Kleitman, D.J., "The design of minimum-cost survivable networks", IEEE Transactions on Circuit Theory 16/4 (1969).

[22] Stein, D.M., Scheduling dial-a-ride transportation systems", Transportation Science 12/3 (1978) 232-249. [23] Tovey, C.A., "Hill climbing with multiple optima", SIAM Journal of Algorithms by Discrete Method~ 6/3 (1985). [24] Yannakakis, M., "The analysis of local search problems and their heuristics", in: Proc. 1990 Symp. Theoretical Aspects of

Comp. Sci., 1990, 298-311.

A new extension of local search applied to the Dial-A-Ride Problem

Documents

Transcript of A new extension of local search applied to the Dial-A-Ride Problem