Multiple Criteria Districting Problems, Models, Algorithms ...Fernando Tavares Pereira zx, Jos e...

Instituto de Engenharia de Sistemas e Computadores de Coimbra

Institute of Systems Engineering and Computers

INESC - Coimbra

Tavares Pereira, F., Figueira, J., Mousseau, V., Roy, B.

Multiple Criteria Districting Problems, Models, Algorithms, and Applications: The Public

Transportation Paris Region Pricing System

No. 21 2004

ISSN: 1645-2631

Instituto de Engenharia de Sistemas e Computadores de Coimbra

INESC - Coimbra

Rua Antero de Quental, 199; 3000-033 Coimbra; Portugal

www.inescc.pt

Multiple Criteria Districting Problems,

Models, Algorithms, and Applications:

The Public Transportation Paris

Region Pricing System.

Fernando Tavares Pereira ∗‡§, Jose Figueira †‡§,

Vincent Mousseau§ and Bernard Roy§

December 19, 2004

∗Dept. of Mathematics, The University of Beira Interior, Rua Marques D’Avilla e Bolama, 6201-001,Covilha, Portugal, Phone/Fax: (+351) 275 31 97 32, E-mail: [email protected]

†Faculty of Economics, The University of Coimbra, Av. Dias da Silva, Coimbra, Portugal, Phone:(+351) 239 790 500, Fax: (+351) 239 790 514. E-mail: [email protected]

‡INESC - Coimbra, Rua Antero de Quental, 199, 3000-033 Coimbra, Portugal§LAMSADE, Paris-Dauphine University, Place du Marechal De Lattre de Tassigny, 75775

Paris Cedex 16, France, Phone: (+33-1) 44-05-44-01, Fax: (+33-1) 44-05-40-91, E-mail:{fpereira,figueira,mousseau,roy}@lamsade.dauphine.fr

i

Contents

Abstract iv

1 Introduction 1

2 Problem statement 62.1 Modelling issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Districting Criteria and Constraints . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Homogeneity Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Geographical Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.3 Flow Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2.4 Conformity Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Multiple criteria background: concepts, definitions and notation . . . . . . . 10

3 A local search evolutionary algorithm 123.1 Representing a solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Defining the initial population by using a merging procedure . . . . . . . . 133.3 Assigning a fitness value to each individual . . . . . . . . . . . . . . . . . . 143.4 Selecting individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.5 The crossover operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.6 The mutation operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.7 A local search procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.8 Outline of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.9 Particular implementation issues . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Illustrative examples 204.1 A small-size example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1.1 Evaluating criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.1.2 The bi-criteria Mehrotra-like model and the ε-constraint method . . 214.1.3 The entire non-dominated set . . . . . . . . . . . . . . . . . . . . . . 234.1.4 Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 A large-size example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2.1 Evaluating criteria and constraints . . . . . . . . . . . . . . . . . . . 254.2.2 Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Case study: The public transportation pricing system problem 275.1 Data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.2 Criteria and constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

ii

6 Conclusions and future research 35

Acknowledgements 35

Appendix A: Data set 39

Appendix B: Graphical representation of the solutions 40

iii

Multiple Criteria Districting Problems,

Models, Algorithms, and Applications:

The Public Transportation Paris

Region Pricing System.

Abstract

Districting problems are of high importance in a wide scope of areas. Multiplecriteria districting problems are a more realistic representation of the practicalproblems in the real-world. This paper deals with the problem of partitioninga territory in zones. Each zone is composed of a set of elementary territorialunits or atoms. A district map is formed by partitioning the set of atoms inconnected zones without inclusions. This problem can be modelled using thegraph theory and the 0-1 mathematical programming concepts and techniques.Each atom is associated to a vertex of the graph, while a pair of contiguousatoms allows to define an edge of the graph. There are also some numericalvalues associated to the edges and/or vertices. The problem of enumerating allthe efficient solutions for such a model is known as being NP-difficult, which“imposes” abandoning of exact methods to solve large-size instances. In thispaper we propose a new method to approximate the efficient frontier basedon an evolutionary algorithm with local search. The algorithm presents a newsolution representation and the crossover/mutation operators. The algorithmswere applied to a real-world problem of the Paris region public transportation.

Keywords: Multiple criteria; Districting Problems; Evolutionary algorithms;Local search; Combinatorial optimization.

iv

1 Introduction

Over the last three decades, many researchers, academics and practitioners from distinctfields have developed models, built algorithms and implemented solutions concerning theso-called partition or districting problem. It can be viewed as a grouping process ofelementary units or atoms of a territory into larger pieces or clusters, giving rise to a mapor configuration.

There are many practical questions related to this problem:

• To define the electoral districts of a country [5, 15, 20, 24].

• To establish the different working zones for a travel salesperson team [11, 18, 33, 39].

• To define areas in metropolitan internet networks to install hubs [29].

• To define the areas for manufactured and consumer goods [13].

But, the same kind of questions occur also in police districting [9]; school districting [12];districting of salt spreading operations [27]; defining electrical power zones [2], and manyother domains. These are barely some frequent real-world decision making questions andconcerns in territory partition problems.

When carefully observing and analyzing the large scope of application areas dealingwith districting problems we cannot be indifferent to the crucial importance of such kind ofdecision making situations to our societies. Most of the above cited applications deal withreal-life decision making situations that contribute to the development of our societiesin a large variety of fields. By nature, these problems are multiple criteria, frequentlyincommensurable and conflicting.

On the definition of the problem. One can say that to define more precisely thepartition of a territory into different “homogeneous” zones under multiple criteria, consistsof grouping elementary units of territory in order to form a set of districts, zones, orclusters. Thus a territory is composed of zones, each zone resulting from a grouping processof elementary units or atoms. The zones should fulfill certain, more or less, technical,ethical, ecological, social and other constraints. Different configurations plans or maps ofa territory form a set of different solutions where each one is evaluated on the basis of afamily of criteria. Thus, the search for an optimal solution makes no sense and the “best”solution is, in general, a compromise where the improvement on a given criterion leadsto a degradation of the evaluations on at least one of the remaining criteria. This is theconcept of non-dominated solution.

A brief historical view on territory partition problems. Historically, among thedifferent types of territory partition problems, it was the so-called electoral districtingproblem that impelled the use of scientific methodologies that sought the constructionof political districts or zones as closed as possible in terms of voting power in order to

1

generate trust and the impartiality in the partition process. The importance of this lastaspect is due to the fact that it is possible to design or conceive a partition favoring acertain political, social or ethnic group. It was well-known in the history of the U.S.A.politics that the Governor of the state of Massachusetts, Albright Gerry (1744-1814), in anattempt to guarantee his re-election, manipulated the division of his state to concentratehis voters strictly in number enough to elect a representative and to group a large numberof his opponents in a few districts [24]. Therefore one of the districts had the shape ofa salamander, as it was suggested in the Boston Gazette on March 26, 1812. This factoriginated the expression “gerrymander”, the result of the contraction between Gerry andsalamander.

The danger that this type of practices represent for the democratic systems justifiedthe dedication and the zeal of some researchers in the construction of new methodologies,as independent as possible of human intervention, that allow to draw electoral districtsfor all the political groups.

One of the first works [37] on this very topic appeared in 1961. It describes a heuristicprocess where a zone is built, at each iteration. Each zone results from the successiveinclusion of adjacent elementary units to reach a given level of voters. This number is theratio between the total of population not assigned to any zone and the number of zonesto be built. Some years later, in 1965, the first mathematical programming model wasproposed [19], formulating the problem as a location/allocation model.

Apart from the political districting problem, the model that most deserved the atten-tion on the area is the problem of designing zones for salespersons. Since 1971 differentworks have been published where the main objective consists of balancing the workloadfor the different zones [11, 18, 33, 39].

The frequent criteria and constraints. It is no easy to distinguish between criteriaand constraints. Certain criteria are considered constraints in different models. Withoutgoing into the details of such a distinction let us consider that all the partition problemswe are concerned with can be modelled through an undirected and connected graph,where the vertices represent the elementary units and the edges the adjacency relationbetween elementary units. Our problem is a particular graph partition problem whereeach partition is obviously composed of a connected subgraph. The following constraintsshould be considered:

1. Integrity. A vertex cannot belong to several subgraphs at the same time. It onlybelongs to one and only one subgraph of the partition;

2. Contiguity. It is possible to define a path from two pairs of vertices in a certain zonewithout passing through a different zone;

3. Absence of holes. It is not possible to obtain embedded connected subgraphs.

In what follows, we only consider these three issues as constraints.

2

As for the criteria many classifications where published in scientific literature, mainlyin political districting problems [16, 25, 38]:

1. Grofman (1985) proposed the following classification: formal (equal population, con-tiguity, compactness, etc.); racial intent (no intent to dilute minority voting power);political intent (no intent to favor a political party, no intent to favor an incum-bent, etc.); racial outcome/anticipated outcome (no retrogression in representation,no dilutions of racial minority voting strength and ‘racial proportionality’); politicaloutcome/anticipated outcome (no “imposed” bias in favor of a party, no incumbent-centered partisan bias, etc.).

2. Morrill (1981) proposed to classify the political criteria in four class: constitutional(equal population and equal probability of representation); geographic (compactness,contiguity and integrity of political of interest); political-geographic (representationof political units and integrity of political boundaries); political (minimal changes toold plans and no political “gerrymandering”).

3. Williams (1995) proposed the following classification: demographic (equal popula-tion and proportional minority representation); geographic (contiguity, compactnessand community integrity); political (proportionality, safe vs. swing districts andsimilarity to the old plan)

This type of classification is specific to the political districting problem. In otherdistricting problems there are similar criteria that have an analogous interpretation. Forexample, while in the political districting problem we search for zones as close as possiblein terms of vote power, in the sales territory alignment problem we are looking for zonesas close as possible in terms of workload. This similarities allowed us propose a newclassification:

1. Homogeneity Criteria. Criteria aiming at distributing/homogenizing some attribute(aspect, characteristic, . . . ) in the partition as a whole or in each zone individually.

2. Geographical Criteria. Criteria aiming at defining districts that respect any geo-graphical attribute, i.e., they account for the spatial position of the territorial units.

3. Flow Criteria. Assuming that there is a flow of any type between zones, we maywish to optimize the flow between zones. This criterion type reflects such an aspect.

4. Conformity Criteria. Criteria aiming at defining districts “compatible” with anexisting territorial partition.

It should be remarked that this classification will be developed below (see Section 2.2).

3

Solving districting problems. When solving districting problems there are three mainaspects to be considered, related with the “techniques” applied to solve this kind of prob-lems, the “models” and the “algorithms”. These aspects are related with the followingquestions: How shall we deal with the partitions of a territory, by agglomerating ele-mentary units or dividing the whole piece? How many criteria should be considered, asingle one or several ones? Should we be concerned with exact solutions or approximatedsolutions will suffice? Let us consider each point individually.

The different “techniques” for districting problems can be divided in two big families:one based on the concept of division and the other on the notion of agglomeration [8]:

1. In the division techniques the territory is considered as a whole and the districtingprocedure works by dividing it into pieces [6, 14].

2. In the agglomerative techniques the territory is composed of a set of elementaryterritorial units and a district is a subset of units forming a connected piece ofland [10, 15, 19, 20, 28, 37].

On the other hand we can classify the districting problems in terms of the number of“criteria”:

1. Single criterion models. Models involving only one criterion. Generally, voting po-tential equality, workload equality, etc., is the only objective function considered.However, several models take into account some criteria as constraints, like com-pactness [15, 19, 20].

2. Multiple criteria models. For dealing with conflicting criteria some authors adoptedmodels with more than one criterion. A district map with very good values for votingpotential equality can perform very badly in terms of compactness and vice-versa.There are several strategies where the criteria are considered according to a fixedhierarchy reflecting the decision maker preferences. In other cases the purpose is tobuild a mixed objective function combining all the objectives [1, 3, 5, 10].

The different works in this field can also be classified in exact and non-exact algorithms:

1. Exact techniques for single criterion problems are provided in [23] where it is applieda decomposition and column generation scheme for their solution. Computing exactefficient solutions for districting problems is a very hard task since in general thesize of real-world problems is unmanageable.

2. A new avenue for dealing with this problem is the use of meta-heuristics to ap-proximate the exact non-dominated frontier. Indeed, this had been already done byseveral researchers [2, 27].

Most of the works published in the area deal with a specific problem of districtingwithout establishing one common “platform” for the different partitioning territory prob-lems. The applied heuristics is also specific for each problem. Despite the very nature of

4

reality which is mainly multi-dimensional (since there are “always” more than one crite-ria to be optimized) there are still several works applying single criterion models whichamalgamate all the dimensions in the same scale and more often provide meaningfulnessconclusions. We can say that, in general, exact techniques only have some interest froma theoretical point of view. They can only be applied to small size instances, which areunrealistic indeed.

Main features of the proposed approach. Our approach is a local search evolution-ary based algorithm. The adopted representation for the individuals or solutions is theclosest possible one to the solutions themselves. Each solution is a set of subsets whereeach subset represents a zone. This allows us to guide the operators (crossover and muta-tion) according to the specific criteria of each kind of problem. The main features of thealgorithm are the following:

1. It allows to solve large size instances in a reasonable CPU time and for differentkinds of problems.

2. It deals with multiple criteria.

3. It allows to consider certain specific constraints of each type of problems.

We intend to create a new platform that allows, with little effort, to adapt our algorithmto any districting problem.

The Paris region case study. The public transportation tickets pricing in the Parisregion, “carte orange”, is calculated on the basis of a territory partition of the Paris regioninto concentric zones. The only criterion considered is the distance from the center. Theprice increases according to the this distance, but it is the same in each concentric zone.A study undertaken by the Syndicat des Transports Parisien (STP) showed that thisdistricting map does not correspond any more to the needs of the users. With the currenttendency of moving services from the center of the big cities to the suburbs, many users ofpublic transportation only use of them inside suburban zones without having the need tomove to the center of the town. This fact along with some verifications of socio-economicnature led researchers at STP and LAMSADE to study a reform of the actual pricingticket system.

The outline of the paper. This paper is organized as follows. Section 2 describes howthis problem can be modelled by using graph theory concepts, proposes a taxonomy of thecriteria/constraints and presents the main concepts definitions and notation. The localsearch evolutionary algorithm is described in Section 3. This is followed by an examplefor checking its effectiveness in Section 4. Section 5 is devoted to the study of a real-world problem. Finally, several important model implementation issues are discussed inSection 6 and some suggestions for future work are also given.

5

1

2

3

45

6 7

1

2

3

45

6 7

(a)

(b)

Figure 1: Contiguity graph

2 Problem statement

In this section we will present a model for this problem that uses a connected graph. Anoriginal criterion taxonomy is also presented.

2.1 Modelling issues

Given a territory composed of indivisible elementary territorial units or atoms, we definea contiguity graph (see Fig. 1) as a connected graph G = (V, E), where V = {1, 2, . . . , n}denotes the set of vertices representing territorial units and E = {e1, e2, . . . , ek, . . . , em} ⊂V × V denotes the set of edges, where ek = {i, j}, represents two adjacent elementaryunits i and j, i.e., the ones having a common boundary.

Once the contiguity graph G = (V, E) is defined, a district map can be viewed as apartition of V into connected subsets of vertices. Furthermore, all the values associated tothe territorial units (population, surface, etc.) can be associated with the correspondingvertices, i.e., for each i ∈ V we have a vector ci = (c1

i , c2i , ..., c

ri ) of r values. Analogously,

the values associated to a couple of contiguous territorial units (length of the commonfrontier, for example) can be associated to the corresponding edge. Thus, for each ej ∈ E

we can have an s-vector dj = (d1j , c

2j , ..., c

sj). We can also have for each pair of vertices

(i, j) one or more values fij representing a flow movement from i to j. It should be noticedthat i and j cannot represent contiguous territorial units.

In this context one solution Y will be represented as a partition of V , as follows,{

Y = {y1, y2, . . . , yK}where, yu ∩ yv = ∅, u 6= v and ∪1≤u≤K yu = V.

(1)

6

2.2 Districting Criteria and Constraints

Various types of criteria can be expressed to define a territorial partition. In addition,different types of constraints can be also imposed. This section aims at proposing ataxonomy for such criteria and constraints. In [4] and [8] we can find an exhaustive list ofpolitical districting problem criteria that can be generalized to other districting problems.We propose to classify districting criteria into 4 categories:

1. Homogeneity Criteria. Criteria aiming at distributing/homogenizing some attributes(aspects, characteristics, effects, etc.) in the partition as a whole or in each zoneindividually;

2. Geographical Criteria. Criteria aiming at defining districts that respect any geo-graphical attribute, i.e., they account for the spatial position of the territorial units;

3. Flow Criteria. Assuming that there is a movement of flow of any type between zones(population, goods, etc.), we may wish to optimize the flow movement between zones;

4. Conformity Criteria. Criteria aiming at defining districts “compatible” with anexisting territorial partition.

2.2.1 Homogeneity Criteria

Homogeneity criteria can be divided in tree different sub-families:

Inter-zone criteria. We may want a partition where some attributes are uniformlydistributed over all zones. For the political districting problem the number of voters ineach zone should be balanced according to the “one-person-one-vote” principle. Hence,the number of voters in each district must be homogenized. If ci represents the number ofvoters in the unit associated to vertex i then pu =

∑

i∈zuci represents the total of voters

in zone zu. This can be implemented in differents ways:

1. Minimizing the difference between the number of voters of the zone containing themaximum number of voters and the one comprising the minimum number of voters:

min

{

max1≤u≤K

{pu} − min1≤u≤K

{pu}

}

(2)

2. Another possibility is to minimize the sum of the deviation from the average:

minK

∑

u=1

|pu − p| (3)

where

p =1

K

n∑

i=1

ci

We call this type of homogeneity inter-zone homogeneity criteria.

7

Intra-zone criteria. We may want a partition where each zone is as uniform as possibleaccording to a certain attribute, regardless of the “value” of this attribute in the remainingzones when considered individually. For example, when considering a district map for ateam of salesperson, one of the criteria frequently used is to make a partition of the territoryinto uniform zones according to some attribute of the population (academic qualifications)which facilitate the specialized tasks of each salesperson or agent. Two possible criteriafor modelling this situation are the following:

1. To minimize the sum of the difference between the maximum and the minimum ineach zone.

minK

∑

u=1

(Mu −mu) (4)

whereMu = max

i∈zu

{ci} and mu = mini∈zu

{ci}

2. To minimize the “worst” difference between the maximum and the minimum on eachzone.

min

{

max1≤u≤K

(Mu −mu)

}

(5)

We call this type of homogeneity intra-zone homogeneity criteria.

Distributing criteria. We can include in this type of criteria the situation when wewant balanced zones concerning some attribute. Suppose that each ci represents a pro-duction level of some service and ri the demand level. We may wish each zone to havethe same production level per demand. Therefore, if C is the total of production and R

the total of demand, we can minimize the sum of the deviation of the production level perdemand on each zone according to the average:

min

K∑

u=1

∣

∣

∣

∣

∑

i∈zuci

∑

i∈zuri

−C

R

∣

∣

∣

∣

(6)

2.2.2 Geographical Criteria

The geographical criteria aim at obtaining the partitions composed by zones as similar aspossible to a given geometric shape or the zones that include some attributes. This type ofcriteria is important when, for example, the zones should be visited by some salesperson.So it is convenient that the zones are the as compact as possible (close in shape to a circleor square) to minimize covered distances. Obviously, it would be counter-productive fora salesperson to visit a long and skinny zone. This type of criterion is called compactness.

8

Several authors have found measures to evaluate the degree of compactness of a partition.An exhaustive list of compactness indicators can be found in [21].

Any measure that evaluates the compactness of a partition must be based on a measurethat evaluates the compactness of a zone. Therefore, the compactness of the partition canbe measured by the sum of the result in each zone or taking into account the worse valueof them.

To measure the compactness of a zone we can define the ratio between the area of thezone and the area of the circumference with a diameter equal to the maximum distancebetween two points in the zone. If the area of zu is Au and its greatest distance is Du

then we minimizeK

∑

u=1

4Au

πD2u

. (7)

This measure has values between 0 and K, being K in the case of maximum compactnesswhen all the zones coincides with a circle.

When we have some units with special attributes (central hospitals, central stations,etc.) we may want to put this attributes in the center of their zones. To evaluate thisaspect we need to know the distance between two territorial units. Let dij be the distancebetween i and j, and Bzu the set of border units of the zone zu. The set SZ denotesthe zones with some services and szu denotes the unit within the service. So, we want tomaximize

∑

zu∈SZ

m(zu), where m(zu) = mini∈Bzu

{dszu i} (8)

that is, we want to maximize the sum of minimum distances between the service units andits border units.

2.2.3 Flow Criteria

Let us consider the flow between territorial units. In this situation it would be convenientto optimize the transference between units of different zones. For example, in the case ofa partition into zones of public transports ticket pricing, knowing the number of peoplecommuting within each pair of units, would be convenient to minimize the number ofpeople commuting between different zones. For one partition Z, EZ = {{i, j} : i ∈zu, j ∈ zv, u 6= v} represents all pairs of vertices that are in different zones. So we wantto minimize

∑

{i,j}∈Ez

fij (9)

where fij represent the number of people going from i to j. Criteria of this kind are calledflow criteria.

9

2.2.4 Conformity Criteria

Finally, admitting that there is a current partition on a territory we may want the newpartition is the as “compatible” as possible with the current one. In this case we must usea measure of compatibility between partitions.

2.2.5 Constraints

For this kind of problems there are two constraints strongly associated with its nature.One is the contiguity and the other is the absence of holes or inclusions in each zone. Thefirst one means that each zone is a single portion of land such that every part is reachablefrom every other part without crossing the frontiers of the zone. From the contiguity graphpoint of view this means that the sub-graph associated with each zone must be connected.The second one means that no zone can have others zones “inside of it”.

Most of the literature considers contiguity without providing a mathematical formu-lation in the context of mathematical programming. Enforcing contiguity in a districtingmap model requires an exponential number of constraints relative to the number of verticesin the graph [24].

The constraint formulation associated with the absence of holes still seems to be ofgreater difficulty. The typical boarding for this type of difficulties, when we wish a practicalapplication, uses algorithmic techniques. From the algorithmic point of view it is notdifficult to test the contiguity of a zone or to verify if inside of it there are other zones.

Normally the constraint related to the absence of holes can be ignored when we havecriteria that homogenizes the surface and compactness. It is clear that if we have twoembedded zones, of similar size the outer zone will have a bad value of compactness.

Another kind of constraints result from the imposition of upper and/or lower boundsto one or more criteria.

2.3 Multiple criteria background: concepts, definitions and notation

A multiple objective linear program (MOLP) may be written as follows:

max{f1(x) = z1}max{f2(x) = z2}

...max{fk(x) = zk}s.t.: x ∈ X

or“max” Z = {F (x) = z ∈ R

k | x ∈ X}

where:

k, is the number of criteria or objectives;

10

x, is the vector with n decision variables;

fi, is a real function defined in Rn representing the ith objective;

zi, is the criterion value (objective function value) of the ith objective;

X, is the feasible region in the decision space;

“max”, means that the purpose is to maximize all the objectives simultaneously;

F , is a vectorial function composed of k, fi, functions, i = 1, 2, . . . , k;

z, is the objective function vector.

Z, is the feasible region in the criteria space;

Definition 2.1 (Dominance relation) Let z1, z2 ∈ Rk be two criteria vectors. Then,

z1 dominates z2 iff z1 ≥ z2 and z1 6= z2, i.e., z1i ≥ z2

i for all i and z1i > z2

i for at leastone i.

Definition 2.2 (Non-dominated solution) A point z ∈ Z is non-dominated iff it doesnot exist another z ∈ Z such that z ≥ z and z 6= z. Otherwise, z is a dominated criterionvector.

Definition 2.3 (Non-dominated set) The set, Znd ⊆ Z, of all non-dominated criteriavectors is called non-dominated set.

Definition 2.4 (Efficient solution) A solution x ∈ X is efficient (Pareto-optimal) ifits criterion vector, z = F (x), is non-dominated.

Definition 2.5 (Efficient set) The set of all efficient solutions is called efficient set andis represented by Xeff .

In multiple criteria linear integer programming, two types of non-dominated solutionscan be distinguished:

Definition 2.6 (non-dominated supported solution) A point z ∈ Z is a non-domi-nated supported solution if it can be obtained by solving the following parametric mathe-matical programming problem:

minz∈Z

k∑

i=1

λizi fork

∑

i=1

λi = 1, λi > 0 for i = 1, . . . , k.

Definition 2.7 (non-dominated unsupported solution) A point z ∈ Z is a non-dominated unsupported solution if it belongs to the interior of the convex hull of Z,Conv(Z)

11

In general, for large-size instances, it is not possible to enumerate all the set Znd,therefore we must approximate it using non-exact methods. The symbol Znd represents theapproximation of Znd. This bring us to the concept of potential non-dominated solution.

Definition 2.8 (Potential non-dominated solution) A point z is a potential non-dominated with respect to a feasible subset Z ⊆ Z iff it does not exist another z ∈ Z suchthat z ≥ z and z 6= z. Otherwise, z is a dominated criterion vector.

When the subset Z is omitted, it is perceived by the context. Very often the subset Z

represents the solutions determined by an algorithm.

Definition 2.9 (Potential non-dominated set) The set, Znd ⊆ Z, of all potentialnon-dominated criteria vectors with respect to Z is called potential non-dominated set.

3 A local search evolutionary algorithm

A local search evolutionary algorithm (LSEA) results from the combination of an evo-lutionary algorithm with a local search. The expression hybrid evolutionary algorithmis also used in this context. There are no rules about the way these combinations aredone. Normally, each author apply his/her skills when he/she must choose a particulartechnique.

Over the recent years, combinations of different heuristics had been used, thus open-ing the research path of hybrid algorithms [30]. They have shown their ability to providehigh quality local optima. In general, genetic operators are not adjusted to locate a bet-ter solution close to another one [32]. Therefore, the combination between evolutionaryalgorithms and local search seems to be very profitable [31]. For multiple criteria optimiza-tion problems, evolutionary algorithms seem also particularly adequate because they dealsimultaneously with a set of potential solutions which allows us to approximate severalsolutions of the efficient set in a single run.

The main feature of our approach is to find, in a low CPU time, a set of potential non-dominated solutions that approximate the exact non-dominated frontier. In each iteration,after attributing a fitness value to each individual, two different solutions are “randomly”selected in order to apply the crossover and mutation operators and thus form a newgeneration. At the same time a list of potential non-dominated solutions is built. Whena new solution is successfully inserted in this list, a local search is then applied to it.

This section comprises the following subsections: representing a solution; definingthe initial population by using a merging procedure; assigning a fitness value to eachindividual; selecting individuals; the crossover operator; the mutation operator; a localsearch procedure; the outline of the algorithm; and some implementation issues.

12

1

2

3

4

5

6

7

8y1

y2

y3

Figure 2: Graph

3.1 Representing a solution

A solution, Y , is represented as a partition, Y = {y1, y2, . . . , yK}, of the set of vertices,V . Each subset yi is the set of vertices of a connected subgraph of G. Thus, a partition isimplemented by a list of lists where each list represents an element of Y . The followingpartition,

Y ={

y1 = {1, 3}, y2 = {2, 5, 4}, y3 = {2, 7, 8}}

is represented in Fig. 2. Each element, yi, of Y is called zone.

3.2 Defining the initial population by using a merging procedure

In many applications, the subsets of vertices belonging to the partition are constrained.For example, there are problems where the vertices have an associated non-negative weightand the total weight of each zone is constrained by lower and/or upper bounds. When wehave any type of constraint, we refer to these problems as constrained districting problems.Thus, in the process of generating new solutions we must deal with two main concerns:

1. To reach feasible solutions;

2. To obtain the best possible solution according to the criteria.

The initial population, P0, is composed of a set of individuals or solutions,

P0 = {Y1, Y2, . . . , YN}.

To generate an individual we start by the trivial solution where each subset of thepartition is composed of a unique vertex. The order of each zone is randomly generatedfor each individual. Then, it is applied a procedure, called “Merging”, that consists ingrouping two neighboring zones until reaching the number of zones fixed a priori (see Al-gorithm 1). This procedure is also used elsewhere in the crossover and mutation operators(see Fig. 3). If we have constraints concerning the number of vertices in each zone, the zone

13

Merging

crossover

new solutions mutation

Figure 3: Merging solutions

comprising the lowest number of vertices is merged with one of its neighbors. The strategyof choosing the neighboring zone depends generally on the criteria or the constraints ofeach problem. Normally, we use of greedy heuristics that depends on a weighted-sum ofthe two criteria. In order to promote diversity in the set of initial solutions, the order ofthe incident edges to each vertex is randomly modified at each time.

Algorithm 1 Merging

Input: A solution Y = {y1, y2, . . . , yK}Output: A solution W = {w1, w2, . . . , wL} such that L ≤ K

while (K ≥ the fixed number of zones) doLet yi, yj be two neighbors zones, chosen according to a heuristic rule

defined for each type of criteria;Merge yj and yi;K ← K − 1;

end while

3.3 Assigning a fitness value to each individual

The selected strategy for evaluating the fitness value of each solution makes use of awell-known technique suggested by Srinivias and Deb [34] called Non-dominated SortingGenetic Algorithm (NSGA).

This technique is based on the Pareto ranking where the individuals from the entirepopulation are classified into several levels according to the concept of dominance. The

14

potential non-dominated individuals belonging to the population are identified at first.These individuals form the first front of the potential non-dominated frontier. Afterwards,we assign to each of them a large dummy fitness value, F . In order to preserve the diversityof the population, these individuals, classified in different levels, are then shared accordingto their dummy fitness values.

The share value of each individual, fi, is determined by dividing its original fitnessvalue by the quantity

mi =∑

j∈ND

sh(dij)

where,

sh(dij) =

{

1− (dij

σshare)2 if dij < σshare

0 otherwise.

This quantity is proportional to the number of individuals around it. Thus,

fi =F

mi

.

The value dij is the Euclidian distance between two solution Yi and Yj and σshare is themaximum distance allowed between any two solutions to become members of a niche(set of solutions that have common features). Afterwards, this front of the potentialnon-dominated individuals is temporarily ignored to process the other members of thepopulation with the goal of identifying the second front. The new potential non-dominatedindividuals are then assigned to a new dummy fitness value which is kept smaller thanthe minimum shared dummy fitness of the previous front. This method continues untilthe entire population is classified into several fronts, and no more fronts can be identified.This technique is presented in Algorithm 2.

Algorithm 2 Evaluation

Input: A population P = {Y1, Y2, . . . , YN}Output: A fitness value fi for each Yi

Paux ← PF ← M, where M is a large dummy fitness valuewhile Paux 6= ∅ do

Znd ← all potential non-dominated individuals in Paux

for all Yi ∈ Znd doCalculate mi

fi ←Fmi

end forPaux ← Paux − Znd

F ← x such that x ≤ min{fi : Yi ∈ Znd}end while

15

-z1

6� solutions already evaluated• second front· remaining solutions

z2

Figure 4: First front evaluation.

The idea of this technique is to penalize or decrease the fitness value of the solutionsbeing too close, in the criteria space, to some other solutions in the current front. Fig. 4represents the calculations of fi for the first front. It should be noticed that the size ofeach point is inversely proportional to the number of solutions in its niche.

3.4 Selecting individuals

To select each pair of individuals we apply the roulette wheel method. We have theguarantee that the fitness value of any individual is just kept smaller than the minimumfitness value of the previous potential non-dominated front. Therefore, the elements inthe first fronts have higher probability of being selected in the next generation than theremaining solutions.

3.5 The crossover operator

The crossover operator takes two parents (solutions) and makes an exchange of sections ofits chromosomes. In our case, in order to generate an offspring solution, from two parents,

Y1 = {y11, y

12, . . . , y

1K1}

andY2 = {y2

1, y22, . . . , y

2K2}

we first choose the k best zones, y1i1

, y1i2

, . . . , y1ik

, from partition Y1, according to a weighted-

sum. The number k will be chosen randomly within the range [ 14K1, k1 −

14K1]. These

16

1

2

3

4

5

6

7

8

Solution 1

Solution 2

1

2

3

4

5

6

7

8

Phase I

1

2

3

4

5

6

7

8

Phase II

1

2

3

4

5

6

7

8

Figure 5: Crossover operator.

k zones will belong to the offspring solution. Afterwards, we will define an equivalencerelation v in the set

V \(k

⋃

j=1

y1ij

)

as follows:

• v1, v2 ∈ V \⋃k

j=1 y1ij

;

• v1 v v2 ⇔ ∃ j ∈ {1, . . . , k2}

such that there is a path between v1 and v2 in y2j \

⋃kj=1 y1

ij.

In other words, we say that, v1 is equivalent to v2 if and only if, for some zone, y2j , in Y2

there is a path between v1 and v2 in y2j without the vertices of

⋃kj=1 y1

ij.

The offspring candidate solution is composed of y1i1

, y1i2

, . . . , y1ik

more the equivalence

class of the equivalence relation v defined in V \⋃k

j=1 y1ij

. Generally, at this time, thenumber of zones is greater that the fixed one. Consequently the “Merging” procedureis applied to group zones as described in Algorithm 1 (see Fig. 3). As we can see, inFig. 5 one choses a zone from solution 1, composed of vertices 2, 4 and 5. In Phase I,the equivalence classes defined by the relation v in {1, 3, 6, 7, 8} are {1}, {3, 6} and {7, 8}.Afterwards, the zones {1} and {3, 6} are merged as we can see in Phase II.

17

1

2

3

4

5

6

7

8

Initial Solution

1

2

3

4

5

6

7

8

Phase I

1

2

3

4

5

6

7

8

Phase II

Figure 6: Mutation operator.

3.6 The mutation operator

The mutation operator takes one solution and randomly modifies the chromosome. Typ-ically, there is a probability associated to this operator. In our case this probability is 1since we can control the way the new solution is obtained without incurring in a greatcosts.

The operator starts for breaking up a set of zones (the worst zones according to aweighted-sum) in such a way that each vertex constitutes one zone. After this operationthe number of zones is greater than the fixed one. Then the procedure “Merging” zones isapplied as described in Algorithm 1 (see Fig. 3). Fig. 6 shows the different phases of themutation. Firstly, the zone with nodes 2, 4 and 5 is chosen and is broken up into threezones (Phase I). In Phase II the zone with node 4 is clustered to the zone with node 5 andthe zone with node 2 is clustered to the zone comprising nodes 1 and 3.

3.7 A local search procedure

In local search procedure we use the concept of neighborhood structure. From a currentsolution we can determine all the neighbor solutions. Neighbor solutions are determinedby moving at most one vertex from a certain zone in the current solution to one of itsneighbors zones, as Fig. 7 shows.

The possibility of moving a vertex to a neighbor zone is checked. In such a case themove is done thus yielding a different solution that will be checked in order to know if thesolution is a potential non-dominated one. If this is the case the solution is added to thelist.

As Fig. 7 shows, when moving vertex 2 from zone 2 to zone 1, two new zones areobtained. This move is feasible, but others are not. For example, vertex 8 cannot bemoved from zone 3.

Each new potential non-dominated solution, resulting either from the crossover andmutation operators or the application of local search, is placed in a queue data structure.Local search will be applied to it, later on.

As we can see the neighborhood structure can be searched by using an O(n) polynomial

18

1

2

3

4

5

6

7

8z1

z2

z3

1

2

3

4

5

6

7

8z1

z2

z3

Figure 7: Neighborhood structure.

time algorithm.

3.8 Outline of the algorithm

The pseudo-code procedure for the Local Search Evolutionary Algorithm is outlined inAlgorithm 3.

Algorithm 3 Outline of the algorithmt ← 0Find an initial population, P0

Compute Fitness in P0

From P0, initialize the set of potential non-dominated solutions, Znd

while stop condition is false doChildren ← Crossover(Pt)Mutant ← Mutation(Children)Update Znd with Children ∪ MutantLocal Search(Znd)Compute Fitness in Pt ∪ Children ∪MutantPt+1 ← the best of Pt ∪ Children ∪Mutantt← t + 1

end while

3.9 Particular implementation issues

During the process of conceptualization (design) and implementation of this algorithmsome issues were treated in a very particular way. We emphasize tree main issues:

1. Assigning a fitness value. One important rule in assigning a fitness value isto guarantee that the fitness value of any individual is just kept smaller than theminimum fitness value of the previous potential non-dominated front. When we haveto choose the next dummy fitness value it is not enough to keep the minimum fitness

19

value of the previous front because, in some cases, the whole population will havethe same final fitness value. To avoid this situation, the next dummy fitness valuewill be 90% of the minimum fitness value of the previous front. This percentage waschosen after an empirical study. With this value we get a very good distribution ofthe fitness values among the whole population.

2. Selecting individuals. The crossover operator is applied to a couple of solutionspreviously selected. One of them is chosen in the population while the second one isselected from the set Znd that contains the potential efficient solutions. The mainreason is that the new potential efficient solutions resulting from local search arekept in set Znd. They are not going directly to the population. Thus, they have anopportunity to apply the genetic operator to them.

3. Applying local search. Each new potential efficient solution resulting from geneticoperators and local search is stored in a queue structure. Therefore, when the processof applying the genetic operators stops, until the queue structure is not empty, a localsearch is applied to the first one, and so on, that is, until the last of the remainingsolutions in that queue. This process stops before emptying the queue structurewhen an upper bound of iterations regarding the local search procedure is achieved.

4 Illustrative examples

Our main concern in this section is to test how the algorithm behaves in practice. Fre-quently the instances with known results are use to evaluate the performance of the meta-heuristics (Section 6.3.4 in [36]). Therefore, concerning our problem there are no availableinstances, consequently, we have to build some examples.

In this section we will propose an heuristic for each kind of criterion to be tested(each criterion requires a specific heuristic). The main feature allowing to distinguish eachheuristic is related to the rule for choosing two neighbor zones that will be merged later.

4.1 A small-size example

We will start by building a small-size bi-criteria instance with 22 vertices and 49 edges (seeFig. 8). Concerning this instance we can calculate the entire set of efficient solutions byusing the ε-constraint technique (see [35]) and the Mehrotra mathematical programmingmodel with two criteria.

4.1.1 Evaluating criteria

Let us associate values c1ij and c2

ij to each edge {i, j}. We want to maximize the twocriteria

fl(Y ) =∑

{i,j}∈EY

clij for l = 1, 2

20

1

2

3

4

5

6

7

8

9

1011

12

13

14

1516

17

18

19

20

21

22

32,18

20,16

12,47

45,15

13,50

11,39

37,19

29,21

14,44

10,43

50,10

41,17

18,38

24,47

44,12

38,23

13,42

10,45

37,1117,47

50,13

15,39

16,41

38,20

41,14

31,10

19,36 28,41

46,16

43,13

31,12

20,33

12,48

40,19

14,38

12,42

47,21

42,1534,12

27,43

38,21

16,50

10,43

37,18

50,10

19,46

18,47

46,13

18,39

Figure 8: Test Graph

whereEY = {{i, j} ∈ E : i, j ∈ yu, u = 1, 2, . . . , K},

represents the subset of edges between vertices belonging to the same zone.

4.1.2 The bi-criteria Mehrotra-like model and the ε-constraint method

Let x = (xki ) and r = (rk

ij) be the decision variables defined as follows:

xki =

{

1 if vertex i belongs to zone k

0 otherwise.

rkij =

{

1 if edge {i, j} belongs to zone k

0 otherwise

Assuming that the number of zones to be formed is K and that every zone must have atleast L vertices, the Mehrotra model, in the bi-criteria case, with costs c1

ij and c2ij for each

{i, j} ∈ E, can be defined as follows:

21

max z1 = f1(x, r) =K

∑

k=1

∑

{i,j}∈E

c1ijr

kij

max z2 = f2(x, r) =K

∑

k=1

∑

{i,j}∈E

c2ijr

kij

s.t.

rkij ≤ xk

i

rkij ≤ xk

j

}

{i, j} ∈ E (10)

rkij ≥ xk

i + xkj − 1 {i, j} ∈ E (11)

K∑

k=1

xki = 1 i = 1, 2, . . . , n (12)

n∑

i=1

xki ≥ L k = 1, 2, . . . , K (13)

xki , r

kij ∈ {0, 1}

Constraints (10) and (11) concern the integrity of the meaning of variables. Inequal-ities (10) and (11) ensure that an edge is inside a zone if and only if both of its verticesare in that zone. Inequality (12) ensures that each vertex must be in some zone and (13)ensures that each zone must have at least L vertices.

The ε-constraint problem associated with the previous bi-criteria problem can be stateas follows,

max z1 = f1(x, r)s.t.

(x, r) ∈ X

z2 ≥ ε

(14)

where ε is a scalar.In the ε-method, ε varies among all the values in which (14) remains feasible. So, in

order to identify a set of efficient solutions, a sequence of problems (14) is solved for eachdifferent value of ε [7]. For the integer bi-criteria linear problem, the entire non-dominatedset Y nd can be determined by solving a sequence of problems (14).

Theorem 4.1 (Theorem of equivalence [17]) Consider ε ≤ max f2(x, r). If the solu-tion (x∗, r∗) solves problem 14 and when (x∗, r∗) is not unique it leads to a maximal valuefor criterion z2, then (x∗, r∗) solves the original bi-criteria problem, that is, (x∗, r∗) is anefficient solution.

22

ranges K min. vertices non-dominated sol.

c1e ∈ [1, 9] 1 4 4 16

c2e ∈ [1, 9] 2 5 3 19

c1e ∈ [1, 20] 3 4 4 15

c2e ∈ [1, 20] 4 5 3 17

c1e ∈ [10, 50] 5 4 4 20

c2e ∈ [10, 50] 6 5 3 37

Table 1: Tests Problems

4.1.3 The entire non-dominated set

For the calculation of the non-dominated set we used LINGO [22] solver. For the samegraph (see Fig. 8) we used three different weight ranges, [1, 9], [1, 20] and [10, 50] (seeAppendix A), and for each one we imposed partitions with 4 zones where each one musthave at least 4 vertices and the partitions with 5 zones with the minimum number ofvertices per zone equal to 3. Therefore, we have six different instances for the same graph.Table 1 contains the number of exact non-dominated solutions for each instance.

4.1.4 Heuristic

For this type of criteria the heuristic developed can be summarized as follows:

1. Choose the zone with the lowest number of vertices, ym.

2. For each neighbor, yk, of ym determine the sum σ1(σ2) of the edges weights c1ij(c

2ij)

linking yk and ym. A weighted-sum λ1σ1 + λ2σ2 is used to rank all neighbors ac-cording to a decreasing order.

3. Among its neighbors, choose the first zone yNm for which the constraint associatedwith the number of vertices per zone is not violated.

4. Merge zones ym and yNm.

The objective of steps 1 and 3 is to obtain solutions that fulfill the constraint associatedwith the number of vertices per zone. Step 2 aims at choosing the best contribution tothe criteria.

4.1.5 Results

The parameters for the LSEA are the following:

1. population size pop size = 400;

2. crossover probability cp = 0.3;

23

· Elementary territorial unit— Adjacent elementary territorial unit

Figure 9: Paris region contiguity graph

3. maximum generations max gen = 20.

For each instance we have run the algorithm ten times.As for the first five problems, the algorithm was able to find all the exact efficient

solutions. Indeed, almost all the solutions were found with the heuristic used to determinethe initial population. As for the last problem, the algorithm, after running ten times,found all the exact solutions 7 times and identified 36 exact solutions in the remainingruns.

These results allow to conclude that the heuristic to determine the initial populationis effective, since almost all the non-dominated solutions were determined. But this factdid not allow us to test the “work” of the genetic operators.

4.2 A large-size example

To develop a more rigorous evaluation of the performance of our algorithm, we decided touse a large-size instance with 1300 vertices and 3719 edges (see Fig. 9), which correspondsindeed to the real-world graph concerning the Paris region transportation problem.

24

4.2.1 Evaluating criteria and constraints

To have some idea of how the algorithm behaves with this data we used the followingstrategy that allows us to know two exact solutions of the efficient set, the optimal solutionfor each criterion. One instance, with two values c1

i and c2i for each vertex, was built as

follows:

1. Consider two homogeneity criteria, z1 and z2, as the ones described in Section 2.2.1,“measure” 4;

2. From a large set of solutions, previously generated, select the two most distantsolutions Y ∗1 = {y1

1, y12, . . . , y

1k} and Y ∗2 = {y2

1, y22, . . . , y

2k} according to a distance

in the decision space defined a priori.

3. Choose the solution Y ∗1(Y ∗2) for set values to c1i (c

2i ). For each zone give the same

value to all its vertices, although different values are assigned to each zone.

Therefore, we have the guarantee that Y ∗1(Y ∗2) is the optimum solution for the first(second) criterion with value 0. That is,

fl(Y∗l) =

k∑

u=1

(

maxi∈yl

u

{cli} −min

i∈ylu

{cli}

)

= 0 for l = 1, 2

The constraints are the following:

1. number of zones: 30;

2. number of vertices per zone: between 20 and 70.

4.2.2 Heuristic

For this type of criteria the heuristic developed can be summarized as follows:

1. Choose the zone with the lowest number of vertices, zm.

2. For each neighbor, zk, determine the difference δ1(δ2) between the maximum and theminimum weight c1

i (c2i ) after a possible merging with zm. A weighted-sum λ1δ1+λ2δ2

is used to rank all the neighbors according to an increasing order.

3. Among its neighbors, choose the first zone zNm for which the constraint associatedwith the number of vertices per zone is not violated.

4. Merge the zones zm and zNm.

In this case, each step has the same meaning of the previous heuristic.

25

· One potential non-dominated solution

z1

z2

Y ∗2

Y ∗1

Figure 10: Initial potential non-dominated solutions

4.2.3 Results

The parameters for the LSEA were fixed as follows:




When the criteria weights (λ1, λ2) are set to (1, 0) and (0, 1), the heuristic appliedfor generating a solution determines, almost always, the two efficient solutions Y ∗1 andY ∗2 . Fig. 10 shows the 59 potential non-dominated solutions extracted from the initialpopulation. As we can see the solutions Y ∗1 and Y ∗2 had been found.

Fig. 11 shows the potential non-dominated solutions identified after 50 generations.190 potential efficient solutions were found. Many of them although, have the same valuein the criteria space. As we can see, it shows the performance of the genetic operators.

Nevertheless it will not be possible to determine the set Znd for this instance thatallows to evaluate, in a rigorous way, the performance of our algorithm. Our intuition ledus to assert that we got good results and is based on two points:

1. The heuristic that chooses the two neighboring zones to be merged has revealed avery good behavior. Therefore, in the generation process of new solutions, and whenrequested, this heuristic was able to determine, with high frequency, solutions Y ∗1

and Y ∗2 ;

26

+ Initial potential non-dominated solutions

· Final potential non-dominated solutions

z1

z2+

++++

+++++++++

++++

++

+

+++

++

+

+ ++++++

+

+

++

+++

+

+++

+++ ++

+++++++++

Figure 11: Final potential non-dominated solutions

2. Since that this type of criteria have a “smooth” variation (the weights belong to asmall range), it is foreseeable that the curve of the potential non-dominated solutionspresents a certain “smoothness” as shown in the Fig. 11.

5 Case study: The public transportation pricing system

problem

Observation of social and economic trends (falling population in the inner city of Paris,increased commuter flows between the center and the suburbs, and demand for local ticketprices) led the Syndicat des Transports d’Ile de France (STIF, the Paris transportationauthority) to re-examine the current ticket pricing system.

The pricing system is grounded on the definition of geographical zones. The currentdistricting map or zoning is defined by concentric zones, which does not correspond tothe use of the transportation network (see Fig. 12). Therefore one of the first objectivesof the reform is to modify the zoning map grounding the ticket prices. Such a problemconsiders approximately 1300 atoms or elementary units (the municipalities in the Parisregion) and each zone of the new zoning map is supposed to represent autonomous units asfar as public transportation is concerned. This autonomy of zones is modelled by severalcriteria (see [26]). Hence this problem involves multiple criteria and is of a combinatorialstructure. We applied our algorithm to this real-world problem.

27

Figure 12: The current “Carte Orange” map

5.1 Data set

The descriptors were defined to highlight the acceptability of a zone in a districting mapchoice and validated by the “stakeholders” in the ticket pricing reform process. They aregrouped according to the type of concern they refer to. Thus, for each territorial unit wehave some real data that allow to build the criteria from de following descriptors:

1. Location of the zone with respect to the network

• Number of stations in rail network (rsi);

• Number of buses on road service;

• Density of the internal offer;

• Density of the external offer on the rail network;

• Density of the bus external offer;

• Location of the stations in rail network.

2. Mobility structure within a zone

• Access to the rail network;

28

• Commuting;

• Presence of public services.

3. Zone corresponding to administrative structures

• Conformity to the current department boundaries;

• Conformity to the current urban community boundaries.

4. Centers of attraction in the zone

• Location of shopping centers and malls;

• Location of healthcare centers.

5. Social nature

• population (popi);

• active population (act popi);

• homes without cars (h0ci);

• homes with one car (h1ci);

• homes with two or more cars (h2ci).

6. Geographical nature

• surface (surf i).

From the descriptors h0ci, h1ci and h2ci on each unit i we constructed two new descriptors:

1. The proportion of homes with two or more cars,

ph2ci =h2ci

h0ci + h1ci + h2ci

;

2. The proportion of homes with one or more cars,

ph1ci =h2ci + h1ci

h0ci + h1ci + h2ci

.

5.2 Criteria and constraints

The need to create criteria for comparing zones is manly due to the fact that severaldistricting map choices have to be made by stakeholders. Stakeholders must, therefore,be able to compare their choices of zoning using criteria which have been accepted byconsensus as a basis for comparison. The criteria were chosen by all and were deemedsuitable for this task. Some of them are presented below.

With the available data we built a set of criteria. Each criterion is formed in twostages:

29

data Evaluation of yu Evaluation of Y Max\Min

surf i S(yu) =∑

i∈yu

suf i f1(Y ) = maxyu∈Y

S(yu)− minyu∈Y

S(yu Min

popi P (yu) =∑

i∈yu

popi f2(Y ) =) maxyu∈Y

P (yu)− minyu∈Y

P (yu) Min

act popi AP (yu) =∑

i∈yu

act popi f3(Y ) = maxyu∈Y

AP (yu)− minyu∈Y

AP (yu) Min

rsi RS(yu) =∑

i∈yu

rsi f4(Y ) = minyu∈Z

RS(yu) Max

ph2ci H2(yu) = maxi∈yu

ph2ci −mini∈yu

ph2ci f5(Y ) = maxyu∈Y

H2(yu) Min

ph1ci H1(yu) = maxi∈yu

ph1ci −mini∈yu

ph1ci f6(Y ) = maxyu∈Y

H1(yu) Min

Table 2: The criteria

1. A value for each zone yu is determined.

2. The set of values are aggregated in a unique number representing the value of thecriterion for a solution Y = {y1, y2, . . . , yk}.

Tab. 2 shows the criteria. All of them are homogeneity criteria (see Section 2.2.1).The inter-zone homogeneity criteria are:

• f1, surface homogenization;

• f2, population homogenization;

• f3, active population homogenization;

• f4, rail station homogenization.

The intra-zone homogeneity criteria are:

• f5, homogenization of the proportion of homes with 2 or more cars;

• f6, homogenization of the proportion of homes with 1 or more cars.

To build up bi-criteria problems we have coupled only the relevant pairs. These pairsare the following: (f1, f2), (f1, f3), (f1, f4), (f1, f5) and (f1, f6). Although, we are ableto apply the algorithm with all the criteria, we opted for choosing couples of criteria.In this way we can visualize the set of the potential non-dominated solutions and the“stakeholders” found it a very interesting starting point.

The constraints are related to the compactness, the number of zones and the numberof units per zone. The degree of compactness C(Y ) of a district map, Y = {y1, y2, . . . , yk},

30

couple K = 20 K = 25 K = 30 K ∈ [20, 30]

(f1, f2) 199 329 167 250

(f1, f3) 82 307 160 158

(f1, f4) 37 16 51 57

(f1, f5) 72 35 334 97

(f1, f6) 347 567 125 988

Table 3: Number of potential efficient solutions.

is equal to the degree of the worse one of its zones c(yu) according to the compactness,i.e.

C(Y ) = minyu∈Z

c(yu).

The degree of compactness of a zone results of the quotient between its surface, S(yu) andthe surface of the smallest circumference that will enclose it, S(y

◦

u), i.e.

c(yu) =S(yu)

S(y◦u).

We decided to choose empirically an acceptable limit for compactness and to make testswith the following characteristics:

1. A fixed number of zones: 20, 25 and 30;

2. A variable number of zones: between 20 and 30;

3. A number of units per zone: between 20 and 110.

5.3 Results

In this case the parameters for the LSEA were the following:




Table 3 presents the number of potential efficient solutions found for each couple ofcriteria. In many cases the number of the corresponding solutions in the criteria spaceis very small. For example, couple (f1, f6) when K ∈ [20, 30] has 988 potential efficientsolutions, but in the criteria space these solutions correspond only to 7 points.

In the Figures 13 –16 we can see the graphics with the initial and final non-dominatedsolutions for the pair (f1, f3) (the homogenization of surface and active population) when

31

f1(100.000)

f3(100.000)

| | | | |

0 3 6 9 12 15|

||

||

3

6

9

12

15


· Final potential non-dominated solutions+

+

+

+

++++

+

++

Figure 13: Potential non-dominated solutions: K = 20

K = 20, K = 25, K = 30 and K ∈ [20, 30]. For all of them the progress made by the LSEAis clear. In some cases the value of f1 and f3 improved more than 50%. The graphics ofthe remaining couples are presented in Appendix B. Fig. 17 represents the best districtmap concerning surface homogenization, criterion f1, when K = 25.

The results reported in this section reveal more or less how the decision making processevolved since we start the analysis of the case study, in particular, the elements thatconcern the “resolution” of the “problem”. Several aspects should be taken into accountfor a better understanding of the decision making process:

1. The current model comprised 6 criteria built according to the descriptors presentedin Section 5.1.

2. The current algorithm is able to deal with all the criteria simultaneously.

3. But, from a practical point of view “stakeholders” at STIF were unable to captureseveral aspects of the problems and proposed to start the analysis by an elementarylevel rendering it easier to understand the real-world decision making process.

4. According to their suggestion, we decided to analyze the problem taking into accountonly pairs of criteria and observe what happens when regarding the criteria space.

5. The generation of the potential non-dominated frontier was well accepted by the“stakeholders”, but they wanted always to fix their study in a particular region ofsuch a frontier.

6. After locating that region some solutions were picked up and the associated mapswere built.

32

f1(100.000)

f3(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

3

6

9

12

15



+

+ ++++++++++ ++ +


f1(100.000)

f3(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

3

6

9

12

15



+

++

+

+

+++


33

f1(100.000)

f3(100.000)

| | | | |

0 3 6 9 12 15

||

||

|3

6

9

12

15



+++

+

+++++

Figure 16: Potential non-dominated solutions: K ∈ [20, 30]

Figure 17: (f1, f3) = (162382, 498071): The best surface homogenization when K = 25

34

6 Conclusions and future research

In view of the large scope of application areas dealing with districting problems andthe increasing number of published works in this field it is necessary to develop a newterminology and make bridges among the different districting problems. This researchrepresents an initial attempt to characterize all districting problems. In this paper wepresented a new taxonomy for different kinds of criteria classifying then in four classes.A new local search evolutionary algorithm for this type of problems was also presented.Our algorithm is a hybridization of recombination operators with local search that allowsthe use of local heuristics taking into account the nature of each criterion. We used asolution coding that allowed us to guide the genetic operators towards the criteria. Thecombinations with the local search allows to check all efficient solutions near the new foundsolutions.

Computational experiments and results showed that the new algorithm can effectivelygenerate all exact efficient solutions for small-size instances. For large-size instances thealgorithm generates a high quality potential efficient solution from all regions of the non-dominated set. The tests performed showed the excellent combination between evolu-tionary algorithms and local search. The possibility of guiding the genetic operators alsoshowed the high convergence increment that allowed founding good solutions very quickly.On the whole, the research work not only develops the districting problem but also showsthe potential power of EAs for combinatorial problems with multiple criteria.

On the one hand we believe that our work has a great potential of development. Itsgreat flexibility allows to adapt it to many kinds of problems of a different nature. On theother hand the LSEA can be improved in some aspects: the search in the neighborhoodstructure can be improved because there is a high level of intersections between neigh-borhoods. In the future, a more or less “automatically” interactive procedure should beimplemented in order to find the “best” solution according to the “stakeholders” prefer-ences.

Acknowledgements

This work has benefited from the luso-french grant PESSOA 2004 (GRICES/Ambassadede France au Portugal). The first two authors would like to acknowledge the partialsupport from MONET research project (POCTI/GES/37707/2001) and the second authoracknowledges the support of the grant SFRH/BDP/6800/2001 (Fundacao para a Cienciae Tecnologia, Portugal).

35

References

[1] P.K. Bergey, C.T. Ragsdale, and M. Hoskote. A decision support system for theelectrical power districting problem. Decision Support Systems, 36:1–17, 2003.

[2] P.K. Bergey, C.T. Ragsdale, and M. Hoskote. A simulated annealing genetic algorithmfor the electrical power districting problem. Annals of Operations Research, 121:33–55, 2003.

[3] J.M. Bourjolly, G. Laporte, and J.M. Rousseau. Decoupage electoral automatise:Application a l’ile de montreal. INFOR, 19:113–124, 1981.

[4] B. Bozkaya. Political Districting: A Tabu Search Algorithm and Geografical Inter-faces. PhD thesis, University of Alberta, 1999.

[5] B. Bozkaya, E. Erkut, and G. Laporte. A tabu search heuristic and adaptive memoryprocedure for political districting. European Journal of Operational Research, 144:12–26, 2003.

[6] C.W. Chance. Representation and reaportionment. Political studies: Number 2,Dept. of Political Science, Ohio State University, Columbus, 1965.

[7] V. Chancon and V.V. Haimes. Multiobjective Decision Making: Theory and Method-ology. Elsevier, North-Holland, New York, 1983.

[8] P.G. Cortona, C. Manzi, A. Pennisi, F. Ricca, and B. Simeone. Evaluation andOptimization of Electoral Systems. SIAM Monographs on Discrete Mathematics andApplications, Philadelphia, 1999.

[9] S.J. D’Amico, S.J. Wang, R. Batta, and C.M. Rump. A simulated annealing approachto police district design. Computers & Operations Research, 29:667–684, 2002.

[10] R.F. Deckro. Multiple objetive districting: A general heuristic approach using mul-tiple criteria. Operational Research Quarterly, 28:953–961, 1979.

[11] C. Easingwood. A heuristic approach to selecting sales regions and territories. Oper-ational Research Quarterly, 24(4):527–534, 1973.

[12] J.A. Ferland and G. Guenette. Decision support system for the school districtingproblem. Operations Research, 38:15–21, 1990.

[13] B. Fleischmann and J.N. Paraschis. Solving a large scale districting problem: A casereport. Computers & Operations Research, 15(6):521–533, 1988.

[14] L. Forrest. Apportionment by computer. American Behavioral Science, 7, 1964.

36

[15] R.S. Garfinkel and G.L. Nemhauser. Optimal political districting by implicit enumer-ation techniques. Management Science, 16(8):495–508, 1970.

[16] B. Grofman. Criteria for districting: A social science prespective. UCLA Law Review,33:77–183, 1985.

[17] Y.Y. Haimes, L.S. Lasdon, and D.A. Wismer. On a bicriterion formulation of theproblem of integrated system identification and system optimization. IEEE Transac-tions on Systems, Man, and Cybernetics, 1:296–297, 1971.

[18] S.W. Hess and S.A. Samuels. Experiences with a sales districting model: Criteriaand implementation. Management Science, 18(4):41–54, 1971.

[19] S.W. Hess, J.B. Siegfeldt, J.N. Whelan, and P.A. Zitlau. Nonpartisan political redis-tricting by computer. Operations Research, 13(6):998–1006, 1965.

[20] M. Hojati. Optimal political districting. Computers & Operations Research,23(12):1147–1161, 1996.

[21] L.D. Horn, C.R. Hampton, and A.J. Vanderberg. Pratical application of districtingcompactness. Political Geography, 12:103–120, 1993.

[22] LINGO. LINGO, the Modeling Language and optimizer. Lindo Systems INC.,Chicago, 1999.

[23] A. Mehrotra. Constrained Graph. PhD thesis, Georgia Institute of Technology, 1992.

[24] A. Mehrotra, E.L. Johnson, and G.L. Nemhauser. An optimization based heuristicfor political districting. Management Science, 44(8):1100–1114, 1998.

[25] R.L. Morrill. Political Redistricting and Geographic Theory. Association of AmericanGeographers, Washington D.C., 1981.

[26] V. Mousseau, B. Roy, and I. Sommerlatt. Development of a decision aiding tool forthe evolution of public transport ticket pricing in the Paris region. In M. ParucciniA. Colorni and B. Roy, editors, A-MCD-A Aide Multicritere a la Decision - MultipleCriteria Decision Aiding, pages 213–230. Joint Research Center, European Commi-sion, Luxembourg, 2001.

[27] L. Muyldermans, D. Cattrysse, D.V. Oudheusden, and T. Lotan. Districting for saltspreading operations. European Journal of Operational Research, 139:521–532, 2002.

[28] B. Nygreen. European assembly constituencies for wales. comparations of methodsfor solving a political districting problem. Mathematical Programming, 42:159–169,1988.

37

[29] K. Park, K. Lee, S. Park, and H.Lee. Telecommunication node clustering withnode compatibility and network survivability requirements. Management Science,46(3):363–374, 2000.

[30] P. Preux and E.G. Talbi. Towards hybrid evolutionary algorithms. InternationalTransactions in Operational Research, 6:557–570, 1999.

[31] C.R. Reeves. Genetic algorithms for the operations researcher. INFORMS Journalon Computing, 9(3):231–250, 1997.

[32] P. Ross. What are genetic algorithms good at? INFORMS Journal on Computing,9(3):260–262, 1997.

[33] R.J. Shanker, R.E. Turner, and A.A. Zoltners. Sales territory design: An integratedapproach. Management Science, 22(3):309–320, 1975.

[34] N. Srinivas and K. Deb. Multiobjective optimization using nondominated sortinggenetic algorithms. Evolutionary Computation, 2/3:221–248, 1995.

[35] R. E. Steuer. Multiple Criteria Optimization: Theory, Computation and Application.John Wiley & Sons, New York, 1986.

[36] D. A. Van Veldhuizen. Multiobjective Evolutionary Algorithms: Classifications, Anal-yses, and New Innovations. PhD thesis, Department of Electrical and Computer En-gineering, Graduate School of Engineering, Air Force Institute of Technology, Wright-Patterson AFB, Ohio, 1999.

[37] W. Vickrey. On the preventions of gerrymandering. Political Science Quarterly,76(1):105–110, 1961.

[38] J.C.Jr. Williams. Political redistricting: A review. Papers in Regional Science,74(1):13–40, 1995.

[39] A.A. Zoltners and P. Sinha. Sales territory alignment: A review and model. Man-agement Science, 29(3):1237–1256, 1983.

38

Appendix A: Data set

Data concerning the graph G = (V, E) of the Fig. 8. For each edge {i, j} ∈ E there are 3couples of costs c1

ij , c2ij according to a given range for their values.

Edges [1, 9] [1, 20] [10, 50]

{1,2} 2,8 19,10 32,18

{1,3} 9,2 10,2 12,47

{1,4} 4,6 16,6 20,16

{2,4} 3,9 15,4 13,50

{2,5} 5,5 3,15 45,15

{3,4} 8,3 13,7 11,39

{3,9} 2,8 6,10 37,19

{4,5} 2,8 5,7 29,21

{4,7} 1,1 4,3 14,44

{4,8} 9,2 5,17 10,43

{4,9} 1,1 7,15 50,10

{5,6} 9,2 12,16 41,17

{5,7} 8,3 11,15 18,38

{6,7} 6,4 8,19 44,12

{6,15} 3,8 17,2 24,47

{7,8} 1,1 2,4 37,11

{7,13} 2,8 18,7 10,45

{7,14} 3,6 14,1 13,42

{7,15} 2,9 5,3 38,23

{8,9} 3,9 20,14 16,41

{8,10} 2,8 18,13 15,39

{8,13} 1,1 1,5 17,47

{8,22} 2,9 8,1 50,13

{9,10} 7,6 15,10 38,20

{10,11} 8,3 10,8 31,10...

......

...

......

......

{10,22} 5,5 7,6 41,14

{11,12} 6,4 4,18 28,41

{11,18} 3,8 13,5 46,16

{11,22} 9,2 9,13 19,36

{12,13} 8,2 9,16 31,12

{12,17} 1,1 5,1 20,33

{12,18} 2,9 1,2 40,19

{12,19} 1,1 6,2 12,48

{12,22} 8,3 16,14 43,13

{13,14} 2,9 16,4 14,38

{13,17} 1,1 7,5 12,42

{13,22} 9,2 4,20 47,21

{14,15} 4,7 11,8 42,15

{14,16} 9,3 7,18 34,12

{14,17} 8,2 2,7 27,43

{15,16} 8,2 3,20 38,21

{16,17} 5,6 10,6 10,43

{16,20} 6,5 18,5 16,50

{17,19} 2,8 8,17 19,46

{17,20} 7,4 13,12 37,18

{17,21} 3,9 11,5 50,10

{18,19} 8,2 15,3 18,47

{19,21} 5,4 9,19 46,13

{20,21} 2,8 17,11 18,39

39

Appendix B: Graphical representation of the solutions

B.1 Homogenization of surface and population (f1, f2)

f1(100.000)

f2(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

6

12

18

24

30



+

+

+++++++

++


40

f1(100.000)

f2(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

6

12

18

24

30



+++

+++ +


f1(100.000)

f2(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

6

12

18

24

30



+

+

+++++

+ +


41

f1(100.000)

f2(100.000)

| | | | |

0 3 6 9 12 15

||

||

|

6

12

18

24

30



+

+ ++++++


42

B.2 Homogenization of surface and number of station (f1, f4)

f1(100.000)

f4

| | | | |

0 2 4 6 8 10

||

||

|

2

4

6

8

10



+

+

+

+

+

+

+

+


43

f1(100.000)

f4

| | | | |

0 2 4 6 8 10

||

||

|

2

4

6

8

10



+

+

+

+

+

+


f1(100.000)

f4

| | | | |

0 2 4 6 8 10

||

||

|

2

4

6

8

10



+

+

+

+


44

f1(100.000)

f4

| | | | |

0 2 4 6 8 10

||

||

|

2

4

6

8

10



+

+

+

+

+

+


45

B.3 Homogenization of the surface and proportion of homes with 2 ormore cars (f1, f5)

f1(100.000)

f5

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8

1.0 + Initial potential non-dominated solutions


+

++++++++ +


46

f1(100.000)

f5

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8



+

+

++ ++

++ +


f1(100.000)

f5

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8



+

++

++

+


47

f1(100.000)

f5

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8



+

+++

++


48

B.4 Homogenization of the surface and proportion of homes with 1 ormore cars (f1, f6)

f1(100.000)

f6

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8

1.0

+ ++

++

+ ++




49

f1(100.000)

f6

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8

1.0

+++++ +

++ + +




f1(100.000)

f6

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8

1.0

++++

++++




50

f1(100.000)

f6

| | | | |

0 2 4 6 8 10

||

||

|

0.2

0.4

0.6

0.8

1.0

+

+++

++

++




51

B.5 Graphical representation of districts maps

Figure 34: (f1, f2) = (259834.53, 574417): The best surface homogenization when K ∈

[20, 30]

52

Figure 35: (f1, f2) = (990189.06, 158377): The best population homogenization whenK ∈ [20, 30]

53

Multiple Criteria Districting Problems, Models, Algorithms ...Fernando Tavares Pereira zx, Jos e...

Documents

Transcript of Multiple Criteria Districting Problems, Models, Algorithms ...Fernando Tavares Pereira zx, Jos e...