Rights / License: Research Collection In Copyright - Non … · 2020. 3. 26. · Contents...
Transcript of Rights / License: Research Collection In Copyright - Non … · 2020. 3. 26. · Contents...
Research Collection
Doctoral Thesis
A continuous relaxation based heuristic for a class of constrainedsemi-assignment problems
Author(s): Burkard, Michael
Publication Date: 2000
Permanent Link: https://doi.org/10.3929/ethz-a-003925104
Rights / License: In Copyright - Non-Commercial Use Permitted
This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.
ETH Library
Diss. ETH No. 13756
A Continuous
Relaxation Based Heuristic
for a Class of Constrained
Semi-Assignment Problems
A dissertation submitted to the
Swiss Federal Institute Of Technology
Zurich
for the degree of
Doctor of Technical Sciences
presented by
Michael Burkard
Dipl.-Ing., TU-Graz
born 7th August 1971
citizen of Austria
accepted on the recommendation of
Prof. Dr. H.-J. Liithi, examiner
Dr. A. Gaillard, co-examiner
Prof. Dr. T. Liebling, co-examiner
2000
Acknowledgement
Special thanks go to Prof. H.-J. Lüthi, who made this work possible by offering me
a position at the Institute of Operations Research (IFOR). His personal engagement
has helped me greatly to finish this work.
Furthermore I am grateful to Prof. T. Liebling at EPFL for his friendly support
and his willingness to referee my thesis.
I am indebted to A. Gaillard for her constant support and her kind supervision of
this thesis. I also wish to thank M. Cochand for introducing me to this topic and
sharing his expertise on this subject with me. His critical review of the work has
decisively improved this thesis.
At EPFL I also want to thank Prof. A. Hertz and D. Kobler for their develop¬
ment and implementation of a Tabu Search method for the C-SAP which made the
comparisons in Chapter 3 possible.
Moreover, B. Mateev's implementation of a visualisation tool for label placement
problems was a great help to me and made the comparisons of different models in
Chapter 4 much easier. It also enabled me to include some nice pictures illustrating
impressively the quality of different placements.
I wish to thank my family and all my friends in Switzerland and abroad for their
moral support, encouragement and much beyond ...
Finally I wish to thank all members at IFOR who made my time in Zurich enjoyable.I am especially obliged to my colleagues G. Studer and L. Finschi for numerous
stimulating discussions which often brought up new ideas and insights. Last but not
least I wish to thank our system administrators, D. Moser and W. Schlickenrieder
who always kept my system running, supported me with their excellent UNIX know-
how and provided several useful Perl-scripts.
Contents
Acknowledgement i
Glossary of Notation xi
Abstract xiii
Zusammenfassung xv
1 Semi-Assignment Problems: A Survey 1
1.1 Introduction 1
1.2 Problem Statement and Overview 2
1.2.1 The Constrained Semi-Assignment Problem 2
1.2.2 Heuristic Solution Methods for the C-SAP 5
1.2.3 Dynamical Systems in Combinatorial Optimization 6
1.2.4 Overview 8
1.3 Combinatorial Problems Formulated as C-SAP's 9
1.4 Idea of the Algorithm 15
2 The Semi-Assignment Problem 21
2.1 Introduction 21
2.2 On the Objective Function of the R-SAP 25
2.2.1 Basic Properties 25
2.2.2 Equivalent A-Polynomials 26
2.2.3 Relationship between Discrete and Continuous Local Optima .28
iv Contents
2.2.4 Saddle Points, Karush-Kuhn-Tucker Points and Nash Equilibria 33
2.3 Properties of the Iteration Sequence 38
2.3.1 Growth Transformations 39
2.3.2 Fixed Points and Accumulation Points 43
2.3.3 On the Convergence 47
2.3.4 Relations between Local Optima and Attractors 56
2.4 Guaranteed Region of Attraction for F[T] 59
2.4.1 An Introductory Example 60
2.4.2 Characterization of the Guaranteed Region of Attraction...
62
2.4.3 The Special Case of Quadratic A-Polynomials 65
2.5 A Region of Attraction for F[£] 73
2.6 Computational Experiments with the Max-Cut Problem 77
3 The Constrained Semi-Assignment Problem 83
3.1 Introduction and Algorithm for C-SAP. 83
3.2 Properties of the Operator for C-SAP 86
3.3 Implementation and Numerical Results 93
3.3.1 Implementation 94
3.3.2 Instance Generation 96
3.3.3 Numerical Results 97
4 The Point Feature Label Placement Problem 103
4.1 Label Placement - State of the Art 103
4.2 Methods behind Label Placement 106
4.3 Problem Descriptions 108
4.4 Preprocessing Strategies 110
4.5 Models for Label Placement 114
4.5.1 Minimizing the Number of Pairwise Overlaps (PI) 114
4.5.2 Minimizing the Number of Conflicting Labels (P2) 115
4.5.3 Models for Label Placement with Point Selection (P3) ....118
Contents v
4.6 Postprocessing Strategies 120
4.7 Ambiguity 122
4.8 Computational Results 126
4.8.1 Test Set 128
4.8.2 Reduction by Preprocessing 128
4.8.3 Comparison of the Models 130
4.8.4 Comparison to Other Heuristics 132
4.8.5 Implementation Details and Parameter Settings 137
A Gauss-Seidel Version of the Operator 139
B The Concept of KKT Points 143
C A Gradient Approach 145
D Definitions 149
D.l General Definitions 149
D.2 Specific Notations 150
aiar*k leaf
List of Tables
2.1 Results for 10 Max-Cut instances 79
2.2 Results for large Max-Cut instances 81
3.1 Composition of weighted partial assignments (T, w) 97
3.2 Comparison of FPH to TS 101
4.1 Desired properties of an aesthetically attractive placement (from[FA87]) 104
4.2 Average reduction of constraints by pre-processing for problems with
n = 750 129
4.3 Average reduction of constraints by pre-processing for problems with
n = 1500 130
4.4 Comparison of (Ml) and (M2) 131
4.5 Varying the number of iterations for (M2) with 750 point features.. .
132
4.6 Comparison of methods for instances without point selection. Taken
from Figure 11 in [CMS95] 134
4.7 Comparison of different heuristics for random maps without pointselection 135
4.8 Comparison of different heuristics for random maps with point selection. 135
Se/£e /
Blank, r/
List of Figures
1.1 SAP and C-SAP with optimal assignments A\ and A2 10
1.2 Trajectories of three different dynamics in the (pn,p2i)-plane 20
2.1 Plots of the graph z(p) = 3pnp2i — Pn — P21 + 1 23
2.2 Trajectories and regions of attraction 24
2.3 Objective values of z in the pn,p21-plane 32
2.4 Nash equilibria of a Max-3-Cut instance 37
2.5 Example illustrating sets Ue of Corollary 2.31 46
2.6 Example with a discrete local maximum p* which is not an accumu¬
lation point of any sequence started in A0 58
2.7 Regions of attraction for z\ and z2 with M := 5000; black region
corresponds to p*, light gray region to q* (numerically determined). .60
2.8 Region of monotone increase; arrows indicate directions of increase..
61
2.9 Leaving M/i(p*), because of too large step length 63
2.10 Examples for Qi{p,p*) 64
2.11 Two examples of sets Mhip*) and GRAx(p*) 67
2.12 Extreme points pu of Qi(p,p*) and Q2(p,P*) for n = l,k = 3 with
P= iPl,P2,P3) 68
2.13 Construction of the guaranteed region of attraction GRA^ip*) 70
2.14 Iterations vs. quality: solid=min, dashed=avg 78
3.1 Graph of fijijix))) and the line y = x 92
3.2 Behavior of FPH for re, F29 and r36 (left: without greedy, right:with greedy) 99
3.3 Behavior of FPH for P36, T3e2 and r363 (left: without greedy, right:with greedy) 100
X List of Figures
4.1 Position priorities suggested by Yoeli [Yoe72] 105
4.2 Example demonstrating the difference between the objectives: 'mini¬
mize pairwise overlaps' and 'minimize conflicting labels' 110
4.3 Example of why (Rl) must not be used for (PI) or (P2) when opti-
mality should be preserved 113
4.4 Motivating example for the objective function of (P2) 115
4.5 Contradicting objectives and improvement by postprocessing (PPl). •121
4.6 Resolving ambiguous placements 122
4.7 Neighborhood Rz(j) for point feature i used by (AMBIl) and (AMBI2).123
4.8 Placement where (AMBIl) does not work 124
4.9 Distance rules for placement with (AMBI2) 125
4.10 Placements of ambiguity avoiding strategies (AMBIl) and (AMBI2). 127
4.11 Comparison of (Ml) and (M2) with the example of a random map
with 750 point features with point selection prohibited 133
4.12 Comparison of different methods for n = 750 (left) and n = 1500
(right) features 136
Glossary of Notation
Of, ß exponents of T and 6 used in F[TaGß], p. 18
r fitness function, p.18
Ttr(z,p),rtr(p) partial derivative of z with respect to ptr, (1.21)A cross product of n simplices A*, (1.9)A* unit simplex in /c-space, p. 14
AJ integer vertices of A, (1.8)A0 relative interior of A, p. 14
Ae relative interior of A with e-boundary, (3.8)Ar {p e A | supp(p*) Ç supp(p)}, (2.84)0 fitness function, p. 18
Öir(p) function used to define repellor dynamics F[&], (1.22)£ fitness function used by operator F, p. 17
S.(p) denominator of F, (1-20)**(p) function A ->• AE used in the definition of Fe[Ç](p), (3.10)A^ assignment, set of all assignments, p.3
^nr iV coefficients of ht(p) and si(p), p.66
aff(A) affine subspace of A, p.33
C, Cj constants, c% defined in (2.98)
Cone() cone of the vectors, (2.93)Conv() convex hull of the vectors, (1.9)C(T),C(R) clauses corresp. to partial assignments T and R, (1.4), (1.5)m(p) operator F with fitness function £, (1.20)F(z,p) short for F[£(z,p)](p), used to distinguish different z
nap) Gauss-Seidel version of operator F[Ç](p), (A.l)m(p) e-restricted version of F[£\(z,p), (3.11){F*(p°)},*>0 sequence generated by F with starting point p°^ face of A which contains p*, p.31
G, Gj monotone transformation (2.35), mostly used as £ = G(r)G = (V,£) undirected graph with vertex set V and edge set E, p.4
GiL4(p*) guaranteed region of attraction ofp*, (2.87), (2.103)/»/(p) linear functions describing MI(p*), p.66
i,j£N indices of variables, p.3
xii Glossary of Notation
(i,r)
K = {l,...,k}I eL
m
Mlip*)N = {l,...,n}
Af{p*)
p,qe A
Pir e [0, l]
Pt
pt = F^pO)p> = F(p)pNtjN
PRA(z,p*)T
q
Q(p,p*)
r,s e K
Riip)Ren
n
RA(z,p*)Round (p)
Si(p)
Si(P*) e K
supp(p)TeT
T
(T, wT)
UeiP*)
u'Ap*)
U(f)
Ueip*)
URA(p*)
Vi
WT, Wir
Xi
zip)
Zl=Z2
assignment, short for Xi = r, p.3
edge between nodes i and j in an undirected graph, p.4
set of possible values to be assigned, p.3
L index set of halfspaces in Mlip*), p.66
degree of the A-polynomial zip), p.25
region of monotone increase, (2.85), (2.102)index set of decision variables, p.3
discrete neighborhood of p* e A1, (2.16)
variables, n x A;-matrices, p. 14
probability for X{ = r, p. 15
i-th. row of matrix p, p. 15
element in iteration sequence {Ftip°)},t > 0, p.16
next point in iteration sequence, p\r = uirpir, p.38
set of all A-polynomials which contain all variables in N, p.25
subset of PN for which F is defined, p.62
form-dependent subset of iL4(z,p*), p.75
transpose of ç, p. 16
subset of A corresp. to feasible direction of F, (2.86), (2.93)
(indices) of assigned values, p.3
rest term in affine linear writing of zip), (2.10)forbidden partial assignment (constraint of C-SAP), p.3
set of all forbidden partial assignments, p.3
set of positive reals
region of attraction of p*, p.56
matrix in AJ, where largest component is rounded to 1, (3.12)linear functions describing GRAip*), (2.91), (2.106)index for p* e A1 with pis.(r) = 1, sip*) G Kn, p.62
support of p, p.33
partial assignment for the objective function z(p), p.3
set of all partial assignments for the objective function, p.3
weighted partial assignment, p.3
factor in F by which Pir is multiplied: Fip)ir = uirpir, (2.39)connected component of U'sip*) which contains p*, p.45
{p e A | zip) > zip*) - e}, (2.64)neighborhood of p*
open ball with center at p* and radius e > 0
universal region of attraction of p* (form-independent), p.62
Lagrangean multiplier, Vi := maxr£K^irip), P-33
weights in the objective function zip), p.3
i-th. decision variable, p.3
objective function of C-SAP, (1.10)
equivalent A-polynomials, z\ and z2 agree on A, p.26
Abstract
In this thesis we consider the constrained semi-assignment problem (C-SAP). It is
a generalization of the well-known Pseudo-Boolean Optimization problem where
the Boolean variables are replaced by discrete decision variables. Additionally,
constraints are given by a set of clauses (similar to those of the Satisfiability
problem) which prohibit certain assignments.
Many well-known combinatorial optimization problems can be formulated as a
C-SAP. This problem occurs also frequently in real-world applications. Since the
C-SAP is an NP-hard problem, it cannot be expected that the global optimum can
be determined efficiently. For this reason one goal of this work was the developmentof a heuristic which can be used efficiently for some of those C-SAP instances,
for which local search methods are doomed to failure due to their myopia. The
newly developed fixed point heuristic (FPH) of this thesis generalizes a method
of Cochand for the Generalized Maximum Satisfiability problem (G-Max-Sat)such that it can also be applied to constrained maximization problems such as
the C-SAP. We will show with the example of the point feature label placement
problem that FPH determines also for real-world problems fast good approximatesolutions.
Essentially FPH is based on a discrete dynamical system which results from iteratingan appropriately defined operator. For any fixed starting point a sequence of points
is generated in this way. The operator should be chosen such that this sequence con¬
verges for a large portion of starting points to a good local maximum of the problem.
By an appropriate choice of the operator in FPH, global information can be
included in the solution process. Thus FPH has for certain C-SAP instances a big
advantage over local search methods, which would fail in such situations due to
their local vision and lack of orientation.
The major part of this thesis deals with the semi-assignment problem (SAP), where
the constraints are not taken into account. We present a continuous embeddingof the problem and discuss the relationship between continuous and discrete local
maxima. Then we define an operator for FPH and prove some interesting properties
xiv Abstract
with respect to its convergence. Based on the fact that attractors are equivalentto strict local maxima we study the regions of attraction of these points and char¬
acterize some subregions thereof for the special case of quadratic objective functions.
Further on, the operator used for the SAP will be extended such that it can
be applied to the C-SAP. Towards this end the operator gets a new component
which has a repelling effect in the neighborhood of certain infeasible solutions. We
investigate this behavior and some other properties of this operator and discuss
then implementation details of FPH. Finally, we compare FPH with a Tabu-Search
method which was especially developed for the C-SAP.
The final part of this thesis concentrates on an application of the C-SAP, the so-
called point feature label placement problem. Its goal is to attach text elements to
given points on a map. This text should be placed clearly in such a way that overlapsand ambiguous assignments are avoided whenever possible. For this reason we derive
different models for this problem, all of which are based on the C-SAP formulation.
Additionally we discuss various pre- and postprocessing strategies which improvethe efficiency of the algorithm and the quality of its solutions. We conclude with a
comparison of the results of our models with the best known approach for this task.
Zusammenfassung
In dieser Arbeit beschäftigen wir uns mit dem restringierten Semi-Zuordnungs-
problem (Constrained Semi-Assignment Problem, C-SAP). Das C-SAP ist eine
Verallgemeinerung des bekannten Pseudo-Boole'schen Optimierungsproblems,wobei statt Boole'schen Variablen diskrete Entscheidungsvariable benutzt werden.
Zusätzlich haben wir Nebenbedingungen in Form von Klauseln (ähnlich wie beim
Satisfizierbarkeits-Problem), welche gewisse Zuordnungen verbieten.
Viele bekannte kombinatorische Optimierungsprobleme können als C-SAP formu¬
liert werden, und auch in der Praxis ist dieses Problem häufig anzutreffen. Da
es sich beim C-SAP um ein NP-schweres Problem handelt, kann nicht erwartet
werden, dass exakte Lösungen effizient bestimmt werden können. Aus diesem
Grund bestand ein Ziel dieser Arbeit darin, eine Heuristik zu entwickeln, die
besonders für solche C-SAP Instanzen effizient eingesetzt werden kann, für welche
lokale Suchverfahren infolge ihrer Kurzsichtigkeit versagen. Die in dieser Arbeit
neu entwickelte Fixpunkt-Heuristik (FPH) verallgemeinert ein von Cochand für das
verallgemeinerte, maximale Satisfizierbarkeits-Problem (G-Max-Sat) entwickeltes
Verfahren so, dass es auch für restringierte Maximierungsprobleme wie dem C-SAP
eingesetzt werden kann. Wir werden anhand des Landkarten-Beschriftungsproblems
zeigen, dass FPH auch für Probleme aus der Praxis schnell gute Näherungslösungenliefert.
Im Wesentlichen basiert FPH auf einem diskreten dynamischen System. Durch
Iteration eines geeigneten Operators wird, ausgehend von einem Startpunkt, eine
Punktefolge generiert. Den Operator will man dabei so definieren, dass diese Folgefür möglichst viele Startpunkte zu einem guten lokalen Maximum des Problems
konvergiert.
Durch eine geeignete Wahl des Operators in FPH kann man auch globale Infoma-
tionen über das Problem im Lösungsfindungsprozess miteinfliessen lassen. Dies
bringt bei gewissen C-SAP Instanzen einen grossen Vorteil gegenüber lokalen
Suchmethoden mit sich, die in solchen Fällen wegen ihrer lokalen Sichtweise oft die
Orientierung verlieren und versagen.
xvi Zusammenfassung
Ein grosser Teil dieser Arbeit beschäftigt sich mit dem Semi-Zuordnungsproblem(Semi-Assignment Problem, SAP), bei dem die Nebenbedingungen ausser Acht
gelassen werden. Wir präsentieren eine kontinuierliche Einbettung des Problems,über welcher FPH arbeiten wird, und diskutieren den Zusammenhang zwischen
kontinuierlichen und diskreten lokalen Maxima. Dann definieren wir einen Operatorfür FPH und beweisen einige interessante Eigenschaften, die im Zusammenhangmit seinem Konvergenzverhalten stehen. Basierend auf der Tatsache, dass At-
traktoren und strikt lokale Maxima äquivalent sind, studieren wir schliesslich die
Attraktionsgebiete dieser Punkte und charakterisieren einige Teilgebiete davon für
den Spezialfall quadratischer Zielfunktionen.
Anschliessend wird der für das SAP benutzte Operator erweitert, sodass er auf das
C-SAP angewendet werden kann. Dabei erhält der Operator eine neue Komponentemit abstossender Wirkung in der Nähe von gewissen unzulässigen Lösungen. Wir
untersuchen diese und weitere Eigenschaften dieses Operators und beschreiben
einige Implementationsdetails von FPH. Schliesslich vergleichen wir FPH mit einem
eigens für das C-SAP entwickelten Tabu-Suchverfahren.
Der letzte Teil dieser Arbeit beschäftigt sich mit einer Anwendung des C-SAP, dem
sogenannten Landkarten-Beschriftungsproblem (Label Placement Problem). Ziel
dabei ist es, vorgegebene Punkte auf einer Landkarte mit einem Text zu beschrif¬
ten. Dieser soll möglichst übersichtlich platziert werden, wobei Überdeckungen und
mehrdeutige Zuordnungen zu vermeiden sind. Um diese Aufgabe zu erfüllen, werden
verschiedene Modelle für das Problem hergeleitet, die auf der C-SAP Formulierungbasieren. Zusätzlich diskutieren wir diverse Pre- und Postprocessing-Strategien,welche die Effizienz des Algorithmus und die Qualität der Lösungen verbessern. Ab¬
schliessend präsentieren wir Ergebnisse eines Vergleichs zwischen unseren Modellen
und dem besten bekannten Verfahren für diese Aufgabe.
Chapter 1
Semi-Assignment Problems: A
Survey
1.1 Introduction
In the commonwealth of combinatorial optimization problems we have two major
kingdoms: One consists of efficiently solvable problems for which an algorithm exists
that determines in polynomial time an optimal solution. The other vast kingdomconsists of NP-hard optimization problems which nowadays can only be solved
optimally by more or less implicit enumeration of all feasible solutions. The semi-
assignment problem which will be discussed in this thesis belongs to this latter class.
Since NP-hard problems are so difficult to solve optimally, heuristics play a crucial
role in this area. In general a heuristic does not find the global optimum of a
problem. For this reason we mean in this thesis by writing "a heuristic solves a
problem" only the search for a good approximate solution.
An important class of heuristics consists of the so-called local search methods
whose most prominent representatives are Simulated Annealing and Tabu Search.
These methods search in neighborhoods of feasible solutions and determine there a
local optimum. Though in general local search methods perform quite well, there
nevertheless exist instances for which they behave poorly.
A typical drawback of local search methods is their inherent myopia and their local
vision which does not allow them to process the global information of a probleminstance. This phenomenon can be observed impressively with instances which
are characterized by a very large global maximum, surrounded only by pointswith equally small objective values. We can imagine the landscape of the solution
space as a vast constant plateau (with small objective values) midst of which a
geyser (corresponding to our global maximum) erupts. For this reason we will call
2 Semi-Assignment Problems: A Survey
objective functions of this type geyser functions, which will be described later in
more detail. Since there are no points of reference within the whole plateau which
may indicate the direction to the global maximum, local search methods cannot
deal with this problem type and will fail in finding the optimum.
In order to overcome this drawback of local search methods, we design in this thesis
a new method, called fixed point heuristic (FPH), which takes also global features
of the problem into account. Our hope with FPH was that by including globalinformation in the solution process, it would not be necessary to comb such a large
portion of the search space as local search methods may have to do. FPH uses the
global information to direct the search towards a hopefully large local optimum.
In order to follow up this idea, we refrain from using a neighborhood and define
FPH instead as a specialized discrete dynamical system which is a generalization of
Cochand's algorithm for the Generalized Maximum Satisfiability problem [Coc93].Based on a continuous embedding of the problem and thanks to the special
structure of the form of the objective function this allows us to include some globalinformation in the solution process. Thus FPH is able to find in some extreme
cases, such as for geyser functions, independently of the starting point always the
global optimum, where local search methods would certainly fail.
Naturally FPH was not only designed for geyser functions, but should as a gen¬
eral purpose heuristic also provide comparable results to local search methods for
'non-structured' problems. In the last chapter of this thesis, we will show the broad
applicability and the effectiveness of FPH with the practical example of the point
feature label placement problem where we compare our results with those of Simu¬
lated Annealing.
1.2 Problem Statement and Overview
In this section we present the formal definition of the constrained semi-assignment
problem and discuss solution methods for this difficult problem. We conclude in
Section 1.2.4 with an overview of this thesis.
1.2.1 The Constrained Semi-Assignment Problem
The constrained semi-assignment problem C-SAP generalizes the Pseudo-Boolean
Optimization problem and combines it with the Satisfiability problem. However, in
contrast to these problems we consider for the C-SAP instead of Boolean variables,decision variables which take values in an arbitrary finite subset of N. In order
to proceed with a formal description of the problem, we introduce the following
1.2 Problem Statement and Overview 3
notation, which will be used throughout this thesis:
Let some decision variables xt for i G N := {1, 2,..., n} and a set of possible values
K := {1,2,..., k] be given. In the C-SAP we assign to each variable xt, i e N,exactly one value r(i) e K, which will further on be denoted by (z, r(z)). As a
convention we always use i, j to refer to indices of variables and r, s for (indices of)assigned values.
Definition 1.1 (Assignments)Let N be the index set of the decision variables and let K be the set of possiblevalues for each variable.
1. An assignment A is a set of the form A := {(z, r(i)) | i — 1,.. ., n, r(z) e K} C
N x K. Moreover, we denote by A the set of all possible assignments.
2. A partial assignment T is a set of the form T :— {(z, r(z)) | i e M, r(z') e K}for some M Ç N. Moreover, if a positive weight wt e R+ is given, we call
(T, wt) a weighted partial assignment.
3. A partial assignment T is satisfied by a given assignment AeA,ifTC A.
Now the following problems are defined.
Definition 1.2 (C-SAP, SAP)The constrained semi-assignment problem C-SAP for the decision variables xt, i e N,with possible values in K is defined by a 5-tuple (n, k, 1Z, T, w) where
1. It is a set of (forbidden) partial assignments with \R\ > 2 for all R e 1Z
(defining the constraints)
2. (T, w) is a class of weighted partial assignments (defining the objective func¬
tion)
and consists in
max z(A) = ^{wT\T Ç A} (1.1)TET
s.t. AeA (1.2)
R£ A \/Ren. (1.3)
An assignment AeA which satisßes (1.3) is called a feasible assignment.The semi-assignment problem SAP is the special case of the C-SAP in which TZ = 0;it is given by the 4-tuple (n, k, T, w).IfT = $ then the C-SAP reduces to the so-called Generalized Satisfiability problem(G-Sat).
4 Semi-Assignment Problems: A Survey
The C-SAP is a unifying framework for a broad class of well-known combinatorial
optimization problems. On the one hand the Maximum-Satisfiability (Max-Sat),Maximum-/c-Cut (Max-fc-Cut) and fc-Coloring problem can be formulated as SAP's.
On the other hand Satisfiability (Sat), Maximum-Clique (Max-Clique) and label
placement are typical representatives of the C-SAP. All these problems and their
corresponding C-SAP formulations will be described in Section 1.3.
An important special case of the SAP is the well-known Pseudo-Boolean Opti¬mization problem, which uses Boolean decision variables and can therefore be
formulated as a SAP with K = {1,2}. First investigations of this problem go back
to Hammer et al. during the 60s, and since then many papers have been written
on this subject. Special classes of Pseudo-Boolean functions are characterized in
[Cra89]. For some of them efficient algorithms for the optimization problem have
been found (see e.g. [Rhy70, BM85, HS86]).
There also exist many real world problems which can be formulated as SAP's or
C-SAP's. Especially the task allocation problem is a classical representative of the
semi-assignment problems. Many applications for this problem exist, including
assignments of professors to departments [GS91], assignments of tasks to processors
[BCS92] as well as classical allocation problems in production systems and fleet
assignments.
In order to give a first impression on how such problems can be modeled, let us
discuss the task assignment problem in a heterogeneous multiple processor system.
This problem has been investigated in [BCS92]. Now we will show how it can be
formulated as a SAP. Let a set of k non-identical processors K := {1,..., k} and
a set of n tasks N := {1,... ,n} be given. The tasks have to be assigned to the
processors in order to be executed. Some of these tasks have to communicate with
each other. These intertask communications are represented by an undirected graphG = (N,E), where [i,j] e E if and only if tasks i and j communicate with each
other. For each edge [i,j] G E we have communication costs Cy, if the tasks i and jare assigned to different processors. If both tasks are executed by the same proces¬
sor, then their communication costs are negligible. Finally we have execution costs
Wir, if task i is assigned to processor r. The objective of this task allocation problemis to assign each task to exactly one processor and to minimize the sum of execution
and communication costs. Note that the order in which the tasks are executed does
not matter!
To model this problem as a SAP we define T := 7! U T2, where the partial assign¬ments in (7Î, wi) correspond to the execution costs and (72, w2) models the intertask
communications. For the execution costs we simply have to add the weights of the
assigned tasks and therefore (T\,wx) := {({(i,r)}, wir) | V(z',r) e NxK}. Regardingthe intertask communication we have costs c^- for each pair of tasks being executed
on different processors. Hence we define (T2,w2) := {({(z',r), (j, s)}, c^) \ V[i,j]E, Vr, s e K,r ^ s}. By minimizing z(A) := J2TeT{wT \ T Ç A} subject to A e A
1.2 Problem Statement and Overview 5
we solve the stated task allocation problem. Of course this minimization problem
can be converted to a maximization problem, but this transformation will result in
negative weights. However, since we defined the SAP only for positive weights, a
further transformation will be necessary. We will see in Section 2.2.2 that negative
weights can always be eliminated and thus the problem can indeed be formulated
as a SAP.
1.2.2 Heuristic Solution Methods for the C-SAP
During the last years much time and effort has been spent in structure and complex¬
ity analyses of NP-hard optimization problems and on the development of algorithmsto handle them. The NP-hardness of the above-mentioned combinatorial optimiza¬tion problems was already proven during the 70s [Kar72, Kar75, SG76, GJ79] and
only a few easy solvable special cases [Bok81, Bok87, Mal94, MP94] are known.
Hence, in the following years research concentrated on the development of heuris¬
tics. Several different methods like genetic algorithms, neural networks, evolution¬
ary strategies and local search methods have been developed to attack these hard
problems. Of these heuristics especially local search methods like Simulated An¬
nealing, a special Metropolis process going back to Kirkpatrick et al. [KGV83] and
Cerny [Cer85] (see also [BR84, CHdW87, LA87, AK89, Con92]), and Tabu Search
[HdW87, dWH89, Glo89, GL92] achieved a breakthrough. On the one hand these
methods are applicable to a broad class of optimization problems, on the other hand
they often provide well accepted results in very short time and are therefore used
frequently for hard combinatorial optimization problems like the C-SAP.
In spite of all these advantages, there exist C-SAP instances where local search
methods are doomed to failure due to their local vision and their lack of abilityto orient in the solution space. One such problem is the so-called geyser function
which we mentioned already in the introduction. We will describe this function here
in detail:
Example 1.3 (Geyser Function)Let N, K be given, A' Ç A be the set of feasible assignments and A* e A' be a
unique strict global maximum with objective value z(A*) := M > 0. Moreover,let ziA) := 0 for all other feasible assignments A e A' \ {A*}. The problem of
Unding a feasible assignment of maximal value can be formulated as a C-SAP with
(T, w) := HA*, M)} and K := A \ Ä.
We see that geyser functions have a large (discrete), constant plateau (manyneighboring solutions with equal objective values). There is only one single pointtherein which has a larger objective value. This is the reason why neighborhoodsearch methods cannot deal efficiently with this problem type.
For this reason one main goal of this work was the development and study of a
general purpose heuristic for solving C-SAP's, which will be called 'fixed pointheuristic' (FPH). Our hope is that this heuristic will overcome the above-mentioned
6 Semi-Assignment Problems: A Survey
weakness and disadvantage of local search heuristics and thus represents a goodalternative to these methods in some difficult cases as for example for geyser
functions.
FPH is based on a continuous relaxation of the C-SAP. We embed the set of
all solutions in continuous space thus getting a polytope whose extremal points
correspond to all possible assignments. Then we construct an appropriate operator
in the interior of this polytope such that the evolving discrete time dynamical
system generates a sequence of points which converges under certain assumptions
to a local optimum of the problem (see Section 1.4).The advantage of such a dynamical system approach lies in the fact that global
information about the problem is directly included during the solution process. In
this way the search for good solutions is not restricted by a discrete neighborhood
system as it is the case with local search methods.
In the sequel we discuss some dynamical systems used for combinatorial optimization
problems, where we concentrate especially on the class of replicator dynamics which
plays an important role in the definition of the operator used in FPH.
1.2.3 Dynamical Systems in Combinatorial Optimization
The use of dynamical systems in combinatorial optimization certainly got an
impulse from the success of interior point methods in convex optimization. An
excellent survey of dynamical systems in optimization can be found in the book of
Helmke and Moore [HM96]. The possibility of efficiently shortcutting through the
interior of a feasible region opened some hope for dealing with NP-hard problems
in a similar way.
Brockett and Wong [BW91] used a gradient flow approach to a special class of
assignment problems of the type mmPe-pn tr(C7*P), where C is a given cost matrix
and Vn the set of n x n permutation matrices. Though this assignment problem
is polynomially solvable, Wong [Won94, Won95] has shown that this approachcan easily be extended to a larger class of combinatorial optimization problems
including the Travelling Salesman problem and the Graph Partition problem.
To solve the C-SAP, FPH uses a discrete dynamical system which is closely related
to the so-called adjusted replicator dynamics. This dynamics has its origin in the
field of theoretical biology where it was studied in its single- and multi-populationform. The single-population replicator dynamics goes back to Taylor, Jonker
[TJ78] whereas its multi-population counterpart was introduced by Maynard Smith
[MS82]. Both dynamics were also studied in their discrete and continuous version
in [Aki79, LA83, Sig87, HS88]. During the 90s important relations between the
fields of theoretical biology, evolutionary game theory [Wei96] and optimization
theory [BPG97] were investigated and many similarities were found. Some of these
1.2 Problem Statement and Overview 7
important results regarding the stability of equilibria and their relationship to local
optima will be described in Section 1.4.
Independently of these investigations, Cochand [Coc93] used the adjusted replicator
dynamics in a heuristic approach to the Generalized Maximum Satisfiability
(G-Max-Sat) problem (see p. 12), followed a few years later by work of Bomze
[Bom96, Bom97] for the quadratic programming and the Max-Clique problem.
This recent work on the G-Max-Sat and the Max-Clique problem has shown once
more the efficiency of the adjusted replicator dynamics as a promising approachfor attacking hard combinatorial optimization problems. In a similar way as the
adjusted replicator dynamics was used by Cochand, we will define operators for
FPH in order to use the resulting discrete dynamical system to deal with the SAP
and the C-SAP.
Derivation of Operators for FPH
The operator used in FPH for the C-SAP is a generalization of Cochand's operator
[Coc93] for the G-Max-Sat. Based on the author's suggestion that similar operators
may be suited for solving constrained maximization problems as well, we combined
the operator for the G-Max-Sat with an additional part being responsible for the
maximization of the objective function. Thus the two goals of the C-SAP, the
satisfaction of all constraints and the maximization of the objective function are
reflected directly in the definition of the operator of FPH.
Moreover, later during our research we came upon the works of Baum and Eagon
[BE67] as well as Baum and Sell [BS68], who investigated some properties of the
maximization part of the operator. Though their works were not aimed at solvingcombinatorial optimization problems, we could build up on their papers and derive
in this way some nice generalization of their results.
Our hope with such a dynamical system approach is not so much the issue of short-
cutting through the interior rather than finding suitable dynamics for which the re¬
gion of attraction of points with large objective values is also large. Unfortunately,there may also exist some disadvantages for such methods, like uninterpretable fixed
points in the interior of the feasible domain or numerical instability during the com¬
putation. We will see in this work that for the operator used in FPH, for the SAP,fixed points in the interior are always unstable and attractors of the operator are
equivalent to strict local maxima. For the C-SAP a small modification of the oper¬
ator proposed in [Coc93] guarantees as well that all fixed points in the interior are
unstable. Numerical instability cannot always be avoided and depends heavily on
the data of the given C-SAP instance.
8 Semi-Assignment Problems: A Survey
1.2.4 Overview
Section 1.3 describes how some well-known combinatorial optimization problemsfit into the setting of the C-SAP. The transition of the discrete problem to a
continuous model, as well as the basic ideas of the associated dynamical system
constitute Section 1.4.
The rest of this thesis is divided into three parts each presented in a separate chapter.
The first part discusses the SAP. It starts in Section 2.1 with a short problem
description and continues with a repetition of the algorithmic main issues and
the definition of a basic operator for the SAP. Certain properties of the objective
function and a special class of polynomials are outlined in Section 2.2. Moreover,this section introduces some important equivalence relation between these polyno¬mials and characterizes then discrete and continuous local optima. In Section 2.3
we review important aspects of the dynamical system introduced by Baum, Sell
[BS68]. This discussion brings up the terminology of growth transformations,
basically an operator generating a sequence of points with monotonically increasing
objective values. We define a new, more general class of operators and show
sufficient conditions under which these operators are growth transformations.
The next subsections deal with properties of the iteration sequence, especially
concentrating on the relationship between fixed points, attractors and local
optima. Moreover, we derive some convergence results and deal with the special
case of a SAP with linear objective function. Investigations of the regions of
attraction are carried out in Section 2.4. We will see that in case of the SAP, the
objective function is not uniquely defined over the set of feasible points. Hence
we can define different 'forms' of the objective function which all agree over some
domain, but have different regions of attraction. We investigate the influence
of such modifications of the objective function on the regions of attraction and
we characterize a so-called guaranteed region of attraction which turns out to
be form-independent. This leads to a stopping criterion for the algorithm which
additionally speeds up the computation. In Section 2.5 we look once more at the
regions of attraction, now however for a class of more general operators. Againwe present a stopping criterion for the algorithm and focus in the sequel on
connections with the guaranteed region of attraction defined for the SAP. Finally,numerical results with the example of the Max-Cut problem are given in Section 2.6.
In the second part of this thesis, in Chapter 3, we add constraints 1Z in form of
forbidden partial assignments to the previously studied SAP. This extended prob¬
lem, the C-SAP is introduced in Section 3.1. There we also treat the construction
of the new operator dealing with the additional constraint set. The new part of
the operator is adapted from Cochand [Coc93] and as we will outline in Section 3.2
it is able to repel infeasible points under certain conditions. Moreover, we show
in this section that if the objective value of a feasible, global maximum is large
1.3 Combinatorial Problems Formulated as C-SAP's 9
enough, then FPH converges to this optimum for any starting point lying not too
close to the boundary; it thus overcomes one weakness of local search methods
described before. The second part concludes with some implementation details of
the algorithm, a presentation of numerical results and a comparison with Tabu
Search in Section 3.3.
The final part, Chapter 4, presents the example of the point feature label placement
problem as an application of the theory developed in the previous two chapters. After
an introduction in Section 4.1 stating the problem and discussing its complexity,
we give in Section 4.2 an overview of the wide variety of well-known algorithms
developed for this task. In Section 4.3 we model the label placement problem as a C-
SAP and describe furthermore the problem of point selection. Section 4.4 introduces
four preprocessing rules which reduce the number of constraints in the C-SAP model.
Moreover these rules are optimality preserving in the sense that an optimal solution
with the same objective value as before, also exists in the reduced search space. In
Section 4.5 different models for the label placement problem based on variants of
the C-SAP are discussed. Section 4.6 presents two postprocessing strategies which
can be combined with the previously derived models, thus additionally improvingan already found placement. Besides the aspect of placing labels without overlaps,aesthetical criteria play often an important role. For this reason we discuss in
Section 4.7 two models based on the C-SAP whose goals are to avoid ambiguoussituations. Towards this end the objective function is designed such that labels are
placed as far apart from each other as possible. Finally, several numerical results are
shown in Section 4.8. Besides a discussion of the reduction by the preprocessing rules,
we also depict various placements and compare our models to one another and other
heuristics especially designed for this task. Parameter settings and implementationdetails conclude this chapter.
1.3 Combinatorial Problems Formulated as C-
SAP's
We have already seen that the C-SAP is given by a 5-tuple in,k,lZ,T,w), where
(T, w) is a set of weighted partial assignments defining the objective function, and
1Z is a set of forbidden assignments forming the constraints. The following exampleillustrates the definitions of Section 1.2:
Example 1.4
Consider a SAP instance (n, k, T, w) and a C-SAP instance (n, k, TZ, T, w) with five
decision variables n = h,N :— {1,...,5} and four possible values k = A,K :=
{1,..., 4}. The set of weighted partial assignments (T, w) with T := {T1;..., T4}for both the SAP and the C-SAP and the constraints TZ := {Rlt R2} for the C-SAP
are given as shown in the following table.
10 Semi-Assignment Problems: A Survey
R2
X\ X2 £3 £4 £5 Xi X2 X3 X4 X5
N N
(a) SAP given by (n, k, T, w),Ax= {•}. (b) C-SAP given by (n, k, K, T, w), A2 = {•}.
Figure 1.1: SAP and C-SAP with optimal assignments Ax and A2.
objective function constraints
weights partial assignments partial assignments
5
7
11
2
7\
T2
T3
T4
(1,3)(2,4)
(1,2)(2,2)(3,3)
(4,2)(5,1)
(5,4)
JR1:(1,2)(5,1)Ä2: (2,4) (3,4)(4, 2)
Figure 1.1(a) shows (7~, w) for the SAP, whereas Figure 1.1(b) includes additionallythe constraints 1Z for the C-SAP.
fn both ßgures we depict horizontally the decision variables and vertically the
possible values. Moreover, they show the sets Ti,... ,7/4 with their corresponding
weights (solid lines) and Figure 1.1(b) also shows the constraints Ri and R2 (dashedlines).
To get an assignment A we have to select in each column exactly one point. More¬
over, A is a feasible assignment for the C-SAP, if it does not include all points of
either of the constraints Ri,R2.If all points of a partial assignment Ti, i = 1,..., 4 are contained in an assignmentA then its corresponding weight wtx is added to the objective value of A.
For the SAP the assignment Ax := {(1, 2), (2, 2), (3, 3), (4, 2), (5,1)} has objective
value 18, because it satisfies the weighted partial assignments (T2,7) and (T3,11).It is shown in Figure 1.1(a) by the solid dots and is optimal for the SAP.
However, A\ is infeasible for the C-SAP because Ri is a subset of A\
(Ri violates (1.3)). An optimal assignment for the C-SAP is A2 :=
{(1, 3), (2, 4), (3,1), (4, 2), (5,1)} (in Figure 1.1(b) depicted by solid dots) with ob-
1.3 Combinatorial Problems Formulated as C-SAP's 11
jective value ziA2) = 16. Note that although A2 includes two of the points in R2 it
does not contain them all and therefore does not violate constraint (1-3) for R2.
Since combinatorial optimization problems are often formulated on the basis of
logical clauses, we show next the relationship between partial assignments and
logical clauses:
Any partial assignment T e T with T = {(i,r(i)) | i e M,r(i) G K} for M Ç N
can also be identified by a logical clause C(T) of the form
C{T)=/\(i,r(i)) (1.4)
where the clause C(T) is satisfied by an assignment A, if and only if T C A.
{{i,r{i)) | i e
(1.5)
Using this relationship between partial assignments and logical clauses, we presentsome combinatorial optimization problems which can be formulated as SAP's:
• Max-Sat
For a given set of Boolean variables xl, their negates xt,i e N and a collection
C(T) of clauses, the Max-Sat problem (in conjunctive normal form) consists
in finding a truth assignment for the variables which maximizes the number
of satisfied clauses in C(T).
Let K := {1,2}, then a truth assignment for x corresponds to an assignmentAix): For i e N
x% = True :^ (i, 1) G Aix)
Ê, = True :<£»(«, 2) G Aix).' '
A clause C(T) G C(7~) of the Max-Sat problem in conjunctive form is given by
C(T) = /\ xt f\ xt, M1nM2 = 0, Mi UM2ÇJV.
By (1.6) there exists to each C(T) a corresponding partial assignment T:
C(T)= /\(z,l) /\ii,2) o T = {(i,l) \ieMl}U{ii,2)\ieM2}.leAfi ieM2
(1.7)
For the constraints, each partial assignment R G TZ given by R —
M,ri%) e K} for M Ç JV can be interpreted as a clause C(i2):
C(J2) = -(A(i'r(0))which is satisfied by an assignment A, if and only if R <Z A.
12 Semi-Assignment Problems: A Survey
Let T be the set of partial assignments corresponding to CiT). We see that a
clause C(T) is satisfied by x, if and only if T is satisfied by assignment Aix)and therefore the Max-Sat problem can be written as a SAP with weightswT = l for all T eT.
The generalization of the Max-Sat problem where \K\ > 2 and the decision
variables xl may take any value r, G K, is called the Generalized Maximum-
Satisfiability (G-Max-Sat) problem (see also [Coc93]).
• Pseudo-Boolean OptimizationA pseudo-Boolean function is a real-valued function of 0-1 variables, which
can be expressed as a polynomial in the variables xi,...,xn and their
complements x\,... ,xn. The Pseudo-Boolean Optimization problem consists
in maximizing the function value of such a polynomial.
Again to each monomial T = wt IiieMi xi \~\ieM-2. ^ 0I" ^ne objective function
with M\ H M2 — 0 and M\ U M2 Ç N there corresponds the weighted partialassignment (T, wt) with
T = {ii,l) \ieMx}u{ii,2) \ieM2}.
Let (T, w) be the set of weighted partial assignments corresponding to the set
of monomials of the objective function, K := {1,2} and the correspondencebetween a truth-assignment x and an assignment Aix) be given by (1.6). Then
the objective function can be written as follows:
max \_. wt TT %i TT %i = max 2_\{wt \ T Ç Aix)}T£T (i,l)T (i,2)eT TeT
and therefore the Pseudo-Boolean Optimization problem is a special case of
the SAP, too.
• Max-fc-Cut
Given a graph G = (Ar, E) with vertex set N, edge set E and positive edgeweights wtj for the edges [i,j] G E. The Max-A;-Cut problem consists in findinga partition of N into k subsets such that the sum of the weights of edges whose
endpoints lie in different sets is maximal.
Let K := {l,...,k} represent the sets of the partition. Each node i e N
must be assigned to a set r(i) G K which can be represented by a mappingr : N -ï K. Then the objective function of the Max-fe-Cut problem is
max V^ wl3.
b,]]eE-.r0)7Mj)
We write for any i G N,r G K: (i, r) for the predicate that node i belongsto set r. Then the SAP (n, k, T, w) corresponds to the Max-A;-Cut problem,if T consists of all partial assignments of the form T = {{i,r), ij, s)} for all
[i, j] G E,r,s G K,r / s and wT := Wij.
1.3 Combinatorial Problems Formulated as C-SAP's 13
• fe-ColoringGiven a graph G = (N,E) and a set of colors K := {l,...,k}. The
fc-Coloring problem consists in finding an assignment of k colors to the nodes
i G N, such that the number of edges between equally colored nodes is minimal.
The objective is equivalent to maximizing the number of edges between dif¬
ferently colored nodes. We see immediately the relation to the unweightedMax-A;-Cut problem, where all nodes in the same partition will get the same
node color.
Some well-known problems that can be formulated as C-SAP's are
• Sat
Given a set of Boolean variables xz,i G N and a collection of clauses in the
form of disjunctions of some literals. The Sat problem consists in decidingwhether there exists a truth assignment that simultaneously satisfies all these
clauses, or not.
Each disjunction can also be represented as the negate of a conjunction as
described in (1.5). We denote the set of all these clauses by C(7Z) and the
corresponding set of partial assignments by 1Z.
This problem is a special case of the C-SAP, because it does not have an
objective function (7~ = 0), but consists only of the constraint set 1Z. Againlet K := {1,2} and the connection between a truth assignment x and a
feasible assignment A(x) be given by (1.6). Then A(x) satisfies (1.3) if and
only if all clauses C(-R) with R G 7Z are satisfied and therefore is a feasible
assignment for the C-SAP.
As for the G-Max-Sat problem, we call the Sat where more than two potentialvalues can be assigned to the variables xt, the Generalized Satisfiability (G-Sat)problem.
• Max-CliqueLet a graph G = (A", E) be given. The Max-Clique problem consists in findingthe largest complete subgraph in G.
We introduce for every node i G N a 0-1 variable x% and define
Xi = 1 :4=> i belongs to a clique.
Then the Max-Clique problem can be stated as follows
max y jxlieiv
Ö.L. dj^Juri U V[i,j]#E
z. e {o, 1} \/l£N.
14 Semi-Assignment Problems: A Survey
The objective function maximizes the number of nodes in a clique, whereas
the constraints guarantee that only nodes of one clique are counted: If two
nodes i and j are not connected by an edge then they cannot both belong to
the same clique.Let K := {0,1}, then the objective function is given by the set of weighted par¬
tial assignments (7", w) with T := {{(i, 1)}, i G N} and wt = 1 for all T G T-
Analogously the constraint set 7Z is formulated by 1Z := {{(i, 1), (j, 1)}, i,j G
N,[i,j] <£ E}. Then the (n,2,7Z,T,w) describes the Max-Clique problem as
a C-SAP.
• Label Placement (see Chapter 4)Let n points in the plane be given, denoting points of interest on a map.
Moreover, for each point a label (description of the point) is given which has
to be placed at one of k given positions in order to mark the corresponding
point. The label placement problem consists in selecting one of the k label
positions for each point such that when placing the labels at these positions,
they do not overlap, ambiguity is avoided and other aesthetical criteria are
fulfilled.
Preferences of label positions as well as ambiguity can be modeled by some
suitable objective function, overlapping by the constraint set. Different models
will be discussed in Chapter 4.
All problems mentioned above are NP-hard. In order to find good approximatesolutions for these problems, we want to include global information in the solution
process. Towards this end we consider a continuous formulation of the C-SAP and
introduce a variable p G {0, l}"xfc which represents uniquely any assignment AeA
by setting pir = 1 if and only if («, r) G A.
Definition 1.5 (Feasible Points: AJ,A, A°,A*)The set of all assignments is defined by
AT := Ipe {0,l}nxfc
Moreover, we denote its convex hull by
^2pir = l,VieN\. (1.8)r=l
$>„• = !, VieN\ (1.9)r=l
A—ConviA1) = lpe[0,l]nxk
and its relative interior by A0. One single simplex will be denoted by
A*:=\qe[0,l]kk
ZAr=\
1.4 Idea of the Algorithm 15
We see that p is an n x ^-matrix whose rows correspond to the decision variables
and whose columns correspond to the values to be assigned. If p G A7, then it
represents uniquely an assignment A. On the other hand, if p G A, then pir G [0,1]can be interpreted as the probability that value r is assigned to decision variable
xt. We denote the rows of p by p%, e A* for i G N.
Using this characterization of the set of all assignments, we formulate the C-SAP
given by (n, k, 1Z, T, w) as in Definition 1.2 by the following integer program
max zip) = ^2 \wt H Pir\ (1.10)T&T \ (,,r)er /
s.t. peA1 (1.11)
Yl pir = 0 VReTZ. (1.12)(i,r)eR
Note that then a SAP in,k,T,w) is just given by (1.10) and (1.11).
1.4 Idea of the Algorithm
A common and often successful approach for solving the problems presented in
the previous section are local search heuristics, like Simulated Annealing and Tabu
Search. However, as we have seen in Example 1.3 their inherent myopia may provetheir undoing for certain instances. For this reason one of our goals in the devel¬
opment of a new heuristic was to include the ability to 'orient itself, from which
especially problems like those in Example 1.3 will profit. Towards this end we embed
the set of all assignments in the continuous space by relaxing the integer constraints
(1.11). Thus for a C-SAP (n, k, 1Z, T, w), we get the following relaxed problem:
max zip) = ^2 \wT Yl P*r) (1-13)Ter \ (,,r)er /
k
s.t. ^2Pir = 1 yieN (i.i4)r—l
0<pir<l VieN,reK (1.15)
Y[ p„ = 0 VRe 1Z. (1.16)(i,r)£fl
Analogously for the SAP a relaxed problem is defined by (1.13)—(1.15) and will be
denoted by R-SAP.
16 Semi-Assignment Problems: A Survey
Working in the interior of A, we define for our heuristic a continuous mapping
F : A0 —> A0 and consider the discrete time dynamical system which results
from iterating F. Thus the basic concept of our algorithm, called the 'fixed point
heuristic', is the following:
Fixed Point Heuristic (FPH):
• Choose a starting point p° G A0.
• Compute the sequence p* := _F(pi_1), t = 1,2,..., I for some /.
• Choose as solution p* G A1 the assignment p* which is 'closest' to pl.
Subsequently we will concentrate on the definition of F which is central to obtain a
sequence of points converging to one of those feasible points in A1, corresponding to
a 'good' solution of the given problem for a reasonably large proportion of starting
points p° e A0.
The success of such discrete dynamical systems as an alternative to local search
methods can, among other papers, be put down to the works of Cochand [Coc93]and Bomze et al. [BPG97] dealing with the Max-Sat and Max-Clique problem,
respectively. Both of these papers use the described concept, where the dynamicsis a special case of the adjusted replicator dynamics. Since we will work with this
dynamics as well, we describe it below in more detail:
In its single-population version the replicator dynamics goes back to Taylor, Jonker
[TJ78] and describes in the context of theoretical biology the evolution of a popula¬tion over time [Aki79, Sig87, HS88]. It is defined for a population state pt G A* at
time t by
Pr+1 = T^fj-iPr ==*V)n T = 1, . . .
, k, (1.17)
where each r G K denotes some property and the components p* represent the
population share of the individuals at time t. In theoretical biology A is a symmetricmatrix with non-negative elements and in this case the so-called Fundamental
Theorem of Natural Selection [LA83, Kur90] says that ipt)TApt, the average fitness,increases along every non-stationary trajectory of (1.17). Moreover, it is known
that in this case every trajectory converges to an equilibrium.
Bomze et al. [BPG97, BBPP99] used the dynamics (1.17) as an approach to the Max-
Clique problem. Let A denote the adjacency matrix of a given graph G. Motzkin
and Strauss [MS65] have shown that (1 — 2z*)~l is the size of the maximum clique,if z* denotes the optimal objective value of the following quadratic optimization
1.4 Idea of the Algorithm 17
problem
mBXzip):=\pTAp (118)s.t. pG A*.
Note that if G has k nodes, then A is a symmetric k x k matrix and so in this case
the Fundamental Theorem of Natural Selection holds. Besides investigating this
quadratic optimization problem, the authors also interpret dynamics (1.17) in the
game theoretical context, where matrix A represents the payoff matrix of the players.
Independently of the interpretation of dynamics (1.17), its equilibria or fixed points,
i.e. points for which F(p) = p, are of special interest. Obviously, they are given bythe solutions of the equations
pr[iAp)r-pTAp] = 0 r = l,...,k. (1.19)
In case of the quadratic optimization problem (1.18) with symmetric matrix A,
it was shown by [BPG97] that not only are all solutions of (1.18) among these
equilibria, but also that the following relations between equilibria and local optimahold: Strict local optima of (1.18) are equivalent to asymptotically stable equilibriaof (1.17). Moreover, every local optimum of (1.18) is a Nash equilibrium, and Nash
equilibria and the Karush-Kuhn-Tucker points of (1.18) are equivalent.
These results show in the case of the single-population replicator dynamics, how
the characterization of the equilibria also determines the local optima of the cor¬
responding optimization problem. Subsequently we will turn our attention to its
multi-population counterpart: Instead of working only over one simplex A*, it is de¬
fined on the crossproduct of simplices, A. Let us introduce the notion of the fitness
function:
Definition 1.6 (Fitness Function)A vector Ç = (£ir) (i G N,r G K), where £jr : A0 —^ R+ are continuous functions, is
called fitness function on A0.
The adjusted multi-population replicator dynamics ([Tay79, MS82]) is defined for
p G A0 by the operator F[£](p):
Definition 1.7 (Operator F[$](p))Let £ be a fitness function on A0. Then the operator F[£] : A0 —> A0 is defined for
alii e N,r G K by
PiÂirjp)m(ph--=~ff- where E«(p) := X>.&(p). (1.20)
In the setting of multi-population dynamics where n populations are considered
whose members can adopt one of k possible strategies, this dynamics describes
18 Semi-Assignment Problems: A Survey
the evolution of the proportion of individuals in population i adopting strategy r
(see [Wei96]). This evolution equation expresses the fact that the relative increase
F[ç](phr-Pir 0fp.r equals the excess fitness ^^7 )
P 0I" subpopulation (z, r) over the
average fitness of population i.
Cochand [Coc93] has used an operator of type (1.20) for solving the G-Max-Sat
problem. He has chosen an appropriate fitness function £ such that truth assign¬ments which satisfy all clauses correspond to asymptotically stable fixed points of
the dynamical system.
Since the C-SAP is a generalization of the G-Max-Sat problem, we adopt the fitness
function used in [Coc93]. On the one hand, it should be responsible for the feasibilityof the solutions by making logically forbidden assignments repellors. On the other
hand, we want to get a good feasible solution. For this reason the first fitness
function is combined with a second one whose task is to attract the system towards
assignments with high objective values. This will be achieved by using the partialderivatives of the objective function as fitness function. Hence, we define
Definition 1.8 (Gradient/Repellor Dynamics)Let z be the objective function of a SAP instance (n, k, T, w) with Y = (Tir) defined
by
TiTiz,p) := ^-ip) = J2 wt II Pi" W£N,reK. (1.21)PlT
<£& ö»er\(«,r)
IfT is a fitness function on A0, i.e. Tirip) > 0 for all p G A0, then we call the
dynamics defined by F[Ta], a > 0, a gradient-type dynamics.
Moreover, if a set of forbidden partial assignments 7Z is given, we define the fitness
function O = (Ojr) as in [Coc93] by
®ir(p):= Il (1_ Il Pi") Vi£N,reK (1.22)
and call the dynamics defined by F[Q^],ß > 0 a repellor dynamics.
Note that by definition of the C-SAP (SAP), all weights wT of the objective function
zip) in (1.13) are positive. Hence, a sufficient criterion for P = (rir) being a fitness
function on A0 is that all variables in z(p) exist. We will see in Section 2.2.2 that
this can always be achieved by an appropriate transformation of z. Moreover, we
observe that 0 < Oîr(p) < 1 for all p G A0 and therefore 6 is also a fitness function
on A0.
For the C-SAP we will use the operator F[TaQß], combining the gradient-typedynamics and the repellor dynamics. In Chapter 3 we will discuss properties of this
1.4 Idea of the Algorithm 19
operator and present some results of [BCG98].
We conclude this introduction with two examples to illustrate the behavior of the
three different dynamics.
Example 1.9
Let the following C-SAP instance with n = k = 2 be given.
max zip) = 5piip2i + 4pi2p22 + 3pi2P2i + 2pnp22
S.t. PnP21 = 0, P12P22 = 0, p G A7.
Here, the set of forbidden assignments is TZ = {{(1,1), (2,1)}, {(1, 2), (2, 2)}}.Moreover, we see immediately that p* := (\q) is the point in A1 with the
largest objective value zip\) = 5 and p*2 := (° \) has objective value z(p*2) = 4.
However, both points, pi and p2 are infeasible because they violate the constraints
corresponding to the forbidden assignments in TZ.
Figure 1.2 depicts the behavior of the three dynamics, in the pu and p21 coordinates
along the x- and y-axis, respectively. Each subfigure shows the iteration paths
(trajectories) of the corresponding dynamics where arrows mark their orientation.
In Figure 1.2(a) we see the trajectories for the gradient-type dynamics F[T0-5] used
to maximize z. The satisfiability constraints are ignored and only the SAP is con¬
sidered. We observe that the majority of the trajectories converges topi, ^ne pointwith the largest objective value, and a smaller part converges to p*.
Figure 1.2(b) shows the repellor dynamics F[&\. It ignores the objective function
and is only designed to satisfy the constraints in TZ. Hence this case corresponds to
the Sat problem (T = 0j given only by the constraint set TZ.
Finally, Figure 1.2(c) depicts the trajectories of the combined dynamics ofF[Y2<d° 5].None of them converges to a forbidden point and their majority converges to the
feasible point in the upper left corner, which is the global maximum of the C-SAP.
The following example shows that for the quadratic maximization problem (1.18)over one simplex, the replicator dynamics (1.17) coincides with the gradient-type
dynamics of F\T].
Example 1.10
Let A be a positive, symmetric kxk matrix and the quadratic optimization problem
(1.18) be given. Then the gradient T of the objective function z(p) is Ap and
therefore we get for F[T]ip):
prrU \^ip) (AP)r
In Chapter 2 we will study the gradient-type dynamics of F[T]. However, in contrast
to the quadratic optimization problem over one simplex as in (1.18), we investigate
20 Semi-Assignment Problems: A Survey
(a) Gradient-type dynamics: F[T° 5] (b) Repellor dynamics: F[&\.
(c) Combined dynamics: F[r20'2ß0 5l
Figure 1.2: Trajectories of three different dynamics in the (pu,p21 )-plane.
F[r] for objective functions zip) of the SAP and we are therefore working on A,the cross product of simplices.
Chapter 2
The Semi-Assignment Problem
2.1 Introduction
The aim of this chapter is the development and study of a continuous relaxation
based heuristic for the SAP. We recall from Chapter 1 that a SAP instance is defined
by a 4-tuple (n, k, 7", w), where w is a function w : T —> M+ and every T G 7" is a set
of the form T = {(i\,ri),..., ii\T\,T\T\)} with î; < 2^,1 < / < V < \T\ < n, 1 < rx <
k Throughout this chapter we will use the sets N :— {1,..., n}, K := {1,..., k} and
denote the weights by wt := W(T). A SAP instance in,k,T,w) can be expressed
by the nonlinear integer program
max zip) = ^2 \wT Y[ Pv.
TET \ (i,r)<ET
s.t. peA1,
(2.1)
(2.2)
where
A7= pe{0,l}:JXfc ^2pir = l,VieN
r=l
We will call 'relaxed semi-assignment problem' (R-SAP) the problem obtained by
relaxing the integrality constraints in (2.2) to
pGA=<pG[0,1inxfc J2pir = 1, Vz G N \ . (2.3)
r=l
The objective function z in (2.1) will be investigated in detail in Section 2.2,
whose first subsections are dedicated to structural properties, transformations and
equivalent formulations of z on A. The last two subsections discuss the relationshipbetween the SAP and the R-SAP and concentrate therefore especially on the
locations of local optima of both problems.
22 The Semi-Assignment Problem
In order to 'solve' the SAP we will deal with its relaxation R-SAP, where our ap¬
proach is based on the concept of growth transformations:
Definition 2.1 (Growth Transformation)Let z be a continuous function on A0 Ç Rnk. We say that a continuous mapping
F : A0 —y A0 is a growth transformation for z iff
zip) < ziFip)) Vp G A0.
If additionally
zip) = ziFip))^p = Fip) (2.4)
F is called a strict growth transformation.
In this chapter we study transformations F[Ç] : A0 —>• A0, where £ = (£ir) is a
vector of continuous functions £îr : A0 —> R+, and F[Ç] is for all 2 G N, r G K of the
following form:
Pir^irjp)
£Î=iP"£«(p)n\(p)ir = ^r W- (2-5)
Recall from Section 1.4 that Ç is a fitness function, which needs to be defined
according to the optimization problem to be solved. In case of the SAP, let zip)be the objective function (2.1) and let us assume that all variables in zip) exist.
Moreover, we recall from (1.21) the definition of V as the partial derivatives of zip),i.e. Tir(z,p) — §*-($) for all 2' G N, r G K. We see that under these assumptions
all Tir are positive on A0 and therefore F = (Tj,.) can be used as fitness function in
(2.5) for the R-SAP (n, k, T, w).
If there is no danger of confusion, then we will simply write rzr(p) instead of Tiriz,p).
The mapping F[T] was already discussed in a paper by Baum and Sell [BS68], where
the authors investigated F[T] for arbitrary polynomials with positive coefficients and
derived the following important result.
Theorem 2.2 (Baum, Sell)Let z : M.nk —> R be a polynomial in the variables pir,i G N,r G K with positivecoefficients. Then F[F] is a strict growth transformation on A0.
We will adopt this mapping and use F in the context of combinatorial optimization
problems which can be formulated as R-SAP instances. The objective function z(p)as in (2.1) has some special properties to be discussed in Section 2.2. Moreover, in
Section 2.3 we will introduce other fitness functions £. Based on the idea of Baum
and Sell we will prove a generalization of Theorem 2.2 by giving some sufficient
conditions under which F[£] is a growth transformation. Furthermore we will
2.1 Introduction 23
investigate the behavior of F[T] and show that local maxima and saddle points
are always fixed points of F. After further discussion of properties of the iter¬
ation sequence and convergence of F[T] we will finally characterize attractors of F[T].
Let us conclude this introduction with the following small example which demon¬
strates the typical behavior of the operator F[T]:
Example 2.3
Let the following SAP with N := {1, 2} and K := {1, 2} be given:
max zip) = 2pup2i + p12p22 + Pu + Pu + P21 + P22- 2
S.t. Pr\+Pi2 = l 2 = 1,2
pîrG{0,l} 2 = l,2,r = l,2.
This problem has its global maximum in p* = ( {[J ) with objective value zip*) = 2
and a local maximum in q* = ([] i) with ziq*) = 1. Moreover, we observe that its
relaxation R-SAP does not have any local optima in the interior A0 - a phenomenonwhich holds for all polynomials of the form (2.1), as we will see in the next section.
(a) Graph of zip) in pn,p2i-plane. (b) Contour plot of zip) and gradientvector field
Figure 2.1: Plots of the graph zip) — 3pnp2i — Pu —
P21 + 1.
Figures 2.1(a) and 2.1(b) show the restriction of z(p) to the pn,p2i-piane. Substi¬
tuting pl2 = 1 - p%1 for 1 — 1,2 in zip) we get
zip) = 3piip2i - P11 - P21 + 1-
We see that by such substitutions we can construct new polynomials z which have
a different 'form' (representation) from z, but their values agree on A. This ob¬
servations will lead to the definition of an equivalence relation for these polynomials.
24 The Semi-Assignment Problem
The graph zip) is depicted in Figure 2.1(a) and the corresponding contour plot with
the gradient vector field in Figure 2.1(b). From both pictures we clearly see that
there exists a saddle point at p := | ( \ \ ).
In this example we get for the partial derivatives
nz,p)=(lP21
+ ] P22 + ])(2-6)^ '
V2pn + 1 Pu + l) y }
and since all coefficients of z are positive, it follows that Tir(p) > 0 for i = 1,2, r =
1, 2 and allp G A0. Note that this would also have been true, ifz(p) consisted onlyof the first two monomials, i.e. zip) = 2pup2i +P12P22 which is another polynomialthat agrees with zip) on A. However, by adding the other monomials in zip) we
guarantee that the partial derivatives Tir(z,p) are strictly positive for allp G A and
therefore F[T] can also be continuously extended to the boundary of A.
(a) Trajectories of F[T]. (b) Regions of attraction for the
strict local maxima p* and q* (nu¬merically determined).
Figure 2.2: Trajectories and regions of attraction.
Figure 2.2(a) shows the trajectories of F[F] for this example. We observe that the
steps become very small in a neighborhood of a saddle point or local maximum.
Furthermore, both local maxima p* and q* have a region of attraction shown in
Figure 2.2(b): the black region is the region of attraction of the global maximum
p*, whereas the light gray region belongs to q*. Note, however that in the generalcase we cannot determine the regions of attraction explicitly.
This example has already touched many interesting questions in connection with
important properties of the operator and the objective function. In Section 2.4
we will focus on the regions of attraction of strict local maxima and we will be
able to explicitly describe a polytopal subset thereof. This knowledge can then
advantageously be used for an additional stopping criterion in FPH.
2.2 On the Objective Function of the R-SAP 25
2.2 On the Objective Function of the R-SAP
In this section we first investigate some properties of the objective function and its
partial derivatives r^, which largely influence the behavior of F[T]. Then we studythe relationship between the SAP and the R-SAP where we concentrate mainly on
the locations of local optima and saddle points.
2.2.1 Basic Properties
The objective function z in (2.1) is a polynomial function, of the following specificform.
Definition 2.4 (A-Polynomial, PN)• Let a 4-tuple (n, k, T, w) as for a SAP instance, together with a constant c G R
be given. Then we call z : Rnk -> R
Z(P) = J2 \Wt II Pir)+C (2-7)T£T \ (,»eT /
an assignment-polynomial (short: A-polynomial), where pir are variables with
i e N,r G K. The degree m of the A-polynomial z(p) is defined by m :—
m&XTeT \T\.
• Let U Ç N, then we denote by Pu the set of all A-polynomials (2.7) which
contain variables pir with i G U.
• Let z e PN. If k = 2 then we call z a Boolean A-polynomial; if m = 2 tien z
is referred to as a quadratic A-polynomial.
If not mentioned otherwise, we will assume that any A-polynomial z lies in PN, i.e.
for alH G N there exists a monomial in z with non-zero weight which contains a
variable pir for any r G K.
We observe that A-polynomials are linear affine functions in the variables p;. for all
i G N. Hence, for every fixed i G N we can write
k
Zip) = ^PirYirip) + Riip) (2.8)r=l
where
r,>(p)= X>t ft Pjs = 4^(p) VieN,reK (2.9)
l^T ü»er\(i,r)Pir
26 The Semi-Assignment Problem
and
Riip):= J2 wt Yl Pis + C VieN. (2.10)-,
T6T: 0>)6T
Note that rjr(p) and Riip) are for all 2 G N, r G TT A-polynomials which do not
depend on the variables p;. and therefore r;r(p),i?;(p) G PN\l.
2.2.2 Equivalent A-Polynomials
We have seen in Example 2.3 that there exist different A-polynomials which all agree
on A. This is a consequence of the fact that we are working with polynomials in M.nk,the dimension of A, however, is only n(A; — 1) and therefore there exists some degreeof freedom. We define the following equivalence relation between A-polynomials:
Definition 2.5 (Equivalent A-polynomials)Let z1,z2 : R"fc -> R be two A-polynomials. Then
zi = z2 :<& ziip) = z2ip) VpGA
and we call Z\_ a form of z2.
Subsequently we will characterize all A-polynomials zip) which are zero on A. For
this reason we use the 'Division Algorithm for Polynomials' (see e.g. [CL097]).
Let R[pn,... ,Pnk] be the set of all polynomials in pn,... ,pn£ with coefficients in R.
Theorem 2.6 (Division Algorithm)Let a monomial order be fixed and let F = (<7i,..., gs) be an ordered s-tuple of
polynomials in R[pn,... ,pnk\- Then every z G R[pn,... ,pnk] can be written as
z = hgi + —h fsgs + h,
where fa, h G R[pn,... ,pnk] and either h = 0 or h is a linear combination, with
coefficients in R, of monomials, none of which is divisible by any of the leadingterms ofgx,...,gs.
Proposition 2.7
Let z be an A-polynomial. If zip) = 0 for all p G A then
n I k \
Z(P)=H I>--1 )fi(p), (2.11)2=1 V=l /
where for all j e N : fj(p) are afEne iinear polynomials in Pi. which do not dependon pj,.
2.2 On the Objective Function of the R-SAP 27
Proof
Let us define g%ip) '= Y^r=iPir ~ 1 f°r all 2 G A" and let us fix a monomial order
given by the lexicographic order, where the variables are ordered as follows:
Pu < ••• <Pik < < Pnik < •• < Pnk-
For this order the leading term of each polynomial giip) is just given by the variable
Pik- Applying the 'Division Algorithm for Polynomials' it follows that any polyno¬mial zip) can be written as
71
Z(P) = X] h(p)9i(p) + hip)1= 1
where /2(p), i e N and h(p) are polynomials and either hip) = 0 or hip) is the sum
of monomials none of which has p^, 2 G N as a factor.
If z(p) = 0 on A, then it follows by construction that h(p) = 0 on A and therefore
h(p) = 0. However, since hip) does not depend on Pik for any i G N, we define
M :={pe R<k-V \Pir>0VieN,reK\ {k}, 1 - YHZlPis > 0, V2 G N} with
M° / 0 and observe that hip) = 0 on M. Moreover, note that the Taylor series of
h agrees with h. Since the partial derivatives Yir(h,p) = 0 for all p G M° it follows
that hip) EOonrM.
This proposition allows us to construct all objective functions which agree on A.
However, regarding FPH, such a change of the objective function has far-reaching
consequences on the partial derivatives and therefore on the behavior of the operator
F[T] as well. As we will see in Section 2.4, we can thus influence the step length of the
iteration sequence, hence also the regions of attraction of strict local maxima - and
all this only by constructing equivalent A-polynomials. For this reason it becomes
especially important to know how the partial derivatives of equivalent A-polynomialscan differ, which can be derived directly from Proposition 2.7:
Corollary 2.8
Let an A-polynomial z\ G PN be given.
1. If fi e PN\l for all i e N are given, then there exists z2 G PN with z1 = z2
such that Tirizi,p) — fi + Yir(z2,p) for all i G N,r e K.
2. If z2 G PN with z\ = z2 is given, then there exist functions fi G PN\l for all
i e N such that Ytr(z1,p) = /j + Yiriz2,p) holds for alii G N,r G K.
A direct consequence of this corollary is the following remark regarding the partialderivatives of two equivalent A-polynomials (which will often be used in Section 2.4).
Remark 2.9
Let z\, z2 be two equivalent A-polynomials. Then for all i G N,r,s G K and any
p G A the following equivalence holds:
Yis(zi,p) > Tirizi,p) <£> Yisiz2,p) > Yir(z2,p). (2.12)
28 The Semi-Assignment Problem
The results of this subsection can also be used to construct equivalent A-polynomialswith positive partial derivatives.
Transformation 2.10 (Positive Derivatives)Let Z\ be a polynomial as in (2.7) with possibly negative weights wt and some non-
positive partial derivatives. We define M := 1 — Y1t-w <o wtand an A-polynomial
z2 by
n k
z2(p) = zxip) + mJ2J2p^ -nM- (2-13)2=1 r=l
By construction z2 has for all p G A positive Y-values, because Yir(z2,p) =
Yir(zi,p) + M for alH G N, r G K. Moreover, z2 agrees with z\ on A.
2.2.3 Relationship between Discrete and Continuous Local
Optima
In this section we study the relationship between the SAP (2.1), (2.2) and its
R-SAP relaxation and investigate properties of local maxima of A-polynomials in
the discrete and continuous case.
The next proposition describes a construction procedure which converts anysolution of the R-SAP into an at least equally good solution of the SAP:
Proposition 2.11
Let z be an A-polynomial. Then for any point p G A we can construct in polynomialtime a point p* G A1 with zip*) > zip). Hence, in particular
maxz(p) = max,z(p).
Proof
Let p G A \ A1 and let i e N be fixed. W.l.o.g. we assume that rti(p) > rîr(p) for
all r G K. We know from (2.8) that zip) can be rewritten for any fixed t e N as
z(p) = X)r=iP"-r«-(p) + Riip)- Thus we get
k k
zip) = £pirrir(p) + R,ip) < Ttlip) Y,Pir + Riip) = r,i(p) + Riip). (2.14)r~l r=l
Note that in (2.14) neither r2i(p) nor R%ip) contains any of the variables p„, r G K.
We define a new vector p G A by
{1,ifj = 2,r = l
0, ifj = 2,r>l. (2.15)
pjr, otherwise
2.2 On the Objective Function of the R-SAP 29
(2.14) implies immediately that zip) = Tji(p) + Ri{p) = Ynip) + Riip) > zip). By
replacing p by p and repeating this procedure for each non-integer vector p3,j G N
we construct an integer solution p* G A7 with zip*) > zip). m
In the sequel we look at the relationship between discrete and continuous local
maxima of the SAP and the R-SAP, respectively. For this reason recall the definition
of a continuous local maximum:
Definition 2.12 (Continuous Local Maximum)A point p* G A is a (strict) local maximum of the R-SAP, if there exists a neighbor¬hood Uip*) : Vg G Uip*) n A : ziq) < zip*) (z(q) < zip*)).
Moreover, in the discrete case we use the following definition of a neighborhood:
Definition 2.13 (Discrete Neighborhood)Let an integer vertex p G A1 be given. Then the discrete neighborhood of p is
defined by
tfip) :={qeAI\3ieN: ç, + p„ qj. = py Vj G N \ {i}}. (2.16)
Note that if z is an A-polynomial and p G A7, then from the linearity of zip) in p,.
for all i G N it follows that for all q G Nip), A G [0,1]
ziXp + (1 - \)q) = Xzip) + (1 - X)ziq). (2.17)
Using this neighborhood N we define the discrete local maximum by
Definition 2.14 (Discrete Local Maximum)A point p* e A1 is a discrete (strict) local maximum of the SAP, iffor all q G N(p*):z(q) < zip*) (ziq) < zip*)).
We have the following relationship between discrete and continuous strict local max¬
ima.
Proposition 2.15
Let z be an A-polynomial and p* G A1 with p\x = 1 for all i e N be given. Then
the following statements are equivalent:
(1) p* is a discrete strict local maximum of the SAP
(2) p* is strict local maximum of the R-SAP
(3) Yn(p*) > Yirip*) for allieN,r>l.
Proof
(1)^(3): Let q* G Af(p*) with q*r = 1 for some r > 1. Since Yirip) and R^p) do not
30 The Semi-Assignment Problem
depend on pim we have Riiq*) = Riip*) and IV (g*) = IV (p*). Hence, for the discrete
strict local maximum p*
z(p*) = Ynip*) + Riip*) < Yïrip*) + R,ip*) = ziq*)
for all q* G Nip*) and therefore Ynip*) < Yirip*) for all i G N, r > 1.
(3)=»(2): Let r\i(p*) > rir(p*) for all i G TV, r > 1. By continuity it follows that
there exists an e-neighborhood U£ip*) such that
Vg G UEip*) n A : Ya(q) > Yiriq) Vi G N,r > 1. (2.18)
Let ç G Ueip*) D A and apply the iteration procedure used in the proof of
Proposition 2.11 to get an integer point q* from q. We see that then q* = p* and
zip*) > z(q) because any intermediate point in the iteration procedure remains in
Ueip*) n A and (2.18) still holds.
(2)=^(1): Since zip) is linear along the line from p* to any discrete neighbor
q* G Nip*) (2.17) and p* G A7 is strict local maximum, z(p) is strictly decreasing
along these lines and therefore p* is also a discrete strict local maximum. H
Regarding the locations of continuous local optima we will prove an even strongerresult in the following theorem, which implies that every strict local maximum is an
integer vertex. More precisely, this theorem will show that there do not exist local
optima in the interior of A, unless z is constant on A. This property is a basic result
in harmonic function theory (see e.g. [ABR92], Theorems 1.4,1.5). We prove this
result for A-polynomials, thus taking advantage of the fact that they can be written
as in (2.8).
Theorem 2.16
Let z G PN be an A-polynomial and for all i G N a convex set Li Ç A* with
non-empty relative interior V] be given and let Sn := L\ x • • • x Ln C A. If zip) is
not constant, then z has no local optima in the relative interior S°.
Proof
We prove by induction over the number n of convex sets Li: i = 1,..., n that if zip)has a local optimum in 5° then it must be constant. For easier writing we define
S?:=L°x-.-xL?, 2 = 1,...,n
to denote the relative interior of the cross product of the first i given convex sets.
Since we build up matrix p successively by adding rows, we use here the notation
p = (u, v), where v always corresponds to the last row of p.
Let 7"i = l, then z G P^ is just the linear function z(i>) = J2r=i wrvr + wo which
has only local optima in S®, if ziv) is constant on 5°.
2.2 On the Objective Function of the R-SAP 31
Now we assume that our induction hypothesis holds for all A-polynomials z G PN\n.
For u G Sn_i,v G L° we have (u,v) G S® and any A-polynomial z G PN can be
written as
k
ziiu,v)) = ^vrzr(u)+z0iu), (2.19)r=l
where zr G PN\n for r = 0,..., k are A-polynomials, u G S°_1 and uei". Now let
us assume that (u*,v*) G 5° is a local optimum in 5°. If we let v* G L° be fixed,we get
k
ziiu,v*)) = J2<zAu) + zoiu) (2.20)r=l
which is an A-polynomial in pN\n and since u* G S°_x is a local optimum of (2.20)we have by the induction hypothesis z((it, t>*)) = c on «S^. This allows us to express
Zoiu) from (2.20) by z0iu) = c — 2^r=i Kzr{u) and after re-substitution in (2.19) we
get
k
zHu, v)) = J>r - <)2r(u) + c. (2.21)r=\
If there exists a neighborhood Ue((u*,v*)) such that for all points (u, u) G
Ueiiu*, v*)) H -S° : z((tt, u)) = c, then z((m, i;)) = c on S'7i, by the same argument as
already used at the end of the proof of Proposition 2.7. In the sequel we use the
negation of this result and assume z((-u, v)) ^ c on Sn:
zHu, v))^c^Ve>0 3(û, v) G W£(K, v*)) n S° : z((ü, ü)) 7^ c.
W.l.o.g. we define £> := v* — (v — v*) and choose e > 0 small enough such that
v, v e Lan n UEiv*). If we assume that z((-ü, ü)) < c then it follows from (2.21) that
z((ü,v)) > c. Hence ^((u, v)) in (2.21) attains in any neighborhood Ue((u*,v*))values larger and smaller than c and therefore (u*,v*) cannot be a local optimum
AssumptionFrom now on, we will assume that zip) is not constant on A.
We denote for a point p* G A the face of A which it lies in by
Fp* := {p G A I p*. = 0 => pir = 0}.
Since Fp* is again a cross product of simplices, Theorem 2.16 implies the followingassertions about local optima.
32 The Semi-Assignment Problem
Corollary 2.17
Let z be an A-polynomial. Then:
(1) There are no local optima in A0.
(2) Ifp* is a non-strict local optimum, then z is constant on the whole face Fp*.
(3) Ifp* is a strict local optimum then p* G A7.
Proof
(1) follows directly from Theorem 2.16 by setting Lt := A* foi all i G N. Moreover,
Theorem 2.16 implies that a local optimum p* cannot lie in the interior of Fp*unless z is constant on the whole face. If p* is a strict local optimum p* cannot
even lie in the interior of a constant face Fp* and therefore p* G A7.
Note that in this corollary, (2) does not necessarily imply that all points in Fp* are
local optima (see Example 2.30 on page 44).
Finally we show with an example the importance of the strictness of local maxima
in Proposition 2.15. Weakening this assumption such that p* is only a non-strict
local maximum, we will see that equivalence between discrete and continuous local
maxima does not hold anymore.
Example 2.18
Let the following A-polynomial be given
Z(P) =PllP21+Pll +P21 +2(p12+p22).
Depicting the objective values in the projection to pn and p2\ coordinates (p12 =
1 — pn and P22 — 1 — P21 are given implicitly) we see in Figure 2.3 that there exists
a discrete strict global maximum q* = (01) with objective value z(q*) = 4 and a
discrete local maximum p* in the upper right corner of the drawing, with p* — ( {§ )and objective value zip*) = 3. Though all points p on the connecting edges ofp* to
3
global max., 4
Figure 2.3: Objective values of z in the pn,p2i-plane.
3, local max.
*tP*
2.2 On the Objective Function of the R-SAP 33
its discrete neighbors have constant objective values zip) = 3, there does not exist
a neighborhood Uip*) such that for all q G Uip*) D A : z(ç) < z(p*).The restriction of the objective function z to the pn,p2i-plane is given by
z(Pll,P2l) = PllP21 + PU + P21 + 2(0- ~ Pll) + (1-P2l))= P11P21 - Pu - P21 + 4
with Pii,p2i G [0,1]. Let us compute the objective values along the diagonal con¬
necting the global and the local maximum. Setting pu := p21 we get zipu,Pn) —
p\x — 2pn+4. In any neighborhood Uip*) we get for points q G UnA on this diagonalwith qn := 1 — e objective values z(qu,qn) — (1 — e)2 — 2 + 2e + 4 = 3 + e2 > 3 for
all s > 0 and therefore there cannot exist a neighborhood Uip*) : Vç G Uip*) n A :
z(q)<zip*).
2.2.4 Saddle Points, Karush-Kuhn-Tucker Points and Nash
Equilibria
As we will see in Section 2.3.2 there exists a close relation between fixed points
of F[Y] and stationary points of zip). For this reason we characterize in this
section the stationary points of z, where we distinguish between saddle points,
Karush-Kuhn-Tucker (KKT) points and Nash equilibria.
Let us first define the stationary points and saddle points of an A-polynomial z.
Definition 2.19 (Stationary Point, Saddle Point)A point p* e A is called a stationary point of z, if the gradient of z projected on the
affine subspace aff(A) is zero in p*, i.e. Yir(p*) = vt for all i G N,r G K and some
constants u,eR.
Moreover, we callp* G A a saddle point ofz on aff (A), ifp* is a stationary point and
in any neighborhood Z4(p*) naff(A) there exist points p, q with zip) < zip*) < ziq).
Since the R-SAP is a nonlinear programming problem on A, first-order necessary
conditions for local maxima are given by the KKT conditions (see Appendix B).Moreover, since A-polynomials are in general not concave functions sufficiency is
not implied.
For a point p G A we define the support supp(p) by
supp(p) :={ii,r) e N x K \ pir > 0}.
In order to derive the KKT conditions for the R-SAP, we use the Lagrangian multi¬
pliers uir for the non-negativity constraints on pir, and vl for the equality constraints
describing A. Then the Lagrangian function for the R-SAP is given by
n k n / k \
Lip, U, V) = -zip) - ]P ^2 U*rPir + ^2 Vl SPîr ~ 1 )2=1 T=\ 2=1 \r=l /
34 The Semi-Assignment Problem
From this we derive directly the KKT conditions for the R-SAP in p*. We get for
all 2 G N, r e K:
-Y^ip*) - uir + Vi = 0 (2.22)
-uirp*r = 0 (2.23)
uir > 0. (2.24)
We see that for all (i,r) G supp(p*), (2.23) implies that iiir = 0 and in this case
(2.22) simplifies to Yirip*) = Uj. On the other hand points p* on the boundary have
components with p*r = 0 and then IV (p*) = V{ — U{T < viy because of (2.22) and
(2.24). Consequently a KKT point p* of the R-SAP is characterized by
r , *\ fvi> if (2, r) G supp(p*)
\ Vi —uir, otherwise
where v^ := maxrEx Fjr(p*). Now we see from (2.25) that every stationary point of
z is a KKT point on A.
Next we introduce the so-called Nash equilibria which are defined as follows:
Definition 2.20 (Nash Equilibrium)A point p* e A is a Nash equilibrium of the R-SAP, if
V2 G N, Vç G {p G A | Pj. = p*, Vj eN\ {%}} : ziq) < zip*).
In our context Nash equilibria will thus be viewed as local maxima with regard to
one component. From the linearity of z(p) in pim for all i G N, it follows directlythat the set of all integer Nash equilibria p* G A7 of the R-SAP is equivalent to the
set of discrete local maxima of the SAP.
Proposition 2.21
Let z be an A-polynomial. Then p G A is a Nash equilibrium of the R-SAP, if and
only if
V2 G N 32, G E, such that j1^ = Vi> J'f (*' r) G 8UpP^. (2.26)
I rir(p) < Vi, otherwise
Proof
'=>': Let p G A be a Nash equilibrium. W.l.o.g. let for all i G N : Ynip) =
maxPjr(p) and we define v^ := Ynip). Thus Yir(p) < V{ for all i G N, r G K.r£K
Assume there exists (z,s) G supp(p) with r^p) < Vi = Ynip) which would contra¬
dict our proposition. Then we can define a new vector p as in (2.15) and get a point
2.2 On the Objective Function of the R-SAP 35
with larger objective value: zip) > zip), since
k
Zip) = Y^P^riP) + Rr{P) < r,l(p) + Riip) = Ytlip) + R,ip)r=l
k
= ^2/pirYir(p) + RriP) = zip).r=l
This contradicts our assumption that p is a Nash equilibrium. Thus YlT(p) = vz, if
(2,r) G supp(p) and r2r(p) < vt otherwise, which proves the first direction.
'-4=': Let for alii e N : Y„ip) = vz for all (2, r) G supp(p), and Tîr(p) < v% otherwise.
Moreover, let 2 G N be fixed and p be a vector with p3_ = pJm for all j e N\ {i} and
pt. G A* arbitrary. We show in (2.27) and (2.28) that then p is a Nash equilibrium.
k k k
Z(P) = ^PirYir{P) + Riip) = ^PirYirip) + Rz{p) < J^Är^ + Riip) (2-27)r=l r=l r=l
k
= vz + Rz (p) = J2 PirYir ip) + Rl (p) = zip). (2.28)r=l
The inequality in (2.27) follows from the assumption that Y„ip) < vt for all
2 G N,r e K. In (2.28) we use the fact that rîr(p) = vz for all (i,r) G supp(p). It
follows that p is a Nash equilibrium.
Comparing Nash equilibria to KKT points, we have just proven the following theo¬
rem.
Theorem 2.22
Let z be an A-polynomial and p* G A0. Then the following three statements are
equivalent:
(1) p* is a Nash equilibrium
(2) p* is a KKT point
(3) p* is a saddle point
of the R-SAP.
Since for quadratic A-polynomials all partial derivatives r„.(p) are linear functions,the next corollary follows immediately from Theorem 2.22 and (2.26).
Corollary 2.23
Let zbea quadratic A-polynomial, then the Nash equilibria in A0 form a polyhedron.
36 The Semi-Assignment Problem
We demonstrate subsequently how Corollary 2.23 can be used to compute the set
of Nash equilibria in A0 for the (unweighted) Max-&-Cut problem. (For a detailed
description of the Max-Zc-Cut problem see Section 1.3).
Let an unweighted graph G = iN,E) be given. Then for K := {1, ...,&}, the
objective function of the Max-/c-Cut problem can be expressed by the followingquadratic A-polynomial
Z(P)= J2 ^PirPj*- (2-29)[i,j]eEr,seK
r^s
Let us denote for all i G N by V(2-) := {j : [i,j] G E} the neighborhood given bythe edges of G and by di := |V(i)| the degree of vertex i. Then a vector p G A is a
Nash equilibrium, if and only if for all i G A^
(2, r) G supp(p) =ï J2 Pjr = Ti
e \h <\^
^> (2-3°)
(2, r) % supp(p) =* VJ pjr > 7i
jev({)
where ji = di — max Pjr (p).
(2.30) can be derived directly from (2.26): For (2.29) we have Yir =
lZjev{i)T,s:s^rP3s and since Y.s-.s^-Pjs = 1 - Pjr we get Yir = £jeV(i)(l - Pjr) =^j ~~ ^2jev(i)Pjr- Hence p is a Nash equilibrium, if and only if for all 2 G A" there
exist Vi such that
(2, r) e supp(p) =» Yir = di- Eiev(i) 2V = v{
(2, r) 0 supp(p) => rir = di - £jey(0 p^v < ^'
Using Vi = maXj-Tj,. and 7; := di — Vi then (2.31) is equivalent to (2.30).
With the help of this result we can easily compute the set of Nash equilibria in A0 as
we demonstrate here for the Max-3-Cut instance whose graph is shown in Figure 2.4.
The corresponding R-SAP has the following objective function
zip) = P11P22 + P11P23 + P12P21 + P12P23 + P13P21 + P13P22
+ P11P32 + P11P33 +P12P31 +P12P33 +P13P31 +P13P32
+ PllP42+PuP43 +P12P41 +P12F43 + P13P41 + P13P42
from which we compute the partial derivatives
4
r^ = EEfts VreK (2.32)2=2 s:sj^r
Yir=Y^P^ i = 2,3,4,Vr G K. (2.33)s:s^r
2.2 On the Objective Function of the R-SAP 37
(1 I I)\3> 3' 3/
©" © "®(P21,P22,P23) (P3l,P32,P33) (P41, P42, P43)
Figure 2.4: Graph of a Max-3-Cut instance where the given vectors correspond to
Nash equilibria (in the interior), whenever p2r -\-p^r +P4r = 1 for r = 1, 2,3.
From (2.33) it follows immediately for i = 2, 3, 4
TV = (1 - Pir) = Vi Mr e K =^ pn = Pl2 = Pl3 = v2 = ^3 = v4 = -
(2.32) is equivalent to rlr = ^2i=2(^ ~ Pir) = 3 — Y^i=2Pir = vi for all r G if and
therefore
\J Pir = 3 — vi Vr G K. (2-34)2=2
In order to compute v\ we substitute pn — 1 — pt2—
Pi3 and get using (2.34) for
r = 1:
^l Tu = 3 - J^Pii = ^Pi2 + ^Pi3 = 2(3 - Vi)2=2 2=2 2=2
,0and thus v\ — 2. The Nash equilibria in A of the graph depicted in Figure 2.4 are
given by
NE := ^ p G Ac Pi. i(l,l,l),J>r = lVrG#l2=2 J
In order to show that there exist directions where the objective value increases and
others, where it decreases, we have to disturb at least two rows of a matrix p G NE.
Let p G NE and let us define pn := | + £i,pi2 := \ - £i,P2i := P21 + ^2,p22 '=
P22 ~ £2 for some small e\, e2 and pV := Pir for all other elements, such that p G A0.
Computing now zip) we get
zip) = zip) - 2eie2
which shows that p is a saddle point, since choosing once e\ and e2 with different
signs yields an increase of z whereas a choice with same signs decreases z.
38 The Semi-Assignment Problem
2.3 Properties of the Iteration Sequence
In this section we discuss properties of the operator F[£], its convergence behavior
and relations between local maxima, fixed points and attractors.
Let Gt : R+ —> R+ be continuous, strictly increasing functions for all i G N. If Y is
a fitness function, then the following monotone transformation
G : Rnk -) Rnk, G := (d,..., G1}..., Gn,..., Gn) (2.35)
defines a new fitness function £ = G(r) which will be used in some of the followingtheorems. Note that then from £îr = Gi{Yir) it follows immediately that for any
fixed p G A:
IVip) < Yls(p) & £„. (p) < £«ip), ieN,r,se K. (2.36)
Moreover, the definition of F\Ç\ip) implies that unless p is a fixed point, there
always exists (at least) one component of p which increases under F. Using (2.36)we can give the following sufficient condition for a component's increase:
Let z e PN and p' := F[£]ip), where £ := G(Y). Moreover, let s e Kn,i e N and
p G A0. Then
YuM > YAP) Vr^st^p'tat>pMi. (2.37)
Moreover, if z is a Boolean A-polynomial, then equivalence holds and we have
I\i(p) > Ll2(p) & p\x > Ptl. (2.38)
To prove (2.37) we define ulT such that p'ir = ulTpir for all i e N, r G K. Then we
simply have to show that ulSi > 1, which follows immediately from (2.36) since
««. := ~kUP
> 7~Ä = 1- (2-39)2—ir=lPlr^ir Çis, / ,r—] P2r
For the Boolean case we have uz\ := —,,l'1—st~ > 1 for all i G N and therefore
Uil > 1 <& fil > Piliil + (1 - Pil)&2 ^ (1 - p.l)£,i > (1 - Pil%2
& U > &2 Vpzl / 1 ^ Yn > Yi2 Vpii / 1
and hence (2.38) holds.
2.3 Properties of the Iteration Sequence 39
2.3.1 Growth Transformations
In Definition 2.1 we introduced the terminology of (strict) growth transformations.
If F is a strict growth transformation, then we have the nice property that for any
point p° G A0 the sequence {Ftip°)},t > 0 does not cycle. In Theorem 2.2 we have
already seen that F[r] is a strict growth transformation. This result was first derived
for homogeneous polynomials by Baum and Eagon [BE67] and later extended to
arbitrary polynomials with positive coefficients in Baum and Sell [BS68]. However,the authors gave an even stronger result which we will derive here directly from
Theorem 2.2:
Theorem 2.24 (Baum, Sell)Let z : Wlk —> R be a polynomial with positive coefficients in the variables pir,i G
N,r e K and let F := F[Y]. Then for any 0 < A, < 1 (i e N) the point p defined
by
Pi := (1 - Xi)pi. + \F(p)i, V2 G N (2.40)
satisfies
zip) < zip) (2.41)
and equality in (2.41) holds if and only if F(p) — p.
Proof
Let Xi G (0,1] for all i e N be fixed and p be given by (2.40). We define a new
polynomial function zip) which agrees with zip) on A by
zip) := zip) + J2Ki l^Pir I - YlK^2=1 \r=l / 2=1
where
Ki : =(1 ~ A°E*(P)
with ^ip):=J2pMz,p).5=1
Obviously, for all p G A the following relationship between the partial derivatives of
z and z holds:
rîr(z,p) = Y„(z,p)+Ki \/zeN,re K.
Hence, with Fiz,p)tr = PtrT^v) we get
v =F(zv)
_(T>r(z,p)+Kl)pir St(p) KtPir — r \Z,p)ir — ^
. .
—
. .^ \Z,P)ir + ^. ( \ , ts Pir
K ip) + A 2 S2 (p) + Ki Ei ip) + Ki
= \Fiz,p)tr + (1 - \)pir
40 The Semi-Assignment Problem
which proves the theorem, since by Theorem 2.2 ziFip)) > z(p) and property (2.4)is fulfilled.
Note that in this theorem it is not assumed that z is an A-polynomial. Moreover,
an important consequence of this theorem is that F cannot leave a 'local hill' as
will be shown in Corollary 2.31.
The following theorem is a generalization of Theorem 2.2 regarding the fitness func¬
tions of the operator F[£\.
Definition 2.25 (Homogeneous Function)A function z : Rre —>• R is called homogeneous of degree d, if ziap) = adzip) for all
ae R.
Theorem 2.26
Let z : M.nk —r R be a polynomial with positive coefficients, F := F[Ç] where
£ := G(Y) as in (2.35) and Gi are concave, strictly increasing, twice differentiable
functions for all i G N. Then F is a growth transformation for z.
Proof
Following the proof given by Baum and Sell [BS68] we first establish this theorem
for homogeneous polynomials z.
Let a homogeneous polynomial zip) with degree m be given. We first introduce
some notations and identities which we will use in the following proof.
Any homogeneous polynomial zip) can be written as
z(p) - yi Wk(p), where rat(p) := JJ Pt(i,r)
%r
,r
and t(i, r) are non-negative integers with J2ir*(*> r) = m-
Moreover we have:
zip) = —T^PirYir (2.42)rn *—'m
i.r
^2 i(i, r)wtmtip) = pzrYir (2.43)t
mt(p) < -^tii,r)pir (2.44)m
is
where (2.42) is Euler's theorem for homogeneous functions, (2.43) is straight forward
computation and (2.44) is the inequality of the geometric and arithmetic means.
Let us denote the new point F(p) by q with
Pir^irr .
^ ,r _ T^
&r :==^ for 2 G JV, r G K.
Z_/s=lPi.sS2s
2.3 Properties of the Iteration Sequence 41
We define an auxiliary function vt(p) by
Vtip) ~ wtmt(p) (wtm-tiq))'^ (2.45)
and get for the objective function
zip) = Yl {wtmtiq))^î vtip). (2-46)t
Applying Holder's inequality to (2.46) we get
m+l
z(p)< (£(w^)^,(m+1)] IT^p^j (2-47)
= z(q)^ Ï^vtip)^ \. (2.48)
To give an estimate for the second term in (2.48) we use (2.45), apply then inequality
(2.44) and finally use identity (2.43). Thus we get
E-w" = X><M) gg)* = E(..w) (». (f))* p-«)
1n k , s
<^£wn*(p)£X>(i,r) ^ (2.50)t i=l r=l
\yir/
n fc v—\k
=1£ «,tWH(p) ££t(i, rfr £=*
**'&'(251)
mJ 2=1 r=l
Pir^r
= iED^)%^- (2-52)
The proof is finished, if we can prove that (2.52) is less or equal than zip), because
then
X>t(P)^<*(p) (2-53)t
and substituting (2.53) into (2.48) we get
z(p) < z(y-)+ï.z(p)+i
which implies that zip) < z(g) and thus proves monotonicity.
We see that if we set now in (2.52) £ir := Yir for all i e N,r G K, then (2.52) is by
(2.42) equal to zip). Together with the second part, the extension from homogenous
polynomials to arbitrary polynomials with positive coefficients, this is the proof of
42 The Semi-Assignment Problem
Baum and Sell of Theorem 2.2.
In our more general context, where £ = G(Y), it remains to show that if Gi : R+ —y
R+ is for all i e N a concave, strictly increasing, twice differentiable function then
(2.53) holds. We simplify (2.53) using the following equivalences:
m = -£E>rv > -££ferr,r)E-;^^ (2.54)
n k k n k k
PisKii<*E£ft'r*I> >
EEE^r*if (2-55)2=1 r=l s=l z'=l r=l s=\
n k k/ C \
** EE£^»r* (x - r )-
° (2-56)i=l r=l «=1
^ ^irS
Ti K K-i—\
^EEE^r^ - &*) ^ ° (2-57)2=1 r=l s=l
^zr
TI K K /yi T^ \
^ EE E *rP» 7^ - 7^ ) (& - Ci,) > 0. (2.58)2=1 r=l S=r+1
^ KlT ^s '
Now let us introduce a shorter notation for easier writing: we denote for any fixed
i e N: Yir by xr and £ir by f(xr). Of course now the assumptions on Gi hold for /and we have x > 0 and /(a;) > 0. We will prove that if / is strictly increasing and
concave then
{7Ù ~ 7Ù)(/(ll) "/te)) * °- (2-59)
If we define g(x) := -#r and use the strict monotonicity of /, then (2.59) is fulfilled,if
fixi) < fix2) & xi < x2 & gixi) < g(x2)
and therefore g(x) is a monotone increasing function. We show that /(0) >
0, fix) > 0 and fix) < 0 for all x > 0 implies that g'ix) > 0.
g'ix) = m~*x*'{x) > 0 & f(x) - xfix) > 0.
However, defining h(x) := f(x) — xf(x) we see that /i(0) = /(0) > 0. Furthermore
we get by its derivative /i'(x) = f{x) — fix) — xf"(x) = —xfix) and the concavityof fix), i.e. f"(x) < 0 that h'ix) > 0 and therefore hix) is a monotone increasingfunction. Thus h(x) > 0 for all x > 0 and therefore g'(x) > 0 which proves this
theorem for homogeneous polynomials.
In a second step we show that this theorem does not only hold for homogeneous
polynomials, but for arbitrary polynomials with positive coefficients. For this reason
2.3 Properties of the Iteration Sequence 43
we rewrite zip) as zip) = J2 hdip) where hd(p) are homogeneous polynomials of
ri=0
degree d. Now we introduce a new variable q := (ci,..., q^) G A* and enlarge the
domain A to A x A*. We define a new polynomial z by
m—d
zHp,q)):=^2hdip) X> (2.60)<f=0 \r=l /
which is a homogeneous polynomial of degree m with positive coefficients and agrees
with zonA for any fixed ç G A*. We get
771/= \_
PirQr{Z,P)_
PirÇir{Z,P) ,_ „.. v
P/- ï <lrÇr{z,q) qAr{z,q)F{z, q)r
=
—-k-— =
77—r^= *• (2.62)
(2.61) follows immediately from (2.60) and the fact that Yiriz,p) = rîr(2:,p) for all
i e N,r e K. (2.62) is a consequence of the fact that all Yr(z, q),r G K have the
same value, because
/ k \rn-d-\
Yriz,q) = J2hd(p)(m-d) E9t =Ys(z,q) Vr,seK.d=0 \t=l /
An important consequence of (2.62) is that q is fixed. Since we know already that
this theorem is true for homogeneous polynomials we can apply it now to z and
using (2.61) and (2.62) we get
zip) = züp,q)) < ziFiz, ip,q))) = ^((F(^,p),ç)) = z(Fiz,p))
which proves the theorem for arbitrary polynomials with positive coefficients.
2.3.2 Fixed Points and Accumulation Points
In this subsection we concentrate on fixed points of the iteration sequence. It will
turn out that in the interior fixed points coincide with saddle points. Further¬
more we prove that on the boundary KKT points are fixed points, but not vice versa.
In order to speak about fixed points on the boundary, we assume from now on that
F[Ç] is well defined on A. If Ç = G(Y), then a sufficient condition for F[£] : A —; A
being well defined is given, if Y is a fitness function on A and therefore Yir(z,p) > 0
for all % e N,r G K,p G A (which can always be achieved by Transformation 2.10
on page 28).
44 The Semi-Assignment Problem
Definition 2.27 ((Unstable) Fixed Point)Let F : A —y A be given, then p* G A is called a fixed point, if F(p*) = p*.
Moreover, a fixed point p* G A is called unstable, if there exists a neighborhood
Uip*) such that for every neighborhood Ui(p*) in Uip*) there exists at least one
starting point p G Uiip*) n ^ sucn ^aa^ tne sequence {Ft(p)},t > 0 does not lie
entirely in Uip*).
We start with an investigation of the relations between KKT points and fixed points,where we distinguish explicitly between points in the interior A0 and those on the
boundary.
Theorem 2.28
Let z be an A-polynomial, F := F[Ç], where £ := G(Y) as in (2.35) and p* G A.
(1) Ifp* is a KKT point, then p* is a fixed point.
(2) Moreover, if p* is a fixed point in A°; then p* is a KKT point, i.e. a saddle
point.
Proof
A point p G A is a fixed point, if F(p)ir — pir for all % G N, r G K which is equivalentto
PirCir^
J£ir = '52sPi8&s--V%, (%, r) G SUpp(p)= Pir <^ \ . . . . (2.DO)
Esftsf» [&r arbitrary, (2, r) ^ supp(p)
The strict monotonicity of G% guarantees that for any fixed p G A the order relation
(2.36) is kept. For p* e Awe see from (2.25) that every KKT point is a fixed point.
Moreover, if p* eA'ö supp(p*) = N x K then by (2.63) the fixed point set in A0
is equivalent to the set of KKT points in A0.
Of course this theorem implies that local maxima (and by the same argument also
local minima) are fixed points. However in general equivalence between local optimaand fixed points does not hold on the boundary, because all integer vertices are fixed
points, but not necessarily KKT points (or local minima), as the following exampleshows:
Example 2.29
Let z = 2pnp2i + 2pi2p22 + 3pi2P2i +PnP22- The integer vertex p* = ( j; g ) is no local
optimum, but nevertheless a fixed point of F[Y]. We see that Yip*) = i2\) which
therefore contradicts (2.25), the characterization of KKT points.
Subsequently we want to point out that not all fixed points p G A \ (A0 U A7) are
necessarily local optima.
Example 2.30
Let z = 2pnp2i + 3pi2p22 + 2pi2p2i + PuP22- For the point p* = ( | | ) we have
Yip*) — (22) and therefore p* is a fixed point.
2.3 Properties of the Iteration Sequence 45
P21;
2
3 1 Pn
However, the objective values along gi : P21 = 2pn are strictly increasing for pu G
[|, 0] and those along g2 : p2i = 2 — 2pn are strictly decreasing for pu G [\, 1]. Hence
in any neighborhood Ue(p*) there are points with larger and smaller value than zip*)and therefore p* is a fixed point which is no local optimum.
Having discussed the relationship between local maxima and fixed points, we will
now present a corollary about the behavior of F in some neighborhood of a fixed
point. This result is a direct consequence of Theorem 2.24:
Corollary 2.31
Let z : R"fc —y M be a continuous function, F be a growth transformation for z and
p* e A be a fixed point of F. Moreover, we define for any fixed e > 0
U'Eip*) := {p G A | zip) > zip*) - s} (2.64)
and denote the connected component of U'Eip*) containing p* by Ueip*) Ç Ue(p*).Then we have F(C/E) c Ue.
The following example shows that Corollary 2.31 does not imply pointwise conver¬
gence of {F*(p)},t>0.
Example 2.32
Let the following A-polynomial be objective function of an R-SAP instance
zip) = 3p2i(pn +P12) + 2p12p22 +pnp22.
It can easily be verified that all points p = (p" Pl2), with pi. G A* arbitrary, are
local maxima with zip) — 3. The graph ofz(p) in the pn, p2i-coordinates is depictedin Figure 2.5(a).
Now, let p* be a local maximum and therefore a fixed point. Then the sets U£(p*)defined by (2.64) are connected and Ue(p*) = U'E(p*). For different values ofe, these
regions UE(p*), bounded by the level curves of z are shown in Figure 2.5(b). We see
that these regions converge for e —y 0 to (p" Pq2 ), px. G A*, the connected component
of local maxima containing p*, represented in Figure 2.5(a) by the upper line.
Finally, we investigate accumulation points of a strict growth transformation F.
46 The Semi-Assignment Problem
(a) Plot of zip) = 3p2i(pn + (b) Interlocking regions UE
p12) + 2pi2P22 +P11P22- converging to the upper edge.
Figure 2.5: Example illustrating sets U£ of Corollary 2.31.
Proposition 2.33
Let z : M.nk —> R be continuous function and F be a strict growth transformation
for z. Then all accumulation points of a sequence {p* := Ft(p0)},t > 0 for a fixed
p° G A0 have the same function value and are fixed points of F.
Proof
Note first that by the compactness of A, accumulation points of F exist. Since F
is a growth transformation it follows that all accumulation points of F must have
the same objective value. Now let p* be an accumulation point. Then there exists
a subsequence {ptl}, I > 0 which converges to p*. Moreover, continuity of F impliesthat {p^+1 — F(p*')} converges to F(p*) and since zip*) = z(F(p*)) it follows from
(2.4) that p* is a fixed point.
Using this proposition we can show an even stronger result than in Theorem 2.28(2).
Proposition 2.34
Let z be an A-polynomial and F := F[Ç] be a strict growth transformation for z.
Moreover, let p* G A0 be a saddle point. Then p* is an unstable fixed point of F.
Proof
Since p* is a saddle point in A0 there always exists a neighborhood Uip*) which
does not contain another saddle point q G U(p*) n A0 with ziq) > zip*). Hence,by Theorem 2.28(2) there is also no fixed point in Uip*) n A0 with larger objectivevalue than p*. Moreover, in every neighborhood Uiip*) n A0 Ç Uip*) there
exists a point p with zip) > zip*). Since F is a strict growth transformation
and by Proposition 2.33 every accumulation point is a fixed point, it follows that
{Ftip)},t > 0 leaves the neighborhood Uip*) and therefore p* is an unstable fixed
point.
2.3 Properties of the Iteration Sequence 47
2.3.3 On the Convergence
In this subsection we discuss some convergence results for the R-SAP regarding
operator F := F[Ç\, where Ç := G(Y) as in (2.35). We first concentrate on the
special case of R-SAP's with linear objective functions and afterwards turn our
attention to the general case.
Linear Case
Let us investigate the special case of an R-SAP with a linear objective function. It
should be noted here that such problems can easily be solved and they are discussed
here only for algorithmic reasons.
For given positive weights c G (R+)nfc we have the R-SAP
n k n k
mBi?EE CirPir = E maX E CirPir- (2-65)2=1 r=l i=l r=l
We see that in order to solve (2.65) it suffices to solve for each i G N the followingreduced problem over one simplex:
(2.66)
An optimal solution p* of (2.66) is given by p* — 1, if cr — maxse^ cs and p* = 0 for
all s G K \ r.
Since Tr(p) = cr is constant, £r = G(rr(p)) is constant, too. This allows us to
compute the i-th element of the iteration sequence generated by operator F[£] (hereexceptionally denoted by p® instead of p* in order to avoid confusion) explicitly:
pft) .
PÏ-%_ _
(*£)&)"_
MV'y
r
e;=ip?_1)6 e;=i (p-t) (6)*-1 eUp^y(2.67)
where E := ]Cs=i£*«£« Now we will show that (2.67) always converges to a globalmaximal solution of (2.66) (even if there exist non-strict local maxima).
Theorem 2.35
Let the linear optimization problem (2.66) be given, F :— F[Ç] where £ := G(Y) as
in (2.35) and I :— {r G K \ cr — maxseKcs}. Then Ft(p)r = pP is given by (2.67)and for any starting point p e A*, {Ft(p)} converges to an optimal point p* with
max
k
/ jCrpr
r=l
s.t. pG A*.
Vrr el
P: =<**""''
~~,T- (2-68)0, r G K\I
48 The Semi-Assignment Problem
If the problem is not degenerated, i.e. all integer vertices have different objectivevalues and therefore |7| = 1, then the optimal solution of (2.66) is a unique integervertex p* e A1.
Proof
W.l.o.g. we assume
Ci = C2 = • • • = Q_i > Ci > > ck > 0,
hence zip) is not constant on A. Since YT = cr for all r G K and G : R+
strictly increasing, the same order as in (2.69) holds for £r, r G K as well.
(2.69)
R+ is
We prove that for any r G I and any s > 0 there exists an index to such that for all
* > *o: \Pr] -p*\<e. We get from (2.67) and (2.68) for all r G I
^ PrPr(fr Pr
J2sPs(ùY J2s£iPs
PrtérY (EaelPs) ~Pr (£fle/PS(6)*) ~ Pr (E^/Ps^)'(EsP'tesY) (EserPs)
-Pr (EstlPÀ&y)
<
(EseiP*) (EsP'tésY
Pr (£)<
EsaPs Pr&Y EseiPs
because ^r = £s Vr, sel
By (2.69) we have & < ^r for all r G I and therefore the last term converges to
zero as t goes to infinity. This proves the convergence for all components pr,r G I.
However, since p* G A and EreiP* = 1> a^ other components of p* must be zero
which finishes the proof of the theorem.
Our next goal is to study convergence for arbitrary A-polynomials z. We will show
pointwise convergence of {F^Yip)}, t > 0 for £ = G(Y) and we will estimate the
convergence rate.
General Case
Let z be an A-polynomial. We call a local maximum p* a local maximum with
maximal support, if for every local maximum q* e A : supp(p*) Ç supp(<?*) =>
supp(p*) = supp(ç*). This definition implies the following property for a local
maximum p* with maximal support, to be used in the Convergence Theorem 2.37:
p*. = o^rîr(p*)< (2.70)
Moreover, note that the negation of the KKT characterization (2.25) implies that
equivalence holds in (2.70).
2.3 Properties of the Iteration Sequence 49
To prove this observation let p* be a local maximum where we assume w.l.o.g.that p*i = 0 and p\2 > 0. Now we will show that if rn(p*) = vx then there
exists a local maximum q* with q*x > 0 and thus supp(ç*) 2 suPP(p*)- Since p*
is a local maximum, there exists a neighborhood U£(p*) such that for all points
p G Ueip*) n A : zip) < zip*). Let us define some s with 0 < ê < e and ë < p\2.Then we can define a new point q* by ç*x := ë, q*2 = p\2 ~ £ > 0 and q*r := p*r for
all other components. Note that now supp(g*) 2 suPP(p*) and by the definition of
q* it follows that Ylr(p*) = Tlr(ç*) for all r G K and R±(p*) = Ri(q*). We get for
the objective value at q*\
k
*(0=E&ri'(0+W)= E ?ïr£lr(p2+(7Îl luifl +Rl(P*)r=l r:plT>0 =„1 =„,
by (2 25) by assumption
= vi + Riip*) = zip*)
and since ç* G UE(p*) it follows that q* is a local maximum with larger support than
p*. This finishes the proof that every local maximum p* with maximal support
satisfies property (2.70).
Definition 2.36 (Geometric Convergence)Let a sequence {p*}, t > 0 with linit-^p* = p* be given. Then the sequence {p*} is
geometrically convergent, if there exist c > 0,0 < p < 1 and an index t0 such that
\\pl - P*\\ < CP* f°r &fi t > *o-
The following result was shown for the operator F[Y] by Cochand and is adaptedhere to fitness functions £ = G(Y).
Theorem 2.37
Let z be an A-polynomial, F :— F[Ç], where £ := G(Y) as in (2.35) and G% are
strictly increasing Lipschitz functions for all i G N. Moreover, let p* G A \ A0 be a
non-strict local maximum with maximal support. If p* is an accumulation point of
the sequence {Ft(p0)}, t > 0 for somep0 G A, then Iim^oo Ft(p°) = p* and {F^p0)}is geometrically convergent.
Proof
Let us define for all i G N, p G A
T, (p) : = max Yir ip) and f, (p) : = max f,r (p).rÇ:K r£K
We have already shown in (2.70) that a local maximum p* with maximal supporthas the following property:
p:r = o^rir(p*)<r,(p*). (2.7i)
Let us denote by Bs(p*) the closed ball with center p* and radius Ô. By continuitywe can find ö > 0 such that for p G Bs(p*)nA, i G N, r G K the following propertieshold for Y„ and because of the strict monotonicity of Gt for £îr as well:
50 The Semi-Assignment Problem
• p*T > 0 =>> p^ > 0
• Y^ip*) < Yis(p*) => Yirip) < ?is(p) => &rip) < CM
• 0 < Yirip*) ^ 0 < Yirip) ^ 0 < 6r(p)
Moreover, we define for all 2 G A" the sets
Ui := {(2, r) | rîr(p*) = I\(p*),1 < r < k}, U := |J £/,•
Vi := {(2,r) | rir(p*) < Yiip*),l< r<k}, V := |J ^.
ieiv
Note that U = supp(p*) and V = (N x K)\ supp(p*).
For p G A we define
£i(p) = E Pir V2GN, e(p) :=max£i(p) (2.72)(i,r)6Vi
and
d(p)--=JH Pi- (2-73)
V (i,r)eV
Since E(2,r)ey^2> ^ max(i)r)evp-r it follows that
dip) > max p^ (2.74)(i,r)eV
and therefore we get from e^p) — Ea r)ev, P^< kmax^çv, Pir that
sip) = maxej(p) < A: max pir < kdip). (2.75)iN {i,r)£V
Consider the mapping n : A0 —y Fp* where p := Yi(p) is defined componentwise by
'0, for (2» G V
'
ï=Sfe)> otherwise
Claim 2.38
For all ii,r) e N x K
\Pir "Pir| <e(p)- (2.76)
Proof of Claim 2.38
If ii,r) G V, then p2> = 0 and (2.76) follows directly from (2.72), because pir <
2.3 Properties of the Iteration Sequence 51
e.(p) <e(p).Moreover, (2.72) also implies that for all i G N
l-e(p) < I-Slip) = Y^ P*r
(i,r)eut
and therefore we have for (2, r) G Ul
a -v__Pü:
p
1 - (1 - g,(p)) gl(p)_
Note that p* lies in the relative interior of Fp*, since it is by assumption a local
maximum with maximal support. We choose 5 < 1 small enough such that all
points in B$(p*) n Fp* are local maxima.
Claim 2.39
j-j.
Then
PeBs,(p*)nA°^peBs(p*).
Let S'--=^TrThen
Proof of Claim 2.39
For p G B51 ip*) we have
d^ = J E (P- " Kr)2 < E (Ar - P^)2 < *',
y (i,r)£V y (2,r)eVU!7
since p*T = 0 for all (1, r) G V. Hence, it follows from (2.75) that e(p) < ko' and
therefore
\\P* -P\\< \\P* ~ P\\ + \\P - P\\ < à' + nke(p) <5' + nkkô' = 8'ink2 + l)=ô
which proves that p G B$(p*).
Let M be defined by
M'1 := min{e,r(p) I P £ Mp*) n A, (î, r) G £7} (2.77)
and
p:=max||^| p G z3,(p*) n A, (i,r) eV, (2, s) G u\ .
Let L' := 2nmaxiejvc?2 (Erer^) ana- ^ := ^' wnere C is the largest Lipschitzconstant of the functions G% for all 2 G Ar. Now we choose À with 0 < A < ~ such
that p j^ < 1. Since À can be chosen arbitrarily small and p < 1 by definition,
such a A always exists. Hence, there also exists p with p ~^< p < 1. Moreover,
we define
VA := {p G Bsip*) n A I eip) < A} (2.78)
and consider a point p G Va fl A0. One could think of p as a point Ft(p°) for some
t. For this point p we define s := £(p).
52 The Semi-Assignment Problem
Claim 2.40
For p G Bs> ip*) n A0 we have
I6r(p) - £2: <eL. (2.79)
Proof of Claim 2.40
We just have to prove that
<eL'. (2.80)
Then the claim follows directly from Lipschitz continuity of the functions Gi,i G N,because we have for all 2 G N, r G K
\UP) - Up)\ < C\Y„ip) - Fir(p)| < eL'C = eL.
Since the coefficients of rzr(p) are also coefficients of z they can be bounded by
EtetWt- Thus it is sufficient to derive a bound for a monomial T in rir(p) with
degree m < n. Using (2.76) we get
wt Yl Pis- ]^[ Pls
(t,s)er (i,s)£T
< wt n {pis+e) - npis(i,s)GT (i,s)£T
< lüT n ^-+E£i(i,s)£T î=l
m
n PiE'
wT > s
2= 1
m
< wT^E (j ^ WTe2m ^ WTer-2=1
^ l '
By summations over T e T we get the upper bound in (2.80).
Since by (2.77) for (2,7-) G Ut we have Çirip)M > 1, it follows from (2.79) that for
pGMp*)nA°nVA
and symmetrically
&r(p) < &r(p) +SL< e,r(p)(l + eLM)
Lr(p) > &r(p) ~eL> &r(p)(l - ELM).
For p G Bß' (p*) n A0 we know from Claim 2.39 that p lies in Bsip*) Pi Fp* and by our
choice of S it is a local maximum. This implies
• p*r >0^pir>0
• rî7.(p) = Yiip) and &r(p) = f,(p) for p2r > 0, i.e (2, r) G LT,
• Yirip) < Yiip) and &r(p) < &(p) for (2» G K-
2.3 Properties of the Iteration Sequence 53
Lemma 2.41
Consider p' := Fip), where p G Bg>ip*) n A0 n V\. Then
(i) there exists a constant K such that \p'ir — pir\ < Kkd(p) for (i,r) G Ui
(ii) p'ir < ppir for (i,r) G Vi
(ni) \d(p)-dip')\> il-p)dip).
Proof of Lemma 2.41 (i)Case 1:
If p'ir > p^ we need an upper bound for p'ir
, Pir£ir(p), Pir£ir(p)
,Pirtir (p) (1 + eLM)
Pir = 7^ 7TT<
v^_ <- <„\
<
Ek,=iPis&s(p) E(i,s)&Ui PistM E(m)g^, Pi^«(p)(1 -eLM)
PiMP){l + eLM)<
pîr(l + eLM)
fc(p)(l - eLM) ZMeUi Pis~
(1 - eLM)(l - e)
and therefore
/ jl+eLM) \ (I + eLM) - il - eLM)il - e)Pir Pir S Pir I /-. r n,rw-i \ J —
l-eLM)il-e) )~ (l-eLM)(l-e)[1 + eLM) - il - eLM) + e{l - eLM) 2sLM + e
(1 - eLM) (1 - e)~
(1 - eLM) (1 - <r )< Ke < Kkdip)
with2LM + 1
K :=
;i-ALM)(l-A)
Case 2:
If p-r < Pir we need a lower bound for p'ir
,_
Piriirjp) PtrÇirjp)>
&(p)(l ~ eLM)Pir
EliP^sip)"
max(M)^ e«(p)~
VlT '
6(P)(1 + eLM)
and therefore
_
/ l-sLM\ jl +eLM)-jl-eLM)Pir pir S Pu \i 1 + £LM)^ 1+eLM
2sLM 2LM + 1Tjr<-, -TTJ-e<Ke< Kkdip).
1 + eLM 1 - eLM
Proof of Lemma 2.41(ii)For (i.r) G V we have
/<
PiSrjp) pAip)lT ~
E{i,s)evxPis£M~
min(M)ec/l{^(p)}E(2)S)etrIPi.CM 1
._
1.
-Pir '
TTT'
Ï -PirP -,
< PPirmin(î>)e^Çîa(p) 1-e 1-e
54 The Semi-Assignment Problem
for e < A.
Proof of Lemma 2.41(iii)From the definition of d(p) in (2.73) and the fact that by Lemma 2.41 (ii) p'ir < ppt
for (i,r) e Vi, it follows immediately that
d{p') = J E GO2 < MP) (2-81)(i,r)eV
and therefore
^(p)-4p')>(i-pMp)-
Summarizing, we have shown that for p' :— F(p) where p G B& f~)V\, we come
closer to the face Fp* by a step of length at least (1 — p)dip) while the progress for
(i,r) G U{ is bounded by Kkdip).
Under the assumption that our hypothesis still holds for p' and all further iterates,
we shall approach Fp* while remaining in a kind of 'cone' with vertex p. Then
Theorem 2.37 follows from the existence of a point p° := L^p0) which lies close
enough to p* so that the corresponding cone is entirely contained in By n Va and
does not contain a presumably existing second accumulation point.
It remains to show that if p° lies close enough to p*, then all iterates of ps lie in
By H Va- In order to formalize this idea we define
0:=-^- and *":=_!_ (2.82)
1-/7 nkO + 1v
Claim 2.42
Assume that p° G #5"(p*) D VA- Then ps := Fsip°) G £^(p*) n VA for all s > 1.
Proof of Claim 2.42:
We prove Claim 2.42 by induction and assume therefore that ps G Bg'ip*) n Va for
0 < s < t. We must show that pm G B*/ (p*) n VA.
Since Lemma 2.41 holds for all ps with 0 < s < t, the following statements are
implied:
1. For (i, r) G V we have
(a) p-r+1 < ppsir for all s < t (Lemma 2.41 (ii))
(b) p°ir > p\r > > p\r > pl^1 (follows from la and p < 1)
(c) ELo |Pt+1 'Pi I = Pi -Pit' < Pi < dip0) (follows from lb and (2.74))
2.3 Properties of the Iteration Sequence 55
(d) e(ps+1) < pX for all s < t and therefore pt+1 G Va
(follows from la, (2.78) and the fact that by induction hypothesis p° G
Bs"(p*) n VA and therefore e(p°) < A)
2. We get for the distances d
(a) dips+1) < pd(ps) for all s < t (follows from (2.81))
(b) dip3) - dips+1) > (1 - p)dips) for all s < t (by Lemma 2.41 (ui))
(c) dip0) > dip1) > > dip*) > dipt+1) (follows from 2a)
3. For (i, r) G U we have
(a) |P2Sr+1 - Ptr\ < Kkdip*) < § • (dip*) - cV+1)) = 0(d{ps) ~ dips+1))for all s < t (by Lemma 2.41 (i), 2b and the definition of 6 in (2.82))
(b) E|P2Sr+1-P2Srl <0-E(diPS)-dip^))s=0 s=0
= 9idip°) - dipt+l)) < 9dip°) (by 3a and 2c)
4. The difference between p° and p* is bounded:
iipm-p°ii<Ei^+1-^E E \p^l-pi\s=0 s=0 (i,r)eVUU
t t
= E El^+1-î4l+ E El^+1-^l (byic,3b)(t,r)EV 5=0 (2,r)ef7 s=0
< Yl d(P°) + E 9d^ ^ ed(P°W\ + \U\) (since 9 > 1)(i,r)Ef (i,r)eU
= nkddip0)
5. p° G Bg»ip*) implies d(p°) < 6" so that
||pt+1 -P*\\ < \\pt+1 -P°\\ + \\P° -P*\\< nk65" + 5" (by 4)= inkO + 1)5" = 6' (by (2.82))
and therefore pt+l G B^ip*). m
Now Claim 2.42 follows immediately from Id and 5 and thus finishes the proof of
convergence of Theorem 2.37. Note, in the last statements we have not only shown
that ps G Bs'(p*) n Va for all s > 1, but also in 4 that the length of the trajectory is
bounded by nk9dip°).
56 The Semi-Assignment Problem
To prove geometric convergence we use for p° G Bs"(p*) fl Va the estimates derived
in la, 2a and 3 and get
(i, r)eV: Ip*. - pi I < pVir < /Mp°) (smce p*r = 0)ir I
00 oo
(t,r) G U : \p\r-pt\ < E l^+S+1 "#'1 < J2Kk<Pt+Sîs=0 s=0
oo
ps = -^dip*) < 9dip°)pt.,=o
1 ~ P
Since there exists t0 such that pto G By(p*) n Va we have d(pio) < 5" and get bysummation over all indices
\\pt-p*\\<nk0ö"pt \ft>t0
which finishes the proof of Theorem 2.37.
Until now we have always discussed the discrete time dynamical system defined by
operator F. However, this system can also be interpreted as a discretized version
of a continuous dynamical system where for a small step length the iteration
sequence generated by the discrete dynamics follows essentially the trajectoriesof the corresponding continuous dynamical system. This aspect will be discussed
briefly in Appendix C where we will show that this continuous dynamical system is
a gradient flow with respect to an appropriate metric.
2.3.4 Relations between Local Optima and Attractors
First we introduce the notion of attractors:
Definition 2.43 (Attractor)Let F : A —)- A be an operator. A point p* G A is called an attractor of F if there
exists a neighborhood U(p*) such that for any starting point p° G Uip*) fl A the
sequence {L'^p0)}, t > 0 converges to p*. In this case
RAip*) := {p G A | lim F\p) = p*} (2.83)t—>oo
is called the region of attraction ofp* with respect to F.
If we are speaking about a SAP instance with objective function z and F := F\Y],then we will write RAiz,p*) instead of RAip*) in order to make clear which
objective function we are working with.
In the sequel we investigate the locations of attractors and their relations with local
maxima. Moreover, we present some sufficient conditions characterizing a subset of
the related region of attraction.
2.3 Properties of the Iteration Sequence 57
Lemma 2.44
Let F : A0 —>• A0 be a strict growth transformation for a continuous function
z : Rnk —y R and let p* G A1 be a ûxed point of F. If there exists a closed set
Mip*) Ç A which fulfills the following properties
(RI) 3e > 0 : (W£(p*) n A) Ç Mip*)
(R2) F(M(p*)) Ç M(p*)
(R3) p* is the only fixed point in Mip*)
then p* is an attractor and Mip*) Ç RA(p*).
Proof
From the compactness of Mip*) and (R2) it follows that for any p G Mip*) the
sequence {Ft(p)}, t > 0 has at least one accumulation point in Mip*). Moreover, we
know already from Proposition 2.33 that every accumulation point of {Ftip)}, t > 0
is also a fixed point. However, by (R3) the only fixed point in Mip*) is p* which is
therefore also the only accumulation point in Mip*) and hence lim^oo Ft(p) = p*for all p G Mip*). Since by (RI) Mip*) has a non-empty relative interior it follows
that p* is an attractor and hence Mip*) Ç RAip*). m
Using the results of Lemma 2.44 and Corollary 2.31 we can now completely charac¬
terize the attractors of the R-SAP:
Theorem 2.45
Let z be an A-polynomiai and F := F[Y]. Then p* G A is an attractor of F if and
only ifp* is a strict local maximum of z on A.
Proof
'=>': Assume p* G A is not a strict local maximum. Then for any neighborhood
Uip*) there exists q G Uip*) n A with ziq) > zip*) and q ^ p*.Now we fix any such Uip*) and define a sequence {p* := L*(p0)}, t > 0 with p° := q.
Then {p*} does not converge to p*, because either zip*) = zip0) for all t > 0
and then p° is a fixed point, or z(p*) > zip0) = z(q) > zip*) for all t > 0 and
{^(p*)},t > 0 increases monotonically.
'^=': This part of the proof makes use of Corollary 2.31. Let p* be a strict local
maximum with p*x = 1 for all i G N and let U£(p*) as in (2.64) be the connected
component of {p G A | zip) > zip*) — e} containing p*. We choose e > 0 small
enough such that for all p G L"£(p*): (i) p%i > 0 and (ii) Pii(p) > Pjr(p) for all r > 1,which is possible because of the continuity of z and the fact that p* is a strict local
maximum.
From (2.63) we know that a point p G A is a fixed point of F if and only if for all
r e K with piT / 0 : Yir(p) = Vi for all i G N. Hence (i) and (ii) imply directly that
the only fixed point in UEip*) is p*. By Corollary 2.31 we have F(UEip*)) Ç Ue(p*))
58 The Semi-Assignment Problem
and hence UE(p*) fulfills for the chosen e the properties (R1)-(R3) of Lemma 2.44
which implies that p* is an attractor.
The following example shows that there may exist non-strict local maxima which
are not even accumulation points of any sequence {Ff(p)},t > 0 with p G A0.
Example 2.46 (Example 2.32 continued)For the R-SAP with
Zip) = 3p2i(pil +P12) + 2pi2P22 +PHP22
the local maxima are given by the points p = (PîlPo2), with pL G A*. In
Figure 2.6(a) we see the graph of z(p) in the pn, p2i-plane and Figure 2.6(b) shows
its vector Held.
\
\ \
\ \
\ \\ \ \
\ \ \ \ \ \ \\ \ \ \ \ \ \
\ \ \
\ \
v \
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
\\1
\
\\\\ \\ \\ \\ \
(a) Plot of zip) = 3p2i(pn +P12) +2^12^22 +PUP22-
(b) Vector field of zip).
Figure 2.6: Example with a discrete local maximum p* which is not an accumulation
point of any sequence started in A0.
Computing Yip) we get
Y(P)3P21 + P22 3p2l + 2p22
3pn + 3p12 pn + 2pi2
and we see that Yuip) < T12(p) for all p G A0 and therefore also L,(p)n < pn.
Hence, the local maximum p* := ({q) cannot be an accumulation point of
(L[r]*(p)}, t > 0 for any p G A0.
Summing up, we have shown the following implications where zigzag lines in the
diagram denote some implications that do not hold:
2.4 Guaranteed Region of Attraction for F[Y] 59
strict local
max. (discr.
local max.
strict local
max. (cont.)
Thm. 2.45
attractor KKT point
accumulation
point
l0'al ^/««vffwJwg fixed'point 7^ ^f.(cont. J /on equilibrium
2.4 Guaranteed Region of Attraction for F[T]
We know already from Theorem 2.45 that each strict local maximum of an R-SAP
instance has a region of attraction for F[Y] which, in general, depends on the
form of the objective function z. For this reason we investigate in this section the
step length and step direction of the iteration sequence, which both influence its
convergence behavior. We will prove that the intersection of all regions of attraction
over all forms of z where F is defined, is non-empty and therefore there exists for
each strict local maximum also a so-called form-independent region of attraction.
So we concentrate on finding a subset of this region, called the guaranteed region of
attraction GRA with the goal that for quadratic A-polynomials this region can be
characterized explicitly and thus provide us in this case with an additional stoppingcriterion for FPH.
Throughout this section we use the operator F := F[Y]. Moreover, we denote the
set of all A-polynomials in PN for which F is well defined by Pp. Since we are
working here with different forms of z we define:
Definition 2.47 (Form-Independence)Let z G Pp be fixed and for all z G Pp with z = z let a set Miz,p) depending on
z and p e A be given. If for all p G A and z G P$ with z = z : M(z, p) = Miz,p),then we define Mip) := Miz,p) and call the set Mip) form-independent.
All pictures in this section represent Boolean, quadratic objective functions with
two variables (72 = k = m = 2). The x-axis corresponds to the pn component and
the y-axis top2i.
60 The Semi-Assignment Problem
2.4.1 An Introductory Example
Let us continue here with Example 2.3 from page 23 and demonstrate some inter¬
esting aspects regarding the regions of attraction of an R-SAP instance. Recall that
in this example N — K = {1,2} and the objective function was given by
zip) = 2pnp2i +P12P22 +P11 +P12 +P21 +P22 - 2.
This R-SAP has a global maximum at p* = ( J g) with zip*) = 2 and a local
maximum at ç* = ({]}) with ziq*) = 1 (see also the graph in Figure 2.1(a)).Moreover, there exists a saddle point p = | (\ |).
Now we will study two A-polynomials zi and z2 which are equivalent to z. We define
for some large constant M:
zi(p) := zip) + Mipu +P12) - M
zzip) := zip) + M(p2i +p22) - M
and it follows immediately that z = z\ = z2.
The partial derivatives of z were already computed in (2.6) and those of the two
forms z\ and z2 are given by
Yizi,p)=Yiz,p)+(MQ *fy and r(Z2,p)=r(z,p)+(^M ^j.
We see in Figure 2.7 (also compare with Figure 2.2(b)) how the regions of attraction
of p* and q* are changed by the new forms z\ and z2.
0 2Q *0 60 80 10Ojj
0 2Q 40 GO 60 100
(a) Region of attraction for z\. (b) Region of attraction for z2.
Figure 2.7: Regions of attraction for zx and z2 with M := 5000; black region corre¬
sponds to p*, light gray region to q* (numerically determined).
2.4 Guaranteed Region of Attraction for F[Y] 61
In order to discuss the regions of attraction, we define four regions on [0, l]2 which
correspond uniquely to regions on A. Within each of these regions the step direction
of the iteration sequence is fixed as shown by the arrows in Figure 2.8.
\Ä1
/«2
A4
/A3
Ik:=
R•4 •"
Figure 2.8: Region of monotone increase; arrows indicate directions of increase.
Comparing the three regions of attraction in the Figures 2.7(a), 2.7(b) and 2.2(b)for the three different forms z\, z2 and z, we observe that all the black regions as
well as all the light gray regions have a non-empty intersection. Indeed, we will see
later that for every strict local maximum there exists a non-empty intersection of
the regions of attraction of all forms of z. This form-independent region is defined
in Definition 2.48 and will be called the universal region of attraction. In this
example the universal regions of attraction are given by R2 and i?4 for the strict
local maxima p* and q*, respectively. Note, however that in the general case the
universal region of attraction cannot be determined explicitly. For this reason we
will investigate in the following subsections some subsets thereof.
Aside from these two universal regions of attraction there also exist the two
form-dependent regions Ri and A3. For points in those two regions we cannot
say in advance to which point the iteration sequence will converge. Since the
step direction is fixed in each of these regions, there the behavior of the iteration
sequence can only be influenced by the step length.
By adding a large constant to some partial derivatives p%, we make the step lengthin these directions small. In this example we added for Z\ a large constant to the
derivatives of pi, thus making the corresponding step length in R3 in —pu-directionbut also in Ri in pn-direction very small. Hence, in Figure 2.7(a) the steps in
P2i-direction are large compared to those in pn-direction and therefore the sequencereaches the universal region of attraction R2 for nearly all points in R3 and none in
Ri. Exactly the reverse situation is depicted in Figure 2.7(b) for the function z2.
In the following subsections we will concentrate on characterizing some subsets of
the universal region of attraction and do not discuss the issues of the step lengthfurther on.
62 The Semi-Assignment Problem
2.4.2 Characterization of the Guaranteed Region of Attrac¬
tion
We have seen in the introductory example that the regions of attraction of different
forms of A-polynomials may vary greatly. However, from Theorem 2.45 it follows
that every attractor of an A-polynomial z is also an attractor of any form z G
Pjy, z = z. Hence we can define:
Definition 2.48 (Universal Region of Attraction: URA(p*))Let z G Pp and p* be an attractor of F. Then the universal region of attraction
URA ofp* is defined by
URAip*) := f| RAiz,p*),
ÏZP»
where RA(z,p*) is the region of attraction ofp* for z, as defined on page 56.
Let p* G A7 be a strict local maximum and thus also be an attractor of F. We denote
by Si(p*) for all i E N that component ofp* which is one, i.e. p*ls<p*\ = 1- We will
see that we can define a neighborhood GRAiip*) such that for all p G GRAi(p*)the sequence {Ft(p)}, t > 0 does not only converge to p*, but even the components
Pis (p*)mcrease monotonically while converging to one. For this reason we first define
Ap* := {p e A | supp(p*) Ç supp(p)} = {p G A I plSi{r) > 0, V* G A^}. (2.84)
The definition of Ap* is motivated by the following observation: If a point p lies in
a face D of A, then all further points of the iteration sequence {Ftip)},t > 0 also
lie in D. Hence, a necessary condition for convergence of the sequence {Ft(p)} with
p G D to p* is that p* G D. The definition of Ap* eliminates exactly those faces of
A for which this condition is not fulfilled. This implies that p* is the only integervertex in Ap*.
As in the introductory example we define some region indicating the step direction,where we will take here only the components (?, sl(p*)) into account. For this reason
we define for each p* G A1 the following region of monotone increase
Mhip*) := {p G Ap* | YtSl{p.)iz,p) > Yzriz,p), Mi eN,r^ Sj(p*)}. (2.85)
If F(p)%T = UirPir, then it follows from (2.39) that for points in Mii(p*) the compo¬
nents («,Sj(p*)) do not only increase, but are also multiplied by the largest factor
ulSt(p*). Hence each point in A is contained in at most one set MIi(p*),p* G A7.
Moreover, MI\ip*) has the following important properties:
• Mliip*) is form-independent (Remark 2.9)
2.4 Guaranteed Region of Attraction for F[Y] 63
. For p G Mliip*) : F(p)Ml(p.) > plSl(p*} (by (2.37))
• p* is strict local maximum <f4> p* G M/i(p*) (Proposition 2.15)
• p* is strict local maximum =£* p* is the only fixed point in MIi(p*) (by (2.63))
These properties imply that if p* is a strict local maximum, then there exists a
neighborhood U£ip*) such that U£(p*) fl A Ç Mliip*) an(i therefore for all points in
U£ip*) the components (z, sz(p*)) increase under F.
Example 2.49 (Introductory example continued)Let us look at some regions MIi in our introductory example. Since Mix is form-
independent we can work with the objective function z(p) — 2puP2i+pi2P22- For the
strict local maximum p* :=(]•§) we get MIi(p*) = {p G Ap* | 2p2i — P22 > 0, 2pn —
P12 > 0} and see that p* G Mliip*). This region corresponds to R2 := (|, 1] x (|, 1]in Figure 2.8.
However, for v* := (°o) wn^ch is not a strict local maximum, we get MIi(v*) =
{p e Av* I P22 — 2p2i > 0, 2pn —
P12 > 0} and we observe that v* 0 M/i(t>*). This
can also be seen in Figure 2.8, where the region corresponding to Mliiv*) is the
rectangle R3 := (|, 1) x (0, |).
In this example for n — k — m = 2, we could simply define GRA\ip*) = Mliip*)thus getting a form-independent subset of URA(p*) for any strict local maximum
p*. However, in the general case we are still confronted with the problem that a
sequence {F*(p)},i > 0 with p G Mliip*) does not necessarily stay in MIi(p*).Assume a situation as depicted in Figure 2.9.
p*
)
Figure 2.9: Leaving Mhip*), because of too large step length.
Let p* = (} I) be a strict local maximum and let Mliip*) be given by the gray
region in Figure 2.9. Though p* G MIxip*) and the step direction is fixed (thesequence always goes to the upper right corner), the problem in this example is
that there may exist critical points p G M/i(p*) with p' := F(p) e" MI\ip*). A pairof such points is shown in Figure 2.9 by p and p'. To avoid these situations, we will
64 The Semi-Assignment Problem
define GRAi(p*) such that all critical points of Mhip*) are cut off, as shown in
Figure 2.9 by the dashed line.
Hence we define for any point p G Mhip*) a set Qi(p,p*) which potentially covers
all directions F(2,p) — p for some form z = z of a given A-polynomial z G Pp .
Let p* G A7 with P*lsip*\ — 1 be given and p G A, then we define
Q\(P,P*) = {Q G A | qisAp*) > PiSl(p*), Vz G N}. (2.86)
For any point p G Mii(p*) we know that p' := F(p) fulfills p'ls^p*) > Pist(P*) for all
i G N. Therefore this definition guarantees that p' G Qi(p,p*)-
(U)
(0,0) pu
(a) n = k = 2 in the pn-p2i-plane.
(0,1,0) (0,0,1)
(b) n= l,k = 3.
Figure 2.10: Examples for Qi(p,p*).
Note that for Boolean A-polynomials the projection of the set Qi(p,p*) to the
Pj! coordinates for i G A^ describes the axis-parallel hyper-cuboid in [0, l]n with
opposing corners p and p* (see Figure 2.10).
Using (2.86) we can define GRAx(p*) as a subset of Mhip*) which is closed with
respect to Qi(p,p*)-
Definition 2.50 (Guaranteed Region of Attraction: GRAxip*))Let z e Pp and p* G A7 with p*Si(p») = 1 for % G N be an attractor. We define the
guaranteed region of attraction GRAi(p*) by
GRAiip*) := {p G Mhip*) I Qi(p,P*) Ç Mh(p*)}. (2.87)
We see that GRAiip*) is form-independent and we will show now that it is indeed
a region of attraction and therefore a subset of URAip*).
2.4 Guaranteed Region of Attraction for F[Y] 65
Theorem 2.51
Let z G Pp be an A-polynomial, F := F[Y] and p* G A7 with p*Si(p,) = 1 be a
strict local maximum. Moreover, let GRAi(p*) be deûned by (2.87). Then for any
starting point p G GRAi (p*)lim F\p) =p*.t—>oo
Proof
Since p* is a strict local maximum, it lies in GRAiip*) and it follows from
Theorem 2.45 that there always exists a neighborhood Z4(p*) such that
U£{p*) H A Ç GRAiip*) and therefore Gi?Ai(p*) is not only the singleton
{p*}. Furthermore we know by construction that for any p G GRAiip*) the
iteration sequence {Ftip)},t > 0 stays in <2i(p,p*). However, Q\(p,p*) is a compact
subset of Mliip*) and p* is the only fixed point in Qi(p,p*)- Hence, Q\ip,p*)fulfills (R1)-(R3) of Lemma 2.44 which proves the theorem.
Though we are still confronted with the same problem as before for URAip*), 0I"
not being able to characterize GRAi(p*) explicitly, we gained by the definition of
GRAiip*) some advantage: the polytope Qi(p,p*) has a very simple structure and
Mh(p*) is a system of nonlinear equations. Fortunately, this allows us to derive
some stronger results for the special case of quadratic A-polynomials.
2.4.3 The Special Case of Quadratic A-Polynomials
If z is a quadratic A-polynomial, then its partial derivatives Yir(p) are linear
functions and Mh(p*) is described by the intersection of open halfspaces with A.
We will see in this section that the closure of GRAi(p*) is a polytope for which we
will present an explicit characterization by its hyperplanes. Thus we get a stopping
criterion, since the test whether a given point in A belongs to GRAiip*) or not can
now be carried out efficiently.
In the second part of this section we construct in a similar way as already done for
GRAiip*) another form-independent subset, GRAiip*), of the universal region of
attraction ofp*. As before GRA2(p*) will be a bounded polyhedral set and we will
be able to derive its defining hyperplanes.
Throughout this section we work with quadratic A-polynomials z and we assume
that p* G A7 with p*x = 1 for alH G N is a strict local maximum and therefore an
attractor. Moreover, we denote by elT the (i, r)-th unit vector of length n x k.
66 The Semi-Assignment Problem
Construction of GRAxip*)
Let us denote by L the index set of all open halfspaces defining Mhip*) of (2.85),i.e. for I = i%,r)
hi(p):=Ynip)-YAP) (2.88)
such that
Mhip*) = {pe Ap* I hiip) > 0, VZ G L}. (2.89)
Moreover, for any l G L we write hi(p) as
HP) = Yl Yl a*Pir + &, (2-9°)i=l r=l
where we skip the index I for the coefficients alT and b for reasons of easier
writing. Furthermore, we assume in accordance with (2.89) that hiip*) > 0, which
determines implicitly the signs of the coefficients air and b in (2.90).
In order to describe GRAiip*) explicitly, we will characterize the extreme points puof Qi(p,p*) and determine the vertex pUl for which hi(pUl) attains its minimum over
Qiip,p*)-
Proposition 2.52
Let z be a quadratic A-polynomial and let p* G A7 be an attractor. Moreover, let
for I e L, hiip) he ëiven °y (2-88) and (2-9°) and
n
si(p) :=y^aîiPïi + (l -pn)mmair + b. (2.91)z—' r>li=l
Then
GRAi(p*) = {Pe Ap* | siip) > 0, V/ G L}.
Proof
First note that since h^p) is a linear function for which the minimum over
Qiip,p*) has to be determined, we can separate h^p) over the simplices i and
write hiip) = E^=i h}(p) + ^- Thus we only need to compute the minima over the
functions h\ip) and therefore we let, i e N be fixed for the moment and work onlyover one simplex A*.
Let p G Mhip*) be given. In order to compute the extreme points of <3i(p,p*)we intersect A* with the hyperplane qn = pn (see Figure 2.12(a) on page 68).We observe that any extreme point pu 7^ p* of Qi(p,p*) has exactly two non-zero
components: pfx = pix and exactly one of the other components pfr = 1 — piX (r > 1
2.4 Guaranteed Region of Attraction for F[Y] 67
fixed). Hence, now again working on A, we associate with each set U of the form
{ii, Ti) | i = 1,..., n, rt G K \ {1}} an extreme point pu of Qi(p,p*) by
{Pii,ifr = l
1-pa, il(i,r)eU (2.92)
0, otherwise
whereby all n(Ä; — 1) extreme points of Qi(p,p*) other than p* are characterized.
Now we will determine the extreme point pUl for which hiipu) attains its minimum
on Qiip,p*), i.e. pUl := argminp[/ hi(pu). Since we have hiipu) = E=iaiiPn +
Eu r)eu air{^ — Pii) + b it follows immediately that
(i,rz) eÜi^rl = minargmins>1 als i = l,...,n.
If ht(pUl) > 0 for all l G L, then hiipu) > 0 for any arbitrary extreme point pu and
therefore Qi(p,p*) Ç Mhip*). This proves the theorem, since siip) — hiipUl) for
all l e L. u
Figure 2.11 shows two examples for MI\ip*) and GRAiip*). The drawings illustrate
how exactly those points of the region Mhip*) are cut off by Q\(p,p*), for which
too large a step length of F would have led out of Mh(p*)-
(a) Case 1: Qi(p,p*) cuts off points in (b) Case 2: Qi(p,p*) cuts off points in one
both directions. direction.
Figure 2.11: Two examples of sets Mh(p*) and GRAx(p*).
In the next subsection we describe another form-independent set GRA2(p*) which
is constructed in a similar way as GRAi(p*) and which defines different subsets of
the universal region of attraction, depending on the value of c.
68 The Semi-Assignment Problem
Construction of GRAiip*)
In the definition of Mh(p*) we have required an increase of the first components p^,
but we have not had any fuither restrictions on the othei components. Now we will
define a new region Ml^ip*) as the set of points with increasing first components
and decreasing p„ components for all i G N, r > 1.
Similarly to (2.86), these properties should hold for all points in a set Q2(p,P*)-Hence we define
Q-i{p,P*) = {<? e A | qtl > plX, q„ < pir Vi G N, r > 1}
= {p+ Coneje11 - elT | Vï g N, r > 1}} n A.
Note that for any p G A, Q2{p,p*) Ç Qi{p,P*) (see Figure 2.12).
(2.93)
(Pl,l -pi.o)
(0,1,0)
(Pi,0,1 -pi)
(0,0,1)
(a) Any extreme point pu ^ p* has al¬
ways exactly (ft—2) zero-components and
Pi -=Pi-
(Pl +P3,P2,0)
(0,1,0) (0,0,1)
(b) Extreme point pu may have any
number of zero-components, at all
positions, but pY. Their p-valuesare all added to p\.
Figure 2.12: Extreme points pu of Qi(p,p*) and Q2(p,P*) for n = 1,/c = 3 with
P= (Pi,P2,Pa)-
In order to preserve these properties for all points of {Ftip)}, t > 0, we require that
the following inequalities hold:
Ytlip) > E,(p) V2 G N (2.94)
rîr(p)<Sî(p) VteJV,r>l. (2.95)
We observe that these inequalities are even for quadratic A-polynomials not linear.
For this reason we derive subsequently some linear estimates for (2.94) and (2.95)which will then be used to define MI2ip*).
We know already from (2.85) that I\i(p) > Y„(p) for alH G N, r > 1 implies (2.94).Moreover, we have
k k
St = V^r« = ptlYtl + y]ptaYia > PiiYn + (1 - Pii) minrM.^—' z—' S>1s=l s=2
2.4 Guaranteed Region of Attraction for F[Y] 69
Hence (2.95) is implied by the condition
PiiYn + (1 - pn) mmYiS > Yir Vi G N, r > 1. (2.96)S>1
Since Yn > Yir for all i e N, r > 1 and (2.96) is linear in pn, we can choose
Ci e (0, l],i G N and get thus for pu > c; the following linear inequality:
PiiYn + (1 - pn) mmYis > aYn + (1 - q) minris > Yir VieN,r>l. (2.97)S>1 S>1
Moreover, assume that p* satisfies (2.97), then we get
{Yir(p*) - mms>iYisjp*)\„
q > max < ;—^ —-;—r f =: c, Vz G A. (2.98)->i lril(p*)-minJ>1ris(p*)J
* l '
Now we can define for quadratic A-polynomials another subset of URAip*):
Definition 2.53 (MJ2c(p*),GßA=(p*))Let p* G A7 be a strict local maximum and c* be defined by (2.98). Then we define
for any c G Mn with c* < q < 1 and
Pu > Ci VieN (2.99)
CiYn + (1 - Ci)Yls > Yir VieN,r,seK\{l},r^s (2.100)
Yii>Yir VieN,r>l. (2.101)
the set Ml^ip*) by
MIc2ip*) := {p G A | (2.99), (2.100), (2.101)} (2.102)
and a corresponding subset of the universal region of attraction by
GRAUp*) = {P e MF2ip*) | Q2(p,p*) Ç MIc2(p*)}. (2.103)
We observe that by (2.98) q > 0 for all i e N and therefore (2.99) guarantees that
p G Ap. = {p G A | pzl > 0, Vï G N}. c* in (2.98) is for any strict local maximum
p* strictly smaller than one and hence we can always choose Cj < 1.
In the Boolean case it follows from (2.98) that c* = 0 for all i e N and since (2.100)does not apply (since k = 2 < 3) we have for all c G (0, l)n : MI^p*) Ç Mhip*)as defined in (2.85). Moreover, Q2(p,p*) = Qi(p,p*), because if pn increases
then the pi2 must decrease, hence for quadratic, Boolean A-polynomials
GRAc2ip*) C GRAiip*) for all c G (0, l)ra.
Subsequently we will assume that c with c* < q < 1 for all i G iV is fixed and
we construct the hyperplanes defining the set GRAiip*). Towards this end, let us
70 The Semi-Assignment Problem
use the same notation as before: Let L be the index set of all halfspaces in (2.102)defining MI2ip*) and for any I e L, hiip) = E=i Z^r=i airPw + b such that
Mlc2(p*) = {p£ Ap* | hiip) > 0, W G L}. (2.104)
Moreover, we define for hi, l G L an index set
L := {it,r) e N x K \ alX - a%r < 0} (2.105)
and a linear function stip) by
siip) := X a*rP*r+ X aiiP*r + b. (2.106)(t,r)0i (i,r)£li
Using these definitions we will show in Proposition 2.55 that
GRAc2(p*) = {pe Ap* | siip) > 0, V/ G L}.
Geometrically interpreted, /; corresponds exactly to those directions for which the
angle between the normal vector of hi (pointing into MI2ip*)) and el1 — etr (definingQ2(p,P*)) is larger than 90°. These directions are the critical ones, in which the
iteration sequence might leave MI2(p*) if the step length is too large. In order to
avoid such situations we adapt the coefficients in hiip) for these directions, resultingin the definition of the linear functions siip) given by (2.106).
Example 2.54
We see in Figure 2.13 that on the one hand we have no critical directions for hi and
therefore it needs not be replaced (we set Si := hi). On the other hand, for h2 the
positive e1 direction is critical and there exist points p G MI2(p*) and A > 0 such
that p := p+ Xe1 ^ MI2(p*). By replacing h2 by s2 we cut off all those critical pointssuch that MI2ip*) is reduced until finally only the desired set GRA2ip*) remains.
hi - si
Figure 2.13: Construction of the guaranteed region of attraction GRAiip*).
2.4 Guaranteed Region of Attraction for F[Y] 71
The proof of the following proposition is based on the same idea as already used
in Proposition 2.52 for GRAiip*). We compute the extreme points of Q2(p,P*)and then derive for each linear function /fy(p) a condition which guarantees that
Q2(p,p*) Ç MI2(p*). It will turn out that this results exactly in the definition of
the functions Siip).
Proposition 2.55
Let z be a quadratic A-polynomial and p* be an attractor. Moreover, for a ßxed c,
let MI$ip*) be given by (2.104) and st(p) be dehned by (2.106), then
GRAc2(p*) = {pe Ap* | siip) > 0, V/ G L}. (2.107)
Proof
Let p G MI2ip*) be given. Since Q2(p,P*) is an affine linear image of a hypercube,every extreme point pu of Q2(p,P*) is uniquely characterized by its zero-components
(see also Figure 2.12(b)). Hence, we associate with each subset U of N x iK\ {1})an extreme point pu of Qi2ip,P*)
{Pn+ E(i,s)euPis, ifr = l
0, ifii,r)eU
Pir, otherwise
whereby all extreme points of Q2 (p,P*) are characterized. For every U and any fixed
I e L we get
hiipU) = anPn + CLn X P«-+ X airPir + b
(i,r)U,r>p
(i,r)gU
y~] anPir + x airpir+b-
(i,r)U (i,r)<£U
(2.108)
Now, let us determine the extreme point pUl for which (2.108) attains its minimum
on Q2(p,P*), i.e. pUl := argmmpU hipu). Since pir is fixed in (2.108), it follows
immediately that
i%,r) G Üi <5air > an.
Now we see that Ui = I[, the set of critical directions for hi as defined in (2.105)and therefore hi(pUl) = siip) and (2.107) follows.
We have already seen that in the Boolean case GRAiip*) Ç GRAiip*) f°r a^ c e
(0, l)n. Finally we show with an example that depending on the choice of c there
may also exist points p G GRAiip*) which do not lie in GRA^p*) and hence neither
region is always contained in the other.
72 The Semi-Assignment Problem
Example 2.56
Let n = 2, k = 3 and let the following A-polynomial z be given:
Zip) = 3piip2l + 2piip23 + P12P22 + 2pn + 2pi2P23 +P13-
We see that z has a strict local maximum at p* = ( J 0 0 ) Moreover, its partialderivatives are given by
r(p)'3p2i + 2p23 +2 P22 + 2p23 1
3pn P12 2pn + 2pi2,
Let us first compute GRAiip*) as described on page 66:
halfspace
Tu - r12 > 0
Tu - r13 > 0
r2i - r22 > 0
r2i - r23 > 0
hi for Mhip*) as in (2.89) s 1 as in (2.91) bound
3p2i - P22 + 2 > 0
3p2i + 2p23 + 1 > 0
3pn - P12 > 0
Pn - 2pi2 > 0
3p2i-(l-P2i) + 2>0
3p2i + 1 > 0
3pn-(l-pii)>0
p11-2(l-p11)>0
P21 >
P21 >-3
Pn>lPn>l
Since p e Ap* is equivalent to p2i > 0,i G A" we get
GRAiip*) = IpeA Pn > g,P2i>0
In order to compute GRA2ip*), we hrst have to determine the bounds c* for
Ci,% G N according to (2.98). We have Y(p*) = (302) an<^ therefore c\ = | and
3"
The inequalities ht(p) > 0 characterizing MI2(p*) which correspond to (2.101) are
the same as for GRAiip*). However, since in all cases the ûrst coefficient is the
largest we get by (2.105) h = 0 and therefore si(p) = hi(p) for I = 1,..., 4.
For the other inequalities in Ml^ip*), corresponding to (2.100) we get:
I haifspace hi for MI%(p*) as in (2.104)5
6
7
8
cirn + (1 - Cl)r12 > r13
Clrn + (1 - ei)r13 > r12
c2r2i + (1 - c2)r22 > r23
C2r2i + (1 - c2)r23 > r22
3cip2i + (1 - C!)p22 + 2p23 + 2ci - 1 > 0
3cip2i - P22 + 2(ci - l)p23 + 1 + Ci > 0
(3c2 - 2)pu - (1 + c2)pi2 > 0
(2 + c2)pn + (1 - 2c2)pi2 > 0
Choosing cx := 0.5 > c\ and c2 := 0.9 > c\, we guarantee that the Grst coefficient
of the linear functions h6, h7 and h8 is always the largest and thus stip) = hiip) f°r
I = 6,7,8. Fori = 5 we get
, , n3 1
hip) = 2^21 + 2^22 + 2p233 13
and s5ip) = -p21 + -p22 + -p23.
2.5 A Region of Attraction for F[Ç] 73
Choosing
_
/0.6 0.1 0.3 \P ~
V0.9 0.05 0.05/
it can be verihed that stip) > 0 for I = 1,..., 8 and since p also fulhlls (2.99) it lies
for c = (0.5, 0.9) in GRA2(p*). However, since pu = 0.6 < |, the point p does not
lie in GRAi(p*).
2.5 A Region of Attraction for F[£\
In the previous section we have dealt with the problem of finding and characterizinga subset of the universal region of attraction for operator F[Y]. Now we will
investigate the regions of attraction in a more general setting for operators F[£\.Again we will be able to define a subset of the region of attraction ofp* for F[£],called PRA(z,p*). This region is described by a set of inequalities and can therefore
also be used as a stopping criterion in FPH. At the end of this section we compare
for A-polynomials and operator F[r] the regions PRAiz,p*) and GRAi(p*).
Proposition 2.57
For all % e N,r G K let £jr : Wlk —> R be polynomials with positive coefficients
and constant, F := F[Ç] as in (2.5), p* G A7 with p*s.(p») = 1 and p° G A0 with
P°is(p*) > P'ir f°r all i e N,r ^ s,(p*) be given. Moreover let for i G N,r G K,iir = £'ir + di- where £'ir consists of the constant and those monomials in £ir for
which all variables have second index equal to Si(p*). Then a sufficient condition for
convergence of the sequence {Ft(p°)},t > 0 to p* is given by
&.<p-)(p°) > &(P*) + &(«) ^eN,r^ siip*) (2-109)
where for i G N: qiSl{P*) = 1 and qir = 1 - P°,i(p.) for all r ^ Si(p*).
Proof
W.l.o.g. we assume Siip*) — 1 for all i G N, i.e. p*x = 1 and p°x > p% for all r > 1.
Moreover, note that t;ir(p*) = £j'r(p*) for all ï G N, r G K. Throughout this proofwe will write p* := Ft(p°).
Furthermore we use the following two inequalities. For all i e N,r e K
4(P*)>C(P) VpGA0 (2.110)
&(?) > C(P') Vp' G A0 with p\i > pi. (2.111)
(2.110) follows directly from the fact that £JiTip) contains only variables with second
index equal to 1 and therefore it attains its maximum over A in p*. (2.111) is true,because qiX — 1 is always larger than p'iX and for all p' G A0 : p'iX > p°ix:
qir = 1 - pi = ^2pI> ^Pir > Pir Vi N,r>l. (2.112)r>l r>l
74 The Semi-Assignment Problem
Finally we will use the following property which holds for alH G A^ and p G A0:
à ( \ ^ c ( \ w ^ i ^ zrV ^Up)Ph
^Up)pil
/oiiq\&(P)>&W Vr > 1 = F(j>h
=
g^^->
iM EsKj= ft, (2.113)
To prove that (2.109) is a sufficient condition for the convergence we show by induc¬
tion over t that for alH 6 AT
UPt)>UPt) Vr>l,t>0 (2.114)
Pa1 < Pii Vt>l. (2.115)
To show the induction basis we need the following two bounds for all« G A"
Up°)<Up°) (2.H6)
Up*) + Uo) > Up°) + Up°) = UP°) Vr > 1. (2.117)
The lower bound (2.116) holds, because ^"(p0) > 0. The upper bound (2.117)follows directly by summation of (2.110) for p' and (2.111).
Combining the two bounds (2.116) and (2.117) with the stopping criterion (2.109)we get the induction basis (2.114) for t = 0:
Up0) > Up0) > Up*) + Ui) > Up0) Vi e n, r > l. (2.118)
By definition of F this implies on the one hand that pz\ > p\r for alH G A", r > 1
and on the other hand it follows from (2.113) that p\x > püiV
In the induction step we show that under the assumption that (2.114) and (2.115)hold for t' < t, they also hold for t+1. From (2.114) and (2.113) it follows imme¬
diately that (2.115) is fulfilled for t + 1, i.e. p*x < p*+x. It remains to show that the
following two bounds hold:
£li(p°)<6i(pt+1), Vi^N, (2.119)
UP*) + U<l) > UpW), Vz G N, r > 1. (2.120)
The lower bound (2.119) follows from
Up0) < Up1) < Upt+l) < Upt+l) v, g n
because 4'i(P*+1) contains only variables with second index equal to 1, (2.115) for
if < t and £"(pi+1) > °- The uPPer bound (2.120) is a direct consequence of (2.110)and (2.111).
Using (2.109), (2.119) and (2.120) we have for all i G N
Upw) > Up0) > Up! + CW) > Upt+1) Vr > 1 (2.121)
2.5 A Region of Attraction for F[(] 75
which proves (2.114) for t + 1 and completes the induction step.
To guarantee convergence of the sequence {pt},t > 0 we will show that for some
t+i
ö > 0 either the quotient ^f- is bounded from below by a constant K%(5) > 1 or
p\x > 1 — 5 for all i e N. Let ö > 0 be given and let us assume that there exists
i G A" such that for some t, p\x < p*x - ö = 1 - S. Moreover, let ^i(p) < M, for all
p G A and
Si:=mmCi(P°)-UP*)-U<ï)-r>l
We see that by (2.109) et > 0 and that (2.121) implies
UPt)>UPt)+^ Vr>l. (2.122)
Using (2.122) it follows that the step length for i is bounded:
pT_
Up1)>
Up1)
p\i ELipIUp')~
pIiUp1) + (i -plùiUp') - ^)
=
Up') Mi=
Mj= K(s)>1
UP^-^ + ^Pli Mi - et + dil - 5) M,-et5' A)
which concludes the proof of the proposition.
If for some starting point p° G A0 the sequence {F^p0)}, t > 0 converges to
p* G A7 and furthermore £;i(p*) > Cir{p*) for alH G N and r > 1, simple continuityarguments show that there exists t > 0 and p = F*(p°) for which (2.109) is fulfilled.
Then p* is an attractor and (2.109) describes a subset of iL4(z,p*) which we denote
by
PRAiz,p*) := {p G A | pu > pir, U*,P) > U^P*) + C(z, q), Vi G N, r > 1}.
Thus we have a criterion which guarantees convergence and therefore could be used
as part of a stopping criterion in an implementation. Since the conditions on £îrin Proposition 2.57 are rather weak, PRA(z,p*) could be used in a much more
general context than only for SAP's with operator F[Y], On the other hand, we
have defined in the previous section for SAP's the guaranteed region of attraction
GRAiip*). Now we will briefly investigate the relationship between these two
regions of attraction.
Let z be an A-polynomial with positive coefficients and F := F [Y]. Moreover, let
p* G A7 with p*x = 1 for alH G N be a strict local maximum. Then
PRAiz,p*) = {p G A | pu > plT, Y[iiz,p) > Y'„iz,p*) + r;'r(z, q), Vz G N, r > 1}
(2.123)
where for all i G N, qu = l,q„ — 1 — pu for all r > 1 and Yw(z,p) — Y[riz,p) +Y"riz,p) is the decomposition as in Proposition 2.57.
76 The Semi-Assignment Problem
The following example shows that PRAiz,p*) is in contrast to GRAiip*) not form-
independent.
Example 2.58
Let zi,z2 with z\ = z2 be the following two equivalent quadratic, Boolean A-
polynomials with positive Y„-coefficients and let p* = ( \ g).
• Ziip) = 2piip2i +P12P22; and we get for its partial derivatives:
r(*,ri = (*" *»), i>,ri =
(*" °), r-(*,P) = (° *»).
V2Pu P12/ V2pii 0/ V° P12/
Computing PRA(zi,p*) according to (2.123) and taking into account that
Pii > Pi2 for all 1 e N, we get:
% = 1 : 2p2i > 0 + Ç22 =^ P21 > I% = 2 : 2pn > 0 + g12 =» Pn > |
vm* PRA(zi,p*) = |p G A I pn > \, P21 > \
z2(p) = 5pnp2i + 3pnp22 + Pi2p22 + 3pi2 - 3, and we have
w v/5p2i + 3p22 P22 + 3
.
Y(z2,p)= }, with
5pn 3pn + p12/
r'fe,rt=|f21 M, r»(,2,P) = (3f »»'
hpu 3puj y 0 pi2j
Again we can compute PRA(z2,p*) and get
i = l: 5p2i > 3 + Ç22 => P21 > I
î = 2 : 5pn > 3 + Ç12 => Pn >
PRA(z2,p*) = Ip e A I pn > -, P21 >- I
.
We see that though zx = z2; both functions have different regions PRA, which
therefore depend on the form of z. Comparing these regions to GRAiip*) =
Mhip*) ~ {p £ A I pn > |, P21 > |} we observe that in this example PRA(zi,p*)and PRA(z2,p*) are subsets of GRAiip*).
Since on the one hand PRAizp*) depends on the form of z and on the other hand,in general PRAizp*) ^ GRAiip*), now the question arises, whether PRAiz,p*) Ç
GRAiip*) f°r a^ z = z. An answer to this question is given by the following
corollary.
2.6 Computational Experiments with the Max-Cut Problem 77
Corollary 2.59
Let z G Pp be an A-polynomial and F := F[Y]. Then for every strict local maximum
p* G A7
PRAizp*) ç GRAiip*)
for all forms z G Pp with z = z.
Proof
Let p G PRAiz,p*). If we assume w.l.o.g. that p*x = 1 for all i G N then we
have Ynip) > Y'irip*) + Y'^q) for alii e N,r > 1. It follows from (2.118) that in
this case Tji(p) > rir(p) for all i e N,r > 1 and therefore p G MJi(p*). Now, let
P G Qiip,p*). Then pn > Pu for alH AT and by (2.110) and (2.111) it follows that
Y'irip*) + YHq) > Yirip). On the other hand we have T^(p) > Y'^p). Thus we get
Hi© > Y[iip) > Y'M > TUP*) + Kiq) > Yirip)
which implies that p G Mh{p*) and therefore Qi(p,p*) Ç Mh(p*)- Moreover
we have for p G PiL4(.z,p*) : pu > 0 for all i iV and thus p G Ap*. Now the
form-independence of GRAi(p*) implies immediately that PRA(z,p) C GRA^p*)for all zG?/ with z = z. *
2.6 Computational Experiments with the Max-
Cut Problem
The Max-Cut problem was already discussed in Section 1.3 on page 12: Given a
graph G = (A', E) and edge weights w^ for [i, j] G E, then the Max-Cut problemconsists in finding
max >Wij.
SCN ^J
~
[i,J]eEiesjgs
We know already that if all edge weights Wi3, [i, j] G E are positive, then this problemcan be formulated as a quadratic SAP. In this case we define K := {1, 2} with the
interpretation pu = l,Pi2 = 0 if node i belongs to S and pu = 0pi2 = 1 if i e N \ S.
The objective function can be written as the following A-polynomial
Z(P) = ^2 Wij(PilPj2+PjlPi2)-
To solve the Max-Cut problem we used FPH as described on page 16. In more
detail, the implemented heuristic runs as follows:
1. Choose m starting points with components randomly drawn from the interval
[0.4995,0.5005]
78 The Semi-Assignment Problem
2. For each starting point iterate F[£\ until one of the following three stoppingcriteria is fulfilled:
(a) The current point lies in the guaranteed region of attraction GRAiip*)for some p* G A7
(b) For all i e N the largest component of the current point p satisfies
max{p,i,p,2} > 1 - 10"5
(c) The number of iterations reaches a given iteration limit
In (2b) and (2c) the components of the final point p of the iteration sequence
are rounded to p* G A7 for v = 1,..., m.
3. Return as solution p* := p*„ with z(p*v*) — max{^(p*)|^ G {1,..., m}}
We first tested the algorithm for Max-Cut instances on randomly generated graphswith n — 50 nodes, edge probability of 0.2 and edge weights wlT,i G N,r G K
uniformly distributed in the interval [1,100]. These problems can be solved
optimally using an interior point algorithm in a branch and bound framework and
therefore the optima are known [Bur94, HRVW96].
In our tests we set the iteration limit to 1000 and used as fitness function £ := ra
with a e {0.5,1}. Experiments have shown that F[r05] gave much better results
than a = 1 so that we present here only these results.
Percentage
10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50
(a) Problem 1. (b) Problem 2.
Figure 2.14: Iterations vs. quality: solid=min, dashed=avg.
We first consider the qualitative behavior of the algorithm depending on the
number of starting points ra. We solved ten instances, each 20 times (= 20 runs)for m — 5,10,..., 50 starting points. For each run we generated a list of 50 startingpoints where the run with m starting points uses the first m generated pointsof this list. Figure 2.14 shows for two typical instances how the relative error
°PtmumUOn of the minimum (solid line) and average (dashed line) of the 20 runs
decreases as the number of starting points (depicted along the :c-axis) increases.
2.6 Computational Experiments with the Max-Cut Problem 79
The maximum of the 20 runs is not mentioned here, since for all m the optimum of
the Max-Cut problem was found.
Of course, the quality increases with the number of starting points, however alreadywith 10 starting points the minimum over all runs gave results within 2% of the
optimum. Therefore we have chosen m = 10 for the following tests.
Table 2.1 shows the results for a = 0.5, m = 10 for 10 instances PI, ... ,P10. Againeach instance is solved 20 times (= 20 runs). The first column ('worst') of the
table gives the relative error of the worst solution out of the 20 runs, the second
column ('average') shows the average of the relative error of the 20 runs and the
last column ('#opt') how often the optimum was found. The last two rows give the
average and the worst case over the 20 instances for each column. One run took
about 5 seconds on a Pentium Pro 200 processor.
worst average #opt
PI 1.20% 0.12% 18
P2 1.27% 0.54% 8
P3 1.59% 0.36% 6
P4 0.60% 0.08% 14
P5 0.26% 0.05% 17
P6 1.29% 0.44% 8
P7 1.08% 0.35% 9
P8 0.68% 0.12% 13
P9 0.00% 0.00% 20
P10 0.94% 0.36% 8
average 0.89% 0.24% 12.1
worst case 1.59% 0.54% 6
Table 2.1: Results for 10 Max-Cut instances.
We see that the results of Figure 2.14 are confirmed: in all runs we were at most
2% off the optimum and on the average less than 0.25%. Moreover, in all cases the
optimum has been found in these 20 runs.
Stopping critérium (2a) tests, whether the sequence of generated points has reached
the guaranteed region of attraction of a strict local maximum. In order to speed up
computing time, the algorithm checks only every 10-th iteration if the current pointlies already in the guaranteed region of attraction of a point p* G A7. Moreover,this test is only carried out if the sequence is 'close' to an integer point, where
'close' means that for all i e N the largest component max{pu,Pi2} > 0.8. These
region of attraction tests are time consuming, however they sometimes allow to
80 The Semi-Assignment Problem
determine pretty early and with certainty to which local maximum the algorithm
converges. Compared to the case where the guaranteed region of attraction partof the stopping criteria (2a) is omitted the region of attraction criterion gives a
total speed up of the algorithm of about 80%. Hence this criterion has not only the
advantage of being a 'clean' stopping criterion, where we know precisely to which
point the algorithm converges, but it also has a practical running time advantage.
In further experiments, we tried to construct some new fitness function £ for the
operator F[£], with the goal to minimize the regions of attraction of already visited
points in A7. Assume that we have already explored the first m! starting points
leading to solutions p1,... ,pm G A7. We try to influence the behavior of operator
F[£\ by changing the functions ^r for all i e N,r e K in order to get a smaller
region of attraction for p1,... ,pm .The idea is the following: Assume that (after
some iterations) the algorithm is in a point q and we do not want to visit p1 again.Now, if for an i e N p}t = 1 and Yuiq) > Yi2iq) then the first component will
increase, i.e. Fiq)u > qu and potentially we are approaching p1. To minimize this
chance we want to make only a small step in this direction, which can be achieved
by taking for example £jr = (r^ + Ki)01, where the larger the constant Ki is chosen,the smaller the step becomes. For a given i e N, any pu, v G {1,..., m'} for which
the i-th component of -F(g) gets closer to pv will contribute to Ki and the nearer q
is to pv, the larger is this contribution.
Formally, let p1,... ,pm be already visited points in A7 and let i e N be fixed.
Define for a point ç G A
hi(q) = {v e {1, • .,m'}\Yu(q) > Yi2(q) and ptt = 1}
Uq) := {v G {1,..., m'}\Yi2iq) > Y^q) and p^2 = 1}.
In order to define a growth transformation F[£\, we require that the fitness function
£ = (6r) satisfies the assumptions of Theorem 2.26. We define Çir — Gi(Yir),i G
N,r G K, where Gi are strictly increasing, concave functions, by
Ui) = (Tir(q) + 2|/ii(?)kii + 2\It2(q)\qi2)a Vz G N,r G K
and call FPH with this new fitness function 'point repulsion' (PR). We solved
the same 10 instances with operator F[£\ as defined above and used the same
parameters as before. Again each instance was solved 20 times. In general, the
global maximum was found more often with PR than before, however the worst
and the average values of the 20 runs were in both version about the same. Since
the computing time for PR is about four times higher than without it, it is more
profitable (for a fixed amount of computing time) to work with more starting pointswithout PR, instead of using PR.
Finally we solved larger instances of Max-Cut problems using the test set proposedin [Bur94]. These graphs are chosen from the following three different data sets:
2.6 Computational Experiments with the Max-Cut Problem 81
Data set A(prob): Unweighted graphs with edge probability prob.
Data set B(cmaa:): Complete graphs with random integer edge weights in
[0, cmax].
Data set C(cmaa;): Planar graphs with random integer edge weights in [1, cmax].
Note that often large problems with more than 100 nodes cannot be solved opti¬
mally. For this reason we compared the results found by FPH with strong upper
bounds derived by a semi-definite programming relaxation [Bur94, HRVW96]. This
comparison is summarized in Table 2.2. The first two columns specify the problem
type and size, whereas the next three depict minimum, average and maximum ob¬
jective value obtained by FPH for solving each instance from 20 different startingpoints. The column entitled with 'bound' gives the value of the best upper bound
found for the instance, where a star beside the number indicates a proven optimalvalue. The next column shows either the percentage of the gap between best FPH
solution and the upper bound, or, if the optimum is known and also was found byFPH how often it was found (number in brackets). Finally, the last column refers
to the time (minutes and seconds) needed by FPH to solve one Max-Cut instance
for one starting point on a Sun Ultra Sparc workstation.
type nodes min avg max bound gap/opt time
A(0.1) 100
A(0.25) 100
A(0.5) 100
B(10) 100
C(l) 100
C(10) 100
335 338 341
776 779 781
1420 1424 1427
13577 13591 13602
196 196 196
1106 1114 1122
341* (1)782 0.13
1427* (2)13608* 0.04
196* (20)1122* (2)
0:11
0:09
0:09
0:09
0:10
0:08
A(0.5) 150
A(0.5) 200
A(0.5) 250
A(0.5) 300
A(0.5) 400
A(0.5) 500
3114 3120 3127
5491 5500 5511
8450 8462 8471
12138 12165 12193
21298 21327 21369
33055 33086 33122
3135 0.26
5543 0.58
8523 0.61
12262 0.56
21498 0.60
33343 0.66
0:21
0:37
0:58
1:24
2:31
3:56
Table 2.2: Results for large Max-Cut instances.
Using branch and bound techniques we could prove that all upper bounds in Ta¬
ble 2.2 are at most 0.5% off the optimum. Moreover, we see that all but one of the
instances with 100 nodes could be solved optimally within the 20 runs, provided the
optimum was known. Even for the larger instances with n = 150,..., 500 nodes
the results of FPH always lie within 0.7% of the optimum. Together with the giventolerance of 0.5% of the upper bounds, these results verify once more that even for
large Max-Cut problems the SAP model together with FPH is a successful approachto this hard combinatorial optimization problem.
Chapter 3
The Constrained Semi-AssignmentProblem
3.1 Introduction and Algorithm for C-SAP
In contrast to the previous chapter, we discuss now an extension of the SAP
with added constraints. These constraints are given in form of a set TZ, the so-
called forbidden partial assignments which must not be satisfied (see Definition 1.1).
Let us keep our notation of the previous chapters and denote the decision variables
by xt,i e N = {1,..., n\ and the set of values to be assigned by K = {1,..., k}.We recall from Chapter 1 that the C-SAP is defined as follows:
Let (n, k, T, w) be a SAP instance and let additionally a set of forbidden partialassignments TZ be given. Then the C-SAP (n, k, TZ, T, w) is given by
(3.1)
/ \max zip) = ^2 \wt Yl Pir
Ter \ (,,r)er /
s.t. peA1 (3.2)
Y[ Pir = 0 \/Re TZ. (3.3)(i,r)eR
Here, again, we denote by
A7= ^{0,1}
k
nxk ^pir = i, v? e N
r=l
the set of all possible assignments and call the relaxed problem, where A7 in (3.2)
84 The Constrained Semi-Assignment Problem
is replaced by A = Conv(A7) the relaxed constrained semi-assignment problem.
Our approach to the C-SAP uses again the operator F[Ç} of Definition 1.7 given by
the mapping F[f] : A0 -> A0, where for alii e N,r e K
m(p\r = JlrUff v
(3-4)Es=iPisUp)
However, in contrast to the SAP we will now define a new fitness function, taking into
account the newly added constraints in TZ. Choosing such a new fitness function, we
combine the effects the gradient-type dynamics and the repellor dynamics. Hence, we
recall from Section 1.4 the definitions of the corresponding fitness functions P = (Tîr.)and 0 = (&ir):
Yir{z,p) = g^=Y,WT II Pi» Vi£N,reK (3.5)
e,r(p)= n i- n Pi' VieN,reK. (3.6)
Again we will write T,r(p) instead of Yir(z,p) if only one objective function z is
considered. Moreover, we will assume that z is always given such that P is a fitness
function on A0. Furthermore, we observe that for all p G A0, 0„-(p) G (0,1],because for any forbidden partial assignment R e TZ : \R\ > 2. Hence, 0 is a fitness
function on A0, too, and F[&] is a continuous mapping.
The repellor dynamics was introduced in Cochand [Coc93] as an approach to
the G-Max-Sat problem. It has the property that if p is close to an integervertex p* G A7 which violates some constraint R e TZ, then for (z, r) G R
the fitness 0îr(p) is, as desired, very small and thus contributes to make pir
smaller. On the other hand, if p lies in a neighborhood of a feasible point, then
0îr(p) ?a 1 and thus it does hardly influence the computation of the new value for pzr.
Now we define the combined dynamics F[YaQß] for a, ß > 0 by
^^(P)- = JirTirr mIo Y(v
* e JV.r *. (3-7)Es=iPisrM aQis{p)ß
A good performance of FPH using F[YaQ>^} depends on a good choice of the expo¬
nents a,ß >0. With the help of these two parameters we can control the influence
of one or the other dynamics. Intuitively, a large value a leads to assignmentswith large objective values, however, unfortunately the larger a becomes the more
constraints may be unsatisfied. On the other hand a large ß exponent helps to find
3.1 Introduction and Algorithm for C-SAP 85
feasible assignments, but only too often with small objective values.
Note that 6 defined by (3.6) is normally not a fitness function on A, because on the
boundary some functions Qir(p) may become zero. However, even for points p G A0,
Qirip) may become very small in the neighborhood of an infeasible assignment. For
this reason we will restrict the domain of all the operators involving 0 as fitness
function to
A£ Pe[e,l-ik- l)e]nxk
r-l
Pi 1, Vi G AT (3.8)
where an appropriate choice for e will be discussed later. The range of the operator
must be Ae as well and for this reason we define \I/£ and Fe as follows.
Definition 3.1 (*e(p),Fe[£] and Round(p))Let s with 0 < e < | be ßxed, p G A and r(i) for all i G N be the smallest index
such that Pir(i) > Pis for all s G K.
1. We dehne a mapping \&e : A —Y A£. Let
)£, If Pir < £
Pir := \I min{pir, 1 — (k — l)e}, otherwise
Moreover, let 0 < p < 1 be deßned by
_
1 - pirji) - (fc - l)e'"
Er^r({)Pir ~ {k - iy
if the denominator in (3.9) does not vanish, and p := 0 otherwise. Then
(3.9)
s + p(pir — e), otherwise(3.10)
2. Furthermore, let a htness function Ç on A0 and the operator F[£] as in (3.4)be given. Then we deßne a new operator FE : A0 —T As by
F£[C}:=^EoF[^] = ^e(F[C}). (3.11)
3. We deßne a function Round: A —> A7 which rounds points in the interior to
integer vertices in A1 by
Roundip)lT := {^ jf> V^Vz G N.
0, otherwise(3.12)
Note that the definition of ^e guarantees that the smallest component of $e(p) is
at least e and r(i) remains the smallest index for which pir(i) > PiS for all s e K.
86 The Constrained Semi-Assignment Problem
3.2 Properties of the Operator for C-SAP
Example 1.3 on page 5 has shown a C-SAP instance which was characterized by a
large, nearly constant plateau with only a single point with a large objective value.
As already discussed, local search heuristics have problems to orient themselves in
such solution spaces due to their local view. However, FPH uses additionally globalinformation in its iterative process and therefore has an advantage over such method.
The following proposition proves that if the objective value of a maximum of the
C-SAP is large enough, then it will be found by FPH.
Proposition 3.2
Let (n, k, TZ, T, w) be a C-SAP instance, F := F[YaQß] with a > 0, ß > 0 and for a
given e > 0, F£ :— ^E(F). Moreover let M := EtetWt> P* £ A7 be a feasible pointwith p*r,^ = 1 for all i e N and 6 > e. If (T', wTi) is a weighted partial assignmentwith V := {(i,r(ï)) \ i e N} and
and (T', w') := (T, w) U (T1, u>t') then for this so extended instance in, k, TZ, T', w')the sequence {FE(p0)}, t > 0 converges to \&£(p*), for any p° G A0 with p°.^ > 5, for
all i e N.
Proof
Observe that p* is a strict local maximum of the SAP (n, k, T', w'), since wT> > wT
for all T e T.
Moreover, let z be the objective function using the partial assignments in T and
z' the one corresponding to T'. W.l.o.g. we assume that p*x = 1 for all i e N.
Furthermore, let p G A0 with pu > 5 for alH G AT and p' := F[Yiz'p)](p), then we
can write
Yir(z',pY<dir(pfpir = u^pir where uir
—
Es=iPisYrs{z',pyeiS(py
We will show subsequently that uir < | for alH G A", r G K \ {1}.
To prove this claim we use the following three inequalities for i e N:
• Yiriz'p) < M for allie N,r e K\ {1} which holds due to (3.5) and the fact
that pir < 1.
• Since Y(z,p) is a fitness function and T and T' differ only by the partialassignment V = {(1,1), (2,1),..., (n, 1)}, we get
Yn(z',p)= ^2 wt u Pjs = Yii(z,p)+wTiWpu>wT:5n~1TeT':(»,l)6T (j,s)£T\(i,l)
S
^'
&i
where the last inequality follows from pu > S for all j G N.
3.2 Properties of the Operator for C-SAP 87
• Finally we derive the following lower bound for 0zi(p):
e.i(p)= n i1- n ft.]>^- (3-14)
We show that the expression in brackets in (3.14) is larger or equal than 5: p*is a feasible point and \R\ > 2, hence (i, 1) G R which implies that there exists
if,s') e R,s' > 1 and therefore the second product in (3.14) is not empty.
Hence, (1 — T[, s)£R\(i i)Pjs) > 1 ~ Pfs' > ö, because pyi > 5 implies that
Pj's' < Es^iPj's <1 — 8 and (3.14) follows.
Combining these inequalities for i G A", r > 1 and using (3.13) we get
Y„izf,p)aQ„ipf^
MalT
Eks=iUsYis(z',p)aeis(p)ß~
p.ir,i(^,p)Qetl(p)/'Ma Ma
_
1
-
wat55a(n-i)5\n\ß -
2Ma~~
2'
If p'ir > £, then it follows from (3.10) (since p < 1) that ^e(p')tr < p[r. Thus on
the one hand we have for r > 1 with p'lT > e: F£ip)„ = ^e(p')ir < p[r < \p%r- On
the other hand we get for r > 1 with p'lT < e: F£ip)ir = ^eip')ir = £ It follows
that the first components FHp)u are increasing and therefore F£ip°)u > S for all
i e N, t > 0. This ensures that the inequalities (3.15) hold in all points of the
iteration sequence and lim^ooF£ip°) = tye(p*).
We have just seen that if the objective value of a feasible point p* G A7 is largeenough, then all trajectories started within a ^-boundary, for a fixed ö > 0, convergeto it. Responsible for this nice behavior is the factor Y of the gradient-type dynamics.Using the bound (3.14) we can proof the following proposition about attractors:
Proposition 3.3
Let (n, k, TZ, T, w) be a C-SAP instance and F := F[YaS0} with a > 0, ß > 0. Then
every strict local maximum p* of the C-SAP is an attractor of F.
Proof
W.l.o.g. let p*i = 1 for all i e N. Since p* is a strict local maximum we have
Yziip*) > Ytr(p*) for all i e N,r > 1 and it follows from continuity that there exists
an e-neighborhood ZYe(p*) such that
Vp G U£ip*) n A0 : Yxlip) > Ytrip), \/teN,r>l.
Letp G We(p*)nA° and ö := 1 — e, thenp^ > S for all« G N. Writing F (p)îr = u„pwand using (3.14), we get for all i e N,r > 1:
Yir(p)aeirjp)ß^
Ylrjp)a ^fYir(p)Y 1
EÎ=iP»rM(p)°eM(p)/»-
PtiYuiprSuipy \Yu(p)J W+1
88 The Constrained Semi-Assignment Problem
From the continuity of z it follows that for e —> 0 : Tirp
—> r'( J < 1. Hence
we can choose J with <5 < J < 1 large enough and define s := 1 — 5 such that for
all points p G Ug(p*) n A0: wjr < 1 for all % G N, r > 1. Thus we have shown
Z4-(p*)nA° ÇRAip*). m
Note that though we can influence the size of a region of attraction of a strict local
maximum of a C-SAP instance by raising the corresponding objective value, other
regions of attraction (of strict local maxima) can never vanish completely because
of Proposition 3.3.
Subsequently we will show with an example the advantage of working with a combi¬
nation of the gradient-type dynamics (T-part) and the repellor dynamics (0-part),rather than using only the gradient-type dynamics and including the constraints
with penalizing costs in the objective function. The example will show that in the
latter case the nice property of Proposition 3.2 will be lost.
Example 3.4
Let in, k, TZ, T, w) be a C-SAP instance and let zip) be its objective function givenby (3.1). We construct a new objective function z which contains the forbidden
assignments ofTZ with penalizing costs M:
lip) = zip) - M{
y. n p*
\ReTl(i,r)eR
and maximize it by some gradient procedure. If we want M to be large enoughfor a hill climbing procedure to prefer a feasible assignment with objective value
zero to any infeasible one, then a reasonable choice for M would be the sum of the
coefficients in z.
Consider now the following C-SAP instance with n = k = 2 and c > 0:
max cpnp2i
s.t. Pup22 = 0, P12P21 = 0
PG A7.
Constructing the penalized objective function z as described above, we get
Zip) = CP11P21 - C(pup22 +Pl2P2l)-
However, now we see that the direction of the gradient of z no longer depends on
c, nor does c inßuence the region of attraction ofp* := (} §) anymore. Hence the
property of Proposition 3.2 is lost since now even a large value c cannot guarantee
convergence of{F£ip)},t > 0 to ^e(p*) for allp G A6.
The last two propositions have investigated the behavior of F[YaQß] in the
neighborhood of feasible points in A7 with large objective values. Now question
3.2 Properties of the Operator for C-SAP 89
arises: What happens, if an infeasible point p* G A7 has such a large objective value?
The next proposition investigates the behavior of the repelling part of operator
FE[Ya@ß] in the neighborhood of an infeasible point p* G A7 which has a discrete
neighbor with less violated constraints in TZ. It was shown by Cochand [Coc93] that
in such situations the repellor dynamics F£[Sß] jumps (under certain assumptions)to another point which is less infeasible. This so-called 'rejecting' effect holds as
well for the combined dynamics F£[YaQß}:
Proposition 3.5
Let F := F[YaSß] with a>0,ß>landFE := *e(F). Moreover, let r,r(p) > 1
for all i G N,r G K,p G A0. If p* G A7 has a discrete neighbor q* G Af(p*) which
violates less constraints in TZ than p*, then for an appropriate choice of e there
exists a neighborhood U of^Eip*) such that for all p G U fl A0: Round(F£ip)) ^
Round(p) =p*.
Proof
First note that the assumption Yir > 1 is not restrictive, since by Transforma¬
tion 2.10 we can always transform an objective function such that it satisfies the
hypothesis.
Let p* G A7 be an assignment for which vi constraints are violated and q*, obtained
from p* by changing the value of one decision variable, be an assignment which
violates v2 < Vi constraints. W.l.o.g. we assume that p*lX = 1 for alH G A" and
if a = 1, r = 2
q*r = { 0, if i = 1, r ^ 2 yieN,r eK.
otherwise
Let p := *e(p*). To get bounds for 0lr(p) = IL^i.oeÄ (X ~ U(j,s^R\(i,r)Pjs) ,r e
K we define the following two sets: Rfr is the set of all forbidden partial assignments
containing (l,r) which are satisfied by p* regardless of the value of p\r and 7?^. are
those assignments which are unsatisfied for p*r = 1:
Rfr :={ReTZ\ (1, r) G R, 3(j, s) e R\ (1, r) : p*s = 0} Vr G K
Rulr :={ReTZ\ (l,r) G R, V(j, s) eR\(l,r): p*s=p]i = 1} Vre K,
and correspondingly ©ir(p) = 0fr(p)0fr(p) with
efr(p):= n i- n p^) ^reK (3-16)
RGRfr \ U,s)£R\(l,r) J
©SUP)" II1- II P») ^reK- (317)
R£Rl \ (j,s)eR\(l,r) J
90 The Constrained Semi-Assignment Problem
Since p = ^re(p*) it follows from (3.10) that p%r is either e or (1 — (A — l)e). Hence the
second product of Ofr(p) contains at least one factor e (because R G R(r and \R\ >
2). Analogously, the second product of 0^.(p) consists only of factors (1 — (A; — l)e)and has at least one such factor. We have
0fr(p)= fl (i-sa«il-ik-l)e)ß*) VreK,
RGRfr
®ÏÀP)= II (1 - (1 - (^ - l)e)7Ä) VreK,
RER¥T
for some aR, 7Ä > 1, #r > 0 and aR, ßR, 7# G N, R G Rfr U Ä^..
We denote by p\r :— |i?fr| and z/ir := |i?^.| the cardinalities of the sets and define
L :— max^gft \R\ — 1 such that L +1 is the maximum number of elements contained
in a forbidden partial assignment ReTZ.
We continue giving an upper and lower bound for (3.16) and (3.17) for all r G K:
(l-e)^ <0fr(p) <1 (3.18)
((Ä - l)e)^ < 0fr(p) < (L(fc - l)e)"lr. (3.19)
The inequalities in (3.18) hold because saR(l - (k - l)e)ßn < eaR < e for
ßR > 0, aR > 1 and all factors are less or equal than 1. The first inequality in (3.19)is true because (1 — (A; — l)e)7R < (1 — (k — l)e) for jR > 1. To prove the second
inequality we use Bernoulli's inequality: (1 — (A; — l)e)7H > 14 7ä(— (A; — 1)s),^r > 1
which is applicable because 7h > 1,7s G N and —(A; — l)e > —1. From the fact
that 7Ä < L (3.19) follows immediately.
Note that i/u < vi and ui2 = ^n + (v2 — v{), because by assumption p* and q*differ only in the components (1,1) and (1, 2). From this observation and (3.18) and
(3.19) it follows that
SM>
(1-e)"»((*-l)g)"" (l-g^'ÇA;-!)^-^en(p)
-
(L(fc - l)e)"» L""' l J
Finally, let M be an upper bound for Ytr,% e N,r e K and /3 > 1 be the exponentof 0 used by the operator, then we get from (3.20)
np)i2>
QÎ2P12>
ef2£> (i-£)ß^(k-i)^-^=d(V2_vl)+i
Fip) n-
MaQßnPu MaSßn~
MaL^' '
Note that /3(-y2 — ^î) 4- 1 < 0 and therefore we can choose e > 0 small enoughto ensure that F(p)i2 > F(p)n. By continuity, the result also holds in some
3.2 Properties of the Operator for C-SAP 91
neighborhood Uip). m
Note that e depends on the instance and can be computed a priori for each C-SAP
instance (on the basis of the number of possible values: k, maximal size of the
clauses: L, number of clauses and an upper bound M for the Yir all assumed to be
greater than one). However, if we want to construct an operator Fe which satisfies
Proposition 3.5, then for an instance as in Proposition 3.2 e will be smaller than 5.
Therefore, in general it is not possible to have convergence in the whole region Ae
and simultaneously the 'rejecting' effect.
By Proposition 3.5 FE[YaQß] cannot converge to an infeasible point which has a
better neighbor, from the feasibility point of view. But how about convergence in
general?
Using Sarkovskii's theorem [Dev89], we can show that there exist examples with
cycles of any period:
Theorem 3.6 (Sarkovskii)Let f : R —>• E be a continuous function. Suppose f has a periodic point of periodthree. Then f has periodic points of all other periods.
The following example shows that cycling of F[YaQß] is possible:
Example 3.7
Let the following C-SAP instance be given:
max zip) = pi2 4-P22 4- 5000(pn 4- p2i)
s.t. puP2i =0, p e A7,(3.22)
and let us consider F := F[r04]. Note that the set D := {p G A | pix = P21} is
invariant under F due to the symmetry of the instance and we have for p e D :
P12 = P22 = 1 — Pu- We deßne for p e D
\Pu 1-Pnju
fn this example we have Y = ( 5000 1 ) an(^ © = ( l-ll] 1 ) and thus we get for p G D:
5000pn(l-pn)4 5000pn(l-pn)3/(P11) = F(P)5000pn(l - pn)4 + (1 - Pn) 5000pn(l - puf 4-1
'
92 The Constrained Semi-Assignment Problem
Figure 3.1: Graph of f(f(f(x))) and the line y = x.
ft can be shown that f has a cycle of period three. For this reason we compute
g(x) := f{f{f{x))) which is depicted in Figure 3.1 together with the line y = x.
Of course, all intersecting points are ßxed points of g and correspond therefore to a
3-cycle off. One such cycle is for example given by /(0.174) « 0.9979, /(0.9979) ft
4-10-5, /(4 • 10~5) ft 0.174 and by Sarkovskii's theorem it follows that cycles of any
period exist.
Finally, we discuss the stability of fixed points in the interior A0. We have seen
that fixed points in the interior are unstable for the SAP using F[Y]. Using a small
technical modification of the operator F[£\, Cochand [Coc93] has proven that the
same result holds as well for this class of more general operators.
Proposition 3.8 (Cochand)Let t;ir : Rnk —» R,i G N,r G K be continuously differentiable functions with
Çtrip) > 0 for all p G A0 and §^(p) = 0 for all i G N,r G K and s G K. Moreover,let
k
fi.r(p) == II(1 - P") Vt&N,reK5=1
and F := i^fF] be the operator as in (3.4). If 7 > 0 then any ßxed point p in the
interior A0 is unstable.
Proof
Since F is a differentiable mapping, a sufficient condition for a fixed point p to be
unstable is that at least one eigenvalue of DF(p) (the derivative of F at p) has
absolute value larger than one.
For a fixed point p G A0 we have F(p)îr = ^^ = p^ for alH g N,r G AT
and it follows that
k
£„.£# = E, := Y,tistilPts Vi6N,reK. (3.23)
3.3 Implementation and Numerical Results 93
To compute the differential DFip) we use for i e N,r e K
5jr
n=0 VseK (3.24)
OPis
P1 = -7(1 - ftr)7"1 YK1 -^ =T~ V'^ (3-25)
CPir , ,
-*- Pirs^tr,l
OPir 1 ~ Pir
Using these derivatives we get for the diagonal elements of the differential DFip) of
F^D7] at the fixed point p
dF[ÇW]i
dpi,v
J_ (f. çv y.- „ p. Q7 (p- Q7
-
J^rPis&snl2 I Çir^^ir^i PirÇir^^ir I Çîr"^ 1
^i \ \ 1- — Pir
(3,23) 1 /2
/ 7^(1 -p^— ^2 1 ^i — Pir2-'!
\*->i
yi2
V —i ru—i i —i 1_ n-
£->i V V -1Ar
= 1 4- (7 - l)pir.
It follows that the trace of DFip) is E"=i EÎLit1 + (7 ~ !)a>) = n(fc - 1) + «7,
viewing F[£f27] as a mapping with domain M.nk. However, note that A0, the range
of F[£Q7] has dimension n(k — 1) and therefore DFip) nSiS a^ least n vanishing
eigenvalues. This implies that there are at most n(A — 1) non-vanishing eigenvalues.Since the trace, which is the sum of all eigenvalues, is n(A; — 1) 4- nj and 7 > 0
it follows that there exists at least one eigenvalue strictly larger than one and
therefore the fixed point p is unstable.
Note that though O^ is similar to Qir, the proposition does not hold for 7 = 0.
However, the slight modification from F[£] to F[£Q7] is only an artificial modification
and of theoretical interest. Practically, 7 can be chosen small enough such that it
has a negligible effect and does not alter the results in A0. This ^-modification was
not used in the implementation.
3.3 Implementation and Numerical Results
We tested FPH on a class of randomly generated C-SAP instances for which feasible
solutions exist, but cannot trivially be determined. Subsequently we describe in the
first subsection the algorithm FPH and some details of its implementation. Then
we present the test set used in our experiments and finally we conclude with a
comparison of FPH to Tabu Search.
94 The Constrained Semi-Assignment Problem
3.3.1 Implementation
In this subsection we describe the implementation of FPH which has been used
subsequently to solve C-SAP instances.
We recall from Section 1.4 that the basic concept of FPH was given as follows:
• Choose a starting point p° G A0.
• Compute the sequence p* := F[£](pt_1), t = 1, 2,..., I for some /.
• Choose as solution p* G A7 the assignment p* which is 'closest' to pl.
As an appropriate fitness function for the C-SAP we have chosen £ := YaQß, as
discussed in the previous sections. However, regarding the operator, experimentshave shown that the following modification improves the performance of the algo¬rithm and was hence used in our test: Instead of changing all components of a pointall at once, we update them sequentially. This variant of F[£] will be denoted byF'[£] and due to its componentwise update, F'[£] will be called Gauss-Seidel version.
Formally, F'[£] is defined as follows: Let for j e N
{P.r&r(p)i = j
Eks=iPi^s(p) VieN,reKPir otherwise
then
F'[^} = KW[C] o o F<n)[e, (3.27)
where % is a permutation of the index set A".
A more detailed description of F'[£] and some of its properties can be found in
Appendix A.
In our numerical experiments we used exclusively the Gauss-Seidel version F'[YaQß],where we allowed the permutation % in (3.27) to be changed during the algorithm.Moreover, we extended the basic concept of FPH by some additional features
improving the algorithm.
Subsequently we will first present some pseudo-code of the algorithm and explainafterwards the new procedures used therein:
FPH
1 Choose a starting point p G A£
2 for t := 1 to itlimit do
3 if t mod recit = 0
4 7T := permute(N)
3.3 Implementation and Numerical Results 95
5 if optip) > 0.8 • optibest)6 greedy ip)7 recenterip)8 for j := 1 to n do
9 i := Ti[j]10 for r := 1 to k do
11 n- = » T"©^
12 normalizeipt)13 Compute new values uc and z for p
14 Keep best solution in best
15 greedy ip)
We see that the second part (lines 8-14) describes the standard FPH for F'[YaQß],where in normalizeip,) the components of p%_ are normalized according to (3.10)such that p G AE.
ImprovementIn each iteration t of the algorithm the current objective value zip) and the number
of unsatisfied constraints uc(p) are computed (line 13). Since feasibility is more
important than a large objective value, we use the lexicographical order on the
vector (—ueip), z(p)) m order to compare 'solutions' of FPH.
Procedures
In the first part of the algorithm (lines 3-7) we included some additional procedureswhich will be carried out every recit iterations:
permute(N): Computes a new permutation -k of the index set N = {1,..., n}. This
permutation is used in line 9 to change the order in which the decision variables
in (3.27) are updated. In several tests we tried to improve FPH by changingthis order depending on the current values of the variables. However, it turned
out that a random permutation works best.
greedyip): If the objective value of the current solution is not too far off the best
solution found insofar, then this greedy algorithm may be applied (line 6).Moreover, the same greedy algorithm will be applied once more at the end of
FPH (line 14).greedy () first constructs an integer solution out ofp using Roundip) of (3.12).Then it tests for all discrete neighbors q G Mip) whether they achieve an
improvement or not. We set p to that neighbor q which yields the largest
improvement and repeat this process until no further improvement is possible.If by this procedure a new best assignment is found in line 6, then this solution
is reconverted into a matrix p G AE favoring the components of the assignmentwith a high value. This point p is then used as a new starting point in FPH.
recenteripi): The goal of this procedure is to avoid that FPH reaches the boundary
96 The Constrained Semi-Assignment Problem
of Ae too fast. Besides the numerical instability close to the boundary, the
gradient-type dynamics looses there its influence. Hence, if FPH reaches the
boundary too fast, the gradient-type dynamics has not had any chance of
contributing to the result and therefore leads to assignments with low objectivevalues.
For this reason we draw the point p back to the center of A, however, without
destroying the order of its values. Towards this end, we add a constant reccon
to each value piT and normalize afterwards using normalizeipi). The larger
reccon is chosen, the closer the new point lies to the center pir = | for all
i e N, r G K.
Data Structure
This algorithm was implemented in C where the following data structures were used:
Each monomial of the objective function is a structure which contains the weight of
the monomial, its current value, the number of variables (degree of the monomial)and two list pointers. The first list links all monomials, thus constructing the
objective function. The second list contains the indices of all variables in the
monomial. This allows a fast update of the value of a monomial when the value
of some variable p;. changes. The same structure as for the objective function is
also used for the constraints, resulting in a list of forbidden assignments, p itself is
implemented as an array where each element contains besides its value as well a list
of pointers to the monomials it is contained in.
Note that though the greedy algorithm is rather time consuming, we could improveits performance thanks to the data structure used, which allows us to compute the
difference of the objective values of two discrete neighbors efficiently.
3.3.2 Instance Generation
The data of a C-SAP instance (n, k, TZ, T, w) consists of the number of variables n,
the number of values k, a set TZ of partial assignments defining the constraints, and
a set (T,w) of weighted partial assignments for the objective function.
We generated instances with n = 100 variables, k = 5 values and 10000 randomly
generated partial assignments of cardinality 3 as constraints. The objective function
is given by (T, w), which contains randomly generated partial assignments with
cardinality between 2 and 5 and all possible assignments with cardinality 1:
3.3 Implementation and Numerical Results 97
cardinality number of clauses weights in
1 500 [10,20]2 800 [100,200]3 1000 [500,750]4 800 [750,1500]5 200 [1500,2500]
Table 3.1: Composition of weighted partial assignments (T, w).
This choice was motivated by the following observations: First, there is no obvious
relation between the best solutions found for the C-SAP and those for the corre¬
sponding SAP. Second, the mixture of the partial assignments in T, satisfied by the
best solutions found, has no apparent pattern.
3.3.3 Numerical Results
Parameters
After having fixed the problem class, we performed some experiments to deter¬
mine the parameters of FPH. Regarding the recentering, we set recit := 8 and
reccon := 0.4. If recit is chosen too large, then it has a negligible effect, and if it
is too small, then the result is similar to choosing a completely new starting point.
Moreover, reccon determines, how much the values of p are recentered.
As a stopping criterion we tested several different possibilities and their combina¬
tions. We tried to stop when the maximum component pu has reached a specified
limit, or when the difference between two consecutive p-values or objective values
is small enough. However, after all it turned out that the larger the number of
iterations is, the better solutions will be found. Hence, for a given time limit, it was
only a question of finding a trade off between the number of starting points and the
number of iterations. In these experiments we have chosen 25 starting points and
we set itlimit :— 800.
Two further important parameters are the exponents a and ß, which influence the
behavior of FPH essentially. Intuitively one can say that a large a-value will increase
the objective value and a large ß-value improves the feasibility. However, unfor¬
tunately these values behave contradictorily such that an improvement achieved
by raising one of the values mostly results as well in a worsening of the other. Ex¬
periments with different combinations of these values have verified these statements.
The following diagrams in Figures 3.2 and 3.3 show for one instance the results for
100 starting points. Each bar in these diagrams corresponds to the solution of one
starting point, where its height corresponds to the objective value. All solutions
are sorted (along the x-axis) with respect to the number of unsatisfied constraints.
98 The Constrained Semi-Assignment Problem
The values below the diagrams show how many feasible solutions (sat) were found
and how the objective values among them where distributed (minimum, median,
maximum). Besides demonstrating the behavior of the exponents, these figuresshould also point out the great influence of the greedy algorithm (line 15) on the
number of feasible solutions. For this reason we depict for each pair (a, ß) in the
left column the result obtained without greedy, and in the right column that one
with greedy.
In Figure 3.2 we have fixed ß = 1 and we vary only the a-exponent. The pictures
of each row correspond to the results for F'fTO], F'[P20] and F'[P30], respectively.As already assumed these figures demonstrate how the objective value improves
when raising a (compare med/max in each column), however at the same time the
number of unsatisfied constraints increases as well.
FPH behaves similarly, when a = 3 is fixed and only the /5-exponent varies. As
Figure 3.3 shows, the number of feasible assignments increases with increasing ß,however simultaneously the objective values decrease.
In further experiments it turned out that sometimes some additional minor
improvement can be achieved, if ß < 1 is chosen. Moreover, we observed that there
exists as well an upper limit for a, for which, when exceeded the objective values
begin to decrease. In the following tests we have chosen a := 4 and ß := 0.8.
As an alternative idea we also tried to change the exponents a and ß dynamically
during the algorithm, instead of fixing them in advance. However, this approachof self-adapting parameters failed, since it only increased the running time without
improving the quality of the solutions so that we refrained from using this concept
in FPH.
Comparisons of FPH to standard Simulated Annealing and Tabu Search (TS) have
shown that FPH is on the average slightly superior to these heuristics (for the
described test set). Hence, we were interested, if some additional modifications
of these methods and further refinement of their parameters would outperformFPH. Towards this end A.Hertz and D.Kobler [HK99] developed in some internal
challenge a TS which was especially adapted for solving C-SAP instances of the
described type. Next we describe briefly the main concept of this TS and present
subsequently the results of the comparison.
Tabu Search (Hertz, Kobler)The search space of the TS is Kn. Moreover, infeasible assignments x are penalized
by a function fjz(x) counting the number of unsatisfied constraints. On the other
hand a function fr{x) is defined which favors assignments in which the satisfied
3.3 Implementation and Numerical Results 99
oAgl001100- normal oAgloottoo-greedy
(a) sat:92, min:20459, med:29283, maX:36730. (b) sat:97; min:20668; med:29532, max:37026.
oAg200t100-normal oAg200t100-greedy
(c) sat:35, min:26865; med:35060, max:40670. (d) sat:66; min:27082; med:35399; max:41459.
oAg300t100 - normal oAg300t100-greedi
(e) Sat:2,min:35392, med:39640, max:43888. (f) Sat:36, min:29529, med:35504, max:45138.
Figure 3.2: Behavior of FPH for P0, T2© and T30 (left: without greedy, right: with
greedy).
100 The Constrained Semi-Assignment Problem
oAg300t100- normal oAg300t100 greedy
s s 7 i ao z 3
(a) sat:2,min:35392, med:39640, max:43888. (b) sat:36; min:29529, med:35504, max:45138.
oAg300t200 - normal oAg3001200 - greedy
(c) sat:28, min:26943, med:32975, max:41211. (d) sat:66! min:26334, med:31921, max:41362.
oAg300t300 - normal oAg3001300 - greedy
(e) sat:41, min:23523, med:31127, max:41899. (f) sat:83, min:24602, med:31022, max:41902.
Figure 3.3: Behavior of FPH for r3@,r3©2 and T3©3 (left: without greedy, right:with greedy).
3.3 Implementation and Numerical Results 101
clauses in T have a large total weight and which takes additionally into account
how far unsatisfied clauses are apart from being satisfied. Then the cost function
is defined by fix) := \fnix) — fr(x), where A is a self-adjusting parameter. The
base length of the tabu list is set to 5 and in each iteration at most 70 neighborsolutions are visited, where a preference is given to those assignments which modifythe value of a conflicting variable. In total 25000 iterations are performed. Since in
each iteration only a subset of neighbors is considered, at the end of TS a steepest
ascent method, similar to our greedyip), is run.
ComparisonWe generated 10 C-SAP instances as described in Section 3.3.2 and solved each of
them 10 times. We report the minimum, average and maximum value over these
10 runs. All these tests were carried out on a Sun Sparc Ultra 60 workstation
with 330 MHz. One run of TS with 25000 iterations took about 11 minutes which
we have taken as a time limit. In approximately the same time we ran FPH for
25 starting points with 800 iterations each (9:30 minutes) and we took the best
solution found. In all runs feasible solutions have been found by both methods.
Table 3.2 shows the objective values found by FPH and TS, respectively. We see
that on the average FPH was about 1.7% worse than TS. Nevertheless the qualityof the solutions found by FPH yielded in one case a better maximum, in three cases
a better average and in five cases better minimum values.
FPH Tabu Search
min avg max min avg max
PI 45451 47930 50350 45437 48414 51937
P2 52683 54735 58800 53180 55369 58800
P3 54066 54910 57125 54114 56672 58645
P4 53025 54533 55910 54607 57794 59355
P5 54197 55699 56891 54002 57494 60652
P6 53399 54516 56606 52422 54990 57763
P7 52349 54568 56052 52189 54420 57857
P8 52525 54840 57183 53488 54727 57057
P9 54389 56515 57914 51256 56217 58711
P10 51642 53980 56345 53693 56048 60738
0% -1.7% -3.1%
Table 3.2: Comparison of FPH to TS.
In all our tests, no cycling of FPH was observed. Moreover, FPH is very stable with
respect to shortage of CPU time. Even in ^th of the time needed by TS, FPH is
still able to find quite good, feasible solutions, whereas TS is completely lost, since
it does not have enough time to investigate a reasonable amount of the search space.
For both algorithms there exist a few parameters which need special care, since they
102 The Constrained Semi-Assignment Problem
have a great influence on the performance of the heuristic. Most critical for TS is the
length of the Tabu list, where even a small change can improve (or worsen) the re¬
sults by several per cent. Similarly, we have seen in Figures 3.2 and 3.3 that for FPH
a good choice of the exponents a and ß is essential, in order to find a good balance
between the number of feasible solutions found and the size of their objective values.
Summing up we have seen that if TS is given all the time it needs, then it outperformsFPH by about 1.7%. On the other hand, FPH has the advantage that it has a quitegood worst case behavior and is able to find good, feasible solutions in a fraction of
the time needed by TS. Hence, FPH is an interesting alternative which may also be
considered in combination with other local search heuristics, due to its completelydifferent approach.
Chapter 4
The Point Feature Label
Placement Problem
4.1 Label Placement - State of the Art
Originally, the label placement problem consisted in placing text elements on
geographic maps, a task carried out by cartographers. It belongs even to this dayto the most time-consuming processes in map manufacturing. However, nowadays
geographic maps are by no means the only area where label placement plays an
important role. Lately, especially technical maps became more and more importantand have influenced the layout design enormously. One typical example for such
a technical map can be found in the paper of Wagner, Wolff [WW97], who deal
with the problem of labeling groundwater drillholes with a block of measuring results.
Problem
The label placement problem consists of a set of (geographic) positions which
should be marked not only by symbols (features), but also with a correspondingtext giving a more detailed description of the location. Typical examples thereof are
the labeling of cities, mountains, but also rivers or countries. Therefore generallyone distinguishes between three different types of features: point features (cities,mountain tops), line features (rivers, border lines) and area features (countries,oceans).
Such placements should naturally fulfill a whole string of criteria such that the
final labeling looks clear and aesthetically attractive. Numerous authors have
investigated this topic and presented various sets of rules which describe desired
properties for placements. Of course, depending on the exact problem formulation
the importance of these rules varies.
In this chapter we consider only point features and we assume that a fixed number
104 The Point Feature Label Placement Problem
of potential label positions for each point feature is given. The task of label
placement is to select for each point feature exactly one label position in order
to get an aesthetically attractive placement. In a mathematical model properties
describing such aesthetic attractiveness can either be formulated as constraints
(e.g. no overlaps allowed) or modeled in the objective function. In Section 4.3 we
present three different problem formulations, which can be modeled as C-SAP's
and which will be solved by FPH. All these problems have in common that
they want to find a placement which minimizes the number of overlapping labels;
however, they differ in the counting arguments for these overlaps and in additionallydesired aesthetic properties. One such criterion could be that not all potentiallabel positions are equally desirable. In this case often priorities are assigned to
the label positions which could then be included in the objective function. In
this case, an optimal solution corresponds to a placement with a minimum num¬
ber of overlaps which additionally fulfills the position priority list as well as possible.
The case of weighted label positions is discussed in many papers and even if not
all authors agree on the same priority ranking of certain label positions, they nev¬
ertheless recognize a certain set of basic rules. Some aspects that could be used
to generate such position priorities are given by the following rules suggested byFreeman and Ahn [FA87]:
1. Names should be placed horizontally and read from west to east.
2. Neither the characters nor the words of a point feature name should be spreadout.
3. A name should be placed close to the point which it refers to, with a fairlysmall tolerance between allowed minimum and maximum distances.
4. Preference should be given to placement that causes a name to be read away
from the point feature rather than towards it.
5. Preference should be given to placement slightly above the feature over place¬ment slightly below the feature.
Table 4.1: Desired properties of an aesthetically attractive placement (from [FA87]).
Besides the described classical formulation of the label placement problem there
exist several related, no less difficult problem formulations which are also of great
practical relevance. These problems range from the generation of aestheticallyattractive label positions for a feature to the construction of different mathematical
models. One question of particular interest deals with finding the maximum
label size such that all features can be labeled without overlap (Wagner, Wolff
[WW95, WW97]). But also the reverse question is of interest: what scale is
necessary to place all labels (with a fixed minimum font size) without overlaps?
4.1 Label Placement - State of the Art 105
Finally another problem which often arises in the context of label placement is the
problem of 'point selection'. If there does not exist a placement without overlaps
(or none could be found), then point selection can be applied whose task consists in
finding a minimum number of features that may remain unlabelled in order to geta labeling without overlaps.
State of the Art
In the sequel we want to give a brief survey of existing literature and the develop¬ment of label placement during the last 40 years.
One of the first papers dealing with label placement was published in the early60s by Imhof [Imh62, Imh.75]. He introduced the already mentioned division into
three feature types: points, lines and areas and moreover, he drew some attention
to aesthetical criteria which clarify the layout and thus enhance the effectiveness
with which a map could be read. Exactly at this point Yoeli [Yoe72] continues
10 years later. He assumes that a fixed number of label positions is given and
shows examples for good and bad placements. This resulted in the developmentof a priority list for potential label positions around a point feature as shown in
Figure 4.1 (low numbers refer to more desirable positions). Even if during the
following years occasional papers suggest the use of other priorities, most of them
are nevertheless based on those suggested by Yoeli.
2 1
4 3
Figure 4.1: Position priorities suggested by Yoeli [Yoe72].
Generally one can say that until the beginning of the 80s the label placementproblem from a mathematical point of view was only dealt with occasionally.However, since then the interest in this field increased rapidly, not least thanks to
increasing computation power. Researcher and scientists of different areas rangingfrom geographers, cartographers over computer scientists to mathematicians tried
themselves at this hard problem. Since nowadays especially technical maps with
continuously increasing sizes are needed, the use of computers for this problem
gained interest. While an experienced cartographer needs several minutes to placeone label satisfactorily (which can take up to 50% of the complete productiontime of maps), modern computers are able to place several thousand labels in the
same time. Unfortunately despite this advantage in speed and though the qualityof techniques have steadily improved, computer generated maps still do not reach
the quality of manually produced maps. For this reason even nowadays many well
designed maps are generated interactively.
During the last 20 years several different algorithms and methods were developed
106 The Point Feature Label Placement Problem
to attack the label placement problem, some of which we will present in Section 4.2.
One idea of special interest was introduced by Hirsch [Hir82] who dispensed with
the fixed position hypothesis and developed a floating position strategy. Other
approaches ranged from integer programming (Zoraster [Zor86, Zor90]) to expert
systems ([Zor91]). Finally, there exists a broad class of so-called rule-based systems
([FA87, CJ90, WB91, DF92]) which are used successfully for dense maps. Before
we direct our attention to some of these methods in the next section, we want to
discuss the complexity of label placement.
ComplexityThe following point feature label placement problem will be used throughout the
next sections. Its NP-hardness was proven independently by [KI88, FW91, MS91]:
Let a set of point features be given, where each of them has exactly four potentiallabel positions. We want to assign to every feature exactly one of these label
positions such that the number of labels which either overlap another point feature
or overlap one another is minimized. Note that this problem formulation is
completely independent of the shape of the labels (we can assume axis-parallel
rectangles of fixed size).
To this day there are only very few polynomial solvable special cases known,and those are practically of no significance. One such special case was stated byChristensen et al. [CMS95] and consists of instances where no potential label
position overlaps more than one other potential label position. Another class
was found by Formann, Wagner [FW91] who could prove that the point feature
label placement problem where every label has at most two potential positions is
polynomially solvable.
Because of the difficulty of the problem researchers dealt besides the developmentof heuristics also with the approximation of this problem. Agarwal et al. [AvKS98]showed an polynomial time approximation scheme for problems with axis-parallel,
rectangular labels of arbitrary height and width with a ratio of l/ö(logn). More¬
over, if the label height is fixed, then van Kreveld et al. [vKSW98] have proven that
the problem can be approximated with a factor of 1/2 in Oinlogn) time.
4.2 Methods behind Label Placement
In this section we present some of the best known methods for the point feature
label placement problem. In this process we do not want to go down to the last
detail, but present a rough overview of the essential ideas. All methods presentedhere have in common that they can be used to solve the standard problem of placing
exactly one label to each feature such that the number of overlaps is minimized.
4.2 Methods behind Label Placement 107
As already mentioned, in contrast to all other methods in this section, the improve¬ment method of Hirsch [Hir82] does not use a fixed number of label positions. Here
every point feature has an infinite set of potential label positions which are arrangedaround the point. In case of a conflict, as for example the overlap of two labels, a
so-called overlap vector is computed whose length and direction describe where a
label should be moved to, in order to resolve the conflict. If a label is involved in
more than one conflict then the sum of all overlap vectors is taken. The resultingvector is then used in some strategies to reposition overlapping labels thus freeingthem from their conflicts. Unfortunately, this method has two disadvantages: the
algorithm may get stuck in a local optimum or oscillate between two configurationswithout being able to resolve the conflict, or one overlap vector becomes so largethat there is not sufficient space left to move the corresponding label far enough in
the necessary direction.
Another completely different approach was taken by Zoraster [Zor86, Zor90]. He
formulates the problem as a 0-1 integer program where each point feature %,% =
1,..., n has kt potential label positions. A 0-1 decision variable x is used to represent
a labeling, where xzr = 1,1 < i < n, 1 < r < k% denotes a label at position r
of feature i. Since we have to assign to every feature exactly one label position,
Er=i xir = 1 f°r all 1 < i < n must hold. Moreover, pairwise overlaps of labels
should be avoided so we will assume that those forbidden, pairwise combinations
are given by a set R. Now the additional constraints x3 s + Xfs> < 1 for each
potential overlap q e R guarantee that no two labels overlap. Finally, each positionis given a weight (priority) w„ such that the objective is to minimize the overall
weight. Since 0-1 integer programming itself is NP-hard, a Lagrangian relaxation
is formed by relaxing the last mentioned constraints and introducing Lagrangianmultipliers dq > 0:
n kt
min ^2S WirXn + S(^s9 + xi'vs'i~ 1)d*
i=l r=l q£R
s.t. 2_^Xir = l l<i<n
r=l
Xir > 0 1 < % < n, 1 < r < kt, dq>Q q e R.
This relaxation is solved by subgradient methods with the aid of numerous tricks
to speed up convergence. Disadvantages of this method are that it may get stuck
in a local optimum and that it is very time consuming to solve such relaxations for
large problems.
Also widespread are rule-based methods [FA87, CJ90, WB91, DF92]. These systems
normally deal with a very general form of the label placement problem (includingpoint, line and area features), where especially aesthetical aspects are to the fore.
The goal of these methods is to embody the expertise of a human cartographer in an
automated system, which is often achieved by the development of an expert system
108 The Point Feature Label Placement Problem
and the construction of a rule database. Of course such rules cover all necessary
conditions like 'no overlaps are allowed', but also further aesthetical criteria like
those shown in Table 4.1. All these rules are evaluated and then applied accordingto their priorities until a satisfactory labeling is found. If none of the placements is
acceptable then either the next-best rule is evaluated or a backtracking becomes
necessary.
Besides the already mentioned methods, the two local search methods Simulated
Annealing and Tabu Search play an important role. Neither algorithm needs any
further explanation, however, it should be stressed here that especially Simulated
Annealing came off extremely well in a comparison with other methods for attackingthe label placement problem ([CMS93, CMS95, CFMS97]). Further details on this
comparison will be given in Section 4.8.
The last method which we present here was developed for label placement with a
discrete set of potential label positions by Wagner and Wolff [WW98]. The approachconsists essentially in cleverly thought-out preprocessing strategies operating on the
so-called conflict graph (where edges represent conflicting placements). By check¬
ing certain (graph-)properties in the conflict graph the number of potential label
positions can be reduced greatly, and simultaneously the used operations guaranteethat in this step the optimal solution is never destroyed. If no further reduction
is possible, then a simple heuristic is used to eliminate further potential positionswhere still more than one possibility for a placement exists. This further reduces
the number of potential label positions, but no more guarantees optimality. The
preprocessing part together with the eliminating heuristic are applied alternativelyuntil a feasible placement is found. A great advantage of this method is its high
speed, since the algorithm runs in linear/quadratic time (depending on the set of
operations being used).
4.3 Problem Descriptions
In this section we introduce some basic notation used throughout this chapter.
Moreover, we define three slightly different variants of the label placement problemand discuss their similarities and differences.
The setting is the same for all problems: Let N := {1,..., n} be the set of pointfeatures to be labeled. Every feature has a finite number of potential label positions
given by the set Kt := {1,..., ki}. For each of these label positions there exists
an additional positive priority or weight Wir for all i e N,r e Ki, where larger
weights correspond to more desirable label positions. In analogy to the previoustwo chapters we denote by (i, r) an assignment of position r e Ki to feature i 6 N.
Using this notation we can define the set TZ by all combinations of at most two
label positions which will overlap. Every element ReTZ, which we will also refer to
4.3 Problem Descriptions 109
as a constraint, is either {(i, r)}, if the label of feature i at position r overlaps some
other point feature, or {{i,r), (j,s)},i ^ j, if the corresponding two labels overlapeach other. Note that here partial assignments R e TZ with \R\ = 1 are allowed (incontrast to Chapters 2 and 3) and we call TZ the set of forbidden assignments. The
goal of the point feature label placement problem is, to assign to each point feature
exactly one of its potential positions such that all constraints in TZ are fulfilled.
Such an assignment is then called a placement or labeling. If in a placement all
positions of a forbidden assignment R e TZ are used, then we are speaking about a
conflict.
Subsequently we define the three most important problems studied in this chapter.
(PI) Minimize the number of overlapsFind a placement such that the number of pairwise overlaps between two labels
plus the number of overlaps between a label and a point feature is minimized
while simultaneously the sum of the assigned position priorities is maximized.
(P2) Minimize the number of labels involved in a conflict
Find a placement such that the number of labels involved in a conflict is
minimized.
(P3) Point selection
By leaving a minimum number of point features unlabelled, find a conflict-free
placement for the remaining points.
Remarks
1. In problem (PI) two different objectives are given. In this case we mean an
optimization in lexicographical order, i.e. first minimize the number of pairwise
overlaps and under all assignments with such a minimum number find that one
which maximizes the given priority function.
2. In problem (P2) the formulation of 'minimizing the number of labels involved
in a conflict', is equivalent to 'maximizing the number of well placed labels'.
3. If not mentioned explicitly that position priorities are given for (PI), then we
will assume that all weights are equal, i.e. all positions are equally desirable
and therefore we use wir = 1 for alH G N, r e Ki. In this case the second task
of the objective in (PI) can be neglected.
In the sequel we want to clarify the difference between the objectives in (PI) and in
(P2). We show that even if there are no overlaps between labels and features, i.e.
|Ä| = 2, Vfi e TZ, a reduction of the number of labels involved in a conflict does
not necessarily mean a reduction of the number of pairwise overlaps. The followingfigure illustrates this difference with an example:
i
110 The Point Feature Label Placement Problem
(a) 2 pairwise overlaps and 1 well (b) 2 pairwise overlaps and 2 well
placed label. placed labels.
Figure 4.2: Example demonstrating the difference between the objectives: 'minimize
pairwise overlaps' and 'minimize conflicting labels'.
In Figure 4.2, white rectangles refer to unused potential label positions, light gray
rectangles denote labels placed without overlaps and dark gray ones denote placedlabels which overlap. We see that in both figures there exist two pairwise overlaps,however in Figure 4.2(a) there is only one well placed label, whereas in Figure 4.2(b)two labels are placed without overlap.
For general label placement problems it cannot be guaranteed that there alwaysexists a conflict-free placement. For this reason both, (PI) and (P2) only seek to
minimize the number of conflicting labels. In contrast to this, (P3), the problem of
point selection, asks directly for a conflict-free placement. If not all labels can be
placed without conflict, then some point features are allowed to remain unlabeled
in order to avoid overlaps. Note, however, that even the problem of removing a
minimum number of labels from a given labeling is NP-hard, since this problemcorresponds to the vertex covering problem.
Before we start with a detailed description of the models in Section 4.5, we first
present in the next section a set of rules which can be applied to the problem in a
preprocessing step in order to reduce the search space.
4.4 Preprocessing Strategies
In this section we introduce several rules to preprocess the set TZ of forbidden
assignments in order to reduce the number of potential label positions (and therefore
the search space for our heuristic, which simultaneously speeds up the algorithm).
Primarily these rules are designed for the label placement problem (P3) with point
4.4 Preprocessing Strategies 111
selection. In this case they guarantee that an optimal solution with same objectivevalue as before the reduction, also exists in the reduced search space. However, at
the end of this section we will briefly discuss how some of these rules can also be
used for (PI) or (P2) if all potential label positions are equally desirable.
Let us assume that forbidden combinations are given by the set TZ, consisting of
constraints R e TZ with \R\ e {1, 2} as described in Section 4.3. Constraints of
cardinality one correspond to label positions which overlap another point feature.
We will refer to such constraints R = {(i,r)} as forbidden positions since in this
case position r of point feature i must not be used. Constraints of cardinality two
denote pairs of overlapping labels.
Since we have only constraints of cardinality smaller or equal than two, they can
also be represented by a so-called conflict graph. This graph is sometimes used to
help with the fast detection of certain label configurations that allow an elimination
of some label positions. More details on conflict graphs can be found in [WW98].However, we will not deal here with details of the conflict graph, but we present in
the sequel a set of preprocessing rules which is directly derived from the set TZ of
forbidden assignments.
Each rule is also illustrated by a figure, where white rectangles correspond to po¬
tential label positions and light gray ones to positions which will be deleted after
the rule's application. A dot in a rectangle marks a forbidden position and finally,labels which can be fixed are shown dark gray.
(RI) Position deletion:
If there exist point features with forbidden positions then all those positionscan be deleted. More precisely, for all features i e N with {{i,r)} Ç TZ
we delete position r by setting K% := Kt \ {r}. Additionally we remove all
constraints in TZ where (i, r) is involved. Thus this procedure eliminates all
constraints R G TZ with \R\ = 1.
• 1
(z r) U
%
Note that an overlap of position (i, r) with a feature j / i does not necessarily
imply that (i, r) overlaps some potential label positions of feature j as well,as can be seen in the right figure above.
112 The Point Feature Label Placement Problem
Special case: (Point deletion)If all potential label positions of a point feature i are forbidden, then this pointcannot be placed without overlap and therefore i has to stay unlabeled. All
constraints where i is involved can be removed from TZ.
(R2) Preselection:
If there exists a point feature i which has at least one potential position r
without conflict with another label or point feature, then the label can be
fixed at this position and all constraints where i is involved can be removed
from TZ.
IF
(*,r)%
1 »
(R3) Dominated positions:If there exists a point feature i with at least two non-forbidden positions r and
f, then we say (i,f) is dominated by (i, r) if {(j,s) \ {(i,r), (j, s)} C TZ} Ç
{O'j s) I {(*; r)> Cb s)} Ç= ^-}- In °ther words this means that all labels which
are in conflict with position (i,f) are also in conflict with position (i,r). In
this case we can remove position (i, r) as potential label position by settingKi := K% \ {r} and eliminate all constraints in 7£ which contain (i, r).
(*-r)
,J
(<<> ';
( i
By this reduction optimal solutions are preserved (see also Wagner, Wolff
[WW98]) because if we have to place the label either at (z,r) or at (i, f) then
placing it at (i, f) always results in less or equally many overlapped other labels
(see figure above).
(R4) Pairwise conflicting solution:
This reduction was introduced in [WW98]. If there exists a point feature
4.4 Preprocessing Strategies 113
with a position r which is only in conflict with some (j, s) and j /î has a
potential position s ^ s which is only overlapped by ii,f),r ^ f then we set
the labels (i, r) and (j, s) and delete all other potential positions of i and j.
i%r) A
si
(3,*) (i,r)Ir
^
j'"
(J, à)
In the preprocessing phase we first apply rule (RI) on the constraint set to remove
superfluous constraints. Then for every point feature rules (R2)-(R4) are used
repeatedly until no further reduction is possible.
If all potential label positions are equally desirable then rules (R2)-(R4) can also be
applied to problems (PI) and (P2) (with the restriction that (R3) is only appliedto non-forbidden positions). However, (RI) must not be applied: On the one hand,because of the special case point deletion, where all positions would be eliminated (asituation only allowed when using point selection). On the other hand, even if there
exists a non-forbidden position, (RI) must not be used, if we want to guarantee that
an optimal solution is obtained. Figure 4.3 shows a situation, where placing a label
at position (i, r) yields a better result, even though this is a forbidden position.
n o
»Î
m
Figure 4.3: Example of why (RI) must not be used for (PI) or (P2) when optimalityshould be preserved.
Finally, it should be mentioned that when some potential label positions of a feature
i are deleted, then they are removed from the set K%. In this case we also have
to reduce the number of potential positions, i.e. kt := kz — 1 and besides, we will
renumber the remaining label positions such that there exist no gaps and all integersup to ki are used, i.e. Kl = {1,..., kz}.
114 The Point Feature Label Placement Problem
4.5 Models for Label Placement
Having thoroughly discussed the different problem types and preprocessing strate¬
gies in the previous sections, we concentrate now on the development of different
models for solving the described problems (P1)-(P3). All three problems will
be formulated as variants of a C-SAP, where depending on the model, forbidden
assignments are either included as constraints or in form of an objective function.
Possible combinations of the solutions of the C-SAP with additional postprocessing
procedures will be discussed in Section 4.6.
In this chapter we handle the notion of the 'C-SAP' loosely and allow in contrast to
Definition 1.2 that
(i) there may exist partial assignments R e TZ with \R\ = 1
(ii) each label i e N may have a different number ki of positions.
4.5.1 Minimizing the Number of Pairwise Overlaps (PI)
The objective of problem (Pi) is to minimize the number of pairwise overlapsand labels overlapping other point features while simultaneously maximizing a
position priority function. If (PI) has a solution without overlaps, then it can be
formulated by a C-SAP. For this reason we introduce for every point feature a
decision variable Xi, i e N. Moreover, each potential label position corresponds to
a value r G Ki where we denote assignments xt = r by (i,r). The constraint set
for the C-SAP is given by the set of forbidden assignments TZ. Introducing finallyvariables Pir G {0,1}, we get the following C-SAP for (PI):
Model (Ml):
max zi ip) = ^2 XWirVlT (4-1)i=l r=l
s.t. ^pir = 1 VieN (4.2)r=l
Pir £{0,1} yieN,reKi (4.3)
Yl Ar = 0 VReTZ. (4.4)(i,r)efi
Note that in the special case when no position priorities w axe given (or all of them
are equal), then the C-SAP (4.1)-(4.4) simplifies to a 2-Sat problem (since the
objective function is constant).
4.5 Models for Label Placement 115
Moreover, we want to stress here that there does not at all exist any guarantee
that the C-SAP (4.1)-(4.4) has a feasible solution. However, when trying to solve
it by FPH, developed in the previous two chapters, we know that in order to find
a feasible assignment for p, the number of unsatisfied constraints (and therefore
the pairwise overlaps) is minimized and the objective value maximized. Thus FPH
performs exactly the two optimization tasks required of this problem.
(PI) is one of the most commonly studied formulations of the label placement prob¬lem. The suggested C-SAP model (Ml) together with FPH will also be used as a
basic approach for solving the other versions of the label placement problem. For
example, we will see in Section 4.6 that C-SAP (4.1)-(4.4) can also be combined
with a postprocessing procedure and then be used as a solution method for problem
(P2).
4.5.2 Minimizing the Number of Conflicting Labels (P2)
The second model is designed for problem (P2): minimize the number of con¬
flicting labels. We know already from Section 4.3 that (P2) is equivalent to
maximizing the number of well placed labels. On the other hand the examples
depicted in Figures 4.4 and 4.5 clearly demonstrate the difference between the
objectives of (PI) and (P2). For this reason we use a different approach re¬
garding the forbidden assignments in the model described in this section. We
refrain from using constraints (4.4) as in (Ml) and model conflicting situations now
directly by a new objective function, thus transforming (P2) into an (extended) SAP.
i—— »i
(a) 2 pairwise conflicts with (b) 2 pairwise conflicts with
4 bad labels. 3 bad labels.
Figure 4.4: Motivating example for the objective function of (P2).
Let us first study the situation depicted in Figure 4.4 in order to construct criteria
for the new objective function. We see that there exist two possibilities to place the
label of feature 2: (2,t) and (2, b), where t denotes the top position and b stands for
bottom. Both placements have two unsatisfied constraints, however, the placementin Figure 4.4(b) is preferable for (P2), since it results in a placement with one well
placed label.
116 The Point Feature Label Placement Problem
Moreover, we see that features 3 and 4 are always overlapping each other and
therefore involved in a conflict. Hence our new objective function should prefer
position (2, b) to (2,t), because in the first case the number of well placed labels (1)minus the number of additionally generated conflicts (0) is maximal.
Mathematically we can describe this situation as follows:
Let us denote the set of all potential conflict partners of a label («, r) by
jC(i,r):={ij,S)\{ii,r),ij,S)}eTZ}. (4.5)
Consider a label placement with plT = 1, i.e. a label placed at position (i,r) for
some i e N,r e Kz. The contribution of (i,r) to the objective function (countingthe number of well placed labels) has to satisfy the following two criteria:
• If (z, r) overlaps another label, then (i, r) is not well placed and the contribution
to the objective function is zero. In this case the number of overlaps of (z, r)with other labels is irrelevant.
• On the contrary, if none of the label positions in £(z, r) is going to be used,
then (i, r) will be a well placed label and we add one to the objective value.
An objective function satisfying the described criteria is given by
3*(p):=£X>r u C1 "ft')- (4-6)1=1 r=l (j,s)eC(i,r)
We have already seen that for a feasible 0-1 assignment for p, zAp) counts the
number of well placed labels, if for all R G TZ, \R\ = 2. Moreover, for arbitraryvalues Pir e [0,1] for all i e N,r e Kt the product in (4.6) is always less or equalthan one and the semi-assignment constraints imply that for a fixed i the second
sum in (4.6) is also less or equal than one. Hence, for a fixed i, the maximum of this
sum is one which is certainly obtained, if a label of feature i is placed without conflict.
By the following small modification of z2 ip) we can also include constraints R with
\R\ = 1 in our objective function. Let us define
0, if{(z,r)}Gft
1, otherwiseyieN,reK%. (4.7)
Then the objective function
n fc,
Z2{P) =^j^lclrplr Yl il-Pos) (4-8)i=l r=l {j,s)C{i,r)
4.5 Models for Label Placement 117
counts the number of well placed labels.
Hence, the problem of minimizing the number of conflicting labels (P2) can be
modeled by the following optimization problem:
Model (M2):
n ki
max Z2 ip) = ^2 ^2 CirPir JJ (1 - Pjs) (4.9)î=1 r=l (j,s)e£(i,r)
ki
s.t. Y^Pir = 1 Vi£N (4.10)r=l
Pire {0,1} VieN,reKi (4.11)
where cir is defined by (4.7).
Remarks
1. We will refer to (4.9)-(4.11) as an extended SAP. The difference to the classical
SAP is that (4.9) is not an A-polynomial as given by Definition 2.4. Thoughstill all exponents of the variables in (4.9) are zero or one, there may now
exist monomials of the form PirPjsPjt with s ^ t. This is a consequence of
the definition of £(i, r) in (4.5) and the fact that label (i, r) may overlap two
neighboring labels of the same point feature: (j, s) and (j,t).However, though this destroys the 'linearity in pif property of the objectivefunction, other properties derived in Chapter 2 are still preserved. Especiallythe fact that F[Y] is a strict growth transformation still holds, since The¬
orem 2.2 was proven by Baum, Sell for arbitrary polynomials with positivecoefficients.
2. Let us compare the operators F[Q] for the unweighted C-SAP (4.2)-(4.4)and F[Y] for the extended SAP (4.9)-(4.11). In this comparison we assume
that there are no forbidden positions and therefore cir = 1 for alH G N, r G Ki.
Using the notation of Chapter 3 we can express the objective function of the
extended SAP also by z2(p) = EiErPir@irip) where 6jr(p) is defined as in
(3.6). If we denote by Fjr(z2,p) the partial derivatives of z2, we see immediatelythat
Yir(z2,p) = SM - Qir(p) (4.12)
with
Qir(p):= J2 Pi* Il 0--P)- (4-13)M?frT)\ (u,v)eC(j,s)\(i,r)(i,r)eC{].s)
118 The Point Feature Label Placement Problem
Thus the operator F[Y] of the extended SAP (4.9)-(4.11) and the operator
F[Q] of the unweighted C-SAP (4.2)-(4.4) of the previous model differ onlyin the terms contained in Q = (Qir).
Note that if we assume that p is a feasible 0-1 assignment, then we get the
following interpretation for Qir(p): The sum in (4.13) goes over all potential
conflicting neighbors of (i, r). Thus it counts the labels for which a conflict
would be resolved by deletion of label (i, r).
3. Unfortunately, the objective function z2 in (4.9) contains negative coefficients
which may result in negative partial derivatives Yir (z2, p) as we have seen in
(4.12). An easy way to get rid of this problem lies in substituting the equality
(1 —Pjs) — Et-tjtsPjt m ^ne objective function. We get
n ki kj
h(p) = ^2YlCirPir n Ylpjt (4-14)i=l r=l (j,s)EC(i,r) *=J
which agrees with z2(p) on AJ x • • x A* where A* := {p G [0, l]kl \ ErLiPr —
1} for all i e N. Then its partial derivatives are given for alii e N,r e Ki by
Yir(z2,p) = Cir Yi (1~Pis)+ ^2 C3sPis II ^-Puv)(j,s)eC(i,r) (j,s):(i,t)eC{j,s) {u,v)eC(j,s)\(i,t)
tj=r
and are therefore non-negative.
4.5.3 Models for Label Placement with Point Selection (P3)
To model problem (P3), we extend each point feature i e N with a virtual label
position ii, 0), thus increasing the set of potential label positions to Ki U {0}. This
new position stands for the predicate that 'feature i stays unlabeled'.
Since we want to guarantee the existence of a feasible solution we do not constrain
these new positions and leave the constraint set TZ unchanged. However, in order to
avoid that these new positions are used too frequently and therefore too many fea¬
tures remain unlabeled, we use an objective function which penalizes the positions
(i, 0). In fact, we can use objective function (4.1) where we set u>;0 = a, wir = b for
all r e Ki,i e N with 0 < a < b. Choosing a = 0 and 6 = 1, the objective function
counts for any feasible 0-1 vector p the number of (well) placed labels. Hence,the following C-SAP describes the label placement problem with point selection (P3) :
4.5 Models for Label Placement 119
Model (M3):
n ki n
max ^^2pir = E(1 ~pi°) (4-15)i=l r=l z=l
s.t. ^Tpir = 1 VieN (4.16)r=0
Pir e {0,1} Vz G N, r G Ki U {0} (4.17)
J] piT = 0 VReTZ. (4.18)
Note that this C-SAP has always a feasible solution, since in the worst case all
features can be left unlabeled. The choice of the weights a and b has a non-negligibleeffect on the performance of FPH. In our numerical experiments in Section 4.8
we have shifted the weights in order to avoid vanishing pi0 terms in the objectivefunction: we set Wio = 1 and wir = 2 for all r G Ki, i G N.
In our second model for label placement with point selection we combine the ideas
of the extended SAP of Section 4.5.2 and the concept of virtual label positions of
the previous model. The objective function consists once more of the maximization
of the number of well placed labels, however now with the additional possibility to
leave certain features unlabeled. This leads to the following extended SAP:
n h n ki kj
max z3ip) -=^2^2p^ Yl i1 ~Pj*) = ^2^Pir Yl S^'* (419)*=1 r=1 (j,s)eC(i,r) i=l r=l (j,s)e£(i,r) *=°
s.t. ^2pir = 1 VieN (4.20)r=0
Pir e {0,1} \fieN,r eKiVJ {0}. (4.21)
We know that pi0 determines whether feature i will stay unlabeled or not, and
therefore the same arguments as used for objective function (4.9) guarantee that
(4.19) counts the number of well placed labels correctly.
Note that this formulation has no constraint set. All constraints R G TZ with
\R\ = 1 can be eliminated in a preprocessing step by rule (RI). Constraints with
\R\ — 2 are already covered by the objective function.
As numerical tests have shown, FPH (for the extended SAP) yields for this model
very poor results. The reason for that lies most probably in the structure of this
model. We use the virtual label positions (i, 0) to deal with unlabeled features.
However, since this is an undesired state, the corresponding variables do not occur
in the objective function (they only come in after the transformation to get positive
120 The Point Feature Label Placement Problem
coefficients). As a result, the partial derivatives corresponding to these variables
contain only very little essential information which is needed so much by the
operator F[Y].
In order to overcome this problem we develop in the next section a different approachfor (P3) which is still based on the extended SAP (4.9)-(4.11), but does not use
virtual positions.
4.6 Postprocessing Strategies
In the sequel we describe how some of the models presented in the previous section
can be combined with a postprocessing procedure in order to improve the result
got by FPH. The first postprocessing will be responsible for the task of minimizingthe number of overlapping labels. Postprocessing 2 converts an arbitrary labelinginto a conflict-free placement by removing a small number of conflicting labels, thus
satisfying the point selection critérium of problem (P3).
Postprocessing (PPl):This postprocessing is used to minimize the number of badly placed labels for
a given label placement. Towards this end we reposition labels by a greedymethod in such a way that the number of (already existent) overlaps is increased
simultaneously freeing some other labels in a neighborhood of this conflict. More
precisely, as long as we can improve the number of well placed labels, we repeat:Choose a badly placed label and try to reposition it without producing a new
conflict. If such a repositioning frees another label of a conflict, then fix it at the
new position and repeat the step.
Figure 4.5 shows even more drastically than Figure 4.2 that minimizing the number
of pairwise overlaps (PI) may even contradict the objective of maximizing the
number of conflict-free labels (P2). Figure 4.5(a) depicts a placement with onlythree pairwise overlaps and no well placed labels. After application of (PPl) we getthe labeling shown in Figure 4.5(b) with four pairwise conflicts but also one well
placed label.
4.6 Postprocessing Strategies 121
(a) Placement after the optimization (b) Placement improved by postpro-
phase. cessing 1.
Figure 4.5: Contradicting objectives and improvement by postprocessing (PPl).
It should be obvious that this method just tries to locally reposition one label in
order to maximize the number of well placed labels. This is also the reason, whyfor example the placement shown in Figure 4.2(a) cannot be improved by this post¬
processing. In that situation two labels have to be repositioned in order to gain an
improved placement. Changing only one label does not improve (may even worsen)the number of well placed labels, which is the reason why (PPl) does not work there.
Postprocessing (PP2):Most algorithms dealing with point selection (P3) try in a first step to minimize
the number of overlapping labels, e.g. solving (PI) or (P2) and perform afterwards
a postprocessing in which conflicting labels are removed. The goal of (PP2) is
to transform any arbitrary placement into a conflict-free placement satisfying the
point selection critérium by removing a small number of labels.
The main idea of (PP2) is to delete labels in decreasing node-degree order in the
conflict graph of a fixed placement. Hence, first labels which overlap many other
labels are deleted, and only at the end pairwise conflicts are resolved. In detail, the
procedure works as follows:
1. Delete all labels which overlap another point feature (apply (RI))
2. Compute for the label of each point feature i the number d(i) of labels which
it overlaps
3. Remove labels in decreasing d(i) order until all conflicts are resolved
Figure 4.4 on page 115 shows that it can be advantageous to perform postprocess¬
ing (PPl) before continuing with the point selection procedure described above. In
the first labeling shown in Figure 4.4(a), d(i) = 1 for i = 1,.. .,4 and two labels
(e.g. of features 1 and 3) have to be removed in order to get a conflict-free labeling.In Figure 4.4(b) we have d(l) = 0, d(2) = d(4) = 1 and d(3) = 2. Therefore we first
remove the label of feature 3, thus already resolving all conflicts, getting three well
placed labels.
122 The Point Feature Label Placement Problem
4.7 Ambiguity
In the case of sparse maps, conflict-free placements can normally be found easily.
Hence, in these situations the problem is not so much the satisfaction of all
feasibility constraints in 7£, but more the issue of avoiding ambiguous placements.
When solving problem (PI) we use position priorities as weights for the label
positions which leads in the corresponding C-SAP model to the linear objective
function (4.1). Obviously, (4.1) does not avoid ambiguity, as the placement in
Figure 4.6 shows:
1
2,
3
(a) Very little space between labels may (b) Alternative placement without dan-
result in ambiguous placements. ger of ambiguity.
Figure 4.6: Resolving ambiguous placements.
If we assume the position priorities suggested by Yoeli [Yoe72] (Figure 4.1, upper
right position is best), then the placement shown in Figure 4.6(a) is optimal for
problem (PI). However, the reader may find maps where labels lie close togetherdifficult to read and prefer the placement shown in Figure 4.6(b). Especially in
situations where two (or more) different corners of labels lie close to the center of
one point feature (here 1 and 3), the placement may become ambiguous such that
the reader cannot be sure which label belongs to which feature.
In order to minimize the occurrence of such ambiguous placements we present in
the sequel two problem formulations dealing with this aspect. In both cases we
want to avoid labels being placed too close to each other, thus preventing the
reader from mixing up different names and avoiding confusion due to ambiguity.We assume in the remainder of this section that all labels are given by axis-parallel
rectangles where the (bounding box of the) label of feature % has length lt and
height ht. Moreover, we do not take into account any given position prioritiesin the following two problem formulations, but discuss at the end of this section
some possible modification of the objective function of the second model, such that
position priorities can also be used.
Since ambiguity is only a local property, the new objective functions of the
problems will also be based on local observations. Therefore we restrict the search
for potential ambiguous placements between two features i and j with % > j to only
1 i^f^îp
2
3
...... „
4.7 Ambiguity 123
those features j which lie in a neighborhood of i. This neighborhood is given by an
axis-parallel rectangle Ri(j) whose size depends on the dimensions of the labels of
features i and j. The threshold for the critical distance between two labels within
which ambiguity can occur is set to 10% of the label's length and height. Therefore
we define the neighborhood Riij) for a feature i with respect to j by the axis-parallel
rectangle of length 2.2(Z, + lj) and height 2.2(/ij + /ij), centered at feature i (see Fig¬
ure 4.7). This neighborhood was chosen such that point features which lie far apart
(more than 10% of their length/height) are neglected, since they do not influence
each other. On the other hand, if feature j lies in Ri(j) it is regarded as a po¬
tential candidate for an ambiguous placement and in this case we write fij) G Ri(j).
Rid)
hi
Figure 4.7: Neighborhood Riij) for point feature i used by (AMBIl) and (AMBI2).
The objective of the first problem (AMBIl) is aside from minimizing the number
of overlaps, to maximize the sum of the distances between the centers of all
non-forbidden label positions (i,r),(j,s) for all features j < i which lie in Ri(j).This can also be formulated by means of a quadratic A-polynomial used as an
objective function for the basic C-SAP of (Ml) presented in Section 4.3.
Let (i, r) and (j, s) be two non-conflicting label positions. Then we define the weight
Wirjs as the distance between the centers of the two labels (i, r) and (j, s). Thus the
new objective function is given by
n
Zi(p) '-=^2^52 £ WirjsPirPjs (4.22)1=1 jSJV,i<J: reÄj.sSKj-:
/(j)eäiO) {(i,7-),o,s)}g7î
and the C-SAP (4.22), (4.2)-(4.4) models (AMBIl). When solving this C-SAP byFPH we search for a conflict-free placement while simultaneously maximizing the
distances between pairs of labels in the given neighborhood. One typical result of
this method is shown in Figure 4.6(b) where the distances between the label centers
are maximal.
However, unfortunately this problem does not cope with all possible situation and
though it works normally very well, there exist situations where aesthetically non-
i
u>
124 The Point Feature Label Placement Problem
attractive placements cannot be avoided by (AMBIl). In some cases two labels
are placed close to each other since their distances to a third label sum up to a
larger objective value (see Figure 4.8): Assume all labels are 4x2 rectangles and
the centers of the label positions are given by the following coordinates LI: (2,1),
L2top: (4,3), L2bottom: (4,5), L3: (7,9), then easy computation shows that the
objective function is
Zip) = V8pip2t + V20~PiP2b + V45p2tP3 + 5P2bP3-
Therefore the placement shown in Figure 4.8(a) with p2t = 1 has objective value
y/8 + a/45 « 9.54 and is therefore slightly larger than the objective value for the
placement shown in Figure 4.8(b) with p2b = 1 which is 5 + \/2Ö « 9.47.
m-—i
2.
(a) Placement with larger objec¬tive value, found by (AMBIl).
(b) Aesthetically more attractive
placement.
Figure 4.8: Placement where (AMBIl) does not work.
In order to avoid such situation we propose a second problem formulation called
(AMBI2).
The goal of (AMBI2) is to avoid placements where label edges lie very close
together. Hence, we do not compute the differences between the centers of two
labels, but the distances between the closest parallel edges. Identifying such
situations is straight forward for any person looking at a placement, but not at all
so clear for a computer. Let us look at the three placements depicted in Figure 4.9.
It is obvious that in Figure 4.9(a) we should compute the distance between the
horizontal edges and in Figure 4.9(b) that one between the vertical edges of the two
labels. However, in Figure 4.9(c) the smallest distance between two parallel edgeswould be between the horizontal edges (equal to zero), though the two labels lie
horizontally far apart.
4.7 Ambiguity 125
,\\ d
\ <
» (
ri
t 11 1II "i d '^
(a) Vertical dis¬
tance between two
labels.
(b) Horizontal dis¬
tance between two
labels.
(c) Horizontal distance,
though the vertical dis¬
tance is smaller (zero).
Figure 4.9: Distance rules for placement with (AMBI2).
For this reason we have to refine our criterion by additionally taking into account
the relative positions of the labels to each other. We sort the labels by ix, y)-coordinates and compute the line through both label centers. Then we check
whether it intersects first the horizontal or the vertical edge of the first label. In
the first case we compute the vertical distance between the labels, in the second
one the horizontal (see distance d in Figure 4.9).
Let d„JS denote the appropriate distance between two parallel edges of the two labels
at positions (i, r) and (j, s) as described before. Moreover, we define a maximum
distance rfmax, where it is assumed that any label farther apart than dmax is non-
critical with respect to danger of confusion. Now we can define the weights wtrjs G
[0,1] for all non-conflicting pairs of potential label positions (i, r), (j, s) with i > jand fij) e Riij) by
irjs11 Q,irjS ^> Umax
otherwise(4.23)
Using these weights in objective function (4.22), (AMBI2) can be modeled bythe same C-SAP as (AMBIl), given by (4.22),(4.2)-(4.4). Note, however that in
contrast to (AMBIl), in the objective function of the second model often several of
the weights wirjs computed for the features i and j may be equal. This comes from
the fact that on the one hand distances are even for different label positions of the
same feature measured with regard to the same edge and on the other hand from
ignoring large distances which are all set to one.
If we want additionally to the described ambiguity distances also handle givenposition priorities, we can proceed with the following modification of the objectivefunction. Instead of using the weights suggested in (4.23), we multiply them bythe position priorities and include in this way also the placement preferences in the
objective function (4.22). Of course, the weights of all label positions of features jwhich do not lie in the neighborhood Riij) remain unchanged and are therefore set
to their corresponding position priority.
126 The Point Feature Label Placement Problem
It should be mentioned here that the two described methods could either be appliedto the whole problem or used in a postprocessing step for any region of the map
which the reader may find difficult to read.
The three pictures in Figure 4.10 represent typical placements for an instance
with 150 point features on a map of size 792 x 612. Every point feature has four
potential label positions and each label has a fixed size of 50 x 15. Moreover,
the critical distance dmax for (AMBI2) is set to the label height, o?max := 15 and
the weights for the objective function in (AMBI2) are those defined by (4.23)multiplied by 100. Figure 4.10(a) shows a placement found by FPH for model (Ml)without any postprocessing. It uses Yoeli's position priorities with the weights
tiin = 8,Wi2 = 2, Wis = 1, u>i4 = 4, if numbered clockwise starting with the upper,
right position. Figures 4.10(b) and 4.10(c) depict placements found by (AMBIl)and (AMBI2) respectively, where we refrained from using position priorities
in these cases in order to demonstrate the results of these methods without any
side effects. We see that in all three cases all labels have been placed without overlap.
Since avoidance of ambiguity is one of the aesthetical criteria for label placement,
every problem formulation dealing with this aspect represents only one subjectiveview. Hence, it is not possible to compare the results of different problem formu¬
lations and say which one is best. If this were possible and such a criterion for
comparison would exist, then we could use this criterion directly as objective func¬
tion of a new problem formulation. However, as we see from Figures 4.10(b) and
4.10(c) both ambiguity avoiding methods distributed the labels much more evenly
throughout the free space than it was done by the standard method in Figure 4.10(a).We see that both methods, (AMBIl) and (AMBI2) have regions where labels are
placed very well, and others where the other method found a better placement. In
most of these situations a close placement of two labels is unavoidable and there
exists only the choice of placing a label close to one or the other label. For this
reason, and since these methods could also be used locally, it is up to the designerof the map to choose the appropriate placement strategy depending on the current
situation.
4.8 Computational Results
The following subsections summarize the results of our numerical tests and our ex¬
perience with the different models developed in the previous sections. We start in
Subsection 4.8.1 by giving a description of the test set used in these experiments.
Subsequently we discuss the usage of the preprocessing rules (R1)-(R4) and point out
their effectiveness regarding the reduction of the search space. In Subsection 4.8.3
we compare the placements got by using FPH on models (Ml) with postprocessing
(PPl) and (M2), as well as (M3) and (M4) in case of point selection. Then these re¬
sults are compared to the best-known method from literature for this test set, where
4 8 Computational Results 127
*—J_^cnzi gg^g wh p.
s ia
bssrr=?
u
,,^.,Tr3 Efjfi
(a) 150 labels placed by (Ml) with position pri¬
orities
jauit-^ö
tri
£ n ros__
m
-,—^
'
'"gljj t-i-1 ^^ ESH^-^
^'. ."^ orxWMiH
EZZ3
^* '» A
tF E_23
dJ EH_i
pri
•^ ^'.A.u.i-.l '"".'"'«
r tf
(b) Placement found by (AMBIl)
ir? i""
3_„
I 3,
sur r~ ~i "».„.a '
E»a ——;
("''ysasa S3 "* »„la
1—4
ET3
scm^f^ ».sa
(c) Placement found by (AMBI2)
Figure 4 10 Placements of ambiguity avoiding strategies (AMBIl) and (AMBI2)
128 The Point Feature Label Placement Problem
again both cases, with point selection allowed and prohibited are discussed. Finally,
we conclude this section with some implementation details and further description
of the parameters used in this comparison.
4.8.1 Test Set
Our numerical experiments were carried out on a test set introduced by Christensen
et al. [CMS95]. During the last years this test set has become a major benchmark
class for the comparison of different methods dealing with the label placement
problem and therefore is used widely.
An instance consists of a set of randomly placed point features on a fixed size
square where the dimensions of the map and labels were chosen by the authors in
order to identify a typical map scale for an 11 by 8.5 inch page size. Hence, for one
instance n point features are placed randomly on a map of size 792 by 612. Each
point feature has four potential label positions (upper left, right and lower left,
right) and all labels are axis-parallel rectangles of fixed size 30x7 with no position
priorities given. The objective always consisted in either solving problem (P2), i.e.
finding a placement which maximizes the number of well placed labels, or solving
problem (P3) when point selection is allowed.
Tests were especially run for medium dense (n = 750) and dense (n = 1500) maps.
In all figures of the following subsections we represent labels overlapping other labels
or point features in black and labels which are placed without conflict in light gray.
By common agreement we assume that in all examples coniiict-free labels crossingthe bounding box of the map are counted as well placed labels.
4.8.2 Reduction by Preprocessing
In this section we investigate the effectiveness of the preprocessing rules (R1)-(R4)presented in Section 4.4. We applied these rules to 10 instances of the test set
described in Section 4.8.1 for n = 750 and n = 1500, respectively. Table 4.2 shows
the average results for the smaller problem and Table 4.3 those for the larger one
where we additionally distinguish between the cases that point selection is allowed
or forbidden. Since every point feature has four potential label positions we have
for every instance of the smaller problem a total of 3000 potential positions.
4.8 Computational Results 129
with pointselection
number of fixed
point features
number of removed
label positions
number of eliminated
constraints
orig. problem 750 % 3000 % 7734 %
Rule 1:
Rules 2-4:
total:
6 0.80
497 66.27
503 67.07
931 31.03
1555 51.83
2486 82.86
5538 71.61
1678 21.70
7216 93.31
remaining: 247 32.93 514 17.14 518 6.69
without pointselection
number of fixed
point features
number of removed
label positions
number of eliminated
constraints
orig. problem 750 % 3000 % 7734 %
Rules 2-4: 226 30.13 962 32.07 1750 22.63
remaining: 524 69.87 2038 67.93 5984 77.37
Table 4.2: Average reduction of constraints by pre-processing for problems with
n = 750.
The first table shows the results for instances where point selection is allowed.
Only in that case rule (RI) may be applied. Moreover, it is always used before
the other rules, since it reduces the constraint set and therefore also influ¬
ences the reduction of rules (R2)-(R4). However, once it was applied it will
not become necessary to use it again. In contrast to this, (R2)-(R4) have to be
applied several times in order to achieve a maximum reduction (see also Section 4.4).
The values in the first column correspond to the number of labels which could be
fixed (and thus completely eliminated) during the preprocessing phase. However,if point selection is allowed there are two possibilities to fix a label: either a pointfeature will stay unlabeled or on the contrary it could be placed without conflict.
The first situation can only occur after application of rule (RI), hence the number
in the line with (RI) corresponds to the amount of unlabeled point features. On
the other hand, fixing of label positions with respect to conflict-free placements is
carried out in rules (R2) and (R4). The corresponding line in the tables states the
number of labels placed without overlap. In both cases the corresponding variables
need not be included in the optimization phase.
The second column denotes the reduction of the number of removed label positions,which is the sum of all positions that are either eliminated or fixed.
Finally, the last column in Table 4.2 shows the number of eliminated constraints.
Of course the major reduction is achieved by rule (RI) and only about 20% could
be eliminated by the other rules. However, in total more than 90% of all constraints
could be removed by the preprocessing rules (R1)-(R4).
130 The Point Feature Label Placement Problem
Comparing the situations where point selection is allowed to that one where it is
forbidden (lower part of Table 4.2), we see that in the first case about 2/3 of all
points could be labeled in advance whereas in the second case only about 1/3 of
them was fixed. This is a direct consequence of the application of rule (RI) in the
first case which eliminates a large number of label positions (and constraints) and
therefore makes subsequent placement easier.
with pointselection
number of fixed
point features
number of removed
label positions
number of eliminated
constraints
orig. problem 1500 % 6000 % 30521 %
Rule 1:
Rules 2-4:
total:
115 7.61
141 9.40
256 17.01
3113 51.88
475 7.92
3588 59.8
25760 84.40
832 2.73
26592 87.13
remaining: 1244 82.99 2412 40.2 3929 12.87
without point
selection
number of fixed
point features
number of removed
label positions
number of eliminated
constraints
orig. problem 1500 % 6000 % 30521 %
Rules 2-4: 42 2.80 196 3.27 583 1.91
remaining: 1458 97.20 5804 96.73 29938 98.09
Table 4.3: Average reduction of constraints by pre-processing for problems with
n = 1500.
Comparing the results of Table 4.2 to those of the larger instances shown in
Table 4.3 we see that in both cases the largest amount of constraints is eliminated
by rule (RI). It can also be observed that the percentage of this reduction is
higher for dense problems, because in that case the probability of the occurrence of
forbidden positions (overlapped point features) is much higher.
All other reductions are smaller for the larger instances, since it is much more difficult
to fix labels or remove positions when there is little free space left. The situation
becomes even worse for dense problems with forbidden point selection (see lower
part of Table 4.3). In that case all reductions are less than 5%. However, since
the preprocessing rules can be carried out extremely fast they are used even in this
situation.
4.8.3 Comparison of the Models
To compare our different models we used the test set described in Section 4.8.1 and
generated 25 instances with 750 and 1500 point features each.
4.8 Computational Results 131
Models
Throughout the remaining three subsections we denote the models and correspond¬
ing methods as follows: (Ml) is the C-SAP (4.1)-(4.4) for label placement without
point selection as described in Section 4.5.1. The second model without point
selection, (M2), is defined by the extended SAP (4.9)-(4.11).
For label placement with point selection we have the C-SAP (4.15)-(4.18) as (M3).Furthermore we define another new method, (M4), in order to overcome the weak¬
ness of the model given by (4.19)-(4.21). (M4) does not use virtual label positions,
but is based on an extended SAP. First, we apply preprocessing rules (R1)-(R4)to the constraints and then we solve the extended SAP (4.9)-(4.11) as described
in Section 4.5.2. After each run of FPH we fix all well placed labels. This reduces
the problem size and also modifies the constraint set. We continue by solving the
remaining problem (again applying (R2)-(R4)) and repeat this procedure as longas an improvement can be achieved. Only then we apply postprocessing (PP2) to
get a final, conflict-free placement.
All C-SAP's and extended SAP's are solved by FPH. Further details on these
methods or their parameter settings for FPH can be found in Section 4.8.5.
ComparisonThe following table summarizes the test results for the two models without pointselection. Since in (Ml) the objective function minimizes only the pairwise overlapswe performed subsequent to the optimization phase postprocessing (PPl). In (M2)the objective function maximizes already the number of well placed labels directlyand therefore no postprocessing becomes necessary. For each instance we used 10
different starting points and performed for each of them 1000 iterations. Out of
these 10 values we computed the best, average and worst solution.
750 point features
bad labels %
1500 point features
bad labels %
worst
(Ml) average
best
100.24 13.37
97.11 12.95
94.44 12.59
1078.16 71.88
1071.48 71.43
1064.80 70.99
worst
(M1)+(PP1) average
best
77.80 10.37
75.00 10.00
72.24 9.63
817.48 54.50
808.63 53.91
799.48 53.30
worst
(M2) average
best
67.80 9.04
64.84 8.65
62.00 8.27
673.32 44.89
667.10 44.47
661.32 44.09
Table 4.4: Comparison of (Ml) and (M2).
132 The Point Feature Label Placement Problem
Table 4.4 shows the average number of badly placed labels over the 25 in¬
stances of the worst, average and best label placement found. The second column
shows the percentage of bad labels with respect to the total number of point features.
We clearly see the increasing improvement from top to bottom. The worst case
of each new model is always better than the best solution of the previous one.
It is not astonishing that (Ml) used without postprocessing achieves quite poor
results since its objective of minimizing the number of pairwise overlaps givesat best just an idea where to search for good solutions. (M2) performs very
well and its results come close to those of the best method (Simulated Anneal¬
ing) of the comparison carried out by Christensen et al. [CMS95] (see Section 4.8.4).
Besides these results we also experimented with the number of iterations in order to
see whether an increased number of iterations (and running time) can achieve better
results. The average results of the 25 instances for 200, 1000 and 10000 iterations
are shown for n = 750 in Table 4.5.
(M2)750 points
200 iterations
bad labels %
1000 iterations
bad labels %
10000 iterations
bad labels %
worst
average
best
71.64 9.55
67.96 9.06
64.52 8.60
67.80 9.04
64.84 8.65
62.00 8.27
65.64 8.75
62.92 8.39
60.44 8.06
Table 4.5: Varying the number of iterations for (M2) with 750 point features.
Solving an instance for one starting point with 1000 iterations took about 22
seconds on a Sun Ultra Sparc workstation. Hence, it follows from the above results
that if time is not a critical factor, then an increase of the number of iterations
could be permitted in order to improve the quality of the label placement. However,if time is critical, then probably the improvement of less than 1% does not admit
the longer running times.
Two typical placements are depicted in Figure 4.11. In the first picture, Fig¬ure 4.11(a) we see the placement found after 1000 iterations by (Ml) with post¬
processing (PPl). It has 79 badly placed labels. Figure 4.11(b) shows the result
for the same instance solved by (M2) which is slightly better and has only 67 bad
labels.
4.8.4 Comparison to Other Heuristics
Since we have seen in the experiments of the last subsection that for label placementwithout point selection model (M2) works best, we want to compare now this best
4.8 Computational Results 133
(a) (Ml) with postprocessing (PPl): 79 bad labels
(b) (M2) after 1000 iterations 67 bad labels
Figure 4.11: Comparison of (Ml) and (M2) with the example of a random map with
750 point features with point selection prohibited.
134 The Point Feature Label Placement Problem
model with other heuristics developed for this task.
As already mentioned in Section 4.8.1 we use the test set introduced by Christensen
et al. [CMS95] of 25 randomly generated maps with n = 750 and n — 1500 pointfeatures. In their paper the authors compared several different heuristics on this
test set and in all cases Simulated Annealing achieved the best results.
A summary of their results for the problem without point selection is given in Ta¬
ble 4.6. It shows the ranking of the three best methods, the number of conflictinglabels and the approximate fraction of badly placed labels. The names of the meth¬
ods refer to those described in Section 4.2.
ranking
750 point features
method bad %
1500 point features
method bad %
1.
2.
3.
Sim. Annealing 60 8
Hirsch 135 18
Zoraster 157 21
Sim. Annealing 615 41
Gradient Descent 975 65
Hirsch 1140 76
Table 4.6: Comparison of methods for instances without point selection. Taken from
Figure 11 in [CMS95].
To carry out a fair comparison on the same platform we also implemented the
described Simulated Annealing and compared its results to our method. Moreover,we included in our tests a random placement in order to get a trivial lower bound
and a greedy procedure consisting of our postprocessing (PPl) applied to a random
placement.
The following tables summarize the results of our test. The parameters for
Simulated Annealing were chosen as suggested in [CMS95] and are described in
Section 4.8.5. However, in contrast to their paper, where every instance was onlysolved once, we solve each of the 25 instances 10 times taking the best, average and
worst result of these 10 runs.
One run for Simulated Annealing took on a Sun Ultra Sparc workstation 18 seconds
for n = 750 and 84 seconds for n = 1500, slightly less time than needed for 1000
iterations of (M2).
For some instances the best placements were found by (M2) and for others bySimulated Annealing. As we can see from Table 4.7, Simulated Annealing is on the
average slightly better than (M2). However, the difference is for both problem sizes
for the best as well as for the worst case at most 0.7%. Hence, (M2) can be placedsecond in the ranking shown in Table 4.6. Some typical placements for one instance
of this test series are shown in Figure 4.12.
4.8 Computational Results 135
750 point features
bad labels %
1500 point features
bad labels %
worst
Sim. Ann. average
best
63.08 8.41
60.75 8.10
58.32 7.78
663.13 44.21
656.92 43.79
650.80 43.39
worst
(M2) average
best
67.80 9.04
64.84 8.65
62.00 8.27
673.32 44.89
667.10 44.47
661.32 44.09
worst
Greedy (PPl) average
best
214.40 28.59
198.39 26.45
182.36 24.31
1002.80 66.85
982.16 65.48
962.88 64.19
worst
Random average
best
550.84 73.45
532.78 71.04
513.12 68.42
1383.04 92.20
1365.52 91.03
1350.24 90.02
Table 4.7: Comparison of different heuristics for random maps without point selec¬
tion.
Table 4.8 depicts the results of a comparison of (M3) and (M4) with Simulated
Annealing for problem (P3), label placement with point selection. Both C-SAP
models are combined with postprocessing (PP2) in order to guarantee a conflict-free
placement. Compared to each other all three methods achieved very similar results,with only small differences regarding the quality of the solution. (M4) found slightlybetter results for the smaller problems, whereas Simulated Annealing was better for
larger ones.
750 point features
bad labels %
1500 point features
bad labels %
worst
(SA) average
best
48.08 6.41
45.85 6.11
43.60 5.81
548.80 36.59
541.77 36.12
535.44 35.70
worst
(M3)+(PP2) average
best
46.88 6.25
45.91 6.12
45.12 6.03
554.32 36.95
550.80 36.72
547.52 36.50
worst
(M4)+(PP2) average
best
46.24 6.17
44.90 5.99
43.64 5.82
554.32 36.95
549.53 36.64
545.72 36.38
Table 4.8: Comparison of different heuristics for random maps with point selection.
A point of greater difference is the speed of the algorithm: Simulated Annealingneeds for one run about 35 seconds for the smaller and 122 seconds for the larger
136 The Point Feature Label Placement Problem
£*rc& 'ilia tisaaf, y;rë=in ts=s
——, F^TTTT r~~llliliU
^^^S^_T^jjl_ ,-—-pl^^f E^iïl'
I t^^ O -I l-^T1 £—, i^Ytt^—. CSS L^_. )*^T —. ""*"t
Simulated Annealing (64) Simulated Annealing (667)
(M2) (67) (M2) (679)
S^Ï^ÏS» =sS
fc^L^A^s^â*
Greedy (PPl) (198) Greedy (PPl) (973)
Random (528) Random (1364)
Figure 4.12: Comparison of different methods for n - 750 (left) and n = 1500 (right)features. Black labels (number in brackets) indicate conflicting labels.
4.8 Computational Results 137
problem, whereas (M4) needs only 6 and 50 seconds, respectively, and fastest (dueto the large elimination of label positions in the preprocessing) was (M3) with only
4 and 22 seconds.
Finally, it should be mentioned that for our test instances we could not reproduce
the good results with our re-implemented Simulated Annealing of the paper
[CMS95] with only about 27% badly placed labels for 1500 point features.
Summing up we see that our results of (M2) and (M4) are absolutely comparable to
those of Simulated Annealing, the best-known method for this benchmark test set.
Moreover, the combination of an extended A-polynomial as objective function and
a set of forbidden assignments as constraint set also allows further combinations of
this model with additional e.g. aesthetical aspects, as we have seen in the case of
avoidance of ambiguity in Section 4.7. Many other aspects such as the size of cities
(=importance that point is labeled) or label preferences could be included this way
in the objective function.
4.8.5 Implementation Details and Parameter Settings
In this subsection we discuss some implementation aspects and describe the
parameters of FPH used in our tests.
All methods were implemented in C and run on a Sun Ultra Sparc workstation.
The visualization tool was implemented by B.Mateev in Objective-C under
NeXT-Step on a PP200. Instances which should be visualized were generated bythe visualization tool which determined the sets of weighted partial assignmentsfor the objective function and the forbidden partial assignments for constraints.
Then these two sets were used as input for the algorithm (which also can be used
stand-alone) which computed a placement in form of an assignment vector.
In the sequel we describe the parameters of FPH which were used in our comparisons
for solving the C-SAP's (and their extensions). The algorithm was implementedas described in Section 3.3.1. For this reason we report here only the parameter
setting and its combination with pre- and postprocessings.
Recentering:All methods use every 10 iterations a recentering of the variables pir, i £ N,r e Kx.
The goal of this recentering is to avoid that the algorithm gets stuck too early at
the boundary of the feasible region. For this reason we add the constant 0.6 to all
values in p and normalize afterwards, such that Er=iP^ = 1 for all i e N.
138 The Point Feature Label Placement Problem
Operator/Iteration/Computation:As described in Section 3.3 we use here as well the Gauss-Seidel version F'[P] of
the operator with fitness function £ := YaQß for the C-SAP. Again the exponents
a, ß > 0 were used to control the intensity of either dynamics. We experimentedwith many different values for the exponents and set them finally to a := 10 and
ß :— 0.5, which seems to work best for these label placement instances.
To avoid numerical instability as far as possible, we initialize 0 with a largeconstant Omit := 10100 such that successive multiplication with (1 — JI^) w^ no^
too soon cause numerical problems. The stopping criterion was given by a fixed
number of iterations, which was set to 1000, if not mentioned differently.
Combination with Pre- and Postprocessing:The two models (Ml) and (M2) for label placement without point selection
are generally used together with the preprocessing rules (R2)-(R4). However,
(Ml) reduces in the unweighted case to a Satisfiability problem, using only the
9-dynamics, and (M2) corresponds to an extended SAP which does not use the
constraints in TZ. Since (Ml) does not minimize the number of badly placed labels
directly, it is normally used in combination with postprocessing (PPl).
The other two methods (M3) and (M4) are designed for label placement with pointselection. (M3) consists of an objective function and the constraint set TZ and
therefore is a classical C-SAP. The weights used in the objective function are set to
Wio = 1 and w„ — 2 for all r ^ 0, i e N. (M4) is in analogy to (M2) an extended
SAP, which does not use virtual label positions. Both methods use preprocessingrules (R1)-(R4). However, since in (M4) FPH is applied more than once, each time
fixing the well placed labels in between, we use (RI) only the first time. Due to
application of preprocessing rule (RI) the extended SAP for (P3) can be solved
much faster than for (P2). Moreover, (M3) and (M4) are also both combined with
postprocessing (PP2) to ensure a conflict-free placement.
Simulated Annealing:The parameters for Simulated Annealing were taken from the paper [CMS95]. The
feasible set for configuration changes was the entire set of labels. The initial tem¬
perature T0 in the annealing schedule was chosen such that a worse placement is
accepted with probability p = exp~AE/T = | when the change in the objectivefunction AE = 1. At each temperature a maximum of 20n labels are repositionedand then the temperature is decreased by 10 per cent. If more than 5n successful
configuration changes are made at any temperature, the temperature is immediatelydecreased. This process is repeated for at most 50 temperature stages whereas it
terminates earlier, if the algorithm stays at a particular temperature for the full 20n
steps without accepting a single label repositioning.
Appendix A
Gauss-Seidel Version of the
Operator
In most of our numerical experiments we have used a variant F'[£] of operator F[£]of (1.20), which updates the components of a point p sequentially. For this reason
F'[£\ is called the Gauss-Seidel version of F[£] and is defined as follows:
Definition A.l (Operator F'[g\)Let £ = iÇir) be a ßtness function on A0. We deßne for each j e N a mapping
Fj[Ç] : A0 -+ A0 by
{Pirjir(p)ff l = j
EÎ=iP«f«(p) VieN,reK.Pir otherwise
Then the operator F'[Ç] : A0 -> A0 is deßned by
m = k® ° - ° nis- (at)
We see that between p and F'[Ç](p), n — 1 so-called intermediate points are visited,which we denote as follows: For all j e N and any fixed p e A0 we define
p® := FjlftipU-V), with pW:=p,
and we get therefore p^ = F'[£](p).
Regarding the fixed points of the operators F and F', the following property holds:
Proposition A.2
Let £ be a ßtness function on A0. Then the operators F := F[£] and F' := F'[£]have the same ßxed points.
Proof
'=>': Let Fip) = p. Since F[ip) changes the components pi. in the same way as
140 Gauss-Seidel Version of the Operator
F and leaves all other components unchanged, we have p\' = F(p)i, = pL and
therefore p^ = p. Continuing analogously for all remaining components i G A" we
get
p = pW = ...=pW = F'(p). (A.2)
,<=K. From F'ip) = p it follows by the same argument as above that (A.2) holds.
However, since all these intermediate points are equal, it makes no difference if we
change the components one after each other, or all at once and therefore F(p) = p.
The next theorem gives, similarly as Theorem 2.26 does for F[£], some sufficient
condition under which F'[Ç] is a strict growth transformation:
Theorem A.3
Let z : Rnk -y 1 bea polynomial such that Y = (Yir) with Yir(p) = -E~{p) Js a
ßtness function on A0. Moreover, let F' := F'[£] where £ := G{Y) as in (2.35). Then
F' is a strict growth transformation for z.
Proof
Let i e N be fixed and p := p^~l\ Subsequently we show that z(F[ip)) > zip).From (2.8) it follows that
k k
z(Flip)) > zip) & ]T f-k Yir > £>rIV (A.3)r=l 2^is=lPis^is r=l
Multiplication with the positive denominator yields
/ _,Pir'- irZir c_ I / J
Pir*- ir I I / ^PisÇis I/ _,
Pir irÇir i / j / ^PirPis*- irÇi.
T= \
or equivalently
r=l \r—l / \s=l J r=l r—1 s=l
s^r
k k
/ jPir*- irt,ir (1 Pir) / _, / ^PirPis*- irÇis fL U w
r—1-^
r=l s=l
Z-is^rPis s^r
k k
^2^2PirPisYir{£ir ~ £is) > 0 ^ (A.4)r=l s=l
sj^r
k k
^2 S PirPis(Yir - Yis)i^ir - &s) > 0. (A.5)r=l s=r+l
Since Gi is strictly increasing it follows from (2.36) that (A.5) is always true and
therefore F' is a growth transformation.
Gauss-Seidel Version of the Operator 141
Moreover, we see that equality in (A.4) holds if and only if for all r, s e K, r / s
with Pir,PiS 7^ 0: £ir — Cis- Since this corresponds exactly to the fixed pointcharacterization of (2.63) it follows that F'[^} is a strict growth transformation.
Note that the assumptions on G in this theorem are much weaker than those used
in Theorem 2.26 for operator F. Hence, it follows that if £ satisfies the assumptions
of Theorem 2.26 then not only F[(], but also F'[£] is a growth transformation and
therefore several results of Section 2.3 hold as well for F'.
Moreover, if z is an A-polynomial, then it is linear along the connecting line of two
intermediate points p(*-1) and pW. Hence, Theorem A.3 implies that there exists a
path from p to -F'(p) along which z increases. This observation generalizes the result
of Theorem 2.24.
/
Appendix B
The Concept of KKT Points
For nonlinear programming problems, the KKT conditions are first order necessary
conditions for local optima (see e.g. [BSS93]). Since we have used these conditions
in Section 2.2.4 for the R-SAP, we describe here briefly how they are derived from
the general setting.
Let / and J denote two index sets, !Clnbea nonempty, open set and continuouslydifferentiable functions /, g% for % e I and h3 for j e J from Rn to R be given. The
problem is to find solutions of the following problem
max fix)
s.t. g%ix) < 0 \fiel
hoix) = 0 Vj eJ[ ' '
xeX.
Theorem B.l (Karush-Kuhn-Tucker Necessary Condition)Let x* be a local solution of (B.l) and L :— {i G I | g%(x*) = 0}. Suppose that
Vgz(x*) for all i G L and Vh3(x*) for all j G J are linearly independent. Then
there exist unique Lagrangian multipliers ut for i G L and v3 for j G J, such that x*
(which is called a KKT point) satisßes the following conditions:
-Vf(x*)+J2u^9^x*)+l>2v3Vhi(x*) = °
ul9iix*) = 0 Mi eiK '
u% > 0 V? G I.
In case of the R-SAP the region of feasibility is given by A and therefore we set
9ir = -Pxr for all (%,r) e N x K =: f and h3 := Y^-~i Pjr~ ! for all j e N =: J. We
see now that Vg,r(p) and Vh3(p) are for all points p G A linearly independent and
therefore (B.2) arc necessary conditions for a local maximum. By setting / := zip),we get for all i e N,r e K the following KKT conditions for a local maximum p* of
144 The Concept of KKT Points
the R-SAP:
-Yirip*) - Uir + Vi = 0
-UirP*lr = 0
Uir > 0,
which are further discussed in (2.22)-(2.24) on page 34.
Appendix C
A Gradient Approach
In Section 2.3 we have studied properties of the discrete time dynamical system
defined by iteration of the mapping F[Y}. We show now that this system can also
be interpreted as a discretized version of a continuous gradient dynamical system in
A0, that is of the form p = v, where v is the gradient vector field of z with respect
to a well chosen inner product.
We recall now the necessary definitions and notations as needed in our context.
Let M be a relatively open subset in an affine subspace of Rra. The tangent space
of M at a point p (denoted by TPM) can be viewed as the set of tangent vectors at
p to the curves in M passing through p. Moreover, by an inner product ( , ) on M
we mean a family {( , )p \ p e M} of inner products with ( , )p defined on TpMsuch that ( , )p depends smoothly on p.
Let z : M —> R be a smooth function defined on M. The differential Dz associates
to every point p G M the linear map
Dzip) : TPM 4R, £ M- Dzip) £
where Dz(p) is the best linear approximation to z — z(p) in the neighborhood of p.
A vector field v on M is defined by assigning a vector v(p) G Rn to every point
p G M.
Definition Cl (Gradient Vector Field)Given a smooth function z : M —> R. The gradient vector held grad z of z with
respect to the inner product { , ) is the unique vector field satisfying the followingtwo properties:
(i) grad zip) e TPM, Vp G M
146 A Gradient Approach
(ii) Dzip) C = (gradzip),C)p, V£ G TPM.
Note that if M = Rn (which implies that TpM = Rn) is endowed with the Euclidean
inner product, then the associated gradient field is the column vector
grad,(p) = V.(p)=(^(p),...,^-(p)).In the sequel we discuss the situation for SAP's and therefore consider M = A0.
The tangent space of A0 in a point p, denoted by TPA° is given by
TpA° := £ Gsnxk y52?ir = o,VieN\. (Ci)
r=l
Let z : A0 —y R be the objective function of a SAP (n, k, T, w), i.e.
z(p)=5^ [wt npirTeT \ (i,r)£T
and F := F[r] be the operator defined by (1.20) and (1.21). Consequently we have
for pt+1 := Fip*)
EÎ=iPi5r"(p*
4+1 = *V)* =
^k .J,.A V» G AT, r G iv (C.2)
and we get for the difference of two successive points
* / k \tâ1 -p\r =
^fc
P"
, Arir(p') - $>LrÎS(p4) I VieN,reK. (c.3)
Ea=iPLr"(p*) V i^i )
Since T is a fitness function, Es-iP^Yis > 0 for alH G A",p G A0. Hence we can
define the following inner product ( , ) on A0 by
it,ri)p--=fQ(p)v (CA)
where Qip) = diag(ç) is the nk x nk diagonal matrix defined on A0 with q G (R"fc) +
given by
qir:=Eks=iP^s VzGiV,rGKPir
Since the numerator is positive and p G A0, it follows immediately that Qip) is
positive definite. (C.4) can be written as
&i = EE E"=lP"r"^r- (C5)i=l r=l
Ar
Note that the inner product defined above does not only depend on the point p, but
also on the (form of the) objective function z defining Yir.
A Gradient Approach 147
Theorem C.2
The vector held v(p) = F(p) — p given by
vip)„ = Fip)ir - pir = . j Y„ - ^2piSYis J i e N,r G K (C.6)Z^/s=lPisl is \ s=l /
is the gradient vector held of z with respect to the inner product (C.5) on A0.
Proof
We verify directly that v(p) satisfies Definition C.l (i) and (ii).
(i) v(p) e TpA° for all p G A0 follows from
ltv(p),r = i2jlrTirr -X> = 1-1 = 0 VieN
r=l r=l 2-is=lP ls r=l
and the fact that TPA° = {£ | £)*=1 ftr = 0, Vz G N} as derived in (C.l).
(ii) Finally we show that Dzip) ' £ = (v(p)i Op'-
n k y-^kp
/ k \
("<pU)> = EE ^s=1Jls kPirr
rîr - $>rM ùri=l r=l
Pir2^s=lPzstzs \ a=i J
n k n k k
i=l r=l i=l r—1 s=l
n k n k n
= EE r«-£- = EE &?-&• = Dz(p) • fi=l r=l i=l r=l
Pir
The first equality of the last row holds, because Ç G TPA° and therefore by(C.l) the second term is zero.
Appendix D
Definitions
D.l General Definitions
Definition D.l (Homogeneous Function (Def. 2.25))A function zip) is called homogeneous of degree d, if ziap) — adzip) for all a.
Definition D.2 (Discrete Neighborhood (Def. 2.13))Let an integer vertex p G A1 be given. Then the discrete neighborhood of p is
defined by
Mip) :={qeAI\3ieN: qL ^ pL, Çj, = P], VjeN\ {i}}.
Definition D.3 (Discrete Local Maximum (Def. 2.14))A point p* G A7 is a discrete (strict) local maximum of the SAP, if for all q G M(p*):z(q) < zip*) (ziq) < zip*)).
Definition D.4 (Continuous Local Maximum (Def. 2.12))A point p* e A is a (strict) local maximum of the R-SAP, if there exists a neighbor¬hood Uip*) : Vç G U(p*) 0 A : ziq) < zip*) (z(q) < zip*)).
Definition D.5 (Stationary Point, Saddle Point (Def. 2.19))A point p* e A is called a stationary point of z, if the gradient of z projected on the
afhne subspace aff(A) is zero in p*, i.e. YiT(p*) = v,-L for all i G N,r G K and some
constants v^ e R.
Moreover, we callp* G A a saddle point of z on aff(A), ifp* is a stationary point and
in any neighborhood W£(p*) naff(A) there exist points p, q with zip) < zip*) < ziq).
Definition D.6 (Nash Equilibrium (Def. 2.20))A point p* G A is a Nash equilibrium of the R-SAP, if
VieN,\/qe{PeA\ Pj. = p*, \/j e N \ {i}} . z(q) < zip*).
Definition D.7 (Growth Transformation (Def. 2.1))Let z be a continuous function on A0 Ç R"fc. We say that a continuous mappingF : A0 —r A0 is a growth transformation for z iff
zip) < ziFip)) Vp G A0.
150 Definitions
If additionally
zip) = z(Fip)) => p - Fip)
F is called a strict growth transformation.
Definition D.8 ((Unstable) Fixed Point (Def. 2.27))Let F : A ->• A be given, then p* G A is called a fixed point, if Fip*) = p*.
Moreover, a ßxed point p* G A is called unstable, if there exists a neighborhood
U(p*) such that for every neighborhood Uiip*) m M(p*) there exists at least one
starting point p G Uiip*) C\ A such that the sequence {Ft(p)},t > 0 does not lie
entirely in Uip*).
Definition D.9 (Attractor (Def. 2.43))Let F : A —>• A be an operator. A point p* G A is called an attractor of F if there
exists a neighborhood Uip*) such that for any starting point p° G Uip*) n A the
sequence {Ft(p°)},t > 0 converges to p*. In this case
RAip*) := {p e A | lim F\p) = p*}t—^oo
is called the region of attraction ofp* with respect to F.
Definition D.10 (Geometric Convergence (Def. 2.36))Let a sequence {p*}, t > 0 with lim^œp* = p* be given. Then the sequence {p1} is
geometrically convergent, if there exist c > 0,0 < p < 1 and an index t0 such that
||p* —P*\\ < cPt f°r ail t > t0.
D.2 Specific Notations
Definition D.ll (Assignments (Def. 1.1))Let N be the index set of the decision variables and let K be the set of possiblevalues for each variable.
1. An assignment A is a set of the form A := {(i, r(i)) | i = 1,..., n, r(?) G K} Ç
N x K. Moreover, we denote by A the set of all possible assignments.
2. A partial assignment T is a set of the form T := {(i,r(i)) \ i G M,r(i) G K}for some M Ç N. If moreover a positive weight wT e R+ is given, we call
(T, wt) a weighted partial assignment.
3. A partial assignment T is satisßed by a given assignment AeA,ifTCA.
Definition D.12 (C-SAP, SAP (Def. 1.2))The constrained semi-assignment problem C-SAP for the decision variables x^, i G N,with possible values in K is deßned by a 5-tuple in, k, TZ, T, w) where
1. TZ is a set of (forbidden) partial assignments with \R\ > 2 for all R G TZ
(deßning the constraints)
D.2 Specific Notations 151
2. (T, w) is a set of weighted partial assignments (deßning the objective function)
and consists in
max z (A) = ^2{wT\T Ç A}TeT
s.t. AeA
RqtA VReTZ. (D.l)
An assignment AeA which satisfies (D.l) is called a feasible assignment.The semi-assignment problem SAP is the special case of the C-SAP in which TZ = 0;it is given by the 4-tuple (n, k, T, w).
Definition D.13 (Feasible Points: A1, A, A0, A* (Def. 1.5))The set of all assignments is dehned by
5>r = l, Vie AT I.r-l
A7:= Ipe {0,l}nxk
Moreover, we denote its convex hull by
A :=Conv(A/) = Jp G [0, l]nxk
and its relative interior by A0. One single simplex will be denoted by
k ~\
$>«• = !, Vie w|r=l J
A*:=<Ug[0,1]k
r=l
Definition D.14 (Fitness Function (Def. 1.6))A vector £ = (£„.) (i G N,r G K), where £ir : A0 —y R+ are continuous functions, is
called ßtness function on A0.
Definition D.15 (Operator F[$](p) (Def. 1.7))Let £ be a ßtness function on A0. Then the operator F[£,} : A0 —>• A0 is for all
i e N,r e K deßned by
mW* =P-4ry- "h E,(p) := ^^»(p).S,(p)
s=l
Definition D.16 (Gradient/Repellor Dynamics (Def. 1.8))Let z be the objective function of a SAP instance in,k,T,w) such that Y = (Yir)deßned by
dzY„(z,p) := g—ip) = E Wt II p3s Vi^N,reK
Pir,fef:T ü.*)er\(t,r)
152 Definitions
is a ßtness function. Then we call the dynamic deßned by F[Ya], a > 0 the gradient-
type dynamic.
Moreover, if a set of forbidden partial assignments TZ is given, we deßne the ßtness
function 6 = (&ir) by
etr(p):= II I1" II Pi>) Vi£N,r<=K
*%R V MeR\(i,r) J
and call the to F[Qß], ß > 0 corresponding dynamic, the repellor dynamic.
Definition D.17 (A-Polynomial, PN,P (Def. 2.4))• Let a 4-tuple in, k,T,w) as for a SAP instance, together with a constant c G R
be given. Then we call z : Rnk ->• R
Z(P) = E I WT Yl Pir\+C (D-2)TeT \ (i,r)6T /
an assignment-polynomial (short: A-polynomial), where pir are variables with
i G N,r e K. The degree m of the A-polynomial zip) is deßned by m :=
maxrG7- \T\.
m Let U C N, then we denote by Pu the set of all A-polynomials (D.2) which
contain variables p„ with i eU.
• Let z G PN. If k = 2 then we call z a Boolean A-polynomial; ifm — 2 then z
is referred to as a quadratic A-polynomial.
• If moreover an operator F := F[£] is given, then we denote the set of all
A-polynomials in PN for which F is well deßned by Pp .
Definition D.18 (Equivalent A-polynomials (Def. 2.5))Let zi,z2 : Rnk -> R be two A-polynomials. Then
zi = z2 :& ziip) = z2(p) VpGA
and we call zx a form of z2.
Definition D.19 (Form-Independence (Def. 2.47))Let F := F[P] and z G Pj? be ßxed. Moreover, let for all z G P$f with z ^ z a
set M(z,p) depending on z and p e A be given. If for all p e A and z G Pp with
z = z : M(z,p) = Miz,p), then we deßne Mip) := M(z,p) and call the set Mip)form-independent.
Definition D.20 (Universal Region of Attraction: URAip*) (Def 2.48))Let F := F[£\, z G Pp and p* be an attractor of F. Then the universal region of
attraction URA ofp* is deßned by
URAip*) := fl RAiz,p*),
where RAiz,p*) is the region of attraction ofp* for z.
D.2 Specific Notations 153
Definition D.21 (Guaranteed Region of Attraction: GRA^p*) (Def. 2.50))Let F - F[Y], z e Pp and p* G A7 with p*zl = 1 for i e N be an attractor ofF.
Moreover, let
Ap* := {p e A | pii > 0, Vi e N}Mhip*) := {p e Ap, | Y.iizp) > Y„(z,p), VieN,r>l}
Qi(p,p*) - {Q G A I 9.1 > P*i, Vi G N}.
Then the guaranteed region of attraction GRAiip*) is deßned by
GRAiip*) -.= {p e Mhip*) I Qi(p,P*) Ç Mhip*)}.
Definition D.22 (MI2c(p*),Gfii^(p*) (Def. 2.53))Let F := F[Y], z G Pg and p* G A7 with p*x = 1 for all i e N be an attractor ofF.
Moreover, let c* := maxr>i {^^jl^;?"^j} for all i G N. Then we deßne for
any c G Rn with c* < c, < 1 and
Pn >Ci VieN (D.3)
czYn + (1 - Cl)Yls > Yir VieN,r,seK\{l},r^s (DA)
Yn >Yïr VieN,r>l (D.5)
the polytope Mf^ip*) by
M/2V) := {p g A | iD.3), (DA), (D.5)}.
IfQ2(p,p*) is given by
Qi(p,P*) = {q G A | qtl > plU q„ < pir Vi G N, r > 1}
then a to MI%(p*) corresponding subset of the universal region ofattraction is deßned
by
GRAiip*) := {p G MF2ip*) | Q2ip,p*) C MF2ip*)}.
/"**.
S@t'te /
Bibliography
[ABR92] Sheldon Axler, Paul Bourdon, and Wade Ramey. Harmonic Function
Theory. Graduate Texts in Mathematics; 137. Springer-Verlag, 1992.
[AK89] E.H.L. Aarts and J.H.M. Korst. Simulated Annealing and Boltzmann
machines. Wiley, Chichester, 1989.
[Aki79] Ethan Akin. The Geometry of Population Genetics. Lecture Notes in
Biomathematics 31. Springer-Verlag, 1979.
[AvKS98] Pankaj K. Agarwal, Marc van Kreveld, and Subhash Suri. Label place¬ment by maximum independent set in rectangles. Computational Ge¬
ometry: Theory and Applications, 11:209-218, 1998.
[BBPP99] I.M. Bomze, M. Budinich, P.M. Pardalos, and M. Pelillo. The maxi¬
mum clique problem. In D.-Z. Du and P.M. Pardalos, editors, Handbook
of Combinatorial Optimization, Suppl. Vol. A, pages 1-74. Dordrecht:
Kluwer Academic Publ., 1999.
[BCG98] Michael Burkard, Maurice Cochand, and Ariette Gaillard. A dynam¬ical system based heuristic for a class of constrained semi-assignment
problems. In P. Kail and H.-J. Liithi, editors, Operations Research Pro¬
ceedings 1998, pages 182-191. Springer-Verlag, 1998.
[BCS92] A. Billionnet, M.C. Costa, and A. Sutter. An efficient algorithm for
a task allocation problem. Journal of the Association for Computing
Machinery, 39(3):502-518, 7 1992.
[BE67] Leonard E. Baum and J.A. Eagon. An inequality with applications to
statistical estimation for probabilistic functions of markov processes and
to a model for ecology. Bulletin of the American Mathematical Society,
73:360-363, 1967.
[BM85] A. Billionnet and M. Minoux. Maximizing a supermodular pseudo-boolean function: A polynomial algorithm for supermodular cubic func¬
tions. Discrete Applied Mathematics, 12:1-11, 1985.
[Bok81] S.H. Bokhari. A shortest tree algorithm for optimal assignments across
space and time in a distributed processor system. fEEE Transactions
on Software Engineering, 7:583-589, 1981.
156 Bibliography
[Bok87] S.H. Bokhari. Assignment problems in parallel and distributed comput¬
ing. Kluwer Academic Publishers, 1987.
[Bom96] Immanuel M. Bomze. Regularization in evolutionary optimization pro¬
cesses of standard QPs. Department of Statistics, Operations Research
and Computer Science, University of Vienna, Vienna, Austria, 1996.
[Bom97] Immanuel M. Bomze. Evolution towards the maximum clique. Journal
of Global Optimization, 10(2):143-164, 1997.
[BPG97] Immanuel Bomze, Marcello Pelillo, and Robert Giacomini. Evolutionary
approach to the maximum clique problem: Empirical evidence on a lagerscale. Nonconvex Optimization and fts Applications, 18:95-108, 1997.
[BR84] R.E. Burkard and F. Rendl. A thermodynamically motivated simulation
procedure for combinatorial optimization problems. European Journal
of Operational Research, 17:169-174, 1984.
[BS68] Leonard E. Baum and George R. Sell. Growth transformations for func¬
tions on manifolds. Pacific Journal of Mathematics, 27(2), 1968.
[BSS93] Mokhtar S. Bazaraa, Hanif D. Sherali, and C. M. Shetty. Nonlinear
Programming: Theory and Algorithms. John Wiley & Sons, 1993.
[Bur94] Michael Burkard. An interior point algorithm for solving max-cut prob¬lems. Diploma thesis, Technical University of Graz, April 1994.
[BW91] Roger W. Brockett and Wing Shing Wong. A gradient flow for the
assignment problem. In G. Conte and B. Wyman, editors, Progress in
System Control Theory, pages 170-177, 1991.
[Cer85] V. Cerny. Thermodynamical approach to the travelling salesman prob¬lem: An efficient simulation algorithm. Journal of Optimization Theoryand Applications, 45:41-51, 1985.
[CFMS97] Jon Christensen, Stacy Friedman, Joe Marks, and Stuart Shieber. Em¬
pirical testing of algorithms for variable-sized label placement. In Pro¬
ceedings of the 13th Annual ACM Symposium on Computational Geom¬
etry, pages 415-417, 1997.
[CHdW87] M. Chams, A. Hertz, and D. de Werra. Some experiments with simulated
annealing for coloring graphs. European Jounal of Operational Research,
32:260-266, 1987.
[CJ90] Anthony C. Cook and Christopher B. Jones. A Prolog rule-based system
for cartographic name placement. Computer Graphics Forum, 9(2):109-126, 1990.
Bibliography 157
[CL097] David Cox, John Little, and Donal O'Shea. Ideals, Varieties, and Al¬
gorithms: An Introduction to Computational Algebraic Geometry and
Commutative Algebra. Undergraduate texts in mathematics. Springer
Verlag, New York, 1997.
[CMS93] Jon Christensen, Joe Marks, and Stuart Shieber. Algorithms for carto¬
graphic label placement. In Proceedings of the American Congress on
Surveying and Mapping 1, pages 75-89, 1993.
[CMS95] Jon Christensen, Joe Marks, and Stuart Shieber. An empirical studyof algorithms for point-feature label placement. ACM Transactions on
Graphics, 14(3):203-232, 1995.
[Coc93] Maurice Cochand. A fixed point operator for the generalised maximum
satisfiability problem. Discrete Applied Mathematics, 46:117-132, 1993.
[Con92] David Connolly. General purpose simulated annealing. Journal of the
Operational Research Society, 43(5):495-505, 1992.
[Cra89] Yves Crama. Recognition problems for special classes of polynomials in
0-1 variables. Mathematical Programming, 44:139-155, 1989.
[Dev89] Robert L. Devaney. An Introduction to Chaotic Dynamical Systems.
Addison-Wesley, 1989.
[DF92] Jeffrey S. Doerschler and Herbert Freeman. A rule-based system for
dense-map name placement. Communications of the ACM, 35:68-79,1992.
[dWH89] D. de Werra and A. Hertz. Tabu search techniques. A tutorial and an
application to neural networks. OR Spektrum, 11(3).T31-141, 1989.
[FA87] Herbert Freeman and John Ahn. On the problem of placing names in
a geographic map. International Journal of Pattern Recognition and
Artifical Intelligence, 1(1):121-140, 1987.
[FW91] Michael Formann and Frank Wagner. A packing problem with appli¬cations to lettering of maps. In Proceedings of the 7th Annual ACM
Symposium on Computational Geometry, pages 281-288, 1991.
[GJ79] Michael R. Garey and David S. Johnson. Computers and Intractabil¬
ity. A Guide to the Theory of NP-Completeness. A series of books in
the mathematical sciences. W.H.Freeman and Company, San Francisco,1979.
[GL92] Fred Glover and Manuel Laguna. Modern Heuristic Techniques for Com¬
binatorial Problems. 1992.
[Glo89] F. Glover. Tabu search - part I. ORSA Journal on Computing, 1:190-
200, 1989.
158 Bibliography
[GS91] G. Gallo and B. Simeone. Optimal grouping of researchers into depart¬
ments. Ricerca Operativa, 57:45-69, 1991.
[HdW87] A. Hertz and D. de Werra. Using tabu search techniques for graph
coloring. Computing, 29:345-351, 1987.
[Hir82] Stephen A. Hirsch. An algorithm for automatic name placement around
point data. The American Cartographer, 9(1):5-17, 1982.
[HK99] Alain Hertz and Daniel Kobler. A tabu search for the constrained semi-
assignment problem, to appear, 1999.
[HM96] Uwe Helmke and John B. Moore. Optimization and Dynamical Systems.
Springer, 3rd printing, 1996.
[HRVW96] Christoph Helmberg, Franz Rendl, Robert Vanderbei, and HenryWolkowicz. An interior-point method for semidefinite programming.SIAM Journal of Optimization, 6(2):342-361, 1996.
[HS86] P. Hansen and B. Simeone. Unimodular functions. Discrete Applied
Mathematics, 14:269-281, 1986.
[HS88] Josef Hofbauer and Karl Sigmund. The Theory of Evolution and Dynam¬
ical Systems. London Mathematical Society Student Texts 7. Cambridge
University Press, 1988.
[Imh62] Eduard Imhof. Die Anordnung der Namen in der Karte. In International
Yearbook of Cartography, pages 93-129, Bonn Bad Godesberg, 1962.
Kirschbaum.
[Imh75] Eduard Imhof. Positioning names on maps. The American Cartographer,
2(2):128-144, 1975.
[Kar72] R.M. Karp. Reducibility among combinatorial problems. In R.E. Miller
and J.W. Thatcher, editors, Complexity of Computer Computation,
pages 85-103. Plenum Press, New York, 1972.
[Kar75] R.M. Karp. On the computational complexity of combinatorial prob¬lems. Networks, 5:45-68, 1975.
[KGV83] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi. Optimization bysimulated annealing. Science, 220(4598):671-680, May 1983.
[KI88] T. Kato and H. Imai. The NP-completeness of the character placement
problem of 2 or 3 degrees of freedom. In Record of Joint Conference ofElectrical and Electronic Engineers in Kyushu, page 1138, 1988.
[Kur90] Petr Kurka. Game theoretical models of mutation and selection. In
J. Maynard Smith and G. Vida, editors, Organizational constraints on
the dynamics of evolution, Proceedings in non linear science, chapter 16,
pages 213-220. Manchester University Press, 1990.
Bibliography 159
[LA83] V. Losert and E. Akin. Dynamics of games and genes: Discrete versus
continuous time. Journal of Mathematical Biology, 17:241-251, 1983.
[LA87] P.J.M. Laarhoven and E.H.L. Aarts. Simulated Annealing: Theory and
Applications. Reidel Publishing Company, 1987.
[Mal94] Federico Malucelli. A polynomially solvable class of quadratic semi-
assignment problems. Technical report, Dipartimento di Informatica,Université di Pisa, 1994.
[MP94] Federico Malucelli and Daniele Pretolani. Quadratic semi-assignment
problems on structured graphs. Ricerca Operativa, 69:57-78, 1994.
[MS65] T.S. Motzkin and E.G. Straus. Maxima for graphs and a new proof of a
theorem of Turan. Canadian Journal of Mathematics, 17:533-540, 1965.
[MS82] John Maynard Smith. Evolution and the Theory of Games. Cambridge
University Press, 1982.
[MS91] Joe Marks and Stuart Shieber. The computational complexity of car¬
tographic label placement. Technical Report TR-05-91, Harvard CS,1991.
[Rhy70] J. Rhys. A selection problem of shared fixed costs and networks. Man¬
agement Science, 17:200-207, 1970.
[SG76] S. Sahni and T. Gonzales. P-complete approximation problems. Journal
of the ACM, 23:555-565, 1976.
[Sig87] Karl Sigmund. Game dynamics, mixed strategies, and gradient systems.
Theoretical Population Biology, 32:114-126, 1987.
[Tay79] P. Taylor. Evolutionary stable strategies with two types of players. Jour¬
nal of Applied Probability, 16:76-83, 1979.
[TJ78] P. Taylor and L. Jonker. Evolutionary stable strategies and game dy¬namics. Mathematical Biosciences, 40:145-156, 1978.
[vKSW98] Marc van Kreveld, Tycho Strijk, and Alexander Wolff. Point set labelingwith sliding labels. In Proceedings of the l^th Annual ACM Symposiumon Computational Geometry, pages 337-346, 7-10 June 1998.
[WB91] Chyan Victor Wu and Barbara Pfeil Buttenfield. Reconsidering rules
for point-feature name placement. Cartographica, 28(l):10-27, 1991.
[Wei96] Jürgen W. Weibull. Evolutionary Game Theory. The MIT Press, 1996.
[Won94] Wing Sing Wong. Gradient flows for local minima of combinatorial
optimization problems. Fields Institute Communications, 3, 1994.
160 Bibliography
[Won95] Wing Sing Wong. Matrix representation and gradient flows for NP-hard
problems. Journal of Optimization Theory and Applications, 87(1):197-220, 1995.
[WW95] Frank Wagner and Alexander Wolff. Map labeling heuristics: Provably
good and practically useful. In Proceedings of the 11th Annual ACM
Symposium on Computational Geometry, pages 109-118, 5-7 June 1995.
[WW97] Frank Wagner and Alexander Wolff. A practical map labeling algorithm.
Computational Geometry: Theory and Applications, 7:387-404, 1997.
[WW98] Frank Wagner and Alexander Wolff. A combinatorial framework for map
labeling. In Sue H. Whitesides, editor, Proceedings of the Symposium on
Graph Drawing '98, volume 1547 of Lecture Notes in Computer Science,
pages 316-331. Springer-Verlag, 13-15 August 1998.
[Yoe72] Pinhas Yoeli. The logic of automated map lettering. The Cartographic
Journal, 9:99-108, 1972.
[Zor86] Steven Zoraster. Integer programming applied to the map label place¬ment problem. Cartographica, 23(3):16-27, 1986.
[Zor90] Steven Zoraster. The solution of large 0-1 integer programming problemsencountered in automated cartography. Operations Research, 38(5):752-759, 1990.
[Zor91] Steven Zoraster. Expert systems and the map label placement problem.
Cartographica, 28(l):l-9, 1991.
Curriculum Vitae
Personal Data
Name:
Date of birth:
Place of birth:
Nationality:
Michael Burkard
August 7th, 1971
Graz, Austria
Austria
Martial status: Single
Education
1977-1981 Primary school in Berg. Gladbach, Germany
1981-1989 High school in Graz, Austria
1990-1994 Study of Technical Mathematics at TU Graz, Austria
1994-2000 Assistant at the Institute for Operations Research (IFOR), ETH
Zürich. Writing a dissertation at IFOR on a heuristic for constrained
semi-assignment problems (Prof. H.-J. Lüthi).