Rights / License: Research Collection In Copyright - Non … · 2020. 3. 26. · Contents...

Research Collection

Doctoral Thesis

A continuous relaxation based heuristic for a class of constrainedsemi-assignment problems

Author(s): Burkard, Michael

Publication Date: 2000

Permanent Link: https://doi.org/10.3929/ethz-a-003925104

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

https://doi.org/10.3929/ethz-a-003925104

http://rightsstatements.org/page/InC-NC/1.0/

https://www.research-collection.ethz.ch

https://www.research-collection.ethz.ch/terms-of-use

Diss. ETH No. 13756

A Continuous

Relaxation Based Heuristic

for a Class of Constrained

Semi-Assignment Problems

A dissertation submitted to the

Swiss Federal Institute Of Technology

Zurich

for the degree of

Doctor of Technical Sciences

presented by

Michael Burkard

Dipl.-Ing., TU-Graz

born 7th August 1971

citizen of Austria

accepted on the recommendation of

Prof. Dr. H.-J. Liithi, examiner

Dr. A. Gaillard, co-examiner

Prof. Dr. T. Liebling, co-examiner

2000

Acknowledgement

Special thanks go to Prof. H.-J. Lüthi, who made this work possible by offering me

a position at the Institute of Operations Research (IFOR). His personal engagement

has helped me greatly to finish this work.

Furthermore I am grateful to Prof. T. Liebling at EPFL for his friendly support

and his willingness to referee my thesis.

I am indebted to A. Gaillard for her constant support and her kind supervision of

this thesis. I also wish to thank M. Cochand for introducing me to this topic and

sharing his expertise on this subject with me. His critical review of the work has

decisively improved this thesis.

At EPFL I also want to thank Prof. A. Hertz and D. Kobler for their develop¬

ment and implementation of a Tabu Search method for the C-SAP which made the

comparisons in Chapter 3 possible.

Moreover, B. Mateev's implementation of a visualisation tool for label placement

problems was a great help to me and made the comparisons of different models in

Chapter 4 much easier. It also enabled me to include some nice pictures illustrating

impressively the quality of different placements.

I wish to thank my family and all my friends in Switzerland and abroad for their

moral support, encouragement and much beyond ...

Finally I wish to thank all members at IFOR who made my time in Zurich enjoyable.I am especially obliged to my colleagues G. Studer and L. Finschi for numerous

stimulating discussions which often brought up new ideas and insights. Last but not

least I wish to thank our system administrators, D. Moser and W. Schlickenrieder

who always kept my system running, supported me with their excellent UNIX know-

how and provided several useful Perl-scripts.

Contents

Acknowledgement i

Glossary of Notation xi

Abstract xiii

Zusammenfassung xv

1 Semi-Assignment Problems: A Survey 1

1.1 Introduction 1

1.2 Problem Statement and Overview 2

1.2.1 The Constrained Semi-Assignment Problem 2

1.2.2 Heuristic Solution Methods for the C-SAP 5

1.2.3 Dynamical Systems in Combinatorial Optimization 6

1.2.4 Overview 8

1.3 Combinatorial Problems Formulated as C-SAP's 9

1.4 Idea of the Algorithm 15

2 The Semi-Assignment Problem 21

2.1 Introduction 21

2.2 On the Objective Function of the R-SAP 25

2.2.1 Basic Properties 25

2.2.2 Equivalent A-Polynomials 26

2.2.3 Relationship between Discrete and Continuous Local Optima .28

iv Contents

2.2.4 Saddle Points, Karush-Kuhn-Tucker Points and Nash Equilibria 33

2.3 Properties of the Iteration Sequence 38

2.3.1 Growth Transformations 39

2.3.2 Fixed Points and Accumulation Points 43

2.3.3 On the Convergence 47

2.3.4 Relations between Local Optima and Attractors 56

2.4 Guaranteed Region of Attraction for F[T] 59

2.4.1 An Introductory Example 60

2.4.2 Characterization of the Guaranteed Region of Attraction...

62

2.4.3 The Special Case of Quadratic A-Polynomials 65

2.5 A Region of Attraction for F[£] 73

2.6 Computational Experiments with the Max-Cut Problem 77

3 The Constrained Semi-Assignment Problem 83

3.1 Introduction and Algorithm for C-SAP. 83

3.2 Properties of the Operator for C-SAP 86

3.3 Implementation and Numerical Results 93

3.3.1 Implementation 94

3.3.2 Instance Generation 96

3.3.3 Numerical Results 97

4 The Point Feature Label Placement Problem 103

4.1 Label Placement - State of the Art 103

4.2 Methods behind Label Placement 106

4.3 Problem Descriptions 108

4.4 Preprocessing Strategies 110

4.5 Models for Label Placement 114

4.5.1 Minimizing the Number of Pairwise Overlaps (PI) 114

4.5.2 Minimizing the Number of Conflicting Labels (P2) 115

4.5.3 Models for Label Placement with Point Selection (P3) ....118

Contents v

4.6 Postprocessing Strategies 120

4.7 Ambiguity 122

4.8 Computational Results 126

4.8.1 Test Set 128

4.8.2 Reduction by Preprocessing 128

4.8.3 Comparison of the Models 130

4.8.4 Comparison to Other Heuristics 132

4.8.5 Implementation Details and Parameter Settings 137

A Gauss-Seidel Version of the Operator 139

B The Concept of KKT Points 143

C A Gradient Approach 145

D Definitions 149

D.l General Definitions 149

D.2 Specific Notations 150

aiar*k leaf

List of Tables

2.1 Results for 10 Max-Cut instances 79

2.2 Results for large Max-Cut instances 81

3.1 Composition of weighted partial assignments (T, w) 97

3.2 Comparison of FPH to TS 101

4.1 Desired properties of an aesthetically attractive placement (from[FA87]) 104

4.2 Average reduction of constraints by pre-processing for problems with

n = 750 129

4.3 Average reduction of constraints by pre-processing for problems with

n = 1500 130

4.4 Comparison of (Ml) and (M2) 131

4.5 Varying the number of iterations for (M2) with 750 point features.. .

132

4.6 Comparison of methods for instances without point selection. Taken

from Figure 11 in [CMS95] 134

4.7 Comparison of different heuristics for random maps without pointselection 135

4.8 Comparison of different heuristics for random maps with point selection. 135

Se/£e /

Blank, r/

List of Figures

1.1 SAP and C-SAP with optimal assignments A\ and A2 10

1.2 Trajectories of three different dynamics in the (pn,p2i)-plane 20

2.1 Plots of the graph z(p) = 3pnp2i — Pn — P21 + 1 23

2.2 Trajectories and regions of attraction 24

2.3 Objective values of z in the pn,p21-plane 32

2.4 Nash equilibria of a Max-3-Cut instance 37

2.5 Example illustrating sets Ue of Corollary 2.31 46

2.6 Example with a discrete local maximum p* which is not an accumu¬

lation point of any sequence started in A0 58

2.7 Regions of attraction for z\ and z2 with M := 5000; black region

corresponds to p*, light gray region to q* (numerically determined). .60

2.8 Region of monotone increase; arrows indicate directions of increase..

61

2.9 Leaving M/i(p*), because of too large step length 63

2.10 Examples for Qi{p,p*) 64

2.11 Two examples of sets Mhip*) and GRAx(p*) 67

2.12 Extreme points pu of Qi(p,p*) and Q2(p,P*) for n = l,k = 3 with

P= iPl,P2,P3) 68

2.13 Construction of the guaranteed region of attraction GRA^ip*) 70

2.14 Iterations vs. quality: solid=min, dashed=avg 78

3.1 Graph of fijijix))) and the line y = x 92

3.2 Behavior of FPH for re, F29 and r36 (left: without greedy, right:with greedy) 99

3.3 Behavior of FPH for P36, T3e2 and r363 (left: without greedy, right:with greedy) 100

X List of Figures

4.1 Position priorities suggested by Yoeli [Yoe72] 105

4.2 Example demonstrating the difference between the objectives: 'mini¬

mize pairwise overlaps' and 'minimize conflicting labels' 110

4.3 Example of why (Rl) must not be used for (PI) or (P2) when opti-

mality should be preserved 113

4.4 Motivating example for the objective function of (P2) 115

4.5 Contradicting objectives and improvement by postprocessing (PPl). •121

4.6 Resolving ambiguous placements 122

4.7 Neighborhood Rz(j) for point feature i used by (AMBIl) and (AMBI2).123

4.8 Placement where (AMBIl) does not work 124

4.9 Distance rules for placement with (AMBI2) 125

4.10 Placements of ambiguity avoiding strategies (AMBIl) and (AMBI2). 127

4.11 Comparison of (Ml) and (M2) with the example of a random map

with 750 point features with point selection prohibited 133

4.12 Comparison of different methods for n = 750 (left) and n = 1500

(right) features 136

Glossary of Notation

Of, ß exponents of T and 6 used in F[TaGß], p. 18

r fitness function, p.18

Ttr(z,p),rtr(p) partial derivative of z with respect to ptr, (1.21)A cross product of n simplices A*, (1.9)A* unit simplex in /c-space, p. 14

AJ integer vertices of A, (1.8)A0 relative interior of A, p. 14

Ae relative interior of A with e-boundary, (3.8)Ar {p e A | supp(p*) Ç supp(p)}, (2.84)0 fitness function, p. 18

Öir(p) function used to define repellor dynamics F[&], (1.22)£ fitness function used by operator F, p. 17

S.(p) denominator of F, (1-20)**(p) function A ->• AE used in the definition of Fe[Ç](p), (3.10)A^ assignment, set of all assignments, p.3

^nr iV coefficients of ht(p) and si(p), p.66

aff(A) affine subspace of A, p.33

C, Cj constants, c% defined in (2.98)

Cone() cone of the vectors, (2.93)Conv() convex hull of the vectors, (1.9)C(T),C(R) clauses corresp. to partial assignments T and R, (1.4), (1.5)m(p) operator F with fitness function £, (1.20)F(z,p) short for F[£(z,p)](p), used to distinguish different z

nap) Gauss-Seidel version of operator F[Ç](p), (A.l)m(p) e-restricted version of F[£\(z,p), (3.11){F*(p°)},*>0 sequence generated by F with starting point p°^ face of A which contains p*, p.31

G, Gj monotone transformation (2.35), mostly used as £ = G(r)G = (V,£) undirected graph with vertex set V and edge set E, p.4

GiL4(p*) guaranteed region of attraction ofp*, (2.87), (2.103)/»/(p) linear functions describing MI(p*), p.66

i,j£N indices of variables, p.3

xii Glossary of Notation

(i,r)

K = {l,...,k}I eL

m

Mlip*)N = {l,...,n}

Af{p*)

p,qe A

Pir e [0, l]

Pt

pt = F^pO)p> = F(p)pNtjN

PRA(z,p*)T

q

Q(p,p*)

r,s e K

Riip)Ren

n

RA(z,p*)Round (p)

Si(p)

Si(P*) e K

supp(p)TeT

T

(T, wT)

UeiP*)

u'Ap*)

U(f)

Ueip*)

URA(p*)

Vi

WT, Wir

Xi

zip)

Zl=Z2

assignment, short for Xi = r, p.3

edge between nodes i and j in an undirected graph, p.4

set of possible values to be assigned, p.3

L index set of halfspaces in Mlip*), p.66

degree of the A-polynomial zip), p.25

region of monotone increase, (2.85), (2.102)index set of decision variables, p.3

discrete neighborhood of p* e A1, (2.16)

variables, n x A;-matrices, p. 14

probability for X{ = r, p. 15

i-th. row of matrix p, p. 15

element in iteration sequence {Ftip°)},t > 0, p.16

next point in iteration sequence, p\r = uirpir, p.38

set of all A-polynomials which contain all variables in N, p.25

subset of PN for which F is defined, p.62

form-dependent subset of iL4(z,p*), p.75

transpose of ç, p. 16

subset of A corresp. to feasible direction of F, (2.86), (2.93)

(indices) of assigned values, p.3

rest term in affine linear writing of zip), (2.10)forbidden partial assignment (constraint of C-SAP), p.3

set of all forbidden partial assignments, p.3

set of positive reals

region of attraction of p*, p.56

matrix in AJ, where largest component is rounded to 1, (3.12)linear functions describing GRAip*), (2.91), (2.106)index for p* e A1 with pis.(r) = 1, sip*) G Kn, p.62

support of p, p.33

partial assignment for the objective function z(p), p.3

set of all partial assignments for the objective function, p.3

weighted partial assignment, p.3

factor in F by which Pir is multiplied: Fip)ir = uirpir, (2.39)connected component of U'sip*) which contains p*, p.45

{p e A | zip) > zip*) - e}, (2.64)neighborhood of p*

open ball with center at p* and radius e > 0

universal region of attraction of p* (form-independent), p.62

Lagrangean multiplier, Vi := maxr£K^irip), P-33

weights in the objective function zip), p.3

i-th. decision variable, p.3

objective function of C-SAP, (1.10)

equivalent A-polynomials, z\ and z2 agree on A, p.26

Abstract

In this thesis we consider the constrained semi-assignment problem (C-SAP). It is

a generalization of the well-known Pseudo-Boolean Optimization problem where

the Boolean variables are replaced by discrete decision variables. Additionally,

constraints are given by a set of clauses (similar to those of the Satisfiability

problem) which prohibit certain assignments.

Many well-known combinatorial optimization problems can be formulated as a

C-SAP. This problem occurs also frequently in real-world applications. Since the

C-SAP is an NP-hard problem, it cannot be expected that the global optimum can

be determined efficiently. For this reason one goal of this work was the developmentof a heuristic which can be used efficiently for some of those C-SAP instances,

for which local search methods are doomed to failure due to their myopia. The

newly developed fixed point heuristic (FPH) of this thesis generalizes a method

of Cochand for the Generalized Maximum Satisfiability problem (G-Max-Sat)such that it can also be applied to constrained maximization problems such as

the C-SAP. We will show with the example of the point feature label placement

problem that FPH determines also for real-world problems fast good approximatesolutions.

Essentially FPH is based on a discrete dynamical system which results from iteratingan appropriately defined operator. For any fixed starting point a sequence of points

is generated in this way. The operator should be chosen such that this sequence con¬

verges for a large portion of starting points to a good local maximum of the problem.

By an appropriate choice of the operator in FPH, global information can be

included in the solution process. Thus FPH has for certain C-SAP instances a big

advantage over local search methods, which would fail in such situations due to

their local vision and lack of orientation.

The major part of this thesis deals with the semi-assignment problem (SAP), where

the constraints are not taken into account. We present a continuous embeddingof the problem and discuss the relationship between continuous and discrete local

maxima. Then we define an operator for FPH and prove some interesting properties

xiv Abstract

with respect to its convergence. Based on the fact that attractors are equivalentto strict local maxima we study the regions of attraction of these points and char¬

acterize some subregions thereof for the special case of quadratic objective functions.

Further on, the operator used for the SAP will be extended such that it can

be applied to the C-SAP. Towards this end the operator gets a new component

which has a repelling effect in the neighborhood of certain infeasible solutions. We

investigate this behavior and some other properties of this operator and discuss

then implementation details of FPH. Finally, we compare FPH with a Tabu-Search

method which was especially developed for the C-SAP.

The final part of this thesis concentrates on an application of the C-SAP, the so-

called point feature label placement problem. Its goal is to attach text elements to

given points on a map. This text should be placed clearly in such a way that overlapsand ambiguous assignments are avoided whenever possible. For this reason we derive

different models for this problem, all of which are based on the C-SAP formulation.

Additionally we discuss various pre- and postprocessing strategies which improvethe efficiency of the algorithm and the quality of its solutions. We conclude with a

comparison of the results of our models with the best known approach for this task.

Zusammenfassung

In dieser Arbeit beschäftigen wir uns mit dem restringierten Semi-Zuordnungs-

problem (Constrained Semi-Assignment Problem, C-SAP). Das C-SAP ist eine

Verallgemeinerung des bekannten Pseudo-Boole'schen Optimierungsproblems,wobei statt Boole'schen Variablen diskrete Entscheidungsvariable benutzt werden.

Zusätzlich haben wir Nebenbedingungen in Form von Klauseln (ähnlich wie beim

Satisfizierbarkeits-Problem), welche gewisse Zuordnungen verbieten.

Viele bekannte kombinatorische Optimierungsprobleme können als C-SAP formu¬

liert werden, und auch in der Praxis ist dieses Problem häufig anzutreffen. Da

es sich beim C-SAP um ein NP-schweres Problem handelt, kann nicht erwartet

werden, dass exakte Lösungen effizient bestimmt werden können. Aus diesem

Grund bestand ein Ziel dieser Arbeit darin, eine Heuristik zu entwickeln, die

besonders für solche C-SAP Instanzen effizient eingesetzt werden kann, für welche

lokale Suchverfahren infolge ihrer Kurzsichtigkeit versagen. Die in dieser Arbeit

neu entwickelte Fixpunkt-Heuristik (FPH) verallgemeinert ein von Cochand für das

verallgemeinerte, maximale Satisfizierbarkeits-Problem (G-Max-Sat) entwickeltes

Verfahren so, dass es auch für restringierte Maximierungsprobleme wie dem C-SAP

eingesetzt werden kann. Wir werden anhand des Landkarten-Beschriftungsproblems

zeigen, dass FPH auch für Probleme aus der Praxis schnell gute Näherungslösungenliefert.

Im Wesentlichen basiert FPH auf einem diskreten dynamischen System. Durch

Iteration eines geeigneten Operators wird, ausgehend von einem Startpunkt, eine

Punktefolge generiert. Den Operator will man dabei so definieren, dass diese Folgefür möglichst viele Startpunkte zu einem guten lokalen Maximum des Problems

konvergiert.

Durch eine geeignete Wahl des Operators in FPH kann man auch globale Infoma-

tionen über das Problem im Lösungsfindungsprozess miteinfliessen lassen. Dies

bringt bei gewissen C-SAP Instanzen einen grossen Vorteil gegenüber lokalen

Suchmethoden mit sich, die in solchen Fällen wegen ihrer lokalen Sichtweise oft die

Orientierung verlieren und versagen.

xvi Zusammenfassung

Ein grosser Teil dieser Arbeit beschäftigt sich mit dem Semi-Zuordnungsproblem(Semi-Assignment Problem, SAP), bei dem die Nebenbedingungen ausser Acht

gelassen werden. Wir präsentieren eine kontinuierliche Einbettung des Problems,über welcher FPH arbeiten wird, und diskutieren den Zusammenhang zwischen

kontinuierlichen und diskreten lokalen Maxima. Dann definieren wir einen Operatorfür FPH und beweisen einige interessante Eigenschaften, die im Zusammenhangmit seinem Konvergenzverhalten stehen. Basierend auf der Tatsache, dass At-

traktoren und strikt lokale Maxima äquivalent sind, studieren wir schliesslich die

Attraktionsgebiete dieser Punkte und charakterisieren einige Teilgebiete davon für

den Spezialfall quadratischer Zielfunktionen.

Anschliessend wird der für das SAP benutzte Operator erweitert, sodass er auf das

C-SAP angewendet werden kann. Dabei erhält der Operator eine neue Komponentemit abstossender Wirkung in der Nähe von gewissen unzulässigen Lösungen. Wir

untersuchen diese und weitere Eigenschaften dieses Operators und beschreiben

einige Implementationsdetails von FPH. Schliesslich vergleichen wir FPH mit einem

eigens für das C-SAP entwickelten Tabu-Suchverfahren.

Der letzte Teil dieser Arbeit beschäftigt sich mit einer Anwendung des C-SAP, dem

sogenannten Landkarten-Beschriftungsproblem (Label Placement Problem). Ziel

dabei ist es, vorgegebene Punkte auf einer Landkarte mit einem Text zu beschrif¬

ten. Dieser soll möglichst übersichtlich platziert werden, wobei Überdeckungen und

mehrdeutige Zuordnungen zu vermeiden sind. Um diese Aufgabe zu erfüllen, werden

verschiedene Modelle für das Problem hergeleitet, die auf der C-SAP Formulierungbasieren. Zusätzlich diskutieren wir diverse Pre- und Postprocessing-Strategien,welche die Effizienz des Algorithmus und die Qualität der Lösungen verbessern. Ab¬

schliessend präsentieren wir Ergebnisse eines Vergleichs zwischen unseren Modellen

und dem besten bekannten Verfahren für diese Aufgabe.

Chapter 1

Semi-Assignment Problems: A

Survey

1.1 Introduction

In the commonwealth of combinatorial optimization problems we have two major

kingdoms: One consists of efficiently solvable problems for which an algorithm exists

that determines in polynomial time an optimal solution. The other vast kingdomconsists of NP-hard optimization problems which nowadays can only be solved

optimally by more or less implicit enumeration of all feasible solutions. The semi-

assignment problem which will be discussed in this thesis belongs to this latter class.

Since NP-hard problems are so difficult to solve optimally, heuristics play a crucial

role in this area. In general a heuristic does not find the global optimum of a

problem. For this reason we mean in this thesis by writing "a heuristic solves a

problem" only the search for a good approximate solution.

An important class of heuristics consists of the so-called local search methods

whose most prominent representatives are Simulated Annealing and Tabu Search.

These methods search in neighborhoods of feasible solutions and determine there a

local optimum. Though in general local search methods perform quite well, there

nevertheless exist instances for which they behave poorly.

A typical drawback of local search methods is their inherent myopia and their local

vision which does not allow them to process the global information of a probleminstance. This phenomenon can be observed impressively with instances which

are characterized by a very large global maximum, surrounded only by pointswith equally small objective values. We can imagine the landscape of the solution

space as a vast constant plateau (with small objective values) midst of which a

geyser (corresponding to our global maximum) erupts. For this reason we will call

2 Semi-Assignment Problems: A Survey

objective functions of this type geyser functions, which will be described later in

more detail. Since there are no points of reference within the whole plateau which

may indicate the direction to the global maximum, local search methods cannot

deal with this problem type and will fail in finding the optimum.

In order to overcome this drawback of local search methods, we design in this thesis

a new method, called fixed point heuristic (FPH), which takes also global features

of the problem into account. Our hope with FPH was that by including globalinformation in the solution process, it would not be necessary to comb such a large

portion of the search space as local search methods may have to do. FPH uses the

global information to direct the search towards a hopefully large local optimum.

In order to follow up this idea, we refrain from using a neighborhood and define

FPH instead as a specialized discrete dynamical system which is a generalization of

Cochand's algorithm for the Generalized Maximum Satisfiability problem [Coc93].Based on a continuous embedding of the problem and thanks to the special

structure of the form of the objective function this allows us to include some globalinformation in the solution process. Thus FPH is able to find in some extreme

cases, such as for geyser functions, independently of the starting point always the

global optimum, where local search methods would certainly fail.

Naturally FPH was not only designed for geyser functions, but should as a gen¬

eral purpose heuristic also provide comparable results to local search methods for

'non-structured' problems. In the last chapter of this thesis, we will show the broad

applicability and the effectiveness of FPH with the practical example of the point

feature label placement problem where we compare our results with those of Simu¬

lated Annealing.

1.2 Problem Statement and Overview

In this section we present the formal definition of the constrained semi-assignment

problem and discuss solution methods for this difficult problem. We conclude in

Section 1.2.4 with an overview of this thesis.

1.2.1 The Constrained Semi-Assignment Problem

The constrained semi-assignment problem C-SAP generalizes the Pseudo-Boolean

Optimization problem and combines it with the Satisfiability problem. However, in

contrast to these problems we consider for the C-SAP instead of Boolean variables,decision variables which take values in an arbitrary finite subset of N. In order

to proceed with a formal description of the problem, we introduce the following


notation, which will be used throughout this thesis:

Let some decision variables xt for i G N := {1, 2,..., n} and a set of possible values

K := {1,2,..., k] be given. In the C-SAP we assign to each variable xt, i e N,exactly one value r(i) e K, which will further on be denoted by (z, r(z)). As a

convention we always use i, j to refer to indices of variables and r, s for (indices of)assigned values.

Definition 1.1 (Assignments)Let N be the index set of the decision variables and let K be the set of possiblevalues for each variable.

1. An assignment A is a set of the form A := {(z, r(i)) | i — 1,.. ., n, r(z) e K} C

N x K. Moreover, we denote by A the set of all possible assignments.

2. A partial assignment T is a set of the form T :— {(z, r(z)) | i e M, r(z') e K}for some M Ç N. Moreover, if a positive weight wt e R+ is given, we call

(T, wt) a weighted partial assignment.

3. A partial assignment T is satisfied by a given assignment AeA,ifTC A.

Now the following problems are defined.

Definition 1.2 (C-SAP, SAP)The constrained semi-assignment problem C-SAP for the decision variables xt, i e N,with possible values in K is defined by a 5-tuple (n, k, 1Z, T, w) where

1. It is a set of (forbidden) partial assignments with \R\ > 2 for all R e 1Z

(defining the constraints)

2. (T, w) is a class of weighted partial assignments (defining the objective func¬

tion)

and consists in

max z(A) = ^{wT\T Ç A} (1.1)TET

s.t. AeA (1.2)

R£ A \/Ren. (1.3)

An assignment AeA which satisßes (1.3) is called a feasible assignment.The semi-assignment problem SAP is the special case of the C-SAP in which TZ = 0;it is given by the 4-tuple (n, k, T, w).IfT = $ then the C-SAP reduces to the so-called Generalized Satisfiability problem(G-Sat).


The C-SAP is a unifying framework for a broad class of well-known combinatorial

optimization problems. On the one hand the Maximum-Satisfiability (Max-Sat),Maximum-/c-Cut (Max-fc-Cut) and fc-Coloring problem can be formulated as SAP's.

On the other hand Satisfiability (Sat), Maximum-Clique (Max-Clique) and label

placement are typical representatives of the C-SAP. All these problems and their

corresponding C-SAP formulations will be described in Section 1.3.

An important special case of the SAP is the well-known Pseudo-Boolean Opti¬mization problem, which uses Boolean decision variables and can therefore be

formulated as a SAP with K = {1,2}. First investigations of this problem go back

to Hammer et al. during the 60s, and since then many papers have been written

on this subject. Special classes of Pseudo-Boolean functions are characterized in

[Cra89]. For some of them efficient algorithms for the optimization problem have

been found (see e.g. [Rhy70, BM85, HS86]).

There also exist many real world problems which can be formulated as SAP's or

C-SAP's. Especially the task allocation problem is a classical representative of the

semi-assignment problems. Many applications for this problem exist, including

assignments of professors to departments [GS91], assignments of tasks to processors

[BCS92] as well as classical allocation problems in production systems and fleet

assignments.

In order to give a first impression on how such problems can be modeled, let us

discuss the task assignment problem in a heterogeneous multiple processor system.

This problem has been investigated in [BCS92]. Now we will show how it can be

formulated as a SAP. Let a set of k non-identical processors K := {1,..., k} and

a set of n tasks N := {1,... ,n} be given. The tasks have to be assigned to the

processors in order to be executed. Some of these tasks have to communicate with

each other. These intertask communications are represented by an undirected graphG = (N,E), where [i,j] e E if and only if tasks i and j communicate with each

other. For each edge [i,j] G E we have communication costs Cy, if the tasks i and jare assigned to different processors. If both tasks are executed by the same proces¬

sor, then their communication costs are negligible. Finally we have execution costs

Wir, if task i is assigned to processor r. The objective of this task allocation problemis to assign each task to exactly one processor and to minimize the sum of execution

and communication costs. Note that the order in which the tasks are executed does

not matter!

To model this problem as a SAP we define T := 7! U T2, where the partial assign¬ments in (7Î, wi) correspond to the execution costs and (72, w2) models the intertask

communications. For the execution costs we simply have to add the weights of the

assigned tasks and therefore (T\,wx) := {({(i,r)}, wir) | V(z',r) e NxK}. Regardingthe intertask communication we have costs c^- for each pair of tasks being executed

on different processors. Hence we define (T2,w2) := {({(z',r), (j, s)}, c^) \ V[i,j]E, Vr, s e K,r ^ s}. By minimizing z(A) := J2TeT{wT \ T Ç A} subject to A e A


we solve the stated task allocation problem. Of course this minimization problem

can be converted to a maximization problem, but this transformation will result in

negative weights. However, since we defined the SAP only for positive weights, a

further transformation will be necessary. We will see in Section 2.2.2 that negative

weights can always be eliminated and thus the problem can indeed be formulated

as a SAP.

1.2.2 Heuristic Solution Methods for the C-SAP

During the last years much time and effort has been spent in structure and complex¬

ity analyses of NP-hard optimization problems and on the development of algorithmsto handle them. The NP-hardness of the above-mentioned combinatorial optimiza¬tion problems was already proven during the 70s [Kar72, Kar75, SG76, GJ79] and

only a few easy solvable special cases [Bok81, Bok87, Mal94, MP94] are known.

Hence, in the following years research concentrated on the development of heuris¬

tics. Several different methods like genetic algorithms, neural networks, evolution¬

ary strategies and local search methods have been developed to attack these hard

problems. Of these heuristics especially local search methods like Simulated An¬

nealing, a special Metropolis process going back to Kirkpatrick et al. [KGV83] and

Cerny [Cer85] (see also [BR84, CHdW87, LA87, AK89, Con92]), and Tabu Search

[HdW87, dWH89, Glo89, GL92] achieved a breakthrough. On the one hand these

methods are applicable to a broad class of optimization problems, on the other hand

they often provide well accepted results in very short time and are therefore used

frequently for hard combinatorial optimization problems like the C-SAP.

In spite of all these advantages, there exist C-SAP instances where local search

methods are doomed to failure due to their local vision and their lack of abilityto orient in the solution space. One such problem is the so-called geyser function

which we mentioned already in the introduction. We will describe this function here

in detail:

Example 1.3 (Geyser Function)Let N, K be given, A' Ç A be the set of feasible assignments and A* e A' be a

unique strict global maximum with objective value z(A*) := M > 0. Moreover,let ziA) := 0 for all other feasible assignments A e A' \ {A*}. The problem of

Unding a feasible assignment of maximal value can be formulated as a C-SAP with

(T, w) := HA*, M)} and K := A \ Ä.

We see that geyser functions have a large (discrete), constant plateau (manyneighboring solutions with equal objective values). There is only one single pointtherein which has a larger objective value. This is the reason why neighborhoodsearch methods cannot deal efficiently with this problem type.

For this reason one main goal of this work was the development and study of a

general purpose heuristic for solving C-SAP's, which will be called 'fixed pointheuristic' (FPH). Our hope is that this heuristic will overcome the above-mentioned


weakness and disadvantage of local search heuristics and thus represents a goodalternative to these methods in some difficult cases as for example for geyser

functions.

FPH is based on a continuous relaxation of the C-SAP. We embed the set of

all solutions in continuous space thus getting a polytope whose extremal points

correspond to all possible assignments. Then we construct an appropriate operator

in the interior of this polytope such that the evolving discrete time dynamical

system generates a sequence of points which converges under certain assumptions

to a local optimum of the problem (see Section 1.4).The advantage of such a dynamical system approach lies in the fact that global

information about the problem is directly included during the solution process. In

this way the search for good solutions is not restricted by a discrete neighborhood

system as it is the case with local search methods.

In the sequel we discuss some dynamical systems used for combinatorial optimization

problems, where we concentrate especially on the class of replicator dynamics which

plays an important role in the definition of the operator used in FPH.

1.2.3 Dynamical Systems in Combinatorial Optimization

The use of dynamical systems in combinatorial optimization certainly got an

impulse from the success of interior point methods in convex optimization. An

excellent survey of dynamical systems in optimization can be found in the book of

Helmke and Moore [HM96]. The possibility of efficiently shortcutting through the

interior of a feasible region opened some hope for dealing with NP-hard problems

in a similar way.

Brockett and Wong [BW91] used a gradient flow approach to a special class of

assignment problems of the type mmPe-pn tr(C7*P), where C is a given cost matrix

and Vn the set of n x n permutation matrices. Though this assignment problem

is polynomially solvable, Wong [Won94, Won95] has shown that this approachcan easily be extended to a larger class of combinatorial optimization problems

including the Travelling Salesman problem and the Graph Partition problem.

To solve the C-SAP, FPH uses a discrete dynamical system which is closely related

to the so-called adjusted replicator dynamics. This dynamics has its origin in the

field of theoretical biology where it was studied in its single- and multi-populationform. The single-population replicator dynamics goes back to Taylor, Jonker

[TJ78] whereas its multi-population counterpart was introduced by Maynard Smith

[MS82]. Both dynamics were also studied in their discrete and continuous version

in [Aki79, LA83, Sig87, HS88]. During the 90s important relations between the

fields of theoretical biology, evolutionary game theory [Wei96] and optimization

theory [BPG97] were investigated and many similarities were found. Some of these


important results regarding the stability of equilibria and their relationship to local

optima will be described in Section 1.4.

Independently of these investigations, Cochand [Coc93] used the adjusted replicator

dynamics in a heuristic approach to the Generalized Maximum Satisfiability

(G-Max-Sat) problem (see p. 12), followed a few years later by work of Bomze

[Bom96, Bom97] for the quadratic programming and the Max-Clique problem.

This recent work on the G-Max-Sat and the Max-Clique problem has shown once

more the efficiency of the adjusted replicator dynamics as a promising approachfor attacking hard combinatorial optimization problems. In a similar way as the

adjusted replicator dynamics was used by Cochand, we will define operators for

FPH in order to use the resulting discrete dynamical system to deal with the SAP

and the C-SAP.

Derivation of Operators for FPH

The operator used in FPH for the C-SAP is a generalization of Cochand's operator

[Coc93] for the G-Max-Sat. Based on the author's suggestion that similar operators

may be suited for solving constrained maximization problems as well, we combined

the operator for the G-Max-Sat with an additional part being responsible for the

maximization of the objective function. Thus the two goals of the C-SAP, the

satisfaction of all constraints and the maximization of the objective function are

reflected directly in the definition of the operator of FPH.

Moreover, later during our research we came upon the works of Baum and Eagon

[BE67] as well as Baum and Sell [BS68], who investigated some properties of the

maximization part of the operator. Though their works were not aimed at solvingcombinatorial optimization problems, we could build up on their papers and derive

in this way some nice generalization of their results.

Our hope with such a dynamical system approach is not so much the issue of short-

cutting through the interior rather than finding suitable dynamics for which the re¬

gion of attraction of points with large objective values is also large. Unfortunately,there may also exist some disadvantages for such methods, like uninterpretable fixed

points in the interior of the feasible domain or numerical instability during the com¬

putation. We will see in this work that for the operator used in FPH, for the SAP,fixed points in the interior are always unstable and attractors of the operator are

equivalent to strict local maxima. For the C-SAP a small modification of the oper¬

ator proposed in [Coc93] guarantees as well that all fixed points in the interior are

unstable. Numerical instability cannot always be avoided and depends heavily on

the data of the given C-SAP instance.


1.2.4 Overview

Section 1.3 describes how some well-known combinatorial optimization problemsfit into the setting of the C-SAP. The transition of the discrete problem to a

continuous model, as well as the basic ideas of the associated dynamical system

constitute Section 1.4.

The rest of this thesis is divided into three parts each presented in a separate chapter.

The first part discusses the SAP. It starts in Section 2.1 with a short problem

description and continues with a repetition of the algorithmic main issues and

the definition of a basic operator for the SAP. Certain properties of the objective

function and a special class of polynomials are outlined in Section 2.2. Moreover,this section introduces some important equivalence relation between these polyno¬mials and characterizes then discrete and continuous local optima. In Section 2.3

we review important aspects of the dynamical system introduced by Baum, Sell

[BS68]. This discussion brings up the terminology of growth transformations,

basically an operator generating a sequence of points with monotonically increasing

objective values. We define a new, more general class of operators and show

sufficient conditions under which these operators are growth transformations.

The next subsections deal with properties of the iteration sequence, especially

concentrating on the relationship between fixed points, attractors and local

optima. Moreover, we derive some convergence results and deal with the special

case of a SAP with linear objective function. Investigations of the regions of

attraction are carried out in Section 2.4. We will see that in case of the SAP, the

objective function is not uniquely defined over the set of feasible points. Hence

we can define different 'forms' of the objective function which all agree over some

domain, but have different regions of attraction. We investigate the influence

of such modifications of the objective function on the regions of attraction and

we characterize a so-called guaranteed region of attraction which turns out to

be form-independent. This leads to a stopping criterion for the algorithm which

additionally speeds up the computation. In Section 2.5 we look once more at the

regions of attraction, now however for a class of more general operators. Againwe present a stopping criterion for the algorithm and focus in the sequel on

connections with the guaranteed region of attraction defined for the SAP. Finally,numerical results with the example of the Max-Cut problem are given in Section 2.6.

In the second part of this thesis, in Chapter 3, we add constraints 1Z in form of

forbidden partial assignments to the previously studied SAP. This extended prob¬

lem, the C-SAP is introduced in Section 3.1. There we also treat the construction

of the new operator dealing with the additional constraint set. The new part of

the operator is adapted from Cochand [Coc93] and as we will outline in Section 3.2

it is able to repel infeasible points under certain conditions. Moreover, we show

in this section that if the objective value of a feasible, global maximum is large


enough, then FPH converges to this optimum for any starting point lying not too

close to the boundary; it thus overcomes one weakness of local search methods

described before. The second part concludes with some implementation details of

the algorithm, a presentation of numerical results and a comparison with Tabu

Search in Section 3.3.

The final part, Chapter 4, presents the example of the point feature label placement

problem as an application of the theory developed in the previous two chapters. After

an introduction in Section 4.1 stating the problem and discussing its complexity,

we give in Section 4.2 an overview of the wide variety of well-known algorithms

developed for this task. In Section 4.3 we model the label placement problem as a C-

SAP and describe furthermore the problem of point selection. Section 4.4 introduces

four preprocessing rules which reduce the number of constraints in the C-SAP model.

Moreover these rules are optimality preserving in the sense that an optimal solution

with the same objective value as before, also exists in the reduced search space. In

Section 4.5 different models for the label placement problem based on variants of

the C-SAP are discussed. Section 4.6 presents two postprocessing strategies which

can be combined with the previously derived models, thus additionally improvingan already found placement. Besides the aspect of placing labels without overlaps,aesthetical criteria play often an important role. For this reason we discuss in

Section 4.7 two models based on the C-SAP whose goals are to avoid ambiguoussituations. Towards this end the objective function is designed such that labels are

placed as far apart from each other as possible. Finally, several numerical results are

shown in Section 4.8. Besides a discussion of the reduction by the preprocessing rules,

we also depict various placements and compare our models to one another and other

heuristics especially designed for this task. Parameter settings and implementationdetails conclude this chapter.

1.3 Combinatorial Problems Formulated as C-

SAP's

We have already seen that the C-SAP is given by a 5-tuple in,k,lZ,T,w), where

(T, w) is a set of weighted partial assignments defining the objective function, and

1Z is a set of forbidden assignments forming the constraints. The following exampleillustrates the definitions of Section 1.2:

Example 1.4

Consider a SAP instance (n, k, T, w) and a C-SAP instance (n, k, TZ, T, w) with five

decision variables n = h,N :— {1,...,5} and four possible values k = A,K :=

{1,..., 4}. The set of weighted partial assignments (T, w) with T := {T1;..., T4}for both the SAP and the C-SAP and the constraints TZ := {Rlt R2} for the C-SAP

are given as shown in the following table.


R2

X\ X2 £3 £4 £5 Xi X2 X3 X4 X5

N N

(a) SAP given by (n, k, T, w),Ax= {•}. (b) C-SAP given by (n, k, K, T, w), A2 = {•}.

Figure 1.1: SAP and C-SAP with optimal assignments Ax and A2.

objective function constraints

weights partial assignments partial assignments

5

7

11

2

7\

T2

T3

T4

(1,3)(2,4)

(1,2)(2,2)(3,3)

(4,2)(5,1)

(5,4)

JR1:(1,2)(5,1)Ä2: (2,4) (3,4)(4, 2)

Figure 1.1(a) shows (7~, w) for the SAP, whereas Figure 1.1(b) includes additionallythe constraints 1Z for the C-SAP.

fn both ßgures we depict horizontally the decision variables and vertically the

possible values. Moreover, they show the sets Ti,... ,7/4 with their corresponding

weights (solid lines) and Figure 1.1(b) also shows the constraints Ri and R2 (dashedlines).

To get an assignment A we have to select in each column exactly one point. More¬

over, A is a feasible assignment for the C-SAP, if it does not include all points of

either of the constraints Ri,R2.If all points of a partial assignment Ti, i = 1,..., 4 are contained in an assignmentA then its corresponding weight wtx is added to the objective value of A.

For the SAP the assignment Ax := {(1, 2), (2, 2), (3, 3), (4, 2), (5,1)} has objective

value 18, because it satisfies the weighted partial assignments (T2,7) and (T3,11).It is shown in Figure 1.1(a) by the solid dots and is optimal for the SAP.

However, A\ is infeasible for the C-SAP because Ri is a subset of A\

(Ri violates (1.3)). An optimal assignment for the C-SAP is A2 :=

{(1, 3), (2, 4), (3,1), (4, 2), (5,1)} (in Figure 1.1(b) depicted by solid dots) with ob-


jective value ziA2) = 16. Note that although A2 includes two of the points in R2 it

does not contain them all and therefore does not violate constraint (1-3) for R2.

Since combinatorial optimization problems are often formulated on the basis of

logical clauses, we show next the relationship between partial assignments and

logical clauses:

Any partial assignment T e T with T = {(i,r(i)) | i e M,r(i) G K} for M Ç N

can also be identified by a logical clause C(T) of the form

C{T)=/\(i,r(i)) (1.4)

where the clause C(T) is satisfied by an assignment A, if and only if T C A.

{{i,r{i)) | i e

(1.5)

Using this relationship between partial assignments and logical clauses, we presentsome combinatorial optimization problems which can be formulated as SAP's:

• Max-Sat

For a given set of Boolean variables xl, their negates xt,i e N and a collection

C(T) of clauses, the Max-Sat problem (in conjunctive normal form) consists

in finding a truth assignment for the variables which maximizes the number

of satisfied clauses in C(T).

Let K := {1,2}, then a truth assignment for x corresponds to an assignmentAix): For i e N

x% = True :^ (i, 1) G Aix)

Ê, = True :<£»(«, 2) G Aix).' '

A clause C(T) G C(7~) of the Max-Sat problem in conjunctive form is given by

C(T) = /\ xt f\ xt, M1nM2 = 0, Mi UM2ÇJV.

By (1.6) there exists to each C(T) a corresponding partial assignment T:

C(T)= /\(z,l) /\ii,2) o T = {(i,l) \ieMl}U{ii,2)\ieM2}.leAfi ieM2

(1.7)

For the constraints, each partial assignment R G TZ given by R —

M,ri%) e K} for M Ç JV can be interpreted as a clause C(i2):

C(J2) = -(A(i'r(0))which is satisfied by an assignment A, if and only if R <Z A.


Let T be the set of partial assignments corresponding to CiT). We see that a

clause C(T) is satisfied by x, if and only if T is satisfied by assignment Aix)and therefore the Max-Sat problem can be written as a SAP with weightswT = l for all T eT.

The generalization of the Max-Sat problem where \K\ > 2 and the decision

variables xl may take any value r, G K, is called the Generalized Maximum-

Satisfiability (G-Max-Sat) problem (see also [Coc93]).

• Pseudo-Boolean OptimizationA pseudo-Boolean function is a real-valued function of 0-1 variables, which

can be expressed as a polynomial in the variables xi,...,xn and their

complements x\,... ,xn. The Pseudo-Boolean Optimization problem consists

in maximizing the function value of such a polynomial.

Again to each monomial T = wt IiieMi xi \~\ieM-2. ^ 0I" ^ne objective function

with M\ H M2 — 0 and M\ U M2 Ç N there corresponds the weighted partialassignment (T, wt) with

T = {ii,l) \ieMx}u{ii,2) \ieM2}.

Let (T, w) be the set of weighted partial assignments corresponding to the set

of monomials of the objective function, K := {1,2} and the correspondencebetween a truth-assignment x and an assignment Aix) be given by (1.6). Then

the objective function can be written as follows:

max \_. wt TT %i TT %i = max 2_\{wt \ T Ç Aix)}T£T (i,l)T (i,2)eT TeT

and therefore the Pseudo-Boolean Optimization problem is a special case of

the SAP, too.

• Max-fc-Cut

Given a graph G = (Ar, E) with vertex set N, edge set E and positive edgeweights wtj for the edges [i,j] G E. The Max-A;-Cut problem consists in findinga partition of N into k subsets such that the sum of the weights of edges whose

endpoints lie in different sets is maximal.

Let K := {l,...,k} represent the sets of the partition. Each node i e N

must be assigned to a set r(i) G K which can be represented by a mappingr : N -ï K. Then the objective function of the Max-fe-Cut problem is

max V^ wl3.

b,]]eE-.r0)7Mj)

We write for any i G N,r G K: (i, r) for the predicate that node i belongsto set r. Then the SAP (n, k, T, w) corresponds to the Max-A;-Cut problem,if T consists of all partial assignments of the form T = {{i,r), ij, s)} for all

[i, j] G E,r,s G K,r / s and wT := Wij.


• fe-ColoringGiven a graph G = (N,E) and a set of colors K := {l,...,k}. The

fc-Coloring problem consists in finding an assignment of k colors to the nodes

i G N, such that the number of edges between equally colored nodes is minimal.

The objective is equivalent to maximizing the number of edges between dif¬

ferently colored nodes. We see immediately the relation to the unweightedMax-A;-Cut problem, where all nodes in the same partition will get the same

node color.

Some well-known problems that can be formulated as C-SAP's are

• Sat

Given a set of Boolean variables xz,i G N and a collection of clauses in the

form of disjunctions of some literals. The Sat problem consists in decidingwhether there exists a truth assignment that simultaneously satisfies all these

clauses, or not.

Each disjunction can also be represented as the negate of a conjunction as

described in (1.5). We denote the set of all these clauses by C(7Z) and the

corresponding set of partial assignments by 1Z.

This problem is a special case of the C-SAP, because it does not have an

objective function (7~ = 0), but consists only of the constraint set 1Z. Againlet K := {1,2} and the connection between a truth assignment x and a

feasible assignment A(x) be given by (1.6). Then A(x) satisfies (1.3) if and

only if all clauses C(-R) with R G 7Z are satisfied and therefore is a feasible

assignment for the C-SAP.

As for the G-Max-Sat problem, we call the Sat where more than two potentialvalues can be assigned to the variables xt, the Generalized Satisfiability (G-Sat)problem.

• Max-CliqueLet a graph G = (A", E) be given. The Max-Clique problem consists in findingthe largest complete subgraph in G.

We introduce for every node i G N a 0-1 variable x% and define

Xi = 1 :4=> i belongs to a clique.

Then the Max-Clique problem can be stated as follows

max y jxlieiv

Ö.L. dj^Juri U V[i,j]#E

z. e {o, 1} \/l£N.


The objective function maximizes the number of nodes in a clique, whereas

the constraints guarantee that only nodes of one clique are counted: If two

nodes i and j are not connected by an edge then they cannot both belong to

the same clique.Let K := {0,1}, then the objective function is given by the set of weighted par¬

tial assignments (7", w) with T := {{(i, 1)}, i G N} and wt = 1 for all T G T-

Analogously the constraint set 7Z is formulated by 1Z := {{(i, 1), (j, 1)}, i,j G

N,[i,j] <£ E}. Then the (n,2,7Z,T,w) describes the Max-Clique problem as

a C-SAP.

• Label Placement (see Chapter 4)Let n points in the plane be given, denoting points of interest on a map.

Moreover, for each point a label (description of the point) is given which has

to be placed at one of k given positions in order to mark the corresponding

point. The label placement problem consists in selecting one of the k label

positions for each point such that when placing the labels at these positions,

they do not overlap, ambiguity is avoided and other aesthetical criteria are

fulfilled.

Preferences of label positions as well as ambiguity can be modeled by some

suitable objective function, overlapping by the constraint set. Different models

will be discussed in Chapter 4.

All problems mentioned above are NP-hard. In order to find good approximatesolutions for these problems, we want to include global information in the solution

process. Towards this end we consider a continuous formulation of the C-SAP and

introduce a variable p G {0, l}"xfc which represents uniquely any assignment AeA

by setting pir = 1 if and only if («, r) G A.

Definition 1.5 (Feasible Points: AJ,A, A°,A*)The set of all assignments is defined by

AT := Ipe {0,l}nxfc

Moreover, we denote its convex hull by

^2pir = l,VieN\. (1.8)r=l

$>„• = !, VieN\ (1.9)r=l

A—ConviA1) = lpe[0,l]nxk

and its relative interior by A0. One single simplex will be denoted by

A*:=\qe[0,l]kk

ZAr=\


We see that p is an n x ^-matrix whose rows correspond to the decision variables

and whose columns correspond to the values to be assigned. If p G A7, then it

represents uniquely an assignment A. On the other hand, if p G A, then pir G [0,1]can be interpreted as the probability that value r is assigned to decision variable

xt. We denote the rows of p by p%, e A* for i G N.

Using this characterization of the set of all assignments, we formulate the C-SAP

given by (n, k, 1Z, T, w) as in Definition 1.2 by the following integer program

max zip) = ^2 \wt H Pir\ (1.10)T&T \ (,,r)er /

s.t. peA1 (1.11)

Yl pir = 0 VReTZ. (1.12)(i,r)eR

Note that then a SAP in,k,T,w) is just given by (1.10) and (1.11).

1.4 Idea of the Algorithm

A common and often successful approach for solving the problems presented in

the previous section are local search heuristics, like Simulated Annealing and Tabu

Search. However, as we have seen in Example 1.3 their inherent myopia may provetheir undoing for certain instances. For this reason one of our goals in the devel¬

opment of a new heuristic was to include the ability to 'orient itself, from which

especially problems like those in Example 1.3 will profit. Towards this end we embed

the set of all assignments in the continuous space by relaxing the integer constraints

(1.11). Thus for a C-SAP (n, k, 1Z, T, w), we get the following relaxed problem:

max zip) = ^2 \wT Yl P*r) (1-13)Ter \ (,,r)er /

k

s.t. ^2Pir = 1 yieN (i.i4)r—l

0<pir<l VieN,reK (1.15)

Y[ p„ = 0 VRe 1Z. (1.16)(i,r)£fl

Analogously for the SAP a relaxed problem is defined by (1.13)—(1.15) and will be

denoted by R-SAP.


Working in the interior of A, we define for our heuristic a continuous mapping

F : A0 —> A0 and consider the discrete time dynamical system which results

from iterating F. Thus the basic concept of our algorithm, called the 'fixed point

heuristic', is the following:

Fixed Point Heuristic (FPH):

• Choose a starting point p° G A0.

• Compute the sequence p* := _F(pi_1), t = 1,2,..., I for some /.

• Choose as solution p* G A1 the assignment p* which is 'closest' to pl.

Subsequently we will concentrate on the definition of F which is central to obtain a

sequence of points converging to one of those feasible points in A1, corresponding to

a 'good' solution of the given problem for a reasonably large proportion of starting

points p° e A0.

The success of such discrete dynamical systems as an alternative to local search

methods can, among other papers, be put down to the works of Cochand [Coc93]and Bomze et al. [BPG97] dealing with the Max-Sat and Max-Clique problem,

respectively. Both of these papers use the described concept, where the dynamicsis a special case of the adjusted replicator dynamics. Since we will work with this

dynamics as well, we describe it below in more detail:

In its single-population version the replicator dynamics goes back to Taylor, Jonker

[TJ78] and describes in the context of theoretical biology the evolution of a popula¬tion over time [Aki79, Sig87, HS88]. It is defined for a population state pt G A* at

time t by

Pr+1 = T^fj-iPr ==*V)n T = 1, . . .

, k, (1.17)

where each r G K denotes some property and the components p* represent the

population share of the individuals at time t. In theoretical biology A is a symmetricmatrix with non-negative elements and in this case the so-called Fundamental

Theorem of Natural Selection [LA83, Kur90] says that ipt)TApt, the average fitness,increases along every non-stationary trajectory of (1.17). Moreover, it is known

that in this case every trajectory converges to an equilibrium.

Bomze et al. [BPG97, BBPP99] used the dynamics (1.17) as an approach to the Max-

Clique problem. Let A denote the adjacency matrix of a given graph G. Motzkin

and Strauss [MS65] have shown that (1 — 2z*)~l is the size of the maximum clique,if z* denotes the optimal objective value of the following quadratic optimization


problem

mBXzip):=\pTAp (118)s.t. pG A*.

Note that if G has k nodes, then A is a symmetric k x k matrix and so in this case

the Fundamental Theorem of Natural Selection holds. Besides investigating this

quadratic optimization problem, the authors also interpret dynamics (1.17) in the

game theoretical context, where matrix A represents the payoff matrix of the players.

Independently of the interpretation of dynamics (1.17), its equilibria or fixed points,

i.e. points for which F(p) = p, are of special interest. Obviously, they are given bythe solutions of the equations

pr[iAp)r-pTAp] = 0 r = l,...,k. (1.19)

In case of the quadratic optimization problem (1.18) with symmetric matrix A,

it was shown by [BPG97] that not only are all solutions of (1.18) among these

equilibria, but also that the following relations between equilibria and local optimahold: Strict local optima of (1.18) are equivalent to asymptotically stable equilibriaof (1.17). Moreover, every local optimum of (1.18) is a Nash equilibrium, and Nash

equilibria and the Karush-Kuhn-Tucker points of (1.18) are equivalent.

These results show in the case of the single-population replicator dynamics, how

the characterization of the equilibria also determines the local optima of the cor¬

responding optimization problem. Subsequently we will turn our attention to its

multi-population counterpart: Instead of working only over one simplex A*, it is de¬

fined on the crossproduct of simplices, A. Let us introduce the notion of the fitness

function:

Definition 1.6 (Fitness Function)A vector Ç = (£ir) (i G N,r G K), where £jr : A0 —^ R+ are continuous functions, is

called fitness function on A0.

The adjusted multi-population replicator dynamics ([Tay79, MS82]) is defined for

p G A0 by the operator F[£](p):

Definition 1.7 (Operator F[$](p))Let £ be a fitness function on A0. Then the operator F[£] : A0 —> A0 is defined for

alii e N,r G K by

PiÂirjp)m(ph--=~ff- where E«(p) := X>.&(p). (1.20)

In the setting of multi-population dynamics where n populations are considered

whose members can adopt one of k possible strategies, this dynamics describes


the evolution of the proportion of individuals in population i adopting strategy r

(see [Wei96]). This evolution equation expresses the fact that the relative increase

F[ç](phr-Pir 0fp.r equals the excess fitness ^^7 )

P 0I" subpopulation (z, r) over the

average fitness of population i.

Cochand [Coc93] has used an operator of type (1.20) for solving the G-Max-Sat

problem. He has chosen an appropriate fitness function £ such that truth assign¬ments which satisfy all clauses correspond to asymptotically stable fixed points of

the dynamical system.

Since the C-SAP is a generalization of the G-Max-Sat problem, we adopt the fitness

function used in [Coc93]. On the one hand, it should be responsible for the feasibilityof the solutions by making logically forbidden assignments repellors. On the other

hand, we want to get a good feasible solution. For this reason the first fitness

function is combined with a second one whose task is to attract the system towards

assignments with high objective values. This will be achieved by using the partialderivatives of the objective function as fitness function. Hence, we define

Definition 1.8 (Gradient/Repellor Dynamics)Let z be the objective function of a SAP instance (n, k, T, w) with Y = (Tir) defined

by

TiTiz,p) := ^-ip) = J2 wt II Pi" W£N,reK. (1.21)PlT

<£& ö»er\(«,r)

IfT is a fitness function on A0, i.e. Tirip) > 0 for all p G A0, then we call the

dynamics defined by F[Ta], a > 0, a gradient-type dynamics.

Moreover, if a set of forbidden partial assignments 7Z is given, we define the fitness

function O = (Ojr) as in [Coc93] by

®ir(p):= Il (1_ Il Pi") Vi£N,reK (1.22)

and call the dynamics defined by F[Q^],ß > 0 a repellor dynamics.

Note that by definition of the C-SAP (SAP), all weights wT of the objective function

zip) in (1.13) are positive. Hence, a sufficient criterion for P = (rir) being a fitness

function on A0 is that all variables in z(p) exist. We will see in Section 2.2.2 that

this can always be achieved by an appropriate transformation of z. Moreover, we

observe that 0 < Oîr(p) < 1 for all p G A0 and therefore 6 is also a fitness function

on A0.

For the C-SAP we will use the operator F[TaQß], combining the gradient-typedynamics and the repellor dynamics. In Chapter 3 we will discuss properties of this


operator and present some results of [BCG98].

We conclude this introduction with two examples to illustrate the behavior of the

three different dynamics.

Example 1.9

Let the following C-SAP instance with n = k = 2 be given.

max zip) = 5piip2i + 4pi2p22 + 3pi2P2i + 2pnp22

S.t. PnP21 = 0, P12P22 = 0, p G A7.

Here, the set of forbidden assignments is TZ = {{(1,1), (2,1)}, {(1, 2), (2, 2)}}.Moreover, we see immediately that p* := (\q) is the point in A1 with the

largest objective value zip\) = 5 and p*2 := (° \) has objective value z(p*2) = 4.

However, both points, pi and p2 are infeasible because they violate the constraints

corresponding to the forbidden assignments in TZ.

Figure 1.2 depicts the behavior of the three dynamics, in the pu and p21 coordinates

along the x- and y-axis, respectively. Each subfigure shows the iteration paths

(trajectories) of the corresponding dynamics where arrows mark their orientation.

In Figure 1.2(a) we see the trajectories for the gradient-type dynamics F[T0-5] used

to maximize z. The satisfiability constraints are ignored and only the SAP is con¬

sidered. We observe that the majority of the trajectories converges topi, ^ne pointwith the largest objective value, and a smaller part converges to p*.

Figure 1.2(b) shows the repellor dynamics F[&\. It ignores the objective function

and is only designed to satisfy the constraints in TZ. Hence this case corresponds to

the Sat problem (T = 0j given only by the constraint set TZ.

Finally, Figure 1.2(c) depicts the trajectories of the combined dynamics ofF[Y2<d° 5].None of them converges to a forbidden point and their majority converges to the

feasible point in the upper left corner, which is the global maximum of the C-SAP.

The following example shows that for the quadratic maximization problem (1.18)over one simplex, the replicator dynamics (1.17) coincides with the gradient-type

dynamics of F\T].

Example 1.10

Let A be a positive, symmetric kxk matrix and the quadratic optimization problem

(1.18) be given. Then the gradient T of the objective function z(p) is Ap and

therefore we get for F[T]ip):

prrU \^ip) (AP)r

In Chapter 2 we will study the gradient-type dynamics of F[T]. However, in contrast

to the quadratic optimization problem over one simplex as in (1.18), we investigate


(a) Gradient-type dynamics: F[T° 5] (b) Repellor dynamics: F[&\.

(c) Combined dynamics: F[r20'2ß0 5l

Figure 1.2: Trajectories of three different dynamics in the (pu,p21 )-plane.

F[r] for objective functions zip) of the SAP and we are therefore working on A,the cross product of simplices.

Chapter 2

The Semi-Assignment Problem

2.1 Introduction

The aim of this chapter is the development and study of a continuous relaxation

based heuristic for the SAP. We recall from Chapter 1 that a SAP instance is defined

by a 4-tuple (n, k, 7", w), where w is a function w : T —> M+ and every T G 7" is a set

of the form T = {(i\,ri),..., ii\T\,T\T\)} with î; < 2^,1 < / < V < \T\ < n, 1 < rx <

k Throughout this chapter we will use the sets N :— {1,..., n}, K := {1,..., k} and

denote the weights by wt := W(T). A SAP instance in,k,T,w) can be expressed

by the nonlinear integer program

max zip) = ^2 \wT Y[ Pv.

TET \ (i,r)<ET

s.t. peA1,

(2.1)

(2.2)

where

A7= pe{0,l}:JXfc ^2pir = l,VieN

r=l

We will call 'relaxed semi-assignment problem' (R-SAP) the problem obtained by

relaxing the integrality constraints in (2.2) to

pGA=<pG[0,1inxfc J2pir = 1, Vz G N \ . (2.3)

r=l

The objective function z in (2.1) will be investigated in detail in Section 2.2,

whose first subsections are dedicated to structural properties, transformations and

equivalent formulations of z on A. The last two subsections discuss the relationshipbetween the SAP and the R-SAP and concentrate therefore especially on the

locations of local optima of both problems.

22 The Semi-Assignment Problem

In order to 'solve' the SAP we will deal with its relaxation R-SAP, where our ap¬

proach is based on the concept of growth transformations:

Definition 2.1 (Growth Transformation)Let z be a continuous function on A0 Ç Rnk. We say that a continuous mapping

F : A0 —y A0 is a growth transformation for z iff

zip) < ziFip)) Vp G A0.

If additionally

zip) = ziFip))^p = Fip) (2.4)

F is called a strict growth transformation.

In this chapter we study transformations F[Ç] : A0 —>• A0, where £ = (£ir) is a

vector of continuous functions £îr : A0 —> R+, and F[Ç] is for all 2 G N, r G K of the

following form:

Pir^irjp)

£Î=iP"£«(p)n\(p)ir = ^r W- (2-5)

Recall from Section 1.4 that Ç is a fitness function, which needs to be defined

according to the optimization problem to be solved. In case of the SAP, let zip)be the objective function (2.1) and let us assume that all variables in zip) exist.

Moreover, we recall from (1.21) the definition of V as the partial derivatives of zip),i.e. Tir(z,p) — §*-($) for all 2' G N, r G K. We see that under these assumptions

all Tir are positive on A0 and therefore F = (Tj,.) can be used as fitness function in

(2.5) for the R-SAP (n, k, T, w).

If there is no danger of confusion, then we will simply write rzr(p) instead of Tiriz,p).

The mapping F[T] was already discussed in a paper by Baum and Sell [BS68], where

the authors investigated F[T] for arbitrary polynomials with positive coefficients and

derived the following important result.

Theorem 2.2 (Baum, Sell)Let z : M.nk —> R be a polynomial in the variables pir,i G N,r G K with positivecoefficients. Then F[F] is a strict growth transformation on A0.

We will adopt this mapping and use F in the context of combinatorial optimization

problems which can be formulated as R-SAP instances. The objective function z(p)as in (2.1) has some special properties to be discussed in Section 2.2. Moreover, in

Section 2.3 we will introduce other fitness functions £. Based on the idea of Baum

and Sell we will prove a generalization of Theorem 2.2 by giving some sufficient

conditions under which F[£] is a growth transformation. Furthermore we will

2.1 Introduction 23

investigate the behavior of F[T] and show that local maxima and saddle points

are always fixed points of F. After further discussion of properties of the iter¬

ation sequence and convergence of F[T] we will finally characterize attractors of F[T].

Let us conclude this introduction with the following small example which demon¬

strates the typical behavior of the operator F[T]:

Example 2.3

Let the following SAP with N := {1, 2} and K := {1, 2} be given:

max zip) = 2pup2i + p12p22 + Pu + Pu + P21 + P22- 2

S.t. Pr\+Pi2 = l 2 = 1,2

pîrG{0,l} 2 = l,2,r = l,2.

This problem has its global maximum in p* = ( {[J ) with objective value zip*) = 2

and a local maximum in q* = ([] i) with ziq*) = 1. Moreover, we observe that its

relaxation R-SAP does not have any local optima in the interior A0 - a phenomenonwhich holds for all polynomials of the form (2.1), as we will see in the next section.

(a) Graph of zip) in pn,p2i-plane. (b) Contour plot of zip) and gradientvector field

Figure 2.1: Plots of the graph zip) — 3pnp2i — Pu —

P21 + 1.

Figures 2.1(a) and 2.1(b) show the restriction of z(p) to the pn,p2i-piane. Substi¬

tuting pl2 = 1 - p%1 for 1 — 1,2 in zip) we get

zip) = 3piip2i - P11 - P21 + 1-

We see that by such substitutions we can construct new polynomials z which have

a different 'form' (representation) from z, but their values agree on A. This ob¬

servations will lead to the definition of an equivalence relation for these polynomials.


The graph zip) is depicted in Figure 2.1(a) and the corresponding contour plot with

the gradient vector field in Figure 2.1(b). From both pictures we clearly see that

there exists a saddle point at p := | ( \ \ ).

In this example we get for the partial derivatives

nz,p)=(lP21

+ ] P22 + ])(2-6)^ '

V2pn + 1 Pu + l) y }

and since all coefficients of z are positive, it follows that Tir(p) > 0 for i = 1,2, r =

1, 2 and allp G A0. Note that this would also have been true, ifz(p) consisted onlyof the first two monomials, i.e. zip) = 2pup2i +P12P22 which is another polynomialthat agrees with zip) on A. However, by adding the other monomials in zip) we

guarantee that the partial derivatives Tir(z,p) are strictly positive for allp G A and

therefore F[T] can also be continuously extended to the boundary of A.

(a) Trajectories of F[T]. (b) Regions of attraction for the

strict local maxima p* and q* (nu¬merically determined).

Figure 2.2: Trajectories and regions of attraction.

Figure 2.2(a) shows the trajectories of F[F] for this example. We observe that the

steps become very small in a neighborhood of a saddle point or local maximum.

Furthermore, both local maxima p* and q* have a region of attraction shown in

Figure 2.2(b): the black region is the region of attraction of the global maximum

p*, whereas the light gray region belongs to q*. Note, however that in the generalcase we cannot determine the regions of attraction explicitly.

This example has already touched many interesting questions in connection with

important properties of the operator and the objective function. In Section 2.4

we will focus on the regions of attraction of strict local maxima and we will be

able to explicitly describe a polytopal subset thereof. This knowledge can then

advantageously be used for an additional stopping criterion in FPH.


2.2 On the Objective Function of the R-SAP

In this section we first investigate some properties of the objective function and its

partial derivatives r^, which largely influence the behavior of F[T]. Then we studythe relationship between the SAP and the R-SAP where we concentrate mainly on

the locations of local optima and saddle points.

2.2.1 Basic Properties

The objective function z in (2.1) is a polynomial function, of the following specificform.

Definition 2.4 (A-Polynomial, PN)• Let a 4-tuple (n, k, T, w) as for a SAP instance, together with a constant c G R

be given. Then we call z : Rnk -> R

Z(P) = J2 \Wt II Pir)+C (2-7)T£T \ (,»eT /

an assignment-polynomial (short: A-polynomial), where pir are variables with

i e N,r G K. The degree m of the A-polynomial z(p) is defined by m :—

m&XTeT \T\.

• Let U Ç N, then we denote by Pu the set of all A-polynomials (2.7) which

contain variables pir with i G U.

• Let z e PN. If k = 2 then we call z a Boolean A-polynomial; if m = 2 tien z

is referred to as a quadratic A-polynomial.

If not mentioned otherwise, we will assume that any A-polynomial z lies in PN, i.e.

for alH G N there exists a monomial in z with non-zero weight which contains a

variable pir for any r G K.

We observe that A-polynomials are linear affine functions in the variables p;. for all

i G N. Hence, for every fixed i G N we can write

k

Zip) = ^PirYirip) + Riip) (2.8)r=l

where

r,>(p)= X>t ft Pjs = 4^(p) VieN,reK (2.9)

l^T ü»er\(i,r)Pir


and

Riip):= J2 wt Yl Pis + C VieN. (2.10)-,

T6T: 0>)6T

Note that rjr(p) and Riip) are for all 2 G N, r G TT A-polynomials which do not

depend on the variables p;. and therefore r;r(p),i?;(p) G PN\l.

2.2.2 Equivalent A-Polynomials

We have seen in Example 2.3 that there exist different A-polynomials which all agree

on A. This is a consequence of the fact that we are working with polynomials in M.nk,the dimension of A, however, is only n(A; — 1) and therefore there exists some degreeof freedom. We define the following equivalence relation between A-polynomials:

Definition 2.5 (Equivalent A-polynomials)Let z1,z2 : R"fc -> R be two A-polynomials. Then

zi = z2 :<& ziip) = z2ip) VpGA

and we call Z\_ a form of z2.

Subsequently we will characterize all A-polynomials zip) which are zero on A. For

this reason we use the 'Division Algorithm for Polynomials' (see e.g. [CL097]).

Let R[pn,... ,Pnk] be the set of all polynomials in pn,... ,pn£ with coefficients in R.

Theorem 2.6 (Division Algorithm)Let a monomial order be fixed and let F = (<7i,..., gs) be an ordered s-tuple of

polynomials in R[pn,... ,pnk\- Then every z G R[pn,... ,pnk] can be written as

z = hgi + —h fsgs + h,

where fa, h G R[pn,... ,pnk] and either h = 0 or h is a linear combination, with

coefficients in R, of monomials, none of which is divisible by any of the leadingterms ofgx,...,gs.

Proposition 2.7

Let z be an A-polynomial. If zip) = 0 for all p G A then

n I k \

Z(P)=H I>--1 )fi(p), (2.11)2=1 V=l /

where for all j e N : fj(p) are afEne iinear polynomials in Pi. which do not dependon pj,.


Proof

Let us define g%ip) '= Y^r=iPir ~ 1 f°r all 2 G A" and let us fix a monomial order

given by the lexicographic order, where the variables are ordered as follows:

Pu < ••• <Pik < < Pnik < •• < Pnk-

For this order the leading term of each polynomial giip) is just given by the variable

Pik- Applying the 'Division Algorithm for Polynomials' it follows that any polyno¬mial zip) can be written as

71

Z(P) = X] h(p)9i(p) + hip)1= 1

where /2(p), i e N and h(p) are polynomials and either hip) = 0 or hip) is the sum

of monomials none of which has p^, 2 G N as a factor.

If z(p) = 0 on A, then it follows by construction that h(p) = 0 on A and therefore

h(p) = 0. However, since hip) does not depend on Pik for any i G N, we define

M :={pe R<k-V \Pir>0VieN,reK\ {k}, 1 - YHZlPis > 0, V2 G N} with

M° / 0 and observe that hip) = 0 on M. Moreover, note that the Taylor series of

h agrees with h. Since the partial derivatives Yir(h,p) = 0 for all p G M° it follows

that hip) EOonrM.

This proposition allows us to construct all objective functions which agree on A.

However, regarding FPH, such a change of the objective function has far-reaching

consequences on the partial derivatives and therefore on the behavior of the operator

F[T] as well. As we will see in Section 2.4, we can thus influence the step length of the

iteration sequence, hence also the regions of attraction of strict local maxima - and

all this only by constructing equivalent A-polynomials. For this reason it becomes

especially important to know how the partial derivatives of equivalent A-polynomialscan differ, which can be derived directly from Proposition 2.7:

Corollary 2.8

Let an A-polynomial z\ G PN be given.

1. If fi e PN\l for all i e N are given, then there exists z2 G PN with z1 = z2

such that Tirizi,p) — fi + Yir(z2,p) for all i G N,r e K.

2. If z2 G PN with z\ = z2 is given, then there exist functions fi G PN\l for all

i e N such that Ytr(z1,p) = /j + Yiriz2,p) holds for alii G N,r G K.

A direct consequence of this corollary is the following remark regarding the partialderivatives of two equivalent A-polynomials (which will often be used in Section 2.4).

Remark 2.9

Let z\, z2 be two equivalent A-polynomials. Then for all i G N,r,s G K and any

p G A the following equivalence holds:

Yis(zi,p) > Tirizi,p) <£> Yisiz2,p) > Yir(z2,p). (2.12)


The results of this subsection can also be used to construct equivalent A-polynomialswith positive partial derivatives.

Transformation 2.10 (Positive Derivatives)Let Z\ be a polynomial as in (2.7) with possibly negative weights wt and some non-

positive partial derivatives. We define M := 1 — Y1t-w <o wtand an A-polynomial

z2 by

n k

z2(p) = zxip) + mJ2J2p^ -nM- (2-13)2=1 r=l

By construction z2 has for all p G A positive Y-values, because Yir(z2,p) =

Yir(zi,p) + M for alH G N, r G K. Moreover, z2 agrees with z\ on A.

2.2.3 Relationship between Discrete and Continuous Local

Optima

In this section we study the relationship between the SAP (2.1), (2.2) and its

R-SAP relaxation and investigate properties of local maxima of A-polynomials in

the discrete and continuous case.

The next proposition describes a construction procedure which converts anysolution of the R-SAP into an at least equally good solution of the SAP:

Proposition 2.11

Let z be an A-polynomial. Then for any point p G A we can construct in polynomialtime a point p* G A1 with zip*) > zip). Hence, in particular

maxz(p) = max,z(p).

Proof

Let p G A \ A1 and let i e N be fixed. W.l.o.g. we assume that rti(p) > rîr(p) for

all r G K. We know from (2.8) that zip) can be rewritten for any fixed t e N as

z(p) = X)r=iP"-r«-(p) + Riip)- Thus we get

k k

zip) = £pirrir(p) + R,ip) < Ttlip) Y,Pir + Riip) = r,i(p) + Riip). (2.14)r~l r=l

Note that in (2.14) neither r2i(p) nor R%ip) contains any of the variables p„, r G K.

We define a new vector p G A by

{1,ifj = 2,r = l

0, ifj = 2,r>l. (2.15)

pjr, otherwise


(2.14) implies immediately that zip) = Tji(p) + Ri{p) = Ynip) + Riip) > zip). By

replacing p by p and repeating this procedure for each non-integer vector p3,j G N

we construct an integer solution p* G A7 with zip*) > zip). m

In the sequel we look at the relationship between discrete and continuous local

maxima of the SAP and the R-SAP, respectively. For this reason recall the definition

of a continuous local maximum:

Definition 2.12 (Continuous Local Maximum)A point p* G A is a (strict) local maximum of the R-SAP, if there exists a neighbor¬hood Uip*) : Vg G Uip*) n A : ziq) < zip*) (z(q) < zip*)).

Moreover, in the discrete case we use the following definition of a neighborhood:

Definition 2.13 (Discrete Neighborhood)Let an integer vertex p G A1 be given. Then the discrete neighborhood of p is

defined by

tfip) :={qeAI\3ieN: ç, + p„ qj. = py Vj G N \ {i}}. (2.16)

Note that if z is an A-polynomial and p G A7, then from the linearity of zip) in p,.

for all i G N it follows that for all q G Nip), A G [0,1]

ziXp + (1 - \)q) = Xzip) + (1 - X)ziq). (2.17)

Using this neighborhood N we define the discrete local maximum by

Definition 2.14 (Discrete Local Maximum)A point p* e A1 is a discrete (strict) local maximum of the SAP, iffor all q G N(p*):z(q) < zip*) (ziq) < zip*)).

We have the following relationship between discrete and continuous strict local max¬

ima.

Proposition 2.15

Let z be an A-polynomial and p* G A1 with p\x = 1 for all i e N be given. Then

the following statements are equivalent:

(1) p* is a discrete strict local maximum of the SAP

(2) p* is strict local maximum of the R-SAP

(3) Yn(p*) > Yirip*) for allieN,r>l.

Proof

(1)^(3): Let q* G Af(p*) with q*r = 1 for some r > 1. Since Yirip) and R^p) do not


depend on pim we have Riiq*) = Riip*) and IV (g*) = IV (p*). Hence, for the discrete

strict local maximum p*

z(p*) = Ynip*) + Riip*) < Yïrip*) + R,ip*) = ziq*)

for all q* G Nip*) and therefore Ynip*) < Yirip*) for all i G N, r > 1.

(3)=»(2): Let r\i(p*) > rir(p*) for all i G TV, r > 1. By continuity it follows that

there exists an e-neighborhood U£ip*) such that

Vg G UEip*) n A : Ya(q) > Yiriq) Vi G N,r > 1. (2.18)

Let ç G Ueip*) D A and apply the iteration procedure used in the proof of

Proposition 2.11 to get an integer point q* from q. We see that then q* = p* and

zip*) > z(q) because any intermediate point in the iteration procedure remains in

Ueip*) n A and (2.18) still holds.

(2)=^(1): Since zip) is linear along the line from p* to any discrete neighbor

q* G Nip*) (2.17) and p* G A7 is strict local maximum, z(p) is strictly decreasing

along these lines and therefore p* is also a discrete strict local maximum. H

Regarding the locations of continuous local optima we will prove an even strongerresult in the following theorem, which implies that every strict local maximum is an

integer vertex. More precisely, this theorem will show that there do not exist local

optima in the interior of A, unless z is constant on A. This property is a basic result

in harmonic function theory (see e.g. [ABR92], Theorems 1.4,1.5). We prove this

result for A-polynomials, thus taking advantage of the fact that they can be written

as in (2.8).

Theorem 2.16

Let z G PN be an A-polynomial and for all i G N a convex set Li Ç A* with

non-empty relative interior V] be given and let Sn := L\ x • • • x Ln C A. If zip) is

not constant, then z has no local optima in the relative interior S°.

Proof

We prove by induction over the number n of convex sets Li: i = 1,..., n that if zip)has a local optimum in 5° then it must be constant. For easier writing we define

S?:=L°x-.-xL?, 2 = 1,...,n

to denote the relative interior of the cross product of the first i given convex sets.

Since we build up matrix p successively by adding rows, we use here the notation

p = (u, v), where v always corresponds to the last row of p.

Let 7"i = l, then z G P^ is just the linear function z(i>) = J2r=i wrvr + wo which

has only local optima in S®, if ziv) is constant on 5°.


Now we assume that our induction hypothesis holds for all A-polynomials z G PN\n.

For u G Sn_i,v G L° we have (u,v) G S® and any A-polynomial z G PN can be

written as

k

ziiu,v)) = ^vrzr(u)+z0iu), (2.19)r=l

where zr G PN\n for r = 0,..., k are A-polynomials, u G S°_1 and uei". Now let

us assume that (u*,v*) G 5° is a local optimum in 5°. If we let v* G L° be fixed,we get

k

ziiu,v*)) = J2<zAu) + zoiu) (2.20)r=l

which is an A-polynomial in pN\n and since u* G S°_x is a local optimum of (2.20)we have by the induction hypothesis z((it, t>*)) = c on «S^. This allows us to express

Zoiu) from (2.20) by z0iu) = c — 2^r=i Kzr{u) and after re-substitution in (2.19) we

get

k

zHu, v)) = J>r - <)2r(u) + c. (2.21)r=\

If there exists a neighborhood Ue((u*,v*)) such that for all points (u, u) G

Ueiiu*, v*)) H -S° : z((tt, u)) = c, then z((m, i;)) = c on S'7i, by the same argument as

already used at the end of the proof of Proposition 2.7. In the sequel we use the

negation of this result and assume z((-u, v)) ^ c on Sn:

zHu, v))^c^Ve>0 3(û, v) G W£(K, v*)) n S° : z((ü, ü)) 7^ c.

W.l.o.g. we define £> := v* — (v — v*) and choose e > 0 small enough such that

v, v e Lan n UEiv*). If we assume that z((-ü, ü)) < c then it follows from (2.21) that

z((ü,v)) > c. Hence ^((u, v)) in (2.21) attains in any neighborhood Ue((u*,v*))values larger and smaller than c and therefore (u*,v*) cannot be a local optimum

AssumptionFrom now on, we will assume that zip) is not constant on A.

We denote for a point p* G A the face of A which it lies in by

Fp* := {p G A I p*. = 0 => pir = 0}.

Since Fp* is again a cross product of simplices, Theorem 2.16 implies the followingassertions about local optima.


Corollary 2.17

Let z be an A-polynomial. Then:

(1) There are no local optima in A0.

(2) Ifp* is a non-strict local optimum, then z is constant on the whole face Fp*.

(3) Ifp* is a strict local optimum then p* G A7.

Proof

(1) follows directly from Theorem 2.16 by setting Lt := A* foi all i G N. Moreover,

Theorem 2.16 implies that a local optimum p* cannot lie in the interior of Fp*unless z is constant on the whole face. If p* is a strict local optimum p* cannot

even lie in the interior of a constant face Fp* and therefore p* G A7.

Note that in this corollary, (2) does not necessarily imply that all points in Fp* are

local optima (see Example 2.30 on page 44).

Finally we show with an example the importance of the strictness of local maxima

in Proposition 2.15. Weakening this assumption such that p* is only a non-strict

local maximum, we will see that equivalence between discrete and continuous local

maxima does not hold anymore.

Example 2.18

Let the following A-polynomial be given

Z(P) =PllP21+Pll +P21 +2(p12+p22).

Depicting the objective values in the projection to pn and p2\ coordinates (p12 =

1 — pn and P22 — 1 — P21 are given implicitly) we see in Figure 2.3 that there exists

a discrete strict global maximum q* = (01) with objective value z(q*) = 4 and a

discrete local maximum p* in the upper right corner of the drawing, with p* — ( {§ )and objective value zip*) = 3. Though all points p on the connecting edges ofp* to

3

global max., 4

Figure 2.3: Objective values of z in the pn,p2i-plane.

3, local max.

*tP*


its discrete neighbors have constant objective values zip) = 3, there does not exist

a neighborhood Uip*) such that for all q G Uip*) D A : z(ç) < z(p*).The restriction of the objective function z to the pn,p2i-plane is given by

z(Pll,P2l) = PllP21 + PU + P21 + 2(0- ~ Pll) + (1-P2l))= P11P21 - Pu - P21 + 4

with Pii,p2i G [0,1]. Let us compute the objective values along the diagonal con¬

necting the global and the local maximum. Setting pu := p21 we get zipu,Pn) —

p\x — 2pn+4. In any neighborhood Uip*) we get for points q G UnA on this diagonalwith qn := 1 — e objective values z(qu,qn) — (1 — e)2 — 2 + 2e + 4 = 3 + e2 > 3 for

all s > 0 and therefore there cannot exist a neighborhood Uip*) : Vç G Uip*) n A :

z(q)<zip*).

2.2.4 Saddle Points, Karush-Kuhn-Tucker Points and Nash

Equilibria

As we will see in Section 2.3.2 there exists a close relation between fixed points

of F[Y] and stationary points of zip). For this reason we characterize in this

section the stationary points of z, where we distinguish between saddle points,

Karush-Kuhn-Tucker (KKT) points and Nash equilibria.

Let us first define the stationary points and saddle points of an A-polynomial z.

Definition 2.19 (Stationary Point, Saddle Point)A point p* e A is called a stationary point of z, if the gradient of z projected on the

affine subspace aff(A) is zero in p*, i.e. Yir(p*) = vt for all i G N,r G K and some

constants u,eR.

Moreover, we callp* G A a saddle point ofz on aff (A), ifp* is a stationary point and

in any neighborhood Z4(p*) naff(A) there exist points p, q with zip) < zip*) < ziq).

Since the R-SAP is a nonlinear programming problem on A, first-order necessary

conditions for local maxima are given by the KKT conditions (see Appendix B).Moreover, since A-polynomials are in general not concave functions sufficiency is

not implied.

For a point p G A we define the support supp(p) by

supp(p) :={ii,r) e N x K \ pir > 0}.

In order to derive the KKT conditions for the R-SAP, we use the Lagrangian multi¬

pliers uir for the non-negativity constraints on pir, and vl for the equality constraints

describing A. Then the Lagrangian function for the R-SAP is given by

n k n / k \

Lip, U, V) = -zip) - ]P ^2 U*rPir + ^2 Vl SPîr ~ 1 )2=1 T=\ 2=1 \r=l /


From this we derive directly the KKT conditions for the R-SAP in p*. We get for

all 2 G N, r e K:

-Y^ip*) - uir + Vi = 0 (2.22)

-uirp*r = 0 (2.23)

uir > 0. (2.24)

We see that for all (i,r) G supp(p*), (2.23) implies that iiir = 0 and in this case

(2.22) simplifies to Yirip*) = Uj. On the other hand points p* on the boundary have

components with p*r = 0 and then IV (p*) = V{ — U{T < viy because of (2.22) and

(2.24). Consequently a KKT point p* of the R-SAP is characterized by

r , *\ fvi> if (2, r) G supp(p*)

\ Vi —uir, otherwise

where v^ := maxrEx Fjr(p*). Now we see from (2.25) that every stationary point of

z is a KKT point on A.

Next we introduce the so-called Nash equilibria which are defined as follows:

Definition 2.20 (Nash Equilibrium)A point p* e A is a Nash equilibrium of the R-SAP, if

V2 G N, Vç G {p G A | Pj. = p*, Vj eN\ {%}} : ziq) < zip*).

In our context Nash equilibria will thus be viewed as local maxima with regard to

one component. From the linearity of z(p) in pim for all i G N, it follows directlythat the set of all integer Nash equilibria p* G A7 of the R-SAP is equivalent to the

set of discrete local maxima of the SAP.

Proposition 2.21

Let z be an A-polynomial. Then p G A is a Nash equilibrium of the R-SAP, if and

only if

V2 G N 32, G E, such that j1^ = Vi> J'f (*' r) G 8UpP^. (2.26)

I rir(p) < Vi, otherwise

Proof

'=>': Let p G A be a Nash equilibrium. W.l.o.g. let for all i G N : Ynip) =

maxPjr(p) and we define v^ := Ynip). Thus Yir(p) < V{ for all i G N, r G K.r£K

Assume there exists (z,s) G supp(p) with r^p) < Vi = Ynip) which would contra¬

dict our proposition. Then we can define a new vector p as in (2.15) and get a point


with larger objective value: zip) > zip), since

k

Zip) = Y^P^riP) + Rr{P) < r,l(p) + Riip) = Ytlip) + R,ip)r=l

k

= ^2/pirYir(p) + RriP) = zip).r=l

This contradicts our assumption that p is a Nash equilibrium. Thus YlT(p) = vz, if

(2,r) G supp(p) and r2r(p) < vt otherwise, which proves the first direction.

'-4=': Let for alii e N : Y„ip) = vz for all (2, r) G supp(p), and Tîr(p) < v% otherwise.

Moreover, let 2 G N be fixed and p be a vector with p3_ = pJm for all j e N\ {i} and

pt. G A* arbitrary. We show in (2.27) and (2.28) that then p is a Nash equilibrium.

k k k

Z(P) = ^PirYir{P) + Riip) = ^PirYirip) + Rz{p) < J^Är^ + Riip) (2-27)r=l r=l r=l

k

= vz + Rz (p) = J2 PirYir ip) + Rl (p) = zip). (2.28)r=l

The inequality in (2.27) follows from the assumption that Y„ip) < vt for all

2 G N,r e K. In (2.28) we use the fact that rîr(p) = vz for all (i,r) G supp(p). It

follows that p is a Nash equilibrium.

Comparing Nash equilibria to KKT points, we have just proven the following theo¬

rem.

Theorem 2.22

Let z be an A-polynomial and p* G A0. Then the following three statements are

equivalent:

(1) p* is a Nash equilibrium

(2) p* is a KKT point

(3) p* is a saddle point

of the R-SAP.

Since for quadratic A-polynomials all partial derivatives r„.(p) are linear functions,the next corollary follows immediately from Theorem 2.22 and (2.26).

Corollary 2.23

Let zbea quadratic A-polynomial, then the Nash equilibria in A0 form a polyhedron.


We demonstrate subsequently how Corollary 2.23 can be used to compute the set

of Nash equilibria in A0 for the (unweighted) Max-&-Cut problem. (For a detailed

description of the Max-Zc-Cut problem see Section 1.3).

Let an unweighted graph G = iN,E) be given. Then for K := {1, ...,&}, the

objective function of the Max-/c-Cut problem can be expressed by the followingquadratic A-polynomial

Z(P)= J2 ^PirPj*- (2-29)[i,j]eEr,seK

r^s

Let us denote for all i G N by V(2-) := {j : [i,j] G E} the neighborhood given bythe edges of G and by di := |V(i)| the degree of vertex i. Then a vector p G A is a

Nash equilibrium, if and only if for all i G A^

(2, r) G supp(p) =ï J2 Pjr = Ti

e \h <\^

^> (2-3°)

(2, r) % supp(p) =* VJ pjr > 7i

jev({)

where ji = di — max Pjr (p).

(2.30) can be derived directly from (2.26): For (2.29) we have Yir =

lZjev{i)T,s:s^rP3s and since Y.s-.s^-Pjs = 1 - Pjr we get Yir = £jeV(i)(l - Pjr) =^j ~~ ^2jev(i)Pjr- Hence p is a Nash equilibrium, if and only if for all 2 G A" there

exist Vi such that

(2, r) e supp(p) =» Yir = di- Eiev(i) 2V = v{

(2, r) 0 supp(p) => rir = di - £jey(0 p^v < ^'

Using Vi = maXj-Tj,. and 7; := di — Vi then (2.31) is equivalent to (2.30).

With the help of this result we can easily compute the set of Nash equilibria in A0 as

we demonstrate here for the Max-3-Cut instance whose graph is shown in Figure 2.4.

The corresponding R-SAP has the following objective function

zip) = P11P22 + P11P23 + P12P21 + P12P23 + P13P21 + P13P22

+ P11P32 + P11P33 +P12P31 +P12P33 +P13P31 +P13P32

+ PllP42+PuP43 +P12P41 +P12F43 + P13P41 + P13P42

from which we compute the partial derivatives

4

r^ = EEfts VreK (2.32)2=2 s:sj^r

Yir=Y^P^ i = 2,3,4,Vr G K. (2.33)s:s^r


(1 I I)\3> 3' 3/

©" © "®(P21,P22,P23) (P3l,P32,P33) (P41, P42, P43)

Figure 2.4: Graph of a Max-3-Cut instance where the given vectors correspond to

Nash equilibria (in the interior), whenever p2r -\-p^r +P4r = 1 for r = 1, 2,3.

From (2.33) it follows immediately for i = 2, 3, 4

TV = (1 - Pir) = Vi Mr e K =^ pn = Pl2 = Pl3 = v2 = ^3 = v4 = -

(2.32) is equivalent to rlr = ^2i=2(^ ~ Pir) = 3 — Y^i=2Pir = vi for all r G if and

therefore

\J Pir = 3 — vi Vr G K. (2-34)2=2

In order to compute v\ we substitute pn — 1 — pt2—

Pi3 and get using (2.34) for

r = 1:

^l Tu = 3 - J^Pii = ^Pi2 + ^Pi3 = 2(3 - Vi)2=2 2=2 2=2

,0and thus v\ — 2. The Nash equilibria in A of the graph depicted in Figure 2.4 are

given by

NE := ^ p G Ac Pi. i(l,l,l),J>r = lVrG#l2=2 J

In order to show that there exist directions where the objective value increases and

others, where it decreases, we have to disturb at least two rows of a matrix p G NE.

Let p G NE and let us define pn := | + £i,pi2 := \ - £i,P2i := P21 + ^2,p22 '=

P22 ~ £2 for some small e\, e2 and pV := Pir for all other elements, such that p G A0.

Computing now zip) we get

zip) = zip) - 2eie2

which shows that p is a saddle point, since choosing once e\ and e2 with different

signs yields an increase of z whereas a choice with same signs decreases z.


2.3 Properties of the Iteration Sequence

In this section we discuss properties of the operator F[£], its convergence behavior

and relations between local maxima, fixed points and attractors.

Let Gt : R+ —> R+ be continuous, strictly increasing functions for all i G N. If Y is

a fitness function, then the following monotone transformation

G : Rnk -) Rnk, G := (d,..., G1}..., Gn,..., Gn) (2.35)

defines a new fitness function £ = G(r) which will be used in some of the followingtheorems. Note that then from £îr = Gi{Yir) it follows immediately that for any

fixed p G A:

IVip) < Yls(p) & £„. (p) < £«ip), ieN,r,se K. (2.36)

Moreover, the definition of F\Ç\ip) implies that unless p is a fixed point, there

always exists (at least) one component of p which increases under F. Using (2.36)we can give the following sufficient condition for a component's increase:

Let z e PN and p' := F[£]ip), where £ := G(Y). Moreover, let s e Kn,i e N and

p G A0. Then

YuM > YAP) Vr^st^p'tat>pMi. (2.37)

Moreover, if z is a Boolean A-polynomial, then equivalence holds and we have

I\i(p) > Ll2(p) & p\x > Ptl. (2.38)

To prove (2.37) we define ulT such that p'ir = ulTpir for all i e N, r G K. Then we

simply have to show that ulSi > 1, which follows immediately from (2.36) since

««. := ~kUP

> 7~Ä = 1- (2-39)2—ir=lPlr^ir Çis, / ,r—] P2r

For the Boolean case we have uz\ := —,,l'1—st~ > 1 for all i G N and therefore

Uil > 1 <& fil > Piliil + (1 - Pil)&2 ^ (1 - p.l)£,i > (1 - Pil%2

& U > &2 Vpzl / 1 ^ Yn > Yi2 Vpii / 1

and hence (2.38) holds.


2.3.1 Growth Transformations

In Definition 2.1 we introduced the terminology of (strict) growth transformations.

If F is a strict growth transformation, then we have the nice property that for any

point p° G A0 the sequence {Ftip°)},t > 0 does not cycle. In Theorem 2.2 we have

already seen that F[r] is a strict growth transformation. This result was first derived

for homogeneous polynomials by Baum and Eagon [BE67] and later extended to

arbitrary polynomials with positive coefficients in Baum and Sell [BS68]. However,the authors gave an even stronger result which we will derive here directly from

Theorem 2.2:

Theorem 2.24 (Baum, Sell)Let z : Wlk —> R be a polynomial with positive coefficients in the variables pir,i G

N,r e K and let F := F[Y]. Then for any 0 < A, < 1 (i e N) the point p defined

by

Pi := (1 - Xi)pi. + \F(p)i, V2 G N (2.40)

satisfies

zip) < zip) (2.41)

and equality in (2.41) holds if and only if F(p) — p.

Proof

Let Xi G (0,1] for all i e N be fixed and p be given by (2.40). We define a new

polynomial function zip) which agrees with zip) on A by

zip) := zip) + J2Ki l^Pir I - YlK^2=1 \r=l / 2=1

where

Ki : =(1 ~ A°E*(P)

with ^ip):=J2pMz,p).5=1

Obviously, for all p G A the following relationship between the partial derivatives of

z and z holds:

rîr(z,p) = Y„(z,p)+Ki \/zeN,re K.

Hence, with Fiz,p)tr = PtrT^v) we get

v =F(zv)

_(T>r(z,p)+Kl)pir St(p) KtPir — r \Z,p)ir — ^

. .

—

. .^ \Z,P)ir + ^. ( \ , ts Pir

K ip) + A 2 S2 (p) + Ki Ei ip) + Ki

= \Fiz,p)tr + (1 - \)pir


which proves the theorem, since by Theorem 2.2 ziFip)) > z(p) and property (2.4)is fulfilled.

Note that in this theorem it is not assumed that z is an A-polynomial. Moreover,

an important consequence of this theorem is that F cannot leave a 'local hill' as

will be shown in Corollary 2.31.

The following theorem is a generalization of Theorem 2.2 regarding the fitness func¬

tions of the operator F[£\.

Definition 2.25 (Homogeneous Function)A function z : Rre —>• R is called homogeneous of degree d, if ziap) = adzip) for all

ae R.

Theorem 2.26

Let z : M.nk —r R be a polynomial with positive coefficients, F := F[Ç] where

£ := G(Y) as in (2.35) and Gi are concave, strictly increasing, twice differentiable

functions for all i G N. Then F is a growth transformation for z.

Proof

Following the proof given by Baum and Sell [BS68] we first establish this theorem

for homogeneous polynomials z.

Let a homogeneous polynomial zip) with degree m be given. We first introduce

some notations and identities which we will use in the following proof.

Any homogeneous polynomial zip) can be written as

z(p) - yi Wk(p), where rat(p) := JJ Pt(i,r)

%r

,r

and t(i, r) are non-negative integers with J2ir*(*> r) = m-

Moreover we have:

zip) = —T^PirYir (2.42)rn *—'m

i.r

^2 i(i, r)wtmtip) = pzrYir (2.43)t

mt(p) < -^tii,r)pir (2.44)m

is

where (2.42) is Euler's theorem for homogeneous functions, (2.43) is straight forward

computation and (2.44) is the inequality of the geometric and arithmetic means.

Let us denote the new point F(p) by q with

Pir^irr .

^ ,r _ T^

&r :==^ for 2 G JV, r G K.

Z_/s=lPi.sS2s


We define an auxiliary function vt(p) by

Vtip) ~ wtmt(p) (wtm-tiq))'^ (2.45)

and get for the objective function

zip) = Yl {wtmtiq))^î vtip). (2-46)t

Applying Holder's inequality to (2.46) we get

m+l

z(p)< (£(w^)^,(m+1)] IT^p^j (2-47)

= z(q)^ Ï^vtip)^ \. (2.48)

To give an estimate for the second term in (2.48) we use (2.45), apply then inequality

(2.44) and finally use identity (2.43). Thus we get

E-w" = X><M) gg)* = E(..w) (». (f))* p-«)

1n k , s

<^£wn*(p)£X>(i,r) ^ (2.50)t i=l r=l

\yir/

n fc v—\k

=1£ «,tWH(p) ££t(i, rfr £=*

**'&'(251)

mJ 2=1 r=l

Pir^r

= iED^)%^- (2-52)

The proof is finished, if we can prove that (2.52) is less or equal than zip), because

then

X>t(P)^<*(p) (2-53)t

and substituting (2.53) into (2.48) we get

z(p) < z(y-)+ï.z(p)+i

which implies that zip) < z(g) and thus proves monotonicity.

We see that if we set now in (2.52) £ir := Yir for all i e N,r G K, then (2.52) is by

(2.42) equal to zip). Together with the second part, the extension from homogenous

polynomials to arbitrary polynomials with positive coefficients, this is the proof of


Baum and Sell of Theorem 2.2.

In our more general context, where £ = G(Y), it remains to show that if Gi : R+ —y

R+ is for all i e N a concave, strictly increasing, twice differentiable function then

(2.53) holds. We simplify (2.53) using the following equivalences:

m = -£E>rv > -££ferr,r)E-;^^ (2.54)

n k k n k k

PisKii<*E£ft'r*I> >

EEE^r*if (2-55)2=1 r=l s=l z'=l r=l s=\

n k k/ C \

** EE£^»r* (x - r )-

° (2-56)i=l r=l «=1

^ ^irS

Ti K K-i—\

^EEE^r^ - &*) ^ ° (2-57)2=1 r=l s=l

^zr

TI K K /yi T^ \

^ EE E *rP» 7^ - 7^ ) (& - Ci,) > 0. (2.58)2=1 r=l S=r+1

^ KlT ^s '

Now let us introduce a shorter notation for easier writing: we denote for any fixed

i e N: Yir by xr and £ir by f(xr). Of course now the assumptions on Gi hold for /and we have x > 0 and /(a;) > 0. We will prove that if / is strictly increasing and

concave then

{7Ù ~ 7Ù)(/(ll) "/te)) * °- (2-59)

If we define g(x) := -#r and use the strict monotonicity of /, then (2.59) is fulfilled,if

fixi) < fix2) & xi < x2 & gixi) < g(x2)

and therefore g(x) is a monotone increasing function. We show that /(0) >

0, fix) > 0 and fix) < 0 for all x > 0 implies that g'ix) > 0.

g'ix) = m~*x*'{x) > 0 & f(x) - xfix) > 0.

However, defining h(x) := f(x) — xf(x) we see that /i(0) = /(0) > 0. Furthermore

we get by its derivative /i'(x) = f{x) — fix) — xf"(x) = —xfix) and the concavityof fix), i.e. f"(x) < 0 that h'ix) > 0 and therefore hix) is a monotone increasingfunction. Thus h(x) > 0 for all x > 0 and therefore g'(x) > 0 which proves this

theorem for homogeneous polynomials.

In a second step we show that this theorem does not only hold for homogeneous

polynomials, but for arbitrary polynomials with positive coefficients. For this reason


we rewrite zip) as zip) = J2 hdip) where hd(p) are homogeneous polynomials of

ri=0

degree d. Now we introduce a new variable q := (ci,..., q^) G A* and enlarge the

domain A to A x A*. We define a new polynomial z by

m—d

zHp,q)):=^2hdip) X> (2.60)<f=0 \r=l /

which is a homogeneous polynomial of degree m with positive coefficients and agrees

with zonA for any fixed ç G A*. We get

771/= \_

PirQr{Z,P)_

PirÇir{Z,P) ,_ „.. v

P/- ï <lrÇr{z,q) qAr{z,q)F{z, q)r

=

—-k-— =

77—r^= *• (2.62)

(2.61) follows immediately from (2.60) and the fact that Yiriz,p) = rîr(2:,p) for all

i e N,r e K. (2.62) is a consequence of the fact that all Yr(z, q),r G K have the

same value, because

/ k \rn-d-\

Yriz,q) = J2hd(p)(m-d) E9t =Ys(z,q) Vr,seK.d=0 \t=l /

An important consequence of (2.62) is that q is fixed. Since we know already that

this theorem is true for homogeneous polynomials we can apply it now to z and

using (2.61) and (2.62) we get

zip) = züp,q)) < ziFiz, ip,q))) = ^((F(^,p),ç)) = z(Fiz,p))

which proves the theorem for arbitrary polynomials with positive coefficients.

2.3.2 Fixed Points and Accumulation Points

In this subsection we concentrate on fixed points of the iteration sequence. It will

turn out that in the interior fixed points coincide with saddle points. Further¬

more we prove that on the boundary KKT points are fixed points, but not vice versa.

In order to speak about fixed points on the boundary, we assume from now on that

F[Ç] is well defined on A. If Ç = G(Y), then a sufficient condition for F[£] : A —; A

being well defined is given, if Y is a fitness function on A and therefore Yir(z,p) > 0

for all % e N,r G K,p G A (which can always be achieved by Transformation 2.10

on page 28).


Definition 2.27 ((Unstable) Fixed Point)Let F : A —y A be given, then p* G A is called a fixed point, if F(p*) = p*.

Moreover, a fixed point p* G A is called unstable, if there exists a neighborhood

Uip*) such that for every neighborhood Ui(p*) in Uip*) there exists at least one

starting point p G Uiip*) n ^ sucn ^aa^ tne sequence {Ft(p)},t > 0 does not lie

entirely in Uip*).

We start with an investigation of the relations between KKT points and fixed points,where we distinguish explicitly between points in the interior A0 and those on the

boundary.

Theorem 2.28

Let z be an A-polynomial, F := F[Ç], where £ := G(Y) as in (2.35) and p* G A.

(1) Ifp* is a KKT point, then p* is a fixed point.

(2) Moreover, if p* is a fixed point in A°; then p* is a KKT point, i.e. a saddle

point.

Proof

A point p G A is a fixed point, if F(p)ir — pir for all % G N, r G K which is equivalentto

PirCir^

J£ir = '52sPi8&s--V%, (%, r) G SUpp(p)= Pir <^ \ . . . . (2.DO)

Esftsf» [&r arbitrary, (2, r) ^ supp(p)

The strict monotonicity of G% guarantees that for any fixed p G A the order relation

(2.36) is kept. For p* e Awe see from (2.25) that every KKT point is a fixed point.

Moreover, if p* eA'ö supp(p*) = N x K then by (2.63) the fixed point set in A0

is equivalent to the set of KKT points in A0.

Of course this theorem implies that local maxima (and by the same argument also

local minima) are fixed points. However in general equivalence between local optimaand fixed points does not hold on the boundary, because all integer vertices are fixed

points, but not necessarily KKT points (or local minima), as the following exampleshows:

Example 2.29

Let z = 2pnp2i + 2pi2p22 + 3pi2P2i +PnP22- The integer vertex p* = ( j; g ) is no local

optimum, but nevertheless a fixed point of F[Y]. We see that Yip*) = i2\) which

therefore contradicts (2.25), the characterization of KKT points.

Subsequently we want to point out that not all fixed points p G A \ (A0 U A7) are

necessarily local optima.

Example 2.30

Let z = 2pnp2i + 3pi2p22 + 2pi2p2i + PuP22- For the point p* = ( | | ) we have

Yip*) — (22) and therefore p* is a fixed point.


P21;

2

3 1 Pn

However, the objective values along gi : P21 = 2pn are strictly increasing for pu G

[|, 0] and those along g2 : p2i = 2 — 2pn are strictly decreasing for pu G [\, 1]. Hence

in any neighborhood Ue(p*) there are points with larger and smaller value than zip*)and therefore p* is a fixed point which is no local optimum.

Having discussed the relationship between local maxima and fixed points, we will

now present a corollary about the behavior of F in some neighborhood of a fixed

point. This result is a direct consequence of Theorem 2.24:

Corollary 2.31

Let z : R"fc —y M be a continuous function, F be a growth transformation for z and

p* e A be a fixed point of F. Moreover, we define for any fixed e > 0

U'Eip*) := {p G A | zip) > zip*) - s} (2.64)

and denote the connected component of U'Eip*) containing p* by Ueip*) Ç Ue(p*).Then we have F(C/E) c Ue.

The following example shows that Corollary 2.31 does not imply pointwise conver¬

gence of {F*(p)},t>0.

Example 2.32

Let the following A-polynomial be objective function of an R-SAP instance

zip) = 3p2i(pn +P12) + 2p12p22 +pnp22.

It can easily be verified that all points p = (p" Pl2), with pi. G A* arbitrary, are

local maxima with zip) — 3. The graph ofz(p) in the pn, p2i-coordinates is depictedin Figure 2.5(a).

Now, let p* be a local maximum and therefore a fixed point. Then the sets U£(p*)defined by (2.64) are connected and Ue(p*) = U'E(p*). For different values ofe, these

regions UE(p*), bounded by the level curves of z are shown in Figure 2.5(b). We see

that these regions converge for e —y 0 to (p" Pq2 ), px. G A*, the connected component

of local maxima containing p*, represented in Figure 2.5(a) by the upper line.

Finally, we investigate accumulation points of a strict growth transformation F.


(a) Plot of zip) = 3p2i(pn + (b) Interlocking regions UE

p12) + 2pi2P22 +P11P22- converging to the upper edge.

Figure 2.5: Example illustrating sets U£ of Corollary 2.31.

Proposition 2.33

Let z : M.nk —> R be continuous function and F be a strict growth transformation

for z. Then all accumulation points of a sequence {p* := Ft(p0)},t > 0 for a fixed

p° G A0 have the same function value and are fixed points of F.

Proof

Note first that by the compactness of A, accumulation points of F exist. Since F

is a growth transformation it follows that all accumulation points of F must have

the same objective value. Now let p* be an accumulation point. Then there exists

a subsequence {ptl}, I > 0 which converges to p*. Moreover, continuity of F impliesthat {p^+1 — F(p*')} converges to F(p*) and since zip*) = z(F(p*)) it follows from

(2.4) that p* is a fixed point.

Using this proposition we can show an even stronger result than in Theorem 2.28(2).

Proposition 2.34

Let z be an A-polynomial and F := F[Ç] be a strict growth transformation for z.

Moreover, let p* G A0 be a saddle point. Then p* is an unstable fixed point of F.

Proof

Since p* is a saddle point in A0 there always exists a neighborhood Uip*) which

does not contain another saddle point q G U(p*) n A0 with ziq) > zip*). Hence,by Theorem 2.28(2) there is also no fixed point in Uip*) n A0 with larger objectivevalue than p*. Moreover, in every neighborhood Uiip*) n A0 Ç Uip*) there

exists a point p with zip) > zip*). Since F is a strict growth transformation

and by Proposition 2.33 every accumulation point is a fixed point, it follows that

{Ftip)},t > 0 leaves the neighborhood Uip*) and therefore p* is an unstable fixed

point.


2.3.3 On the Convergence

In this subsection we discuss some convergence results for the R-SAP regarding

operator F := F[Ç\, where Ç := G(Y) as in (2.35). We first concentrate on the

special case of R-SAP's with linear objective functions and afterwards turn our

attention to the general case.

Linear Case

Let us investigate the special case of an R-SAP with a linear objective function. It

should be noted here that such problems can easily be solved and they are discussed

here only for algorithmic reasons.

For given positive weights c G (R+)nfc we have the R-SAP

n k n k

mBi?EE CirPir = E maX E CirPir- (2-65)2=1 r=l i=l r=l

We see that in order to solve (2.65) it suffices to solve for each i G N the followingreduced problem over one simplex:

(2.66)

An optimal solution p* of (2.66) is given by p* — 1, if cr — maxse^ cs and p* = 0 for

all s G K \ r.

Since Tr(p) = cr is constant, £r = G(rr(p)) is constant, too. This allows us to

compute the i-th element of the iteration sequence generated by operator F[£] (hereexceptionally denoted by p® instead of p* in order to avoid confusion) explicitly:

pft) .

PÏ-%_ _

(*£)&)"_

MV'y

r

e;=ip?_1)6 e;=i (p-t) (6)*-1 eUp^y(2.67)

where E := ]Cs=i£*«£« Now we will show that (2.67) always converges to a globalmaximal solution of (2.66) (even if there exist non-strict local maxima).

Theorem 2.35

Let the linear optimization problem (2.66) be given, F :— F[Ç] where £ := G(Y) as

in (2.35) and I :— {r G K \ cr — maxseKcs}. Then Ft(p)r = pP is given by (2.67)and for any starting point p e A*, {Ft(p)} converges to an optimal point p* with

max

k

/ jCrpr

r=l

s.t. pG A*.

Vrr el

P: =<**""''

~~,T- (2-68)0, r G K\I


If the problem is not degenerated, i.e. all integer vertices have different objectivevalues and therefore |7| = 1, then the optimal solution of (2.66) is a unique integervertex p* e A1.

Proof

W.l.o.g. we assume

Ci = C2 = • • • = Q_i > Ci > > ck > 0,

hence zip) is not constant on A. Since YT = cr for all r G K and G : R+

strictly increasing, the same order as in (2.69) holds for £r, r G K as well.

(2.69)

R+ is

We prove that for any r G I and any s > 0 there exists an index to such that for all

* > *o: \Pr] -p*\<e. We get from (2.67) and (2.68) for all r G I

^ PrPr(fr Pr

J2sPs(ùY J2s£iPs

PrtérY (EaelPs) ~Pr (£fle/PS(6)*) ~ Pr (E^/Ps^)'(EsP'tesY) (EserPs)

-Pr (EstlPÀ&y)

<

(EseiP*) (EsP'tésY

Pr (£)<

EsaPs Pr&Y EseiPs

because ^r = £s Vr, sel

By (2.69) we have & < ^r for all r G I and therefore the last term converges to

zero as t goes to infinity. This proves the convergence for all components pr,r G I.

However, since p* G A and EreiP* = 1> a^ other components of p* must be zero

which finishes the proof of the theorem.

Our next goal is to study convergence for arbitrary A-polynomials z. We will show

pointwise convergence of {F^Yip)}, t > 0 for £ = G(Y) and we will estimate the

convergence rate.

General Case

Let z be an A-polynomial. We call a local maximum p* a local maximum with

maximal support, if for every local maximum q* e A : supp(p*) Ç supp(<?*) =>

supp(p*) = supp(ç*). This definition implies the following property for a local

maximum p* with maximal support, to be used in the Convergence Theorem 2.37:

p*. = o^rîr(p*)< (2.70)

Moreover, note that the negation of the KKT characterization (2.25) implies that

equivalence holds in (2.70).


To prove this observation let p* be a local maximum where we assume w.l.o.g.that p*i = 0 and p\2 > 0. Now we will show that if rn(p*) = vx then there

exists a local maximum q* with q*x > 0 and thus supp(ç*) 2 suPP(p*)- Since p*

is a local maximum, there exists a neighborhood U£(p*) such that for all points

p G Ueip*) n A : zip) < zip*). Let us define some s with 0 < ê < e and ë < p\2.Then we can define a new point q* by ç*x := ë, q*2 = p\2 ~ £ > 0 and q*r := p*r for

all other components. Note that now supp(g*) 2 suPP(p*) and by the definition of

q* it follows that Ylr(p*) = Tlr(ç*) for all r G K and R±(p*) = Ri(q*). We get for

the objective value at q*\

k

*(0=E&ri'(0+W)= E ?ïr£lr(p2+(7Îl luifl +Rl(P*)r=l r:plT>0 =„1 =„,

by (2 25) by assumption

= vi + Riip*) = zip*)

and since ç* G UE(p*) it follows that q* is a local maximum with larger support than

p*. This finishes the proof that every local maximum p* with maximal support

satisfies property (2.70).

Definition 2.36 (Geometric Convergence)Let a sequence {p*}, t > 0 with linit-^p* = p* be given. Then the sequence {p*} is

geometrically convergent, if there exist c > 0,0 < p < 1 and an index t0 such that

\\pl - P*\\ < CP* f°r &fi t > *o-

The following result was shown for the operator F[Y] by Cochand and is adaptedhere to fitness functions £ = G(Y).

Theorem 2.37

Let z be an A-polynomial, F :— F[Ç], where £ := G(Y) as in (2.35) and G% are

strictly increasing Lipschitz functions for all i G N. Moreover, let p* G A \ A0 be a

non-strict local maximum with maximal support. If p* is an accumulation point of

the sequence {Ft(p0)}, t > 0 for somep0 G A, then Iim^oo Ft(p°) = p* and {F^p0)}is geometrically convergent.

Proof

Let us define for all i G N, p G A

T, (p) : = max Yir ip) and f, (p) : = max f,r (p).rÇ:K r£K

We have already shown in (2.70) that a local maximum p* with maximal supporthas the following property:

p:r = o^rir(p*)<r,(p*). (2.7i)

Let us denote by Bs(p*) the closed ball with center p* and radius Ô. By continuitywe can find ö > 0 such that for p G Bs(p*)nA, i G N, r G K the following propertieshold for Y„ and because of the strict monotonicity of Gt for £îr as well:


• p*T > 0 =>> p^ > 0

• Y^ip*) < Yis(p*) => Yirip) < ?is(p) => &rip) < CM

• 0 < Yirip*) ^ 0 < Yirip) ^ 0 < 6r(p)

Moreover, we define for all 2 G A" the sets

Ui := {(2, r) | rîr(p*) = I\(p*),1 < r < k}, U := |J £/,•

Vi := {(2,r) | rir(p*) < Yiip*),l< r<k}, V := |J ^.

ieiv

Note that U = supp(p*) and V = (N x K)\ supp(p*).

For p G A we define

£i(p) = E Pir V2GN, e(p) :=max£i(p) (2.72)(i,r)6Vi

and

d(p)--=JH Pi- (2-73)

V (i,r)eV

Since E(2,r)ey^2> ^ max(i)r)evp-r it follows that

dip) > max p^ (2.74)(i,r)eV

and therefore we get from e^p) — Ea r)ev, P^< kmax^çv, Pir that

sip) = maxej(p) < A: max pir < kdip). (2.75)iN {i,r)£V

Consider the mapping n : A0 —y Fp* where p := Yi(p) is defined componentwise by

'0, for (2» G V

'

ï=Sfe)> otherwise

Claim 2.38

For all ii,r) e N x K

\Pir "Pir| <e(p)- (2.76)

Proof of Claim 2.38

If ii,r) G V, then p2> = 0 and (2.76) follows directly from (2.72), because pir <


e.(p) <e(p).Moreover, (2.72) also implies that for all i G N

l-e(p) < I-Slip) = Y^ P*r

(i,r)eut

and therefore we have for (2, r) G Ul

a -v__Pü:

p

1 - (1 - g,(p)) gl(p)_

Note that p* lies in the relative interior of Fp*, since it is by assumption a local

maximum with maximal support. We choose 5 < 1 small enough such that all

points in B$(p*) n Fp* are local maxima.

Claim 2.39

j-j.

Then

PeBs,(p*)nA°^peBs(p*).

Let S'--=^TrThen

Proof of Claim 2.39

For p G B51 ip*) we have

d^ = J E (P- " Kr)2 < E (Ar - P^)2 < *',

y (i,r)£V y (2,r)eVU!7

since p*T = 0 for all (1, r) G V. Hence, it follows from (2.75) that e(p) < ko' and

therefore

\\P* -P\\< \\P* ~ P\\ + \\P - P\\ < à' + nke(p) <5' + nkkô' = 8'ink2 + l)=ô

which proves that p G B$(p*).

Let M be defined by

M'1 := min{e,r(p) I P £ Mp*) n A, (î, r) G £7} (2.77)

and

p:=max||^| p G z3,(p*) n A, (i,r) eV, (2, s) G u\ .

Let L' := 2nmaxiejvc?2 (Erer^) ana- ^ := ^' wnere C is the largest Lipschitzconstant of the functions G% for all 2 G Ar. Now we choose À with 0 < A < ~ such

that p j^ < 1. Since À can be chosen arbitrarily small and p < 1 by definition,

such a A always exists. Hence, there also exists p with p ~^< p < 1. Moreover,

we define

VA := {p G Bsip*) n A I eip) < A} (2.78)

and consider a point p G Va fl A0. One could think of p as a point Ft(p°) for some

t. For this point p we define s := £(p).


Claim 2.40

For p G Bs> ip*) n A0 we have

I6r(p) - £2: <eL. (2.79)

Proof of Claim 2.40

We just have to prove that

<eL'. (2.80)

Then the claim follows directly from Lipschitz continuity of the functions Gi,i G N,because we have for all 2 G N, r G K

\UP) - Up)\ < C\Y„ip) - Fir(p)| < eL'C = eL.

Since the coefficients of rzr(p) are also coefficients of z they can be bounded by

EtetWt- Thus it is sufficient to derive a bound for a monomial T in rir(p) with

degree m < n. Using (2.76) we get

wt Yl Pis- ]^[ Pls

(t,s)er (i,s)£T

< wt n {pis+e) - npis(i,s)GT (i,s)£T

< lüT n ^-+E£i(i,s)£T î=l

m

n PiE'

wT > s

2= 1

m

< wT^E (j ^ WTe2m ^ WTer-2=1

^ l '

By summations over T e T we get the upper bound in (2.80).

Since by (2.77) for (2,7-) G Ut we have Çirip)M > 1, it follows from (2.79) that for

pGMp*)nA°nVA

and symmetrically

&r(p) < &r(p) +SL< e,r(p)(l + eLM)

Lr(p) > &r(p) ~eL> &r(p)(l - ELM).

For p G Bß' (p*) n A0 we know from Claim 2.39 that p lies in Bsip*) Pi Fp* and by our

choice of S it is a local maximum. This implies

• p*r >0^pir>0

• rî7.(p) = Yiip) and &r(p) = f,(p) for p2r > 0, i.e (2, r) G LT,

• Yirip) < Yiip) and &r(p) < &(p) for (2» G K-


Lemma 2.41

Consider p' := Fip), where p G Bg>ip*) n A0 n V\. Then

(i) there exists a constant K such that \p'ir — pir\ < Kkd(p) for (i,r) G Ui

(ii) p'ir < ppir for (i,r) G Vi

(ni) \d(p)-dip')\> il-p)dip).

Proof of Lemma 2.41 (i)Case 1:

If p'ir > p^ we need an upper bound for p'ir

, Pir£ir(p), Pir£ir(p)

,Pirtir (p) (1 + eLM)

Pir = 7^ 7TT<

v^_ <- <„\

<

Ek,=iPis&s(p) E(i,s)&Ui PistM E(m)g^, Pi^«(p)(1 -eLM)

PiMP){l + eLM)<

pîr(l + eLM)

fc(p)(l - eLM) ZMeUi Pis~

(1 - eLM)(l - e)

and therefore

/ jl+eLM) \ (I + eLM) - il - eLM)il - e)Pir Pir S Pir I /-. r n,rw-i \ J —

l-eLM)il-e) )~ (l-eLM)(l-e)[1 + eLM) - il - eLM) + e{l - eLM) 2sLM + e

(1 - eLM) (1 - e)~

(1 - eLM) (1 - <r )< Ke < Kkdip)

with2LM + 1

K :=

;i-ALM)(l-A)

Case 2:

If p-r < Pir we need a lower bound for p'ir

,_

Piriirjp) PtrÇirjp)>

&(p)(l ~ eLM)Pir

EliP^sip)"

max(M)^ e«(p)~

VlT '

6(P)(1 + eLM)

and therefore

_

/ l-sLM\ jl +eLM)-jl-eLM)Pir pir S Pu \i 1 + £LM)^ 1+eLM

2sLM 2LM + 1Tjr<-, -TTJ-e<Ke< Kkdip).

1 + eLM 1 - eLM

Proof of Lemma 2.41(ii)For (i.r) G V we have

/<

PiSrjp) pAip)lT ~

E{i,s)evxPis£M~

min(M)ec/l{^(p)}E(2)S)etrIPi.CM 1

._

1.

-Pir '

TTT'

Ï -PirP -,

< PPirmin(î>)e^Çîa(p) 1-e 1-e


for e < A.

Proof of Lemma 2.41(iii)From the definition of d(p) in (2.73) and the fact that by Lemma 2.41 (ii) p'ir < ppt

for (i,r) e Vi, it follows immediately that

d{p') = J E GO2 < MP) (2-81)(i,r)eV

and therefore

^(p)-4p')>(i-pMp)-

Summarizing, we have shown that for p' :— F(p) where p G B& f~)V\, we come

closer to the face Fp* by a step of length at least (1 — p)dip) while the progress for

(i,r) G U{ is bounded by Kkdip).

Under the assumption that our hypothesis still holds for p' and all further iterates,

we shall approach Fp* while remaining in a kind of 'cone' with vertex p. Then

Theorem 2.37 follows from the existence of a point p° := L^p0) which lies close

enough to p* so that the corresponding cone is entirely contained in By n Va and

does not contain a presumably existing second accumulation point.

It remains to show that if p° lies close enough to p*, then all iterates of ps lie in

By H Va- In order to formalize this idea we define

0:=-^- and *":=_!_ (2.82)

1-/7 nkO + 1v

Claim 2.42

Assume that p° G #5"(p*) D VA- Then ps := Fsip°) G £^(p*) n VA for all s > 1.

Proof of Claim 2.42:

We prove Claim 2.42 by induction and assume therefore that ps G Bg'ip*) n Va for

0 < s < t. We must show that pm G B*/ (p*) n VA.

Since Lemma 2.41 holds for all ps with 0 < s < t, the following statements are

implied:

1. For (i, r) G V we have

(a) p-r+1 < ppsir for all s < t (Lemma 2.41 (ii))

(b) p°ir > p\r > > p\r > pl^1 (follows from la and p < 1)

(c) ELo |Pt+1 'Pi I = Pi -Pit' < Pi < dip0) (follows from lb and (2.74))


(d) e(ps+1) < pX for all s < t and therefore pt+1 G Va

(follows from la, (2.78) and the fact that by induction hypothesis p° G

Bs"(p*) n VA and therefore e(p°) < A)

2. We get for the distances d

(a) dips+1) < pd(ps) for all s < t (follows from (2.81))

(b) dip3) - dips+1) > (1 - p)dips) for all s < t (by Lemma 2.41 (ui))

(c) dip0) > dip1) > > dip*) > dipt+1) (follows from 2a)

3. For (i, r) G U we have

(a) |P2Sr+1 - Ptr\ < Kkdip*) < § • (dip*) - cV+1)) = 0(d{ps) ~ dips+1))for all s < t (by Lemma 2.41 (i), 2b and the definition of 6 in (2.82))

(b) E|P2Sr+1-P2Srl <0-E(diPS)-dip^))s=0 s=0

= 9idip°) - dipt+l)) < 9dip°) (by 3a and 2c)

4. The difference between p° and p* is bounded:

iipm-p°ii<Ei^+1-^E E \p^l-pi\s=0 s=0 (i,r)eVUU

t t

= E El^+1-î4l+ E El^+1-^l (byic,3b)(t,r)EV 5=0 (2,r)ef7 s=0

< Yl d(P°) + E 9d^ ^ ed(P°W\ + \U\) (since 9 > 1)(i,r)Ef (i,r)eU

= nkddip0)

5. p° G Bg»ip*) implies d(p°) < 6" so that

||pt+1 -P*\\ < \\pt+1 -P°\\ + \\P° -P*\\< nk65" + 5" (by 4)= inkO + 1)5" = 6' (by (2.82))

and therefore pt+l G B^ip*). m

Now Claim 2.42 follows immediately from Id and 5 and thus finishes the proof of

convergence of Theorem 2.37. Note, in the last statements we have not only shown

that ps G Bs'(p*) n Va for all s > 1, but also in 4 that the length of the trajectory is

bounded by nk9dip°).


To prove geometric convergence we use for p° G Bs"(p*) fl Va the estimates derived

in la, 2a and 3 and get

(i, r)eV: Ip*. - pi I < pVir < /Mp°) (smce p*r = 0)ir I

00 oo

(t,r) G U : \p\r-pt\ < E l^+S+1 "#'1 < J2Kk<Pt+Sîs=0 s=0

oo

ps = -^dip*) < 9dip°)pt.,=o

1 ~ P

Since there exists t0 such that pto G By(p*) n Va we have d(pio) < 5" and get bysummation over all indices

\\pt-p*\\<nk0ö"pt \ft>t0

which finishes the proof of Theorem 2.37.

Until now we have always discussed the discrete time dynamical system defined by

operator F. However, this system can also be interpreted as a discretized version

of a continuous dynamical system where for a small step length the iteration

sequence generated by the discrete dynamics follows essentially the trajectoriesof the corresponding continuous dynamical system. This aspect will be discussed

briefly in Appendix C where we will show that this continuous dynamical system is

a gradient flow with respect to an appropriate metric.

2.3.4 Relations between Local Optima and Attractors

First we introduce the notion of attractors:

Definition 2.43 (Attractor)Let F : A —)- A be an operator. A point p* G A is called an attractor of F if there

exists a neighborhood U(p*) such that for any starting point p° G Uip*) fl A the

sequence {L'^p0)}, t > 0 converges to p*. In this case

RAip*) := {p G A | lim F\p) = p*} (2.83)t—>oo

is called the region of attraction ofp* with respect to F.

If we are speaking about a SAP instance with objective function z and F := F\Y],then we will write RAiz,p*) instead of RAip*) in order to make clear which

objective function we are working with.

In the sequel we investigate the locations of attractors and their relations with local

maxima. Moreover, we present some sufficient conditions characterizing a subset of

the related region of attraction.


Lemma 2.44

Let F : A0 —>• A0 be a strict growth transformation for a continuous function

z : Rnk —y R and let p* G A1 be a ûxed point of F. If there exists a closed set

Mip*) Ç A which fulfills the following properties

(RI) 3e > 0 : (W£(p*) n A) Ç Mip*)

(R2) F(M(p*)) Ç M(p*)

(R3) p* is the only fixed point in Mip*)

then p* is an attractor and Mip*) Ç RA(p*).

Proof

From the compactness of Mip*) and (R2) it follows that for any p G Mip*) the

sequence {Ft(p)}, t > 0 has at least one accumulation point in Mip*). Moreover, we

know already from Proposition 2.33 that every accumulation point of {Ftip)}, t > 0

is also a fixed point. However, by (R3) the only fixed point in Mip*) is p* which is

therefore also the only accumulation point in Mip*) and hence lim^oo Ft(p) = p*for all p G Mip*). Since by (RI) Mip*) has a non-empty relative interior it follows

that p* is an attractor and hence Mip*) Ç RAip*). m

Using the results of Lemma 2.44 and Corollary 2.31 we can now completely charac¬

terize the attractors of the R-SAP:

Theorem 2.45

Let z be an A-polynomiai and F := F[Y]. Then p* G A is an attractor of F if and

only ifp* is a strict local maximum of z on A.

Proof

'=>': Assume p* G A is not a strict local maximum. Then for any neighborhood

Uip*) there exists q G Uip*) n A with ziq) > zip*) and q ^ p*.Now we fix any such Uip*) and define a sequence {p* := L*(p0)}, t > 0 with p° := q.

Then {p*} does not converge to p*, because either zip*) = zip0) for all t > 0

and then p° is a fixed point, or z(p*) > zip0) = z(q) > zip*) for all t > 0 and

{^(p*)},t > 0 increases monotonically.

'^=': This part of the proof makes use of Corollary 2.31. Let p* be a strict local

maximum with p*x = 1 for all i G N and let U£(p*) as in (2.64) be the connected

component of {p G A | zip) > zip*) — e} containing p*. We choose e > 0 small

enough such that for all p G L"£(p*): (i) p%i > 0 and (ii) Pii(p) > Pjr(p) for all r > 1,which is possible because of the continuity of z and the fact that p* is a strict local

maximum.

From (2.63) we know that a point p G A is a fixed point of F if and only if for all

r e K with piT / 0 : Yir(p) = Vi for all i G N. Hence (i) and (ii) imply directly that

the only fixed point in UEip*) is p*. By Corollary 2.31 we have F(UEip*)) Ç Ue(p*))


and hence UE(p*) fulfills for the chosen e the properties (R1)-(R3) of Lemma 2.44

which implies that p* is an attractor.

The following example shows that there may exist non-strict local maxima which

are not even accumulation points of any sequence {Ff(p)},t > 0 with p G A0.

Example 2.46 (Example 2.32 continued)For the R-SAP with

Zip) = 3p2i(pil +P12) + 2pi2P22 +PHP22

the local maxima are given by the points p = (PîlPo2), with pL G A*. In

Figure 2.6(a) we see the graph of z(p) in the pn, p2i-plane and Figure 2.6(b) shows

its vector Held.

\

\ \

\ \

\ \\ \ \

\ \ \ \ \ \ \\ \ \ \ \ \ \

\ \ \

\ \

v \

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

\\1

\

\\\\ \\ \\ \\ \

(a) Plot of zip) = 3p2i(pn +P12) +2^12^22 +PUP22-

(b) Vector field of zip).

Figure 2.6: Example with a discrete local maximum p* which is not an accumulation

point of any sequence started in A0.

Computing Yip) we get

Y(P)3P21 + P22 3p2l + 2p22

3pn + 3p12 pn + 2pi2

and we see that Yuip) < T12(p) for all p G A0 and therefore also L,(p)n < pn.

Hence, the local maximum p* := ({q) cannot be an accumulation point of

(L[r]*(p)}, t > 0 for any p G A0.

Summing up, we have shown the following implications where zigzag lines in the

diagram denote some implications that do not hold:

2.4 Guaranteed Region of Attraction for F[Y] 59

strict local

max. (discr.

local max.

strict local

max. (cont.)

Thm. 2.45

attractor KKT point

accumulation

point

l0'al ^/««vffwJwg fixed'point 7^ ^f.(cont. J /on equilibrium

2.4 Guaranteed Region of Attraction for F[T]

We know already from Theorem 2.45 that each strict local maximum of an R-SAP

instance has a region of attraction for F[Y] which, in general, depends on the

form of the objective function z. For this reason we investigate in this section the

step length and step direction of the iteration sequence, which both influence its

convergence behavior. We will prove that the intersection of all regions of attraction

over all forms of z where F is defined, is non-empty and therefore there exists for

each strict local maximum also a so-called form-independent region of attraction.

So we concentrate on finding a subset of this region, called the guaranteed region of

attraction GRA with the goal that for quadratic A-polynomials this region can be

characterized explicitly and thus provide us in this case with an additional stoppingcriterion for FPH.

Throughout this section we use the operator F := F[Y]. Moreover, we denote the

set of all A-polynomials in PN for which F is well defined by Pp. Since we are

working here with different forms of z we define:

Definition 2.47 (Form-Independence)Let z G Pp be fixed and for all z G Pp with z = z let a set Miz,p) depending on

z and p e A be given. If for all p G A and z G P$ with z = z : M(z, p) = Miz,p),then we define Mip) := Miz,p) and call the set Mip) form-independent.

All pictures in this section represent Boolean, quadratic objective functions with

two variables (72 = k = m = 2). The x-axis corresponds to the pn component and

the y-axis top2i.


2.4.1 An Introductory Example

Let us continue here with Example 2.3 from page 23 and demonstrate some inter¬

esting aspects regarding the regions of attraction of an R-SAP instance. Recall that

in this example N — K = {1,2} and the objective function was given by

zip) = 2pnp2i +P12P22 +P11 +P12 +P21 +P22 - 2.

This R-SAP has a global maximum at p* = ( J g) with zip*) = 2 and a local

maximum at ç* = ({]}) with ziq*) = 1 (see also the graph in Figure 2.1(a)).Moreover, there exists a saddle point p = | (\ |).

Now we will study two A-polynomials zi and z2 which are equivalent to z. We define

for some large constant M:

zi(p) := zip) + Mipu +P12) - M

zzip) := zip) + M(p2i +p22) - M

and it follows immediately that z = z\ = z2.

The partial derivatives of z were already computed in (2.6) and those of the two

forms z\ and z2 are given by

Yizi,p)=Yiz,p)+(MQ *fy and r(Z2,p)=r(z,p)+(^M ^j.

We see in Figure 2.7 (also compare with Figure 2.2(b)) how the regions of attraction

of p* and q* are changed by the new forms z\ and z2.

0 2Q *0 60 80 10Ojj

0 2Q 40 GO 60 100

(a) Region of attraction for z\. (b) Region of attraction for z2.

Figure 2.7: Regions of attraction for zx and z2 with M := 5000; black region corre¬

sponds to p*, light gray region to q* (numerically determined).


In order to discuss the regions of attraction, we define four regions on [0, l]2 which

correspond uniquely to regions on A. Within each of these regions the step direction

of the iteration sequence is fixed as shown by the arrows in Figure 2.8.

\Ä1

/«2

A4

/A3

Ik:=

R•4 •"

Figure 2.8: Region of monotone increase; arrows indicate directions of increase.

Comparing the three regions of attraction in the Figures 2.7(a), 2.7(b) and 2.2(b)for the three different forms z\, z2 and z, we observe that all the black regions as

well as all the light gray regions have a non-empty intersection. Indeed, we will see

later that for every strict local maximum there exists a non-empty intersection of

the regions of attraction of all forms of z. This form-independent region is defined

in Definition 2.48 and will be called the universal region of attraction. In this

example the universal regions of attraction are given by R2 and i?4 for the strict

local maxima p* and q*, respectively. Note, however that in the general case the

universal region of attraction cannot be determined explicitly. For this reason we

will investigate in the following subsections some subsets thereof.

Aside from these two universal regions of attraction there also exist the two

form-dependent regions Ri and A3. For points in those two regions we cannot

say in advance to which point the iteration sequence will converge. Since the

step direction is fixed in each of these regions, there the behavior of the iteration

sequence can only be influenced by the step length.

By adding a large constant to some partial derivatives p%, we make the step lengthin these directions small. In this example we added for Z\ a large constant to the

derivatives of pi, thus making the corresponding step length in R3 in —pu-directionbut also in Ri in pn-direction very small. Hence, in Figure 2.7(a) the steps in

P2i-direction are large compared to those in pn-direction and therefore the sequencereaches the universal region of attraction R2 for nearly all points in R3 and none in

Ri. Exactly the reverse situation is depicted in Figure 2.7(b) for the function z2.

In the following subsections we will concentrate on characterizing some subsets of

the universal region of attraction and do not discuss the issues of the step lengthfurther on.


2.4.2 Characterization of the Guaranteed Region of Attrac¬

tion

We have seen in the introductory example that the regions of attraction of different

forms of A-polynomials may vary greatly. However, from Theorem 2.45 it follows

that every attractor of an A-polynomial z is also an attractor of any form z G

Pjy, z = z. Hence we can define:

Definition 2.48 (Universal Region of Attraction: URA(p*))Let z G Pp and p* be an attractor of F. Then the universal region of attraction

URA ofp* is defined by

URAip*) := f| RAiz,p*),

ÏZP»

where RA(z,p*) is the region of attraction ofp* for z, as defined on page 56.

Let p* G A7 be a strict local maximum and thus also be an attractor of F. We denote

by Si(p*) for all i E N that component ofp* which is one, i.e. p*ls<p*\ = 1- We will

see that we can define a neighborhood GRAiip*) such that for all p G GRAi(p*)the sequence {Ft(p)}, t > 0 does not only converge to p*, but even the components

Pis (p*)mcrease monotonically while converging to one. For this reason we first define

Ap* := {p e A | supp(p*) Ç supp(p)} = {p G A I plSi{r) > 0, V* G A^}. (2.84)

The definition of Ap* is motivated by the following observation: If a point p lies in

a face D of A, then all further points of the iteration sequence {Ftip)},t > 0 also

lie in D. Hence, a necessary condition for convergence of the sequence {Ft(p)} with

p G D to p* is that p* G D. The definition of Ap* eliminates exactly those faces of

A for which this condition is not fulfilled. This implies that p* is the only integervertex in Ap*.

As in the introductory example we define some region indicating the step direction,where we will take here only the components (?, sl(p*)) into account. For this reason

we define for each p* G A1 the following region of monotone increase

Mhip*) := {p G Ap* | YtSl{p.)iz,p) > Yzriz,p), Mi eN,r^ Sj(p*)}. (2.85)

If F(p)%T = UirPir, then it follows from (2.39) that for points in Mii(p*) the compo¬

nents («,Sj(p*)) do not only increase, but are also multiplied by the largest factor

ulSt(p*). Hence each point in A is contained in at most one set MIi(p*),p* G A7.

Moreover, MI\ip*) has the following important properties:

• Mliip*) is form-independent (Remark 2.9)


. For p G Mliip*) : F(p)Ml(p.) > plSl(p*} (by (2.37))

• p* is strict local maximum <f4> p* G M/i(p*) (Proposition 2.15)

• p* is strict local maximum =£* p* is the only fixed point in MIi(p*) (by (2.63))

These properties imply that if p* is a strict local maximum, then there exists a

neighborhood U£ip*) such that U£(p*) fl A Ç Mliip*) an(i therefore for all points in

U£ip*) the components (z, sz(p*)) increase under F.

Example 2.49 (Introductory example continued)Let us look at some regions MIi in our introductory example. Since Mix is form-

independent we can work with the objective function z(p) — 2puP2i+pi2P22- For the

strict local maximum p* :=(]•§) we get MIi(p*) = {p G Ap* | 2p2i — P22 > 0, 2pn —

P12 > 0} and see that p* G Mliip*). This region corresponds to R2 := (|, 1] x (|, 1]in Figure 2.8.

However, for v* := (°o) wn^ch is not a strict local maximum, we get MIi(v*) =

{p e Av* I P22 — 2p2i > 0, 2pn —

P12 > 0} and we observe that v* 0 M/i(t>*). This

can also be seen in Figure 2.8, where the region corresponding to Mliiv*) is the

rectangle R3 := (|, 1) x (0, |).

In this example for n — k — m = 2, we could simply define GRA\ip*) = Mliip*)thus getting a form-independent subset of URA(p*) for any strict local maximum

p*. However, in the general case we are still confronted with the problem that a

sequence {F*(p)},i > 0 with p G Mliip*) does not necessarily stay in MIi(p*).Assume a situation as depicted in Figure 2.9.

p*

)

Figure 2.9: Leaving Mhip*), because of too large step length.

Let p* = (} I) be a strict local maximum and let Mliip*) be given by the gray

region in Figure 2.9. Though p* G MIxip*) and the step direction is fixed (thesequence always goes to the upper right corner), the problem in this example is

that there may exist critical points p G M/i(p*) with p' := F(p) e" MI\ip*). A pairof such points is shown in Figure 2.9 by p and p'. To avoid these situations, we will


define GRAi(p*) such that all critical points of Mhip*) are cut off, as shown in

Figure 2.9 by the dashed line.

Hence we define for any point p G Mhip*) a set Qi(p,p*) which potentially covers

all directions F(2,p) — p for some form z = z of a given A-polynomial z G Pp .

Let p* G A7 with P*lsip*\ — 1 be given and p G A, then we define

Q\(P,P*) = {Q G A | qisAp*) > PiSl(p*), Vz G N}. (2.86)

For any point p G Mii(p*) we know that p' := F(p) fulfills p'ls^p*) > Pist(P*) for all

i G N. Therefore this definition guarantees that p' G Qi(p,p*)-

(U)

(0,0) pu

(a) n = k = 2 in the pn-p2i-plane.

(0,1,0) (0,0,1)

(b) n= l,k = 3.

Figure 2.10: Examples for Qi(p,p*).

Note that for Boolean A-polynomials the projection of the set Qi(p,p*) to the

Pj! coordinates for i G A^ describes the axis-parallel hyper-cuboid in [0, l]n with

opposing corners p and p* (see Figure 2.10).

Using (2.86) we can define GRAx(p*) as a subset of Mhip*) which is closed with

respect to Qi(p,p*)-

Definition 2.50 (Guaranteed Region of Attraction: GRAxip*))Let z e Pp and p* G A7 with p*Si(p») = 1 for % G N be an attractor. We define the

guaranteed region of attraction GRAi(p*) by

GRAiip*) := {p G Mhip*) I Qi(p,P*) Ç Mh(p*)}. (2.87)

We see that GRAiip*) is form-independent and we will show now that it is indeed

a region of attraction and therefore a subset of URAip*).


Theorem 2.51

Let z G Pp be an A-polynomial, F := F[Y] and p* G A7 with p*Si(p,) = 1 be a

strict local maximum. Moreover, let GRAi(p*) be deûned by (2.87). Then for any

starting point p G GRAi (p*)lim F\p) =p*.t—>oo

Proof

Since p* is a strict local maximum, it lies in GRAiip*) and it follows from

Theorem 2.45 that there always exists a neighborhood Z4(p*) such that

U£{p*) H A Ç GRAiip*) and therefore Gi?Ai(p*) is not only the singleton

{p*}. Furthermore we know by construction that for any p G GRAiip*) the

iteration sequence {Ftip)},t > 0 stays in <2i(p,p*). However, Q\(p,p*) is a compact

subset of Mliip*) and p* is the only fixed point in Qi(p,p*)- Hence, Q\ip,p*)fulfills (R1)-(R3) of Lemma 2.44 which proves the theorem.

Though we are still confronted with the same problem as before for URAip*), 0I"

not being able to characterize GRAi(p*) explicitly, we gained by the definition of

GRAiip*) some advantage: the polytope Qi(p,p*) has a very simple structure and

Mh(p*) is a system of nonlinear equations. Fortunately, this allows us to derive

some stronger results for the special case of quadratic A-polynomials.

2.4.3 The Special Case of Quadratic A-Polynomials

If z is a quadratic A-polynomial, then its partial derivatives Yir(p) are linear

functions and Mh(p*) is described by the intersection of open halfspaces with A.

We will see in this section that the closure of GRAi(p*) is a polytope for which we

will present an explicit characterization by its hyperplanes. Thus we get a stopping

criterion, since the test whether a given point in A belongs to GRAiip*) or not can

now be carried out efficiently.

In the second part of this section we construct in a similar way as already done for

GRAiip*) another form-independent subset, GRAiip*), of the universal region of

attraction ofp*. As before GRA2(p*) will be a bounded polyhedral set and we will

be able to derive its defining hyperplanes.

Throughout this section we work with quadratic A-polynomials z and we assume

that p* G A7 with p*x = 1 for alH G N is a strict local maximum and therefore an

attractor. Moreover, we denote by elT the (i, r)-th unit vector of length n x k.


Construction of GRAxip*)

Let us denote by L the index set of all open halfspaces defining Mhip*) of (2.85),i.e. for I = i%,r)

hi(p):=Ynip)-YAP) (2.88)

such that

Mhip*) = {pe Ap* I hiip) > 0, VZ G L}. (2.89)

Moreover, for any l G L we write hi(p) as

HP) = Yl Yl a*Pir + &, (2-9°)i=l r=l

where we skip the index I for the coefficients alT and b for reasons of easier

writing. Furthermore, we assume in accordance with (2.89) that hiip*) > 0, which

determines implicitly the signs of the coefficients air and b in (2.90).

In order to describe GRAiip*) explicitly, we will characterize the extreme points puof Qi(p,p*) and determine the vertex pUl for which hi(pUl) attains its minimum over

Qiip,p*)-

Proposition 2.52

Let z be a quadratic A-polynomial and let p* G A7 be an attractor. Moreover, let

for I e L, hiip) he ëiven °y (2-88) and (2-9°) and

n

si(p) :=y^aîiPïi + (l -pn)mmair + b. (2.91)z—' r>li=l

Then

GRAi(p*) = {Pe Ap* | siip) > 0, V/ G L}.

Proof

First note that since h^p) is a linear function for which the minimum over

Qiip,p*) has to be determined, we can separate h^p) over the simplices i and

write hiip) = E^=i h}(p) + ^- Thus we only need to compute the minima over the

functions h\ip) and therefore we let, i e N be fixed for the moment and work onlyover one simplex A*.

Let p G Mhip*) be given. In order to compute the extreme points of <3i(p,p*)we intersect A* with the hyperplane qn = pn (see Figure 2.12(a) on page 68).We observe that any extreme point pu 7^ p* of Qi(p,p*) has exactly two non-zero

components: pfx = pix and exactly one of the other components pfr = 1 — piX (r > 1


fixed). Hence, now again working on A, we associate with each set U of the form

{ii, Ti) | i = 1,..., n, rt G K \ {1}} an extreme point pu of Qi(p,p*) by

{Pii,ifr = l

1-pa, il(i,r)eU (2.92)

0, otherwise

whereby all n(Ä; — 1) extreme points of Qi(p,p*) other than p* are characterized.

Now we will determine the extreme point pUl for which hiipu) attains its minimum

on Qiip,p*), i.e. pUl := argminp[/ hi(pu). Since we have hiipu) = E=iaiiPn +

Eu r)eu air{^ — Pii) + b it follows immediately that

(i,rz) eÜi^rl = minargmins>1 als i = l,...,n.

If ht(pUl) > 0 for all l G L, then hiipu) > 0 for any arbitrary extreme point pu and

therefore Qi(p,p*) Ç Mhip*). This proves the theorem, since siip) — hiipUl) for

all l e L. u

Figure 2.11 shows two examples for MI\ip*) and GRAiip*). The drawings illustrate

how exactly those points of the region Mhip*) are cut off by Q\(p,p*), for which

too large a step length of F would have led out of Mh(p*)-

(a) Case 1: Qi(p,p*) cuts off points in (b) Case 2: Qi(p,p*) cuts off points in one

both directions. direction.

Figure 2.11: Two examples of sets Mh(p*) and GRAx(p*).

In the next subsection we describe another form-independent set GRA2(p*) which

is constructed in a similar way as GRAi(p*) and which defines different subsets of

the universal region of attraction, depending on the value of c.


Construction of GRAiip*)

In the definition of Mh(p*) we have required an increase of the first components p^,

but we have not had any fuither restrictions on the othei components. Now we will

define a new region Ml^ip*) as the set of points with increasing first components

and decreasing p„ components for all i G N, r > 1.

Similarly to (2.86), these properties should hold for all points in a set Q2(p,P*)-Hence we define

Q-i{p,P*) = {<? e A | qtl > plX, q„ < pir Vi G N, r > 1}

= {p+ Coneje11 - elT | Vï g N, r > 1}} n A.

Note that for any p G A, Q2{p,p*) Ç Qi{p,P*) (see Figure 2.12).

(2.93)

(Pl,l -pi.o)

(0,1,0)

(Pi,0,1 -pi)

(0,0,1)

(a) Any extreme point pu ^ p* has al¬

ways exactly (ft—2) zero-components and

Pi -=Pi-

(Pl +P3,P2,0)

(0,1,0) (0,0,1)

(b) Extreme point pu may have any

number of zero-components, at all

positions, but pY. Their p-valuesare all added to p\.

Figure 2.12: Extreme points pu of Qi(p,p*) and Q2(p,P*) for n = 1,/c = 3 with

P= (Pi,P2,Pa)-

In order to preserve these properties for all points of {Ftip)}, t > 0, we require that

the following inequalities hold:

Ytlip) > E,(p) V2 G N (2.94)

rîr(p)<Sî(p) VteJV,r>l. (2.95)

We observe that these inequalities are even for quadratic A-polynomials not linear.

For this reason we derive subsequently some linear estimates for (2.94) and (2.95)which will then be used to define MI2ip*).

We know already from (2.85) that I\i(p) > Y„(p) for alH G N, r > 1 implies (2.94).Moreover, we have

k k

St = V^r« = ptlYtl + y]ptaYia > PiiYn + (1 - Pii) minrM.^—' z—' S>1s=l s=2


Hence (2.95) is implied by the condition

PiiYn + (1 - pn) mmYiS > Yir Vi G N, r > 1. (2.96)S>1

Since Yn > Yir for all i e N, r > 1 and (2.96) is linear in pn, we can choose

Ci e (0, l],i G N and get thus for pu > c; the following linear inequality:

PiiYn + (1 - pn) mmYis > aYn + (1 - q) minris > Yir VieN,r>l. (2.97)S>1 S>1

Moreover, assume that p* satisfies (2.97), then we get

{Yir(p*) - mms>iYisjp*)\„

q > max < ;—^ —-;—r f =: c, Vz G A. (2.98)->i lril(p*)-minJ>1ris(p*)J

* l '

Now we can define for quadratic A-polynomials another subset of URAip*):

Definition 2.53 (MJ2c(p*),GßA=(p*))Let p* G A7 be a strict local maximum and c* be defined by (2.98). Then we define

for any c G Mn with c* < q < 1 and

Pu > Ci VieN (2.99)

CiYn + (1 - Ci)Yls > Yir VieN,r,seK\{l},r^s (2.100)

Yii>Yir VieN,r>l. (2.101)

the set Ml^ip*) by

MIc2ip*) := {p G A | (2.99), (2.100), (2.101)} (2.102)

and a corresponding subset of the universal region of attraction by

GRAUp*) = {P e MF2ip*) | Q2(p,p*) Ç MIc2(p*)}. (2.103)

We observe that by (2.98) q > 0 for all i e N and therefore (2.99) guarantees that

p G Ap. = {p G A | pzl > 0, Vï G N}. c* in (2.98) is for any strict local maximum

p* strictly smaller than one and hence we can always choose Cj < 1.

In the Boolean case it follows from (2.98) that c* = 0 for all i e N and since (2.100)does not apply (since k = 2 < 3) we have for all c G (0, l)n : MI^p*) Ç Mhip*)as defined in (2.85). Moreover, Q2(p,p*) = Qi(p,p*), because if pn increases

then the pi2 must decrease, hence for quadratic, Boolean A-polynomials

GRAc2ip*) C GRAiip*) for all c G (0, l)ra.

Subsequently we will assume that c with c* < q < 1 for all i G iV is fixed and

we construct the hyperplanes defining the set GRAiip*). Towards this end, let us


use the same notation as before: Let L be the index set of all halfspaces in (2.102)defining MI2ip*) and for any I e L, hiip) = E=i Z^r=i airPw + b such that

Mlc2(p*) = {p£ Ap* | hiip) > 0, W G L}. (2.104)

Moreover, we define for hi, l G L an index set

L := {it,r) e N x K \ alX - a%r < 0} (2.105)

and a linear function stip) by

siip) := X a*rP*r+ X aiiP*r + b. (2.106)(t,r)0i (i,r)£li

Using these definitions we will show in Proposition 2.55 that

GRAc2(p*) = {pe Ap* | siip) > 0, V/ G L}.

Geometrically interpreted, /; corresponds exactly to those directions for which the

angle between the normal vector of hi (pointing into MI2ip*)) and el1 — etr (definingQ2(p,P*)) is larger than 90°. These directions are the critical ones, in which the

iteration sequence might leave MI2(p*) if the step length is too large. In order to

avoid such situations we adapt the coefficients in hiip) for these directions, resultingin the definition of the linear functions siip) given by (2.106).

Example 2.54

We see in Figure 2.13 that on the one hand we have no critical directions for hi and

therefore it needs not be replaced (we set Si := hi). On the other hand, for h2 the

positive e1 direction is critical and there exist points p G MI2(p*) and A > 0 such

that p := p+ Xe1 ^ MI2(p*). By replacing h2 by s2 we cut off all those critical pointssuch that MI2ip*) is reduced until finally only the desired set GRA2ip*) remains.

hi - si

Figure 2.13: Construction of the guaranteed region of attraction GRAiip*).


The proof of the following proposition is based on the same idea as already used

in Proposition 2.52 for GRAiip*). We compute the extreme points of Q2(p,P*)and then derive for each linear function /fy(p) a condition which guarantees that

Q2(p,p*) Ç MI2(p*). It will turn out that this results exactly in the definition of

the functions Siip).

Proposition 2.55

Let z be a quadratic A-polynomial and p* be an attractor. Moreover, for a ßxed c,

let MI$ip*) be given by (2.104) and st(p) be dehned by (2.106), then

GRAc2(p*) = {pe Ap* | siip) > 0, V/ G L}. (2.107)

Proof

Let p G MI2ip*) be given. Since Q2(p,P*) is an affine linear image of a hypercube,every extreme point pu of Q2(p,P*) is uniquely characterized by its zero-components

(see also Figure 2.12(b)). Hence, we associate with each subset U of N x iK\ {1})an extreme point pu of Qi2ip,P*)

{Pn+ E(i,s)euPis, ifr = l

0, ifii,r)eU

Pir, otherwise

whereby all extreme points of Q2 (p,P*) are characterized. For every U and any fixed

I e L we get

hiipU) = anPn + CLn X P«-+ X airPir + b

(i,r)U,r>p

(i,r)gU

y~] anPir + x airpir+b-

(i,r)U (i,r)<£U

(2.108)

Now, let us determine the extreme point pUl for which (2.108) attains its minimum

on Q2(p,P*), i.e. pUl := argmmpU hipu). Since pir is fixed in (2.108), it follows

immediately that

i%,r) G Üi <5air > an.

Now we see that Ui = I[, the set of critical directions for hi as defined in (2.105)and therefore hi(pUl) = siip) and (2.107) follows.

We have already seen that in the Boolean case GRAiip*) Ç GRAiip*) f°r a^ c e

(0, l)n. Finally we show with an example that depending on the choice of c there

may also exist points p G GRAiip*) which do not lie in GRA^p*) and hence neither

region is always contained in the other.


Example 2.56

Let n = 2, k = 3 and let the following A-polynomial z be given:

Zip) = 3piip2l + 2piip23 + P12P22 + 2pn + 2pi2P23 +P13-

We see that z has a strict local maximum at p* = ( J 0 0 ) Moreover, its partialderivatives are given by

r(p)'3p2i + 2p23 +2 P22 + 2p23 1

3pn P12 2pn + 2pi2,

Let us first compute GRAiip*) as described on page 66:

halfspace

Tu - r12 > 0

Tu - r13 > 0

r2i - r22 > 0

r2i - r23 > 0

hi for Mhip*) as in (2.89) s 1 as in (2.91) bound

3p2i - P22 + 2 > 0

3p2i + 2p23 + 1 > 0

3pn - P12 > 0

Pn - 2pi2 > 0

3p2i-(l-P2i) + 2>0

3p2i + 1 > 0

3pn-(l-pii)>0

p11-2(l-p11)>0

P21 >

P21 >-3

Pn>lPn>l

Since p e Ap* is equivalent to p2i > 0,i G A" we get

GRAiip*) = IpeA Pn > g,P2i>0

In order to compute GRA2ip*), we hrst have to determine the bounds c* for

Ci,% G N according to (2.98). We have Y(p*) = (302) an<^ therefore c\ = | and

3"

The inequalities ht(p) > 0 characterizing MI2(p*) which correspond to (2.101) are

the same as for GRAiip*). However, since in all cases the ûrst coefficient is the

largest we get by (2.105) h = 0 and therefore si(p) = hi(p) for I = 1,..., 4.

For the other inequalities in Ml^ip*), corresponding to (2.100) we get:

I haifspace hi for MI%(p*) as in (2.104)5

6

7

8

cirn + (1 - Cl)r12 > r13

Clrn + (1 - ei)r13 > r12

c2r2i + (1 - c2)r22 > r23

C2r2i + (1 - c2)r23 > r22

3cip2i + (1 - C!)p22 + 2p23 + 2ci - 1 > 0

3cip2i - P22 + 2(ci - l)p23 + 1 + Ci > 0

(3c2 - 2)pu - (1 + c2)pi2 > 0

(2 + c2)pn + (1 - 2c2)pi2 > 0

Choosing cx := 0.5 > c\ and c2 := 0.9 > c\, we guarantee that the Grst coefficient

of the linear functions h6, h7 and h8 is always the largest and thus stip) = hiip) f°r

I = 6,7,8. Fori = 5 we get

, , n3 1

hip) = 2^21 + 2^22 + 2p233 13

and s5ip) = -p21 + -p22 + -p23.

2.5 A Region of Attraction for F[Ç] 73

Choosing

_

/0.6 0.1 0.3 \P ~

V0.9 0.05 0.05/

it can be verihed that stip) > 0 for I = 1,..., 8 and since p also fulhlls (2.99) it lies

for c = (0.5, 0.9) in GRA2(p*). However, since pu = 0.6 < |, the point p does not

lie in GRAi(p*).

2.5 A Region of Attraction for F[£\

In the previous section we have dealt with the problem of finding and characterizinga subset of the universal region of attraction for operator F[Y]. Now we will

investigate the regions of attraction in a more general setting for operators F[£\.Again we will be able to define a subset of the region of attraction ofp* for F[£],called PRA(z,p*). This region is described by a set of inequalities and can therefore

also be used as a stopping criterion in FPH. At the end of this section we compare

for A-polynomials and operator F[r] the regions PRAiz,p*) and GRAi(p*).

Proposition 2.57

For all % e N,r G K let £jr : Wlk —> R be polynomials with positive coefficients

and constant, F := F[Ç] as in (2.5), p* G A7 with p*s.(p») = 1 and p° G A0 with

P°is(p*) > P'ir f°r all i e N,r ^ s,(p*) be given. Moreover let for i G N,r G K,iir = £'ir + di- where £'ir consists of the constant and those monomials in £ir for

which all variables have second index equal to Si(p*). Then a sufficient condition for

convergence of the sequence {Ft(p°)},t > 0 to p* is given by

&.<p-)(p°) > &(P*) + &(«) ^eN,r^ siip*) (2-109)

where for i G N: qiSl{P*) = 1 and qir = 1 - P°,i(p.) for all r ^ Si(p*).

Proof

W.l.o.g. we assume Siip*) — 1 for all i G N, i.e. p*x = 1 and p°x > p% for all r > 1.

Moreover, note that t;ir(p*) = £j'r(p*) for all ï G N, r G K. Throughout this proofwe will write p* := Ft(p°).

Furthermore we use the following two inequalities. For all i e N,r e K

4(P*)>C(P) VpGA0 (2.110)

&(?) > C(P') Vp' G A0 with p\i > pi. (2.111)

(2.110) follows directly from the fact that £JiTip) contains only variables with second

index equal to 1 and therefore it attains its maximum over A in p*. (2.111) is true,because qiX — 1 is always larger than p'iX and for all p' G A0 : p'iX > p°ix:

qir = 1 - pi = ^2pI> ^Pir > Pir Vi N,r>l. (2.112)r>l r>l


Finally we will use the following property which holds for alH G A^ and p G A0:

à ( \ ^ c ( \ w ^ i ^ zrV ^Up)Ph

^Up)pil

/oiiq\&(P)>&W Vr > 1 = F(j>h

=

g^^->

iM EsKj= ft, (2.113)

To prove that (2.109) is a sufficient condition for the convergence we show by induc¬

tion over t that for alH 6 AT

UPt)>UPt) Vr>l,t>0 (2.114)

Pa1 < Pii Vt>l. (2.115)

To show the induction basis we need the following two bounds for all« G A"

Up°)<Up°) (2.H6)

Up*) + Uo) > Up°) + Up°) = UP°) Vr > 1. (2.117)

The lower bound (2.116) holds, because ^"(p0) > 0. The upper bound (2.117)follows directly by summation of (2.110) for p' and (2.111).

Combining the two bounds (2.116) and (2.117) with the stopping criterion (2.109)we get the induction basis (2.114) for t = 0:

Up0) > Up0) > Up*) + Ui) > Up0) Vi e n, r > l. (2.118)

By definition of F this implies on the one hand that pz\ > p\r for alH G A", r > 1

and on the other hand it follows from (2.113) that p\x > püiV

In the induction step we show that under the assumption that (2.114) and (2.115)hold for t' < t, they also hold for t+1. From (2.114) and (2.113) it follows imme¬

diately that (2.115) is fulfilled for t + 1, i.e. p*x < p*+x. It remains to show that the

following two bounds hold:

£li(p°)<6i(pt+1), Vi^N, (2.119)

UP*) + U<l) > UpW), Vz G N, r > 1. (2.120)

The lower bound (2.119) follows from

Up0) < Up1) < Upt+l) < Upt+l) v, g n

because 4'i(P*+1) contains only variables with second index equal to 1, (2.115) for

if < t and £"(pi+1) > °- The uPPer bound (2.120) is a direct consequence of (2.110)and (2.111).

Using (2.109), (2.119) and (2.120) we have for all i G N

Upw) > Up0) > Up! + CW) > Upt+1) Vr > 1 (2.121)

2.5 A Region of Attraction for F[(] 75

which proves (2.114) for t + 1 and completes the induction step.

To guarantee convergence of the sequence {pt},t > 0 we will show that for some

t+i

ö > 0 either the quotient ^f- is bounded from below by a constant K%(5) > 1 or

p\x > 1 — 5 for all i e N. Let ö > 0 be given and let us assume that there exists

i G A" such that for some t, p\x < p*x - ö = 1 - S. Moreover, let ^i(p) < M, for all

p G A and

Si:=mmCi(P°)-UP*)-U<ï)-r>l

We see that by (2.109) et > 0 and that (2.121) implies

UPt)>UPt)+^ Vr>l. (2.122)

Using (2.122) it follows that the step length for i is bounded:

pT_

Up1)>

Up1)

p\i ELipIUp')~

pIiUp1) + (i -plùiUp') - ^)

=

Up') Mi=

Mj= K(s)>1

UP^-^ + ^Pli Mi - et + dil - 5) M,-et5' A)

which concludes the proof of the proposition.

If for some starting point p° G A0 the sequence {F^p0)}, t > 0 converges to

p* G A7 and furthermore £;i(p*) > Cir{p*) for alH G N and r > 1, simple continuityarguments show that there exists t > 0 and p = F*(p°) for which (2.109) is fulfilled.

Then p* is an attractor and (2.109) describes a subset of iL4(z,p*) which we denote

by

PRAiz,p*) := {p G A | pu > pir, U*,P) > U^P*) + C(z, q), Vi G N, r > 1}.

Thus we have a criterion which guarantees convergence and therefore could be used

as part of a stopping criterion in an implementation. Since the conditions on £îrin Proposition 2.57 are rather weak, PRA(z,p*) could be used in a much more

general context than only for SAP's with operator F[Y], On the other hand, we

have defined in the previous section for SAP's the guaranteed region of attraction

GRAiip*). Now we will briefly investigate the relationship between these two

regions of attraction.

Let z be an A-polynomial with positive coefficients and F := F [Y]. Moreover, let

p* G A7 with p*x = 1 for alH G N be a strict local maximum. Then

PRAiz,p*) = {p G A | pu > plT, Y[iiz,p) > Y'„iz,p*) + r;'r(z, q), Vz G N, r > 1}

(2.123)

where for all i G N, qu = l,q„ — 1 — pu for all r > 1 and Yw(z,p) — Y[riz,p) +Y"riz,p) is the decomposition as in Proposition 2.57.


The following example shows that PRAiz,p*) is in contrast to GRAiip*) not form-

independent.

Example 2.58

Let zi,z2 with z\ = z2 be the following two equivalent quadratic, Boolean A-

polynomials with positive Y„-coefficients and let p* = ( \ g).

• Ziip) = 2piip2i +P12P22; and we get for its partial derivatives:

r(*,ri = (*" *»), i>,ri =

(*" °), r-(*,P) = (° *»).

V2Pu P12/ V2pii 0/ V° P12/

Computing PRA(zi,p*) according to (2.123) and taking into account that

Pii > Pi2 for all 1 e N, we get:

% = 1 : 2p2i > 0 + Ç22 =^ P21 > I% = 2 : 2pn > 0 + g12 =» Pn > |

vm* PRA(zi,p*) = |p G A I pn > \, P21 > \

z2(p) = 5pnp2i + 3pnp22 + Pi2p22 + 3pi2 - 3, and we have

w v/5p2i + 3p22 P22 + 3

.

Y(z2,p)= }, with

5pn 3pn + p12/

r'fe,rt=|f21 M, r»(,2,P) = (3f »»'

hpu 3puj y 0 pi2j

Again we can compute PRA(z2,p*) and get

i = l: 5p2i > 3 + Ç22 => P21 > I

î = 2 : 5pn > 3 + Ç12 => Pn >

PRA(z2,p*) = Ip e A I pn > -, P21 >- I

.

We see that though zx = z2; both functions have different regions PRA, which

therefore depend on the form of z. Comparing these regions to GRAiip*) =

Mhip*) ~ {p £ A I pn > |, P21 > |} we observe that in this example PRA(zi,p*)and PRA(z2,p*) are subsets of GRAiip*).

Since on the one hand PRAizp*) depends on the form of z and on the other hand,in general PRAizp*) ^ GRAiip*), now the question arises, whether PRAiz,p*) Ç

GRAiip*) f°r a^ z = z. An answer to this question is given by the following

corollary.


Corollary 2.59

Let z G Pp be an A-polynomial and F := F[Y]. Then for every strict local maximum

p* G A7

PRAizp*) ç GRAiip*)

for all forms z G Pp with z = z.

Proof

Let p G PRAiz,p*). If we assume w.l.o.g. that p*x = 1 for all i G N then we

have Ynip) > Y'irip*) + Y'^q) for alii e N,r > 1. It follows from (2.118) that in

this case Tji(p) > rir(p) for all i e N,r > 1 and therefore p G MJi(p*). Now, let

P G Qiip,p*). Then pn > Pu for alH AT and by (2.110) and (2.111) it follows that

Y'irip*) + YHq) > Yirip). On the other hand we have T^(p) > Y'^p). Thus we get

Hi© > Y[iip) > Y'M > TUP*) + Kiq) > Yirip)

which implies that p G Mh{p*) and therefore Qi(p,p*) Ç Mh(p*)- Moreover

we have for p G PiL4(.z,p*) : pu > 0 for all i iV and thus p G Ap*. Now the

form-independence of GRAi(p*) implies immediately that PRA(z,p) C GRA^p*)for all zG?/ with z = z. *

2.6 Computational Experiments with the Max-

Cut Problem

The Max-Cut problem was already discussed in Section 1.3 on page 12: Given a

graph G = (A', E) and edge weights w^ for [i, j] G E, then the Max-Cut problemconsists in finding

max >Wij.

SCN ^J

~

[i,J]eEiesjgs

We know already that if all edge weights Wi3, [i, j] G E are positive, then this problemcan be formulated as a quadratic SAP. In this case we define K := {1, 2} with the

interpretation pu = l,Pi2 = 0 if node i belongs to S and pu = 0pi2 = 1 if i e N \ S.

The objective function can be written as the following A-polynomial

Z(P) = ^2 Wij(PilPj2+PjlPi2)-

To solve the Max-Cut problem we used FPH as described on page 16. In more

detail, the implemented heuristic runs as follows:

1. Choose m starting points with components randomly drawn from the interval

[0.4995,0.5005]


2. For each starting point iterate F[£\ until one of the following three stoppingcriteria is fulfilled:

(a) The current point lies in the guaranteed region of attraction GRAiip*)for some p* G A7

(b) For all i e N the largest component of the current point p satisfies

max{p,i,p,2} > 1 - 10"5

(c) The number of iterations reaches a given iteration limit

In (2b) and (2c) the components of the final point p of the iteration sequence

are rounded to p* G A7 for v = 1,..., m.

3. Return as solution p* := p*„ with z(p*v*) — max{^(p*)|^ G {1,..., m}}

We first tested the algorithm for Max-Cut instances on randomly generated graphswith n — 50 nodes, edge probability of 0.2 and edge weights wlT,i G N,r G K

uniformly distributed in the interval [1,100]. These problems can be solved

optimally using an interior point algorithm in a branch and bound framework and

therefore the optima are known [Bur94, HRVW96].

In our tests we set the iteration limit to 1000 and used as fitness function £ := ra

with a e {0.5,1}. Experiments have shown that F[r05] gave much better results

than a = 1 so that we present here only these results.

Percentage

10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50

(a) Problem 1. (b) Problem 2.

Figure 2.14: Iterations vs. quality: solid=min, dashed=avg.

We first consider the qualitative behavior of the algorithm depending on the

number of starting points ra. We solved ten instances, each 20 times (= 20 runs)for m — 5,10,..., 50 starting points. For each run we generated a list of 50 startingpoints where the run with m starting points uses the first m generated pointsof this list. Figure 2.14 shows for two typical instances how the relative error

°PtmumUOn of the minimum (solid line) and average (dashed line) of the 20 runs

decreases as the number of starting points (depicted along the :c-axis) increases.


The maximum of the 20 runs is not mentioned here, since for all m the optimum of

the Max-Cut problem was found.

Of course, the quality increases with the number of starting points, however alreadywith 10 starting points the minimum over all runs gave results within 2% of the

optimum. Therefore we have chosen m = 10 for the following tests.

Table 2.1 shows the results for a = 0.5, m = 10 for 10 instances PI, ... ,P10. Againeach instance is solved 20 times (= 20 runs). The first column ('worst') of the

table gives the relative error of the worst solution out of the 20 runs, the second

column ('average') shows the average of the relative error of the 20 runs and the

last column ('#opt') how often the optimum was found. The last two rows give the

average and the worst case over the 20 instances for each column. One run took

about 5 seconds on a Pentium Pro 200 processor.

worst average #opt

PI 1.20% 0.12% 18

P2 1.27% 0.54% 8

P3 1.59% 0.36% 6

P4 0.60% 0.08% 14

P5 0.26% 0.05% 17

P6 1.29% 0.44% 8

P7 1.08% 0.35% 9

P8 0.68% 0.12% 13

P9 0.00% 0.00% 20

P10 0.94% 0.36% 8

average 0.89% 0.24% 12.1

worst case 1.59% 0.54% 6

Table 2.1: Results for 10 Max-Cut instances.

We see that the results of Figure 2.14 are confirmed: in all runs we were at most

2% off the optimum and on the average less than 0.25%. Moreover, in all cases the

optimum has been found in these 20 runs.

Stopping critérium (2a) tests, whether the sequence of generated points has reached

the guaranteed region of attraction of a strict local maximum. In order to speed up

computing time, the algorithm checks only every 10-th iteration if the current pointlies already in the guaranteed region of attraction of a point p* G A7. Moreover,this test is only carried out if the sequence is 'close' to an integer point, where

'close' means that for all i e N the largest component max{pu,Pi2} > 0.8. These

region of attraction tests are time consuming, however they sometimes allow to


determine pretty early and with certainty to which local maximum the algorithm

converges. Compared to the case where the guaranteed region of attraction partof the stopping criteria (2a) is omitted the region of attraction criterion gives a

total speed up of the algorithm of about 80%. Hence this criterion has not only the

advantage of being a 'clean' stopping criterion, where we know precisely to which

point the algorithm converges, but it also has a practical running time advantage.

In further experiments, we tried to construct some new fitness function £ for the

operator F[£], with the goal to minimize the regions of attraction of already visited

points in A7. Assume that we have already explored the first m! starting points

leading to solutions p1,... ,pm G A7. We try to influence the behavior of operator

F[£\ by changing the functions ^r for all i e N,r e K in order to get a smaller

region of attraction for p1,... ,pm .The idea is the following: Assume that (after

some iterations) the algorithm is in a point q and we do not want to visit p1 again.Now, if for an i e N p}t = 1 and Yuiq) > Yi2iq) then the first component will

increase, i.e. Fiq)u > qu and potentially we are approaching p1. To minimize this

chance we want to make only a small step in this direction, which can be achieved

by taking for example £jr = (r^ + Ki)01, where the larger the constant Ki is chosen,the smaller the step becomes. For a given i e N, any pu, v G {1,..., m'} for which

the i-th component of -F(g) gets closer to pv will contribute to Ki and the nearer q

is to pv, the larger is this contribution.

Formally, let p1,... ,pm be already visited points in A7 and let i e N be fixed.

Define for a point ç G A

hi(q) = {v e {1, • .,m'}\Yu(q) > Yi2(q) and ptt = 1}

Uq) := {v G {1,..., m'}\Yi2iq) > Y^q) and p^2 = 1}.

In order to define a growth transformation F[£\, we require that the fitness function

£ = (6r) satisfies the assumptions of Theorem 2.26. We define Çir — Gi(Yir),i G

N,r G K, where Gi are strictly increasing, concave functions, by

Ui) = (Tir(q) + 2|/ii(?)kii + 2\It2(q)\qi2)a Vz G N,r G K

and call FPH with this new fitness function 'point repulsion' (PR). We solved

the same 10 instances with operator F[£\ as defined above and used the same

parameters as before. Again each instance was solved 20 times. In general, the

global maximum was found more often with PR than before, however the worst

and the average values of the 20 runs were in both version about the same. Since

the computing time for PR is about four times higher than without it, it is more

profitable (for a fixed amount of computing time) to work with more starting pointswithout PR, instead of using PR.

Finally we solved larger instances of Max-Cut problems using the test set proposedin [Bur94]. These graphs are chosen from the following three different data sets:


Data set A(prob): Unweighted graphs with edge probability prob.

Data set B(cmaa:): Complete graphs with random integer edge weights in

[0, cmax].

Data set C(cmaa;): Planar graphs with random integer edge weights in [1, cmax].

Note that often large problems with more than 100 nodes cannot be solved opti¬

mally. For this reason we compared the results found by FPH with strong upper

bounds derived by a semi-definite programming relaxation [Bur94, HRVW96]. This

comparison is summarized in Table 2.2. The first two columns specify the problem

type and size, whereas the next three depict minimum, average and maximum ob¬

jective value obtained by FPH for solving each instance from 20 different startingpoints. The column entitled with 'bound' gives the value of the best upper bound

found for the instance, where a star beside the number indicates a proven optimalvalue. The next column shows either the percentage of the gap between best FPH

solution and the upper bound, or, if the optimum is known and also was found byFPH how often it was found (number in brackets). Finally, the last column refers

to the time (minutes and seconds) needed by FPH to solve one Max-Cut instance

for one starting point on a Sun Ultra Sparc workstation.

type nodes min avg max bound gap/opt time

A(0.1) 100

A(0.25) 100

A(0.5) 100

B(10) 100

C(l) 100

C(10) 100

335 338 341

776 779 781

1420 1424 1427

13577 13591 13602

196 196 196

1106 1114 1122

341* (1)782 0.13

1427* (2)13608* 0.04

196* (20)1122* (2)

0:11

0:09

0:09

0:09

0:10

0:08

A(0.5) 150

A(0.5) 200

A(0.5) 250

A(0.5) 300

A(0.5) 400

A(0.5) 500

3114 3120 3127

5491 5500 5511

8450 8462 8471

12138 12165 12193

21298 21327 21369

33055 33086 33122

3135 0.26

5543 0.58

8523 0.61

12262 0.56

21498 0.60

33343 0.66

0:21

0:37

0:58

1:24

2:31

3:56

Table 2.2: Results for large Max-Cut instances.

Using branch and bound techniques we could prove that all upper bounds in Ta¬

ble 2.2 are at most 0.5% off the optimum. Moreover, we see that all but one of the

instances with 100 nodes could be solved optimally within the 20 runs, provided the

optimum was known. Even for the larger instances with n = 150,..., 500 nodes

the results of FPH always lie within 0.7% of the optimum. Together with the giventolerance of 0.5% of the upper bounds, these results verify once more that even for

large Max-Cut problems the SAP model together with FPH is a successful approachto this hard combinatorial optimization problem.

Chapter 3

The Constrained Semi-AssignmentProblem

3.1 Introduction and Algorithm for C-SAP

In contrast to the previous chapter, we discuss now an extension of the SAP

with added constraints. These constraints are given in form of a set TZ, the so-

called forbidden partial assignments which must not be satisfied (see Definition 1.1).

Let us keep our notation of the previous chapters and denote the decision variables

by xt,i e N = {1,..., n\ and the set of values to be assigned by K = {1,..., k}.We recall from Chapter 1 that the C-SAP is defined as follows:

Let (n, k, T, w) be a SAP instance and let additionally a set of forbidden partialassignments TZ be given. Then the C-SAP (n, k, TZ, T, w) is given by

(3.1)

/ \max zip) = ^2 \wt Yl Pir

Ter \ (,,r)er /

s.t. peA1 (3.2)

Y[ Pir = 0 \/Re TZ. (3.3)(i,r)eR

Here, again, we denote by

A7= ^{0,1}

k

nxk ^pir = i, v? e N

r=l

the set of all possible assignments and call the relaxed problem, where A7 in (3.2)

84 The Constrained Semi-Assignment Problem

is replaced by A = Conv(A7) the relaxed constrained semi-assignment problem.

Our approach to the C-SAP uses again the operator F[Ç} of Definition 1.7 given by

the mapping F[f] : A0 -> A0, where for alii e N,r e K

m(p\r = JlrUff v

(3-4)Es=iPisUp)

However, in contrast to the SAP we will now define a new fitness function, taking into

account the newly added constraints in TZ. Choosing such a new fitness function, we

combine the effects the gradient-type dynamics and the repellor dynamics. Hence, we

recall from Section 1.4 the definitions of the corresponding fitness functions P = (Tîr.)and 0 = (&ir):

Yir{z,p) = g^=Y,WT II Pi» Vi£N,reK (3.5)

e,r(p)= n i- n Pi' VieN,reK. (3.6)

Again we will write T,r(p) instead of Yir(z,p) if only one objective function z is

considered. Moreover, we will assume that z is always given such that P is a fitness

function on A0. Furthermore, we observe that for all p G A0, 0„-(p) G (0,1],because for any forbidden partial assignment R e TZ : \R\ > 2. Hence, 0 is a fitness

function on A0, too, and F[&] is a continuous mapping.

The repellor dynamics was introduced in Cochand [Coc93] as an approach to

the G-Max-Sat problem. It has the property that if p is close to an integervertex p* G A7 which violates some constraint R e TZ, then for (z, r) G R

the fitness 0îr(p) is, as desired, very small and thus contributes to make pir

smaller. On the other hand, if p lies in a neighborhood of a feasible point, then

0îr(p) ?a 1 and thus it does hardly influence the computation of the new value for pzr.

Now we define the combined dynamics F[YaQß] for a, ß > 0 by

^^(P)- = JirTirr mIo Y(v

* e JV.r *. (3-7)Es=iPisrM aQis{p)ß

A good performance of FPH using F[YaQ>^} depends on a good choice of the expo¬

nents a,ß >0. With the help of these two parameters we can control the influence

of one or the other dynamics. Intuitively, a large value a leads to assignmentswith large objective values, however, unfortunately the larger a becomes the more

constraints may be unsatisfied. On the other hand a large ß exponent helps to find

3.1 Introduction and Algorithm for C-SAP 85

feasible assignments, but only too often with small objective values.

Note that 6 defined by (3.6) is normally not a fitness function on A, because on the

boundary some functions Qir(p) may become zero. However, even for points p G A0,

Qirip) may become very small in the neighborhood of an infeasible assignment. For

this reason we will restrict the domain of all the operators involving 0 as fitness

function to

A£ Pe[e,l-ik- l)e]nxk

r-l

Pi 1, Vi G AT (3.8)

where an appropriate choice for e will be discussed later. The range of the operator

must be Ae as well and for this reason we define \I/£ and Fe as follows.

Definition 3.1 (*e(p),Fe[£] and Round(p))Let s with 0 < e < | be ßxed, p G A and r(i) for all i G N be the smallest index

such that Pir(i) > Pis for all s G K.

1. We dehne a mapping \&e : A —Y A£. Let

)£, If Pir < £

Pir := \I min{pir, 1 — (k — l)e}, otherwise

Moreover, let 0 < p < 1 be deßned by

_

1 - pirji) - (fc - l)e'"

Er^r({)Pir ~ {k - iy

if the denominator in (3.9) does not vanish, and p := 0 otherwise. Then

(3.9)

s + p(pir — e), otherwise(3.10)

2. Furthermore, let a htness function Ç on A0 and the operator F[£] as in (3.4)be given. Then we deßne a new operator FE : A0 —T As by

F£[C}:=ÊoF[^] = ê(F[C}). (3.11)

3. We deßne a function Round: A —> A7 which rounds points in the interior to

integer vertices in A1 by

Roundip)lT := {^ jf> V^Vz G N.

0, otherwise(3.12)

Note that the definition of ê guarantees that the smallest component of $e(p) is

at least e and r(i) remains the smallest index for which pir(i) > PiS for all s e K.


3.2 Properties of the Operator for C-SAP

Example 1.3 on page 5 has shown a C-SAP instance which was characterized by a

large, nearly constant plateau with only a single point with a large objective value.

As already discussed, local search heuristics have problems to orient themselves in

such solution spaces due to their local view. However, FPH uses additionally globalinformation in its iterative process and therefore has an advantage over such method.

The following proposition proves that if the objective value of a maximum of the

C-SAP is large enough, then it will be found by FPH.

Proposition 3.2

Let (n, k, TZ, T, w) be a C-SAP instance, F := F[YaQß] with a > 0, ß > 0 and for a

given e > 0, F£ :— ^E(F). Moreover let M := EtetWt> P* £ A7 be a feasible pointwith p*r,^ = 1 for all i e N and 6 > e. If (T', wTi) is a weighted partial assignmentwith V := {(i,r(ï)) \ i e N} and

and (T', w') := (T, w) U (T1, u>t') then for this so extended instance in, k, TZ, T', w')the sequence {FE(p0)}, t > 0 converges to \&£(p*), for any p° G A0 with p°.^ > 5, for

all i e N.

Proof

Observe that p* is a strict local maximum of the SAP (n, k, T', w'), since wT> > wT

for all T e T.

Moreover, let z be the objective function using the partial assignments in T and

z' the one corresponding to T'. W.l.o.g. we assume that p*x = 1 for all i e N.

Furthermore, let p G A0 with pu > 5 for alH G AT and p' := F[Yiz'p)](p), then we

can write

Yir(z',pY<dir(pfpir = u^pir where uir

—

Es=iPisYrs{z',pyeiS(py

We will show subsequently that uir < | for alH G A", r G K \ {1}.

To prove this claim we use the following three inequalities for i e N:

• Yiriz'p) < M for allie N,r e K\ {1} which holds due to (3.5) and the fact

that pir < 1.

• Since Y(z,p) is a fitness function and T and T' differ only by the partialassignment V = {(1,1), (2,1),..., (n, 1)}, we get

Yn(z',p)= ^2 wt u Pjs = Yii(z,p)+wTiWpu>wT:5n~1TeT':(»,l)6T (j,s)£T\(i,l)

S

^'

&i

where the last inequality follows from pu > S for all j G N.


• Finally we derive the following lower bound for 0zi(p):

e.i(p)= n i1- n ft.]>^- (3-14)

We show that the expression in brackets in (3.14) is larger or equal than 5: p*is a feasible point and \R\ > 2, hence (i, 1) G R which implies that there exists

if,s') e R,s' > 1 and therefore the second product in (3.14) is not empty.

Hence, (1 — T[, s)£R\(i i)Pjs) > 1 ~ Pfs' > ö, because pyi > 5 implies that

Pj's' < EsîPj's <1 — 8 and (3.14) follows.

Combining these inequalities for i G A", r > 1 and using (3.13) we get

Y„izf,p)aQ„ipf^

MalT

Eks=iUsYis(z',p)aeis(p)ß~

p.ir,i(^,p)Qetl(p)/'Ma Ma

_

1

-

wat55a(n-i)5\n\ß -

2Ma~~

2'

If p'ir > £, then it follows from (3.10) (since p < 1) that ê(p')tr < p[r. Thus on

the one hand we have for r > 1 with p'lT > e: F£ip)„ = ê(p')ir < p[r < \p%r- On

the other hand we get for r > 1 with p'lT < e: F£ip)ir = êip')ir = £ It follows

that the first components FHp)u are increasing and therefore F£ip°)u > S for all

i e N, t > 0. This ensures that the inequalities (3.15) hold in all points of the

iteration sequence and limôoF£ip°) = tye(p*).

We have just seen that if the objective value of a feasible point p* G A7 is largeenough, then all trajectories started within a ^-boundary, for a fixed ö > 0, convergeto it. Responsible for this nice behavior is the factor Y of the gradient-type dynamics.Using the bound (3.14) we can proof the following proposition about attractors:

Proposition 3.3

Let (n, k, TZ, T, w) be a C-SAP instance and F := F[YaS0} with a > 0, ß > 0. Then

every strict local maximum p* of the C-SAP is an attractor of F.

Proof

W.l.o.g. let p*i = 1 for all i e N. Since p* is a strict local maximum we have

Yziip*) > Ytr(p*) for all i e N,r > 1 and it follows from continuity that there exists

an e-neighborhood ZYe(p*) such that

Vp G U£ip*) n A0 : Yxlip) > Ytrip), \/teN,r>l.

Letp G We(p*)nA° and ö := 1 — e, thenp^ > S for all« G N. Writing F (p)îr = u„pwand using (3.14), we get for all i e N,r > 1:

Yir(p)aeirjp)ß^

Ylrjp)a ^fYir(p)Y 1

EÎ=iP»rM(p)°eM(p)/»-

PtiYuiprSuipy \Yu(p)J W+1


From the continuity of z it follows that for e —> 0 : Tirp

—> r'( J < 1. Hence

we can choose J with <5 < J < 1 large enough and define s := 1 — 5 such that for

all points p G Ug(p*) n A0: wjr < 1 for all % G N, r > 1. Thus we have shown

Z4-(p*)nA° ÇRAip*). m

Note that though we can influence the size of a region of attraction of a strict local

maximum of a C-SAP instance by raising the corresponding objective value, other

regions of attraction (of strict local maxima) can never vanish completely because

of Proposition 3.3.

Subsequently we will show with an example the advantage of working with a combi¬

nation of the gradient-type dynamics (T-part) and the repellor dynamics (0-part),rather than using only the gradient-type dynamics and including the constraints

with penalizing costs in the objective function. The example will show that in the

latter case the nice property of Proposition 3.2 will be lost.

Example 3.4

Let in, k, TZ, T, w) be a C-SAP instance and let zip) be its objective function givenby (3.1). We construct a new objective function z which contains the forbidden

assignments ofTZ with penalizing costs M:

lip) = zip) - M{

y. n p*

\ReTl(i,r)eR

and maximize it by some gradient procedure. If we want M to be large enoughfor a hill climbing procedure to prefer a feasible assignment with objective value

zero to any infeasible one, then a reasonable choice for M would be the sum of the

coefficients in z.

Consider now the following C-SAP instance with n = k = 2 and c > 0:

max cpnp2i

s.t. Pup22 = 0, P12P21 = 0

PG A7.

Constructing the penalized objective function z as described above, we get

Zip) = CP11P21 - C(pup22 +Pl2P2l)-

However, now we see that the direction of the gradient of z no longer depends on

c, nor does c inßuence the region of attraction ofp* := (} §) anymore. Hence the

property of Proposition 3.2 is lost since now even a large value c cannot guarantee

convergence of{F£ip)},t > 0 to ^e(p*) for allp G A6.

The last two propositions have investigated the behavior of F[YaQß] in the

neighborhood of feasible points in A7 with large objective values. Now question


arises: What happens, if an infeasible point p* G A7 has such a large objective value?

The next proposition investigates the behavior of the repelling part of operator

FE[Ya@ß] in the neighborhood of an infeasible point p* G A7 which has a discrete

neighbor with less violated constraints in TZ. It was shown by Cochand [Coc93] that

in such situations the repellor dynamics F£[Sß] jumps (under certain assumptions)to another point which is less infeasible. This so-called 'rejecting' effect holds as

well for the combined dynamics F£[YaQß}:

Proposition 3.5

Let F := F[YaSß] with a>0,ß>landFE := *e(F). Moreover, let r,r(p) > 1

for all i G N,r G K,p G A0. If p* G A7 has a discrete neighbor q* G Af(p*) which

violates less constraints in TZ than p*, then for an appropriate choice of e there

exists a neighborhood U of^Eip*) such that for all p G U fl A0: Round(F£ip)) ^

Round(p) =p*.

Proof

First note that the assumption Yir > 1 is not restrictive, since by Transforma¬

tion 2.10 we can always transform an objective function such that it satisfies the

hypothesis.

Let p* G A7 be an assignment for which vi constraints are violated and q*, obtained

from p* by changing the value of one decision variable, be an assignment which

violates v2 < Vi constraints. W.l.o.g. we assume that p*lX = 1 for alH G A" and

if a = 1, r = 2

q*r = { 0, if i = 1, r ^ 2 yieN,r eK.

otherwise

Let p := *e(p*). To get bounds for 0lr(p) = IL^i.oeÄ (X ~ U(j,s^R\(i,r)Pjs) ,r e

K we define the following two sets: Rfr is the set of all forbidden partial assignments

containing (l,r) which are satisfied by p* regardless of the value of p\r and 7?^. are

those assignments which are unsatisfied for p*r = 1:

Rfr :={ReTZ\ (1, r) G R, 3(j, s) e R\ (1, r) : p*s = 0} Vr G K

Rulr :={ReTZ\ (l,r) G R, V(j, s) eR\(l,r): p*s=p]i = 1} Vre K,

and correspondingly ©ir(p) = 0fr(p)0fr(p) with

efr(p):= n i- n p^) ^reK (3-16)

RGRfr \ U,s)£R\(l,r) J

©SUP)" II1- II P») ^reK- (317)

R£Rl \ (j,s)eR\(l,r) J


Since p = ^re(p*) it follows from (3.10) that p%r is either e or (1 — (A — l)e). Hence the

second product of Ofr(p) contains at least one factor e (because R G R(r and \R\ >

2). Analogously, the second product of 0^.(p) consists only of factors (1 — (A; — l)e)and has at least one such factor. We have

0fr(p)= fl (i-sa«il-ik-l)e)ß*) VreK,

RGRfr

®ÏÀP)= II (1 - (1 - (^ - l)e)7Ä) VreK,

RER¥T

for some aR, 7Ä > 1, #r > 0 and aR, ßR, 7# G N, R G Rfr U Ä^..

We denote by p\r :— |i?fr| and z/ir := |i?^.| the cardinalities of the sets and define

L :— max^gft \R\ — 1 such that L +1 is the maximum number of elements contained

in a forbidden partial assignment ReTZ.

We continue giving an upper and lower bound for (3.16) and (3.17) for all r G K:

(l-e)^ <0fr(p) <1 (3.18)

((Ä - l)e)^ < 0fr(p) < (L(fc - l)e)"lr. (3.19)

The inequalities in (3.18) hold because saR(l - (k - l)e)ßn < eaR < e for

ßR > 0, aR > 1 and all factors are less or equal than 1. The first inequality in (3.19)is true because (1 — (A; — l)e)7R < (1 — (k — l)e) for jR > 1. To prove the second

inequality we use Bernoulli's inequality: (1 — (A; — l)e)7H > 14 7ä(— (A; — 1)s),^r > 1

which is applicable because 7h > 1,7s G N and —(A; — l)e > —1. From the fact

that 7Ä < L (3.19) follows immediately.

Note that i/u < vi and ui2 = ^n + (v2 — v{), because by assumption p* and q*differ only in the components (1,1) and (1, 2). From this observation and (3.18) and

(3.19) it follows that

SM>

(1-e)"»((*-l)g)"" (l-g^'ÇA;-!)^-^en(p)

-

(L(fc - l)e)"» L""' l J

Finally, let M be an upper bound for Ytr,% e N,r e K and /3 > 1 be the exponentof 0 used by the operator, then we get from (3.20)

np)i2>

QÎ2P12>

ef2£> (i-£)ß^(k-i)^-^=d(V2_vl)+i

Fip) n-

MaQßnPu MaSßn~

MaL^' '

Note that /3(-y2 — ^î) 4- 1 < 0 and therefore we can choose e > 0 small enoughto ensure that F(p)i2 > F(p)n. By continuity, the result also holds in some


neighborhood Uip). m

Note that e depends on the instance and can be computed a priori for each C-SAP

instance (on the basis of the number of possible values: k, maximal size of the

clauses: L, number of clauses and an upper bound M for the Yir all assumed to be

greater than one). However, if we want to construct an operator Fe which satisfies

Proposition 3.5, then for an instance as in Proposition 3.2 e will be smaller than 5.

Therefore, in general it is not possible to have convergence in the whole region Ae

and simultaneously the 'rejecting' effect.

By Proposition 3.5 FE[YaQß] cannot converge to an infeasible point which has a

better neighbor, from the feasibility point of view. But how about convergence in

general?

Using Sarkovskii's theorem [Dev89], we can show that there exist examples with

cycles of any period:

Theorem 3.6 (Sarkovskii)Let f : R —>• E be a continuous function. Suppose f has a periodic point of periodthree. Then f has periodic points of all other periods.

The following example shows that cycling of F[YaQß] is possible:

Example 3.7

Let the following C-SAP instance be given:

max zip) = pi2 4-P22 4- 5000(pn 4- p2i)

s.t. puP2i =0, p e A7,(3.22)

and let us consider F := F[r04]. Note that the set D := {p G A | pix = P21} is

invariant under F due to the symmetry of the instance and we have for p e D :

P12 = P22 = 1 — Pu- We deßne for p e D

\Pu 1-Pnju

fn this example we have Y = ( 5000 1 ) an(^ © = ( l-ll] 1 ) and thus we get for p G D:

5000pn(l-pn)4 5000pn(l-pn)3/(P11) = F(P)5000pn(l - pn)4 + (1 - Pn) 5000pn(l - puf 4-1

'


Figure 3.1: Graph of f(f(f(x))) and the line y = x.

ft can be shown that f has a cycle of period three. For this reason we compute

g(x) := f{f{f{x))) which is depicted in Figure 3.1 together with the line y = x.

Of course, all intersecting points are ßxed points of g and correspond therefore to a

3-cycle off. One such cycle is for example given by /(0.174) « 0.9979, /(0.9979) ft

4-10-5, /(4 • 10~5) ft 0.174 and by Sarkovskii's theorem it follows that cycles of any

period exist.

Finally, we discuss the stability of fixed points in the interior A0. We have seen

that fixed points in the interior are unstable for the SAP using F[Y]. Using a small

technical modification of the operator F[£\, Cochand [Coc93] has proven that the

same result holds as well for this class of more general operators.

Proposition 3.8 (Cochand)Let t;ir : Rnk —» R,i G N,r G K be continuously differentiable functions with

Çtrip) > 0 for all p G A0 and §^(p) = 0 for all i G N,r G K and s G K. Moreover,let

k

fi.r(p) == II(1 - P") Vt&N,reK5=1

and F := i^fF] be the operator as in (3.4). If 7 > 0 then any ßxed point p in the

interior A0 is unstable.

Proof

Since F is a differentiable mapping, a sufficient condition for a fixed point p to be

unstable is that at least one eigenvalue of DF(p) (the derivative of F at p) has

absolute value larger than one.

For a fixed point p G A0 we have F(p)îr = ^^ = p^ for alH g N,r G AT

and it follows that

k

£„.£# = E, := Y,tistilPts Vi6N,reK. (3.23)


To compute the differential DFip) we use for i e N,r e K

5jr

n=0 VseK (3.24)

OPis

P1 = -7(1 - ftr)7"1 YK1 -^ =T~ V'^ (3-25)

CPir , ,

-*- Pirs^tr,l

OPir 1 ~ Pir

Using these derivatives we get for the diagonal elements of the differential DFip) of

F^D7] at the fixed point p

dF[ÇW]i

dpi,v

J_ (f. çv y.- „ p. Q7 (p- Q7

-

J^rPis&snl2 I Çir^îrî PirÇir^îr I Çîr"^ 1

î \ \ 1- — Pir

(3,23) 1 /2

/ 7^(1 -p^— ^2 1 î — Pir2-'!

\*->i

yi2

V —i ru—i i —i 1_ n-

£->i V V -1Ar

= 1 4- (7 - l)pir.

It follows that the trace of DFip) is E"=i EÎLit1 + (7 ~ !)a>) = n(fc - 1) + «7,

viewing F[£f27] as a mapping with domain M.nk. However, note that A0, the range

of F[£Q7] has dimension n(k — 1) and therefore DFip) nSiS a^ least n vanishing

eigenvalues. This implies that there are at most n(A — 1) non-vanishing eigenvalues.Since the trace, which is the sum of all eigenvalues, is n(A; — 1) 4- nj and 7 > 0

it follows that there exists at least one eigenvalue strictly larger than one and

therefore the fixed point p is unstable.

Note that though O^ is similar to Qir, the proposition does not hold for 7 = 0.

However, the slight modification from F[£] to F[£Q7] is only an artificial modification

and of theoretical interest. Practically, 7 can be chosen small enough such that it

has a negligible effect and does not alter the results in A0. This ^-modification was

not used in the implementation.

3.3 Implementation and Numerical Results

We tested FPH on a class of randomly generated C-SAP instances for which feasible

solutions exist, but cannot trivially be determined. Subsequently we describe in the

first subsection the algorithm FPH and some details of its implementation. Then

we present the test set used in our experiments and finally we conclude with a

comparison of FPH to Tabu Search.


3.3.1 Implementation

In this subsection we describe the implementation of FPH which has been used

subsequently to solve C-SAP instances.

We recall from Section 1.4 that the basic concept of FPH was given as follows:

• Choose a starting point p° G A0.

• Compute the sequence p* := F[£](pt_1), t = 1, 2,..., I for some /.

• Choose as solution p* G A7 the assignment p* which is 'closest' to pl.

As an appropriate fitness function for the C-SAP we have chosen £ := YaQß, as

discussed in the previous sections. However, regarding the operator, experimentshave shown that the following modification improves the performance of the algo¬rithm and was hence used in our test: Instead of changing all components of a pointall at once, we update them sequentially. This variant of F[£] will be denoted byF'[£] and due to its componentwise update, F'[£] will be called Gauss-Seidel version.

Formally, F'[£] is defined as follows: Let for j e N

{P.r&r(p)i = j

Eks=iPi^s(p) VieN,reKPir otherwise

then

F'[^} = KW[C] o o F<n)[e, (3.27)

where % is a permutation of the index set A".

A more detailed description of F'[£] and some of its properties can be found in

Appendix A.

In our numerical experiments we used exclusively the Gauss-Seidel version F'[YaQß],where we allowed the permutation % in (3.27) to be changed during the algorithm.Moreover, we extended the basic concept of FPH by some additional features

improving the algorithm.

Subsequently we will first present some pseudo-code of the algorithm and explainafterwards the new procedures used therein:

FPH

1 Choose a starting point p G A£

2 for t := 1 to itlimit do

3 if t mod recit = 0

4 7T := permute(N)


5 if optip) > 0.8 • optibest)6 greedy ip)7 recenterip)8 for j := 1 to n do

9 i := Ti[j]10 for r := 1 to k do

11 n- = » T"©^

12 normalizeipt)13 Compute new values uc and z for p

14 Keep best solution in best

15 greedy ip)

We see that the second part (lines 8-14) describes the standard FPH for F'[YaQß],where in normalizeip,) the components of p%_ are normalized according to (3.10)such that p G AE.

ImprovementIn each iteration t of the algorithm the current objective value zip) and the number

of unsatisfied constraints uc(p) are computed (line 13). Since feasibility is more

important than a large objective value, we use the lexicographical order on the

vector (—ueip), z(p)) m order to compare 'solutions' of FPH.

Procedures

In the first part of the algorithm (lines 3-7) we included some additional procedureswhich will be carried out every recit iterations:

permute(N): Computes a new permutation -k of the index set N = {1,..., n}. This

permutation is used in line 9 to change the order in which the decision variables

in (3.27) are updated. In several tests we tried to improve FPH by changingthis order depending on the current values of the variables. However, it turned

out that a random permutation works best.

greedyip): If the objective value of the current solution is not too far off the best

solution found insofar, then this greedy algorithm may be applied (line 6).Moreover, the same greedy algorithm will be applied once more at the end of

FPH (line 14).greedy () first constructs an integer solution out ofp using Roundip) of (3.12).Then it tests for all discrete neighbors q G Mip) whether they achieve an

improvement or not. We set p to that neighbor q which yields the largest

improvement and repeat this process until no further improvement is possible.If by this procedure a new best assignment is found in line 6, then this solution

is reconverted into a matrix p G AE favoring the components of the assignmentwith a high value. This point p is then used as a new starting point in FPH.

recenteripi): The goal of this procedure is to avoid that FPH reaches the boundary


of Ae too fast. Besides the numerical instability close to the boundary, the

gradient-type dynamics looses there its influence. Hence, if FPH reaches the

boundary too fast, the gradient-type dynamics has not had any chance of

contributing to the result and therefore leads to assignments with low objectivevalues.

For this reason we draw the point p back to the center of A, however, without

destroying the order of its values. Towards this end, we add a constant reccon

to each value piT and normalize afterwards using normalizeipi). The larger

reccon is chosen, the closer the new point lies to the center pir = | for all

i e N, r G K.

Data Structure

This algorithm was implemented in C where the following data structures were used:

Each monomial of the objective function is a structure which contains the weight of

the monomial, its current value, the number of variables (degree of the monomial)and two list pointers. The first list links all monomials, thus constructing the

objective function. The second list contains the indices of all variables in the

monomial. This allows a fast update of the value of a monomial when the value

of some variable p;. changes. The same structure as for the objective function is

also used for the constraints, resulting in a list of forbidden assignments, p itself is

implemented as an array where each element contains besides its value as well a list

of pointers to the monomials it is contained in.

Note that though the greedy algorithm is rather time consuming, we could improveits performance thanks to the data structure used, which allows us to compute the

difference of the objective values of two discrete neighbors efficiently.

3.3.2 Instance Generation

The data of a C-SAP instance (n, k, TZ, T, w) consists of the number of variables n,

the number of values k, a set TZ of partial assignments defining the constraints, and

a set (T,w) of weighted partial assignments for the objective function.

We generated instances with n = 100 variables, k = 5 values and 10000 randomly

generated partial assignments of cardinality 3 as constraints. The objective function

is given by (T, w), which contains randomly generated partial assignments with

cardinality between 2 and 5 and all possible assignments with cardinality 1:


cardinality number of clauses weights in

1 500 [10,20]2 800 [100,200]3 1000 [500,750]4 800 [750,1500]5 200 [1500,2500]

Table 3.1: Composition of weighted partial assignments (T, w).

This choice was motivated by the following observations: First, there is no obvious

relation between the best solutions found for the C-SAP and those for the corre¬

sponding SAP. Second, the mixture of the partial assignments in T, satisfied by the

best solutions found, has no apparent pattern.

3.3.3 Numerical Results

Parameters

After having fixed the problem class, we performed some experiments to deter¬

mine the parameters of FPH. Regarding the recentering, we set recit := 8 and

reccon := 0.4. If recit is chosen too large, then it has a negligible effect, and if it

is too small, then the result is similar to choosing a completely new starting point.

Moreover, reccon determines, how much the values of p are recentered.

As a stopping criterion we tested several different possibilities and their combina¬

tions. We tried to stop when the maximum component pu has reached a specified

limit, or when the difference between two consecutive p-values or objective values

is small enough. However, after all it turned out that the larger the number of

iterations is, the better solutions will be found. Hence, for a given time limit, it was

only a question of finding a trade off between the number of starting points and the

number of iterations. In these experiments we have chosen 25 starting points and

we set itlimit :— 800.

Two further important parameters are the exponents a and ß, which influence the

behavior of FPH essentially. Intuitively one can say that a large a-value will increase

the objective value and a large ß-value improves the feasibility. However, unfor¬

tunately these values behave contradictorily such that an improvement achieved

by raising one of the values mostly results as well in a worsening of the other. Ex¬

periments with different combinations of these values have verified these statements.

The following diagrams in Figures 3.2 and 3.3 show for one instance the results for

100 starting points. Each bar in these diagrams corresponds to the solution of one

starting point, where its height corresponds to the objective value. All solutions

are sorted (along the x-axis) with respect to the number of unsatisfied constraints.


The values below the diagrams show how many feasible solutions (sat) were found

and how the objective values among them where distributed (minimum, median,

maximum). Besides demonstrating the behavior of the exponents, these figuresshould also point out the great influence of the greedy algorithm (line 15) on the

number of feasible solutions. For this reason we depict for each pair (a, ß) in the

left column the result obtained without greedy, and in the right column that one

with greedy.

In Figure 3.2 we have fixed ß = 1 and we vary only the a-exponent. The pictures

of each row correspond to the results for F'fTO], F'[P20] and F'[P30], respectively.As already assumed these figures demonstrate how the objective value improves

when raising a (compare med/max in each column), however at the same time the

number of unsatisfied constraints increases as well.

FPH behaves similarly, when a = 3 is fixed and only the /5-exponent varies. As

Figure 3.3 shows, the number of feasible assignments increases with increasing ß,however simultaneously the objective values decrease.

In further experiments it turned out that sometimes some additional minor

improvement can be achieved, if ß < 1 is chosen. Moreover, we observed that there

exists as well an upper limit for a, for which, when exceeded the objective values

begin to decrease. In the following tests we have chosen a := 4 and ß := 0.8.

As an alternative idea we also tried to change the exponents a and ß dynamically

during the algorithm, instead of fixing them in advance. However, this approachof self-adapting parameters failed, since it only increased the running time without

improving the quality of the solutions so that we refrained from using this concept

in FPH.

Comparisons of FPH to standard Simulated Annealing and Tabu Search (TS) have

shown that FPH is on the average slightly superior to these heuristics (for the

described test set). Hence, we were interested, if some additional modifications

of these methods and further refinement of their parameters would outperformFPH. Towards this end A.Hertz and D.Kobler [HK99] developed in some internal

challenge a TS which was especially adapted for solving C-SAP instances of the

described type. Next we describe briefly the main concept of this TS and present

subsequently the results of the comparison.

Tabu Search (Hertz, Kobler)The search space of the TS is Kn. Moreover, infeasible assignments x are penalized

by a function fjz(x) counting the number of unsatisfied constraints. On the other

hand a function fr{x) is defined which favors assignments in which the satisfied


oAgl001100- normal oAgloottoo-greedy

(a) sat:92, min:20459, med:29283, maX:36730. (b) sat:97; min:20668; med:29532, max:37026.

oAg200t100-normal oAg200t100-greedy

(c) sat:35, min:26865; med:35060, max:40670. (d) sat:66; min:27082; med:35399; max:41459.

oAg300t100 - normal oAg300t100-greedi

(e) Sat:2,min:35392, med:39640, max:43888. (f) Sat:36, min:29529, med:35504, max:45138.

Figure 3.2: Behavior of FPH for P0, T2© and T30 (left: without greedy, right: with

greedy).


oAg300t100- normal oAg300t100 greedy

s s 7 i ao z 3

(a) sat:2,min:35392, med:39640, max:43888. (b) sat:36; min:29529, med:35504, max:45138.

oAg300t200 - normal oAg3001200 - greedy

(c) sat:28, min:26943, med:32975, max:41211. (d) sat:66! min:26334, med:31921, max:41362.

oAg300t300 - normal oAg3001300 - greedy

(e) sat:41, min:23523, med:31127, max:41899. (f) sat:83, min:24602, med:31022, max:41902.

Figure 3.3: Behavior of FPH for r3@,r3©2 and T3©3 (left: without greedy, right:with greedy).


clauses in T have a large total weight and which takes additionally into account

how far unsatisfied clauses are apart from being satisfied. Then the cost function

is defined by fix) := \fnix) — fr(x), where A is a self-adjusting parameter. The

base length of the tabu list is set to 5 and in each iteration at most 70 neighborsolutions are visited, where a preference is given to those assignments which modifythe value of a conflicting variable. In total 25000 iterations are performed. Since in

each iteration only a subset of neighbors is considered, at the end of TS a steepest

ascent method, similar to our greedyip), is run.

ComparisonWe generated 10 C-SAP instances as described in Section 3.3.2 and solved each of

them 10 times. We report the minimum, average and maximum value over these

10 runs. All these tests were carried out on a Sun Sparc Ultra 60 workstation

with 330 MHz. One run of TS with 25000 iterations took about 11 minutes which

we have taken as a time limit. In approximately the same time we ran FPH for

25 starting points with 800 iterations each (9:30 minutes) and we took the best

solution found. In all runs feasible solutions have been found by both methods.

Table 3.2 shows the objective values found by FPH and TS, respectively. We see

that on the average FPH was about 1.7% worse than TS. Nevertheless the qualityof the solutions found by FPH yielded in one case a better maximum, in three cases

a better average and in five cases better minimum values.

FPH Tabu Search

min avg max min avg max

PI 45451 47930 50350 45437 48414 51937

P2 52683 54735 58800 53180 55369 58800

P3 54066 54910 57125 54114 56672 58645

P4 53025 54533 55910 54607 57794 59355

P5 54197 55699 56891 54002 57494 60652

P6 53399 54516 56606 52422 54990 57763

P7 52349 54568 56052 52189 54420 57857

P8 52525 54840 57183 53488 54727 57057

P9 54389 56515 57914 51256 56217 58711

P10 51642 53980 56345 53693 56048 60738

0% -1.7% -3.1%

Table 3.2: Comparison of FPH to TS.

In all our tests, no cycling of FPH was observed. Moreover, FPH is very stable with

respect to shortage of CPU time. Even in ^th of the time needed by TS, FPH is

still able to find quite good, feasible solutions, whereas TS is completely lost, since

it does not have enough time to investigate a reasonable amount of the search space.

For both algorithms there exist a few parameters which need special care, since they


have a great influence on the performance of the heuristic. Most critical for TS is the

length of the Tabu list, where even a small change can improve (or worsen) the re¬

sults by several per cent. Similarly, we have seen in Figures 3.2 and 3.3 that for FPH

a good choice of the exponents a and ß is essential, in order to find a good balance

between the number of feasible solutions found and the size of their objective values.

Summing up we have seen that if TS is given all the time it needs, then it outperformsFPH by about 1.7%. On the other hand, FPH has the advantage that it has a quitegood worst case behavior and is able to find good, feasible solutions in a fraction of

the time needed by TS. Hence, FPH is an interesting alternative which may also be

considered in combination with other local search heuristics, due to its completelydifferent approach.

Chapter 4

The Point Feature Label

Placement Problem

4.1 Label Placement - State of the Art

Originally, the label placement problem consisted in placing text elements on

geographic maps, a task carried out by cartographers. It belongs even to this dayto the most time-consuming processes in map manufacturing. However, nowadays

geographic maps are by no means the only area where label placement plays an

important role. Lately, especially technical maps became more and more importantand have influenced the layout design enormously. One typical example for such

a technical map can be found in the paper of Wagner, Wolff [WW97], who deal

with the problem of labeling groundwater drillholes with a block of measuring results.

Problem

The label placement problem consists of a set of (geographic) positions which

should be marked not only by symbols (features), but also with a correspondingtext giving a more detailed description of the location. Typical examples thereof are

the labeling of cities, mountains, but also rivers or countries. Therefore generallyone distinguishes between three different types of features: point features (cities,mountain tops), line features (rivers, border lines) and area features (countries,oceans).

Such placements should naturally fulfill a whole string of criteria such that the

final labeling looks clear and aesthetically attractive. Numerous authors have

investigated this topic and presented various sets of rules which describe desired

properties for placements. Of course, depending on the exact problem formulation

the importance of these rules varies.

In this chapter we consider only point features and we assume that a fixed number

104 The Point Feature Label Placement Problem

of potential label positions for each point feature is given. The task of label

placement is to select for each point feature exactly one label position in order

to get an aesthetically attractive placement. In a mathematical model properties

describing such aesthetic attractiveness can either be formulated as constraints

(e.g. no overlaps allowed) or modeled in the objective function. In Section 4.3 we

present three different problem formulations, which can be modeled as C-SAP's

and which will be solved by FPH. All these problems have in common that

they want to find a placement which minimizes the number of overlapping labels;

however, they differ in the counting arguments for these overlaps and in additionallydesired aesthetic properties. One such criterion could be that not all potentiallabel positions are equally desirable. In this case often priorities are assigned to

the label positions which could then be included in the objective function. In

this case, an optimal solution corresponds to a placement with a minimum num¬

ber of overlaps which additionally fulfills the position priority list as well as possible.

The case of weighted label positions is discussed in many papers and even if not

all authors agree on the same priority ranking of certain label positions, they nev¬

ertheless recognize a certain set of basic rules. Some aspects that could be used

to generate such position priorities are given by the following rules suggested byFreeman and Ahn [FA87]:

1. Names should be placed horizontally and read from west to east.

2. Neither the characters nor the words of a point feature name should be spreadout.

3. A name should be placed close to the point which it refers to, with a fairlysmall tolerance between allowed minimum and maximum distances.

4. Preference should be given to placement that causes a name to be read away

from the point feature rather than towards it.

5. Preference should be given to placement slightly above the feature over place¬ment slightly below the feature.

Table 4.1: Desired properties of an aesthetically attractive placement (from [FA87]).

Besides the described classical formulation of the label placement problem there

exist several related, no less difficult problem formulations which are also of great

practical relevance. These problems range from the generation of aestheticallyattractive label positions for a feature to the construction of different mathematical

models. One question of particular interest deals with finding the maximum

label size such that all features can be labeled without overlap (Wagner, Wolff

[WW95, WW97]). But also the reverse question is of interest: what scale is

necessary to place all labels (with a fixed minimum font size) without overlaps?

4.1 Label Placement - State of the Art 105

Finally another problem which often arises in the context of label placement is the

problem of 'point selection'. If there does not exist a placement without overlaps

(or none could be found), then point selection can be applied whose task consists in

finding a minimum number of features that may remain unlabelled in order to geta labeling without overlaps.

State of the Art

In the sequel we want to give a brief survey of existing literature and the develop¬ment of label placement during the last 40 years.

One of the first papers dealing with label placement was published in the early60s by Imhof [Imh62, Imh.75]. He introduced the already mentioned division into

three feature types: points, lines and areas and moreover, he drew some attention

to aesthetical criteria which clarify the layout and thus enhance the effectiveness

with which a map could be read. Exactly at this point Yoeli [Yoe72] continues

10 years later. He assumes that a fixed number of label positions is given and

shows examples for good and bad placements. This resulted in the developmentof a priority list for potential label positions around a point feature as shown in

Figure 4.1 (low numbers refer to more desirable positions). Even if during the

following years occasional papers suggest the use of other priorities, most of them

are nevertheless based on those suggested by Yoeli.

2 1

4 3

Figure 4.1: Position priorities suggested by Yoeli [Yoe72].

Generally one can say that until the beginning of the 80s the label placementproblem from a mathematical point of view was only dealt with occasionally.However, since then the interest in this field increased rapidly, not least thanks to

increasing computation power. Researcher and scientists of different areas rangingfrom geographers, cartographers over computer scientists to mathematicians tried

themselves at this hard problem. Since nowadays especially technical maps with

continuously increasing sizes are needed, the use of computers for this problem

gained interest. While an experienced cartographer needs several minutes to placeone label satisfactorily (which can take up to 50% of the complete productiontime of maps), modern computers are able to place several thousand labels in the

same time. Unfortunately despite this advantage in speed and though the qualityof techniques have steadily improved, computer generated maps still do not reach

the quality of manually produced maps. For this reason even nowadays many well

designed maps are generated interactively.

During the last 20 years several different algorithms and methods were developed


to attack the label placement problem, some of which we will present in Section 4.2.

One idea of special interest was introduced by Hirsch [Hir82] who dispensed with

the fixed position hypothesis and developed a floating position strategy. Other

approaches ranged from integer programming (Zoraster [Zor86, Zor90]) to expert

systems ([Zor91]). Finally, there exists a broad class of so-called rule-based systems

([FA87, CJ90, WB91, DF92]) which are used successfully for dense maps. Before

we direct our attention to some of these methods in the next section, we want to

discuss the complexity of label placement.

ComplexityThe following point feature label placement problem will be used throughout the

next sections. Its NP-hardness was proven independently by [KI88, FW91, MS91]:

Let a set of point features be given, where each of them has exactly four potentiallabel positions. We want to assign to every feature exactly one of these label

positions such that the number of labels which either overlap another point feature

or overlap one another is minimized. Note that this problem formulation is

completely independent of the shape of the labels (we can assume axis-parallel

rectangles of fixed size).

To this day there are only very few polynomial solvable special cases known,and those are practically of no significance. One such special case was stated byChristensen et al. [CMS95] and consists of instances where no potential label

position overlaps more than one other potential label position. Another class

was found by Formann, Wagner [FW91] who could prove that the point feature

label placement problem where every label has at most two potential positions is

polynomially solvable.

Because of the difficulty of the problem researchers dealt besides the developmentof heuristics also with the approximation of this problem. Agarwal et al. [AvKS98]showed an polynomial time approximation scheme for problems with axis-parallel,

rectangular labels of arbitrary height and width with a ratio of l/ö(logn). More¬

over, if the label height is fixed, then van Kreveld et al. [vKSW98] have proven that

the problem can be approximated with a factor of 1/2 in Oinlogn) time.

4.2 Methods behind Label Placement

In this section we present some of the best known methods for the point feature

label placement problem. In this process we do not want to go down to the last

detail, but present a rough overview of the essential ideas. All methods presentedhere have in common that they can be used to solve the standard problem of placing

exactly one label to each feature such that the number of overlaps is minimized.

4.2 Methods behind Label Placement 107

As already mentioned, in contrast to all other methods in this section, the improve¬ment method of Hirsch [Hir82] does not use a fixed number of label positions. Here

every point feature has an infinite set of potential label positions which are arrangedaround the point. In case of a conflict, as for example the overlap of two labels, a

so-called overlap vector is computed whose length and direction describe where a

label should be moved to, in order to resolve the conflict. If a label is involved in

more than one conflict then the sum of all overlap vectors is taken. The resultingvector is then used in some strategies to reposition overlapping labels thus freeingthem from their conflicts. Unfortunately, this method has two disadvantages: the

algorithm may get stuck in a local optimum or oscillate between two configurationswithout being able to resolve the conflict, or one overlap vector becomes so largethat there is not sufficient space left to move the corresponding label far enough in

the necessary direction.

Another completely different approach was taken by Zoraster [Zor86, Zor90]. He

formulates the problem as a 0-1 integer program where each point feature %,% =

1,..., n has kt potential label positions. A 0-1 decision variable x is used to represent

a labeling, where xzr = 1,1 < i < n, 1 < r < k% denotes a label at position r

of feature i. Since we have to assign to every feature exactly one label position,

Er=i xir = 1 f°r all 1 < i < n must hold. Moreover, pairwise overlaps of labels

should be avoided so we will assume that those forbidden, pairwise combinations

are given by a set R. Now the additional constraints x3 s + Xfs> < 1 for each

potential overlap q e R guarantee that no two labels overlap. Finally, each positionis given a weight (priority) w„ such that the objective is to minimize the overall

weight. Since 0-1 integer programming itself is NP-hard, a Lagrangian relaxation

is formed by relaxing the last mentioned constraints and introducing Lagrangianmultipliers dq > 0:

n kt

min ^2S WirXn + S(^s9 + xi'vs'i~ 1)d*

i=l r=l q£R

s.t. 2_^Xir = l l<i<n

r=l

Xir > 0 1 < % < n, 1 < r < kt, dq>Q q e R.

This relaxation is solved by subgradient methods with the aid of numerous tricks

to speed up convergence. Disadvantages of this method are that it may get stuck

in a local optimum and that it is very time consuming to solve such relaxations for

large problems.

Also widespread are rule-based methods [FA87, CJ90, WB91, DF92]. These systems

normally deal with a very general form of the label placement problem (includingpoint, line and area features), where especially aesthetical aspects are to the fore.

The goal of these methods is to embody the expertise of a human cartographer in an

automated system, which is often achieved by the development of an expert system


and the construction of a rule database. Of course such rules cover all necessary

conditions like 'no overlaps are allowed', but also further aesthetical criteria like

those shown in Table 4.1. All these rules are evaluated and then applied accordingto their priorities until a satisfactory labeling is found. If none of the placements is

acceptable then either the next-best rule is evaluated or a backtracking becomes

necessary.

Besides the already mentioned methods, the two local search methods Simulated

Annealing and Tabu Search play an important role. Neither algorithm needs any

further explanation, however, it should be stressed here that especially Simulated

Annealing came off extremely well in a comparison with other methods for attackingthe label placement problem ([CMS93, CMS95, CFMS97]). Further details on this

comparison will be given in Section 4.8.

The last method which we present here was developed for label placement with a

discrete set of potential label positions by Wagner and Wolff [WW98]. The approachconsists essentially in cleverly thought-out preprocessing strategies operating on the

so-called conflict graph (where edges represent conflicting placements). By check¬

ing certain (graph-)properties in the conflict graph the number of potential label

positions can be reduced greatly, and simultaneously the used operations guaranteethat in this step the optimal solution is never destroyed. If no further reduction

is possible, then a simple heuristic is used to eliminate further potential positionswhere still more than one possibility for a placement exists. This further reduces

the number of potential label positions, but no more guarantees optimality. The

preprocessing part together with the eliminating heuristic are applied alternativelyuntil a feasible placement is found. A great advantage of this method is its high

speed, since the algorithm runs in linear/quadratic time (depending on the set of

operations being used).

4.3 Problem Descriptions

In this section we introduce some basic notation used throughout this chapter.

Moreover, we define three slightly different variants of the label placement problemand discuss their similarities and differences.

The setting is the same for all problems: Let N := {1,..., n} be the set of pointfeatures to be labeled. Every feature has a finite number of potential label positions

given by the set Kt := {1,..., ki}. For each of these label positions there exists

an additional positive priority or weight Wir for all i e N,r e Ki, where larger

weights correspond to more desirable label positions. In analogy to the previoustwo chapters we denote by (i, r) an assignment of position r e Ki to feature i 6 N.

Using this notation we can define the set TZ by all combinations of at most two

label positions which will overlap. Every element ReTZ, which we will also refer to

4.3 Problem Descriptions 109

as a constraint, is either {(i, r)}, if the label of feature i at position r overlaps some

other point feature, or {{i,r), (j,s)},i ^ j, if the corresponding two labels overlapeach other. Note that here partial assignments R e TZ with \R\ = 1 are allowed (incontrast to Chapters 2 and 3) and we call TZ the set of forbidden assignments. The

goal of the point feature label placement problem is, to assign to each point feature

exactly one of its potential positions such that all constraints in TZ are fulfilled.

Such an assignment is then called a placement or labeling. If in a placement all

positions of a forbidden assignment R e TZ are used, then we are speaking about a

conflict.

Subsequently we define the three most important problems studied in this chapter.

(PI) Minimize the number of overlapsFind a placement such that the number of pairwise overlaps between two labels

plus the number of overlaps between a label and a point feature is minimized

while simultaneously the sum of the assigned position priorities is maximized.

(P2) Minimize the number of labels involved in a conflict

Find a placement such that the number of labels involved in a conflict is

minimized.

(P3) Point selection

By leaving a minimum number of point features unlabelled, find a conflict-free

placement for the remaining points.

Remarks

1. In problem (PI) two different objectives are given. In this case we mean an

optimization in lexicographical order, i.e. first minimize the number of pairwise

overlaps and under all assignments with such a minimum number find that one

which maximizes the given priority function.

2. In problem (P2) the formulation of 'minimizing the number of labels involved

in a conflict', is equivalent to 'maximizing the number of well placed labels'.

3. If not mentioned explicitly that position priorities are given for (PI), then we

will assume that all weights are equal, i.e. all positions are equally desirable

and therefore we use wir = 1 for alH G N, r e Ki. In this case the second task

of the objective in (PI) can be neglected.

In the sequel we want to clarify the difference between the objectives in (PI) and in

(P2). We show that even if there are no overlaps between labels and features, i.e.

|Ä| = 2, Vfi e TZ, a reduction of the number of labels involved in a conflict does

not necessarily mean a reduction of the number of pairwise overlaps. The followingfigure illustrates this difference with an example:

i


(a) 2 pairwise overlaps and 1 well (b) 2 pairwise overlaps and 2 well

placed label. placed labels.

Figure 4.2: Example demonstrating the difference between the objectives: 'minimize

pairwise overlaps' and 'minimize conflicting labels'.

In Figure 4.2, white rectangles refer to unused potential label positions, light gray

rectangles denote labels placed without overlaps and dark gray ones denote placedlabels which overlap. We see that in both figures there exist two pairwise overlaps,however in Figure 4.2(a) there is only one well placed label, whereas in Figure 4.2(b)two labels are placed without overlap.

For general label placement problems it cannot be guaranteed that there alwaysexists a conflict-free placement. For this reason both, (PI) and (P2) only seek to

minimize the number of conflicting labels. In contrast to this, (P3), the problem of

point selection, asks directly for a conflict-free placement. If not all labels can be

placed without conflict, then some point features are allowed to remain unlabeled

in order to avoid overlaps. Note, however, that even the problem of removing a

minimum number of labels from a given labeling is NP-hard, since this problemcorresponds to the vertex covering problem.

Before we start with a detailed description of the models in Section 4.5, we first

present in the next section a set of rules which can be applied to the problem in a

preprocessing step in order to reduce the search space.

4.4 Preprocessing Strategies

In this section we introduce several rules to preprocess the set TZ of forbidden

assignments in order to reduce the number of potential label positions (and therefore

the search space for our heuristic, which simultaneously speeds up the algorithm).

Primarily these rules are designed for the label placement problem (P3) with point


selection. In this case they guarantee that an optimal solution with same objectivevalue as before the reduction, also exists in the reduced search space. However, at

the end of this section we will briefly discuss how some of these rules can also be

used for (PI) or (P2) if all potential label positions are equally desirable.

Let us assume that forbidden combinations are given by the set TZ, consisting of

constraints R e TZ with \R\ e {1, 2} as described in Section 4.3. Constraints of

cardinality one correspond to label positions which overlap another point feature.

We will refer to such constraints R = {(i,r)} as forbidden positions since in this

case position r of point feature i must not be used. Constraints of cardinality two

denote pairs of overlapping labels.

Since we have only constraints of cardinality smaller or equal than two, they can

also be represented by a so-called conflict graph. This graph is sometimes used to

help with the fast detection of certain label configurations that allow an elimination

of some label positions. More details on conflict graphs can be found in [WW98].However, we will not deal here with details of the conflict graph, but we present in

the sequel a set of preprocessing rules which is directly derived from the set TZ of

forbidden assignments.

Each rule is also illustrated by a figure, where white rectangles correspond to po¬

tential label positions and light gray ones to positions which will be deleted after

the rule's application. A dot in a rectangle marks a forbidden position and finally,labels which can be fixed are shown dark gray.

(RI) Position deletion:

If there exist point features with forbidden positions then all those positionscan be deleted. More precisely, for all features i e N with {{i,r)} Ç TZ

we delete position r by setting K% := Kt \ {r}. Additionally we remove all

constraints in TZ where (i, r) is involved. Thus this procedure eliminates all

constraints R G TZ with \R\ = 1.

• 1

(z r) U

%

Note that an overlap of position (i, r) with a feature j / i does not necessarily

imply that (i, r) overlaps some potential label positions of feature j as well,as can be seen in the right figure above.


Special case: (Point deletion)If all potential label positions of a point feature i are forbidden, then this pointcannot be placed without overlap and therefore i has to stay unlabeled. All

constraints where i is involved can be removed from TZ.

(R2) Preselection:

If there exists a point feature i which has at least one potential position r

without conflict with another label or point feature, then the label can be

fixed at this position and all constraints where i is involved can be removed

from TZ.

IF

(*,r)%

1 »

(R3) Dominated positions:If there exists a point feature i with at least two non-forbidden positions r and

f, then we say (i,f) is dominated by (i, r) if {(j,s) \ {(i,r), (j, s)} C TZ} Ç

{O'j s) I {(*; r)> Cb s)} Ç= ^-}- In °ther words this means that all labels which

are in conflict with position (i,f) are also in conflict with position (i,r). In

this case we can remove position (i, r) as potential label position by settingKi := K% \ {r} and eliminate all constraints in 7£ which contain (i, r).

(*-r)

,J

(<<> ';

( i

By this reduction optimal solutions are preserved (see also Wagner, Wolff

[WW98]) because if we have to place the label either at (z,r) or at (i, f) then

placing it at (i, f) always results in less or equally many overlapped other labels

(see figure above).

(R4) Pairwise conflicting solution:

This reduction was introduced in [WW98]. If there exists a point feature


with a position r which is only in conflict with some (j, s) and j /î has a

potential position s ^ s which is only overlapped by ii,f),r ^ f then we set

the labels (i, r) and (j, s) and delete all other potential positions of i and j.

i%r) A

si

(3,*) (i,r)Ir

^

j'"

(J, à)

In the preprocessing phase we first apply rule (RI) on the constraint set to remove

superfluous constraints. Then for every point feature rules (R2)-(R4) are used

repeatedly until no further reduction is possible.

If all potential label positions are equally desirable then rules (R2)-(R4) can also be

applied to problems (PI) and (P2) (with the restriction that (R3) is only appliedto non-forbidden positions). However, (RI) must not be applied: On the one hand,because of the special case point deletion, where all positions would be eliminated (asituation only allowed when using point selection). On the other hand, even if there

exists a non-forbidden position, (RI) must not be used, if we want to guarantee that

an optimal solution is obtained. Figure 4.3 shows a situation, where placing a label

at position (i, r) yields a better result, even though this is a forbidden position.

n o

»Î

m

Figure 4.3: Example of why (RI) must not be used for (PI) or (P2) when optimalityshould be preserved.

Finally, it should be mentioned that when some potential label positions of a feature

i are deleted, then they are removed from the set K%. In this case we also have

to reduce the number of potential positions, i.e. kt := kz — 1 and besides, we will

renumber the remaining label positions such that there exist no gaps and all integersup to ki are used, i.e. Kl = {1,..., kz}.


4.5 Models for Label Placement

Having thoroughly discussed the different problem types and preprocessing strate¬

gies in the previous sections, we concentrate now on the development of different

models for solving the described problems (P1)-(P3). All three problems will

be formulated as variants of a C-SAP, where depending on the model, forbidden

assignments are either included as constraints or in form of an objective function.

Possible combinations of the solutions of the C-SAP with additional postprocessing

procedures will be discussed in Section 4.6.

In this chapter we handle the notion of the 'C-SAP' loosely and allow in contrast to

Definition 1.2 that

(i) there may exist partial assignments R e TZ with \R\ = 1

(ii) each label i e N may have a different number ki of positions.

4.5.1 Minimizing the Number of Pairwise Overlaps (PI)

The objective of problem (Pi) is to minimize the number of pairwise overlapsand labels overlapping other point features while simultaneously maximizing a

position priority function. If (PI) has a solution without overlaps, then it can be

formulated by a C-SAP. For this reason we introduce for every point feature a

decision variable Xi, i e N. Moreover, each potential label position corresponds to

a value r G Ki where we denote assignments xt = r by (i,r). The constraint set

for the C-SAP is given by the set of forbidden assignments TZ. Introducing finallyvariables Pir G {0,1}, we get the following C-SAP for (PI):

Model (Ml):

max zi ip) = ^2 XWirVlT (4-1)i=l r=l

s.t. ^pir = 1 VieN (4.2)r=l

Pir £{0,1} yieN,reKi (4.3)

Yl Ar = 0 VReTZ. (4.4)(i,r)efi

Note that in the special case when no position priorities w axe given (or all of them

are equal), then the C-SAP (4.1)-(4.4) simplifies to a 2-Sat problem (since the

objective function is constant).


Moreover, we want to stress here that there does not at all exist any guarantee

that the C-SAP (4.1)-(4.4) has a feasible solution. However, when trying to solve

it by FPH, developed in the previous two chapters, we know that in order to find

a feasible assignment for p, the number of unsatisfied constraints (and therefore

the pairwise overlaps) is minimized and the objective value maximized. Thus FPH

performs exactly the two optimization tasks required of this problem.

(PI) is one of the most commonly studied formulations of the label placement prob¬lem. The suggested C-SAP model (Ml) together with FPH will also be used as a

basic approach for solving the other versions of the label placement problem. For

example, we will see in Section 4.6 that C-SAP (4.1)-(4.4) can also be combined

with a postprocessing procedure and then be used as a solution method for problem

(P2).

4.5.2 Minimizing the Number of Conflicting Labels (P2)

The second model is designed for problem (P2): minimize the number of con¬

flicting labels. We know already from Section 4.3 that (P2) is equivalent to

maximizing the number of well placed labels. On the other hand the examples

depicted in Figures 4.4 and 4.5 clearly demonstrate the difference between the

objectives of (PI) and (P2). For this reason we use a different approach re¬

garding the forbidden assignments in the model described in this section. We

refrain from using constraints (4.4) as in (Ml) and model conflicting situations now

directly by a new objective function, thus transforming (P2) into an (extended) SAP.

i—— »i

(a) 2 pairwise conflicts with (b) 2 pairwise conflicts with

4 bad labels. 3 bad labels.

Figure 4.4: Motivating example for the objective function of (P2).

Let us first study the situation depicted in Figure 4.4 in order to construct criteria

for the new objective function. We see that there exist two possibilities to place the

label of feature 2: (2,t) and (2, b), where t denotes the top position and b stands for

bottom. Both placements have two unsatisfied constraints, however, the placementin Figure 4.4(b) is preferable for (P2), since it results in a placement with one well

placed label.


Moreover, we see that features 3 and 4 are always overlapping each other and

therefore involved in a conflict. Hence our new objective function should prefer

position (2, b) to (2,t), because in the first case the number of well placed labels (1)minus the number of additionally generated conflicts (0) is maximal.

Mathematically we can describe this situation as follows:

Let us denote the set of all potential conflict partners of a label («, r) by

jC(i,r):={ij,S)\{ii,r),ij,S)}eTZ}. (4.5)

Consider a label placement with plT = 1, i.e. a label placed at position (i,r) for

some i e N,r e Kz. The contribution of (i,r) to the objective function (countingthe number of well placed labels) has to satisfy the following two criteria:

• If (z, r) overlaps another label, then (i, r) is not well placed and the contribution

to the objective function is zero. In this case the number of overlaps of (z, r)with other labels is irrelevant.

• On the contrary, if none of the label positions in £(z, r) is going to be used,

then (i, r) will be a well placed label and we add one to the objective value.

An objective function satisfying the described criteria is given by

3*(p):=£X>r u C1 "ft')- (4-6)1=1 r=l (j,s)eC(i,r)

We have already seen that for a feasible 0-1 assignment for p, zAp) counts the

number of well placed labels, if for all R G TZ, \R\ = 2. Moreover, for arbitraryvalues Pir e [0,1] for all i e N,r e Kt the product in (4.6) is always less or equalthan one and the semi-assignment constraints imply that for a fixed i the second

sum in (4.6) is also less or equal than one. Hence, for a fixed i, the maximum of this

sum is one which is certainly obtained, if a label of feature i is placed without conflict.

By the following small modification of z2 ip) we can also include constraints R with

\R\ = 1 in our objective function. Let us define

0, if{(z,r)}Gft

1, otherwiseyieN,reK%. (4.7)

Then the objective function

n fc,

Z2{P) =^j^lclrplr Yl il-Pos) (4-8)i=l r=l {j,s)C{i,r)


counts the number of well placed labels.

Hence, the problem of minimizing the number of conflicting labels (P2) can be

modeled by the following optimization problem:

Model (M2):

n ki

max Z2 ip) = ^2 ^2 CirPir JJ (1 - Pjs) (4.9)î=1 r=l (j,s)e£(i,r)

ki

s.t. Y^Pir = 1 Vi£N (4.10)r=l

Pire {0,1} VieN,reKi (4.11)

where cir is defined by (4.7).

Remarks

1. We will refer to (4.9)-(4.11) as an extended SAP. The difference to the classical

SAP is that (4.9) is not an A-polynomial as given by Definition 2.4. Thoughstill all exponents of the variables in (4.9) are zero or one, there may now

exist monomials of the form PirPjsPjt with s ^ t. This is a consequence of

the definition of £(i, r) in (4.5) and the fact that label (i, r) may overlap two

neighboring labels of the same point feature: (j, s) and (j,t).However, though this destroys the 'linearity in pif property of the objectivefunction, other properties derived in Chapter 2 are still preserved. Especiallythe fact that F[Y] is a strict growth transformation still holds, since The¬

orem 2.2 was proven by Baum, Sell for arbitrary polynomials with positivecoefficients.

2. Let us compare the operators F[Q] for the unweighted C-SAP (4.2)-(4.4)and F[Y] for the extended SAP (4.9)-(4.11). In this comparison we assume

that there are no forbidden positions and therefore cir = 1 for alH G N, r G Ki.

Using the notation of Chapter 3 we can express the objective function of the

extended SAP also by z2(p) = EiErPir@irip) where 6jr(p) is defined as in

(3.6). If we denote by Fjr(z2,p) the partial derivatives of z2, we see immediatelythat

Yir(z2,p) = SM - Qir(p) (4.12)

with

Qir(p):= J2 Pi* Il 0--P)- (4-13)M?frT)\ (u,v)eC(j,s)\(i,r)(i,r)eC{].s)


Thus the operator F[Y] of the extended SAP (4.9)-(4.11) and the operator

F[Q] of the unweighted C-SAP (4.2)-(4.4) of the previous model differ onlyin the terms contained in Q = (Qir).

Note that if we assume that p is a feasible 0-1 assignment, then we get the

following interpretation for Qir(p): The sum in (4.13) goes over all potential

conflicting neighbors of (i, r). Thus it counts the labels for which a conflict

would be resolved by deletion of label (i, r).

3. Unfortunately, the objective function z2 in (4.9) contains negative coefficients

which may result in negative partial derivatives Yir (z2, p) as we have seen in

(4.12). An easy way to get rid of this problem lies in substituting the equality

(1 —Pjs) — Et-tjtsPjt m ^ne objective function. We get

n ki kj

h(p) = ^2YlCirPir n Ylpjt (4-14)i=l r=l (j,s)EC(i,r) *=J

which agrees with z2(p) on AJ x • • x A* where A* := {p G [0, l]kl \ ErLiPr —

1} for all i e N. Then its partial derivatives are given for alii e N,r e Ki by

Yir(z2,p) = Cir Yi (1~Pis)+ ^2 C3sPis II ^-Puv)(j,s)eC(i,r) (j,s):(i,t)eC{j,s) {u,v)eC(j,s)\(i,t)

tj=r

and are therefore non-negative.

4.5.3 Models for Label Placement with Point Selection (P3)

To model problem (P3), we extend each point feature i e N with a virtual label

position ii, 0), thus increasing the set of potential label positions to Ki U {0}. This

new position stands for the predicate that 'feature i stays unlabeled'.

Since we want to guarantee the existence of a feasible solution we do not constrain

these new positions and leave the constraint set TZ unchanged. However, in order to

avoid that these new positions are used too frequently and therefore too many fea¬

tures remain unlabeled, we use an objective function which penalizes the positions

(i, 0). In fact, we can use objective function (4.1) where we set u>;0 = a, wir = b for

all r e Ki,i e N with 0 < a < b. Choosing a = 0 and 6 = 1, the objective function

counts for any feasible 0-1 vector p the number of (well) placed labels. Hence,the following C-SAP describes the label placement problem with point selection (P3) :


Model (M3):

n ki n

max ^^2pir = E(1 ~pi°) (4-15)i=l r=l z=l

s.t. ^Tpir = 1 VieN (4.16)r=0

Pir e {0,1} Vz G N, r G Ki U {0} (4.17)

J] piT = 0 VReTZ. (4.18)

Note that this C-SAP has always a feasible solution, since in the worst case all

features can be left unlabeled. The choice of the weights a and b has a non-negligibleeffect on the performance of FPH. In our numerical experiments in Section 4.8

we have shifted the weights in order to avoid vanishing pi0 terms in the objectivefunction: we set Wio = 1 and wir = 2 for all r G Ki, i G N.

In our second model for label placement with point selection we combine the ideas

of the extended SAP of Section 4.5.2 and the concept of virtual label positions of

the previous model. The objective function consists once more of the maximization

of the number of well placed labels, however now with the additional possibility to

leave certain features unlabeled. This leads to the following extended SAP:

n h n ki kj

max z3ip) -=^2^2p^ Yl i1 ~Pj*) = ^2^Pir Yl S^'* (419)*=1 r=1 (j,s)eC(i,r) i=l r=l (j,s)e£(i,r) *=°

s.t. ^2pir = 1 VieN (4.20)r=0

Pir e {0,1} \fieN,r eKiVJ {0}. (4.21)

We know that pi0 determines whether feature i will stay unlabeled or not, and

therefore the same arguments as used for objective function (4.9) guarantee that

(4.19) counts the number of well placed labels correctly.

Note that this formulation has no constraint set. All constraints R G TZ with

\R\ = 1 can be eliminated in a preprocessing step by rule (RI). Constraints with

\R\ — 2 are already covered by the objective function.

As numerical tests have shown, FPH (for the extended SAP) yields for this model

very poor results. The reason for that lies most probably in the structure of this

model. We use the virtual label positions (i, 0) to deal with unlabeled features.

However, since this is an undesired state, the corresponding variables do not occur

in the objective function (they only come in after the transformation to get positive


coefficients). As a result, the partial derivatives corresponding to these variables

contain only very little essential information which is needed so much by the

operator F[Y].

In order to overcome this problem we develop in the next section a different approachfor (P3) which is still based on the extended SAP (4.9)-(4.11), but does not use

virtual positions.

4.6 Postprocessing Strategies

In the sequel we describe how some of the models presented in the previous section

can be combined with a postprocessing procedure in order to improve the result

got by FPH. The first postprocessing will be responsible for the task of minimizingthe number of overlapping labels. Postprocessing 2 converts an arbitrary labelinginto a conflict-free placement by removing a small number of conflicting labels, thus

satisfying the point selection critérium of problem (P3).

Postprocessing (PPl):This postprocessing is used to minimize the number of badly placed labels for

a given label placement. Towards this end we reposition labels by a greedymethod in such a way that the number of (already existent) overlaps is increased

simultaneously freeing some other labels in a neighborhood of this conflict. More

precisely, as long as we can improve the number of well placed labels, we repeat:Choose a badly placed label and try to reposition it without producing a new

conflict. If such a repositioning frees another label of a conflict, then fix it at the

new position and repeat the step.

Figure 4.5 shows even more drastically than Figure 4.2 that minimizing the number

of pairwise overlaps (PI) may even contradict the objective of maximizing the

number of conflict-free labels (P2). Figure 4.5(a) depicts a placement with onlythree pairwise overlaps and no well placed labels. After application of (PPl) we getthe labeling shown in Figure 4.5(b) with four pairwise conflicts but also one well

placed label.

4.6 Postprocessing Strategies 121

(a) Placement after the optimization (b) Placement improved by postpro-

phase. cessing 1.

Figure 4.5: Contradicting objectives and improvement by postprocessing (PPl).

It should be obvious that this method just tries to locally reposition one label in

order to maximize the number of well placed labels. This is also the reason, whyfor example the placement shown in Figure 4.2(a) cannot be improved by this post¬

processing. In that situation two labels have to be repositioned in order to gain an

improved placement. Changing only one label does not improve (may even worsen)the number of well placed labels, which is the reason why (PPl) does not work there.

Postprocessing (PP2):Most algorithms dealing with point selection (P3) try in a first step to minimize

the number of overlapping labels, e.g. solving (PI) or (P2) and perform afterwards

a postprocessing in which conflicting labels are removed. The goal of (PP2) is

to transform any arbitrary placement into a conflict-free placement satisfying the

point selection critérium by removing a small number of labels.

The main idea of (PP2) is to delete labels in decreasing node-degree order in the

conflict graph of a fixed placement. Hence, first labels which overlap many other

labels are deleted, and only at the end pairwise conflicts are resolved. In detail, the

procedure works as follows:

1. Delete all labels which overlap another point feature (apply (RI))

2. Compute for the label of each point feature i the number d(i) of labels which

it overlaps

3. Remove labels in decreasing d(i) order until all conflicts are resolved

Figure 4.4 on page 115 shows that it can be advantageous to perform postprocess¬

ing (PPl) before continuing with the point selection procedure described above. In

the first labeling shown in Figure 4.4(a), d(i) = 1 for i = 1,.. .,4 and two labels

(e.g. of features 1 and 3) have to be removed in order to get a conflict-free labeling.In Figure 4.4(b) we have d(l) = 0, d(2) = d(4) = 1 and d(3) = 2. Therefore we first

remove the label of feature 3, thus already resolving all conflicts, getting three well

placed labels.


4.7 Ambiguity

In the case of sparse maps, conflict-free placements can normally be found easily.

Hence, in these situations the problem is not so much the satisfaction of all

feasibility constraints in 7£, but more the issue of avoiding ambiguous placements.

When solving problem (PI) we use position priorities as weights for the label

positions which leads in the corresponding C-SAP model to the linear objective

function (4.1). Obviously, (4.1) does not avoid ambiguity, as the placement in

Figure 4.6 shows:

1

2,

3

(a) Very little space between labels may (b) Alternative placement without dan-

result in ambiguous placements. ger of ambiguity.

Figure 4.6: Resolving ambiguous placements.

If we assume the position priorities suggested by Yoeli [Yoe72] (Figure 4.1, upper

right position is best), then the placement shown in Figure 4.6(a) is optimal for

problem (PI). However, the reader may find maps where labels lie close togetherdifficult to read and prefer the placement shown in Figure 4.6(b). Especially in

situations where two (or more) different corners of labels lie close to the center of

one point feature (here 1 and 3), the placement may become ambiguous such that

the reader cannot be sure which label belongs to which feature.

In order to minimize the occurrence of such ambiguous placements we present in

the sequel two problem formulations dealing with this aspect. In both cases we

want to avoid labels being placed too close to each other, thus preventing the

reader from mixing up different names and avoiding confusion due to ambiguity.We assume in the remainder of this section that all labels are given by axis-parallel

rectangles where the (bounding box of the) label of feature % has length lt and

height ht. Moreover, we do not take into account any given position prioritiesin the following two problem formulations, but discuss at the end of this section

some possible modification of the objective function of the second model, such that

position priorities can also be used.

Since ambiguity is only a local property, the new objective functions of the

problems will also be based on local observations. Therefore we restrict the search

for potential ambiguous placements between two features i and j with % > j to only

1 i^f^îp

2

3

...... „

4.7 Ambiguity 123

those features j which lie in a neighborhood of i. This neighborhood is given by an

axis-parallel rectangle Ri(j) whose size depends on the dimensions of the labels of

features i and j. The threshold for the critical distance between two labels within

which ambiguity can occur is set to 10% of the label's length and height. Therefore

we define the neighborhood Riij) for a feature i with respect to j by the axis-parallel

rectangle of length 2.2(Z, + lj) and height 2.2(/ij + /ij), centered at feature i (see Fig¬

ure 4.7). This neighborhood was chosen such that point features which lie far apart

(more than 10% of their length/height) are neglected, since they do not influence

each other. On the other hand, if feature j lies in Ri(j) it is regarded as a po¬

tential candidate for an ambiguous placement and in this case we write fij) G Ri(j).

Rid)

hi

Figure 4.7: Neighborhood Riij) for point feature i used by (AMBIl) and (AMBI2).

The objective of the first problem (AMBIl) is aside from minimizing the number

of overlaps, to maximize the sum of the distances between the centers of all

non-forbidden label positions (i,r),(j,s) for all features j < i which lie in Ri(j).This can also be formulated by means of a quadratic A-polynomial used as an

objective function for the basic C-SAP of (Ml) presented in Section 4.3.

Let (i, r) and (j, s) be two non-conflicting label positions. Then we define the weight

Wirjs as the distance between the centers of the two labels (i, r) and (j, s). Thus the

new objective function is given by

n

Zi(p) '-=^2^52 £ WirjsPirPjs (4.22)1=1 jSJV,i<J: reÄj.sSKj-:

/(j)eäiO) {(i,7-),o,s)}g7î

and the C-SAP (4.22), (4.2)-(4.4) models (AMBIl). When solving this C-SAP byFPH we search for a conflict-free placement while simultaneously maximizing the

distances between pairs of labels in the given neighborhood. One typical result of

this method is shown in Figure 4.6(b) where the distances between the label centers

are maximal.

However, unfortunately this problem does not cope with all possible situation and

though it works normally very well, there exist situations where aesthetically non-

i

u>


attractive placements cannot be avoided by (AMBIl). In some cases two labels

are placed close to each other since their distances to a third label sum up to a

larger objective value (see Figure 4.8): Assume all labels are 4x2 rectangles and

the centers of the label positions are given by the following coordinates LI: (2,1),

L2top: (4,3), L2bottom: (4,5), L3: (7,9), then easy computation shows that the

objective function is

Zip) = V8pip2t + V20~PiP2b + V45p2tP3 + 5P2bP3-

Therefore the placement shown in Figure 4.8(a) with p2t = 1 has objective value

y/8 + a/45 « 9.54 and is therefore slightly larger than the objective value for the

placement shown in Figure 4.8(b) with p2b = 1 which is 5 + \/2Ö « 9.47.

m-—i

2.

(a) Placement with larger objec¬tive value, found by (AMBIl).

(b) Aesthetically more attractive

placement.

Figure 4.8: Placement where (AMBIl) does not work.

In order to avoid such situation we propose a second problem formulation called

(AMBI2).

The goal of (AMBI2) is to avoid placements where label edges lie very close

together. Hence, we do not compute the differences between the centers of two

labels, but the distances between the closest parallel edges. Identifying such

situations is straight forward for any person looking at a placement, but not at all

so clear for a computer. Let us look at the three placements depicted in Figure 4.9.

It is obvious that in Figure 4.9(a) we should compute the distance between the

horizontal edges and in Figure 4.9(b) that one between the vertical edges of the two

labels. However, in Figure 4.9(c) the smallest distance between two parallel edgeswould be between the horizontal edges (equal to zero), though the two labels lie

horizontally far apart.

4.7 Ambiguity 125

,\\ d

\ <

» (

ri

t 11 1II "i d '^

(a) Vertical dis¬

tance between two

labels.

(b) Horizontal dis¬

tance between two

labels.

(c) Horizontal distance,

though the vertical dis¬

tance is smaller (zero).

Figure 4.9: Distance rules for placement with (AMBI2).

For this reason we have to refine our criterion by additionally taking into account

the relative positions of the labels to each other. We sort the labels by ix, y)-coordinates and compute the line through both label centers. Then we check

whether it intersects first the horizontal or the vertical edge of the first label. In

the first case we compute the vertical distance between the labels, in the second

one the horizontal (see distance d in Figure 4.9).

Let d„JS denote the appropriate distance between two parallel edges of the two labels

at positions (i, r) and (j, s) as described before. Moreover, we define a maximum

distance rfmax, where it is assumed that any label farther apart than dmax is non-

critical with respect to danger of confusion. Now we can define the weights wtrjs G

[0,1] for all non-conflicting pairs of potential label positions (i, r), (j, s) with i > jand fij) e Riij) by

irjs11 Q,irjS ^> Umax

otherwise(4.23)

Using these weights in objective function (4.22), (AMBI2) can be modeled bythe same C-SAP as (AMBIl), given by (4.22),(4.2)-(4.4). Note, however that in

contrast to (AMBIl), in the objective function of the second model often several of

the weights wirjs computed for the features i and j may be equal. This comes from

the fact that on the one hand distances are even for different label positions of the

same feature measured with regard to the same edge and on the other hand from

ignoring large distances which are all set to one.

If we want additionally to the described ambiguity distances also handle givenposition priorities, we can proceed with the following modification of the objectivefunction. Instead of using the weights suggested in (4.23), we multiply them bythe position priorities and include in this way also the placement preferences in the

objective function (4.22). Of course, the weights of all label positions of features jwhich do not lie in the neighborhood Riij) remain unchanged and are therefore set

to their corresponding position priority.


It should be mentioned here that the two described methods could either be appliedto the whole problem or used in a postprocessing step for any region of the map

which the reader may find difficult to read.

The three pictures in Figure 4.10 represent typical placements for an instance

with 150 point features on a map of size 792 x 612. Every point feature has four

potential label positions and each label has a fixed size of 50 x 15. Moreover,

the critical distance dmax for (AMBI2) is set to the label height, o?max := 15 and

the weights for the objective function in (AMBI2) are those defined by (4.23)multiplied by 100. Figure 4.10(a) shows a placement found by FPH for model (Ml)without any postprocessing. It uses Yoeli's position priorities with the weights

tiin = 8,Wi2 = 2, Wis = 1, u>i4 = 4, if numbered clockwise starting with the upper,

right position. Figures 4.10(b) and 4.10(c) depict placements found by (AMBIl)and (AMBI2) respectively, where we refrained from using position priorities

in these cases in order to demonstrate the results of these methods without any

side effects. We see that in all three cases all labels have been placed without overlap.

Since avoidance of ambiguity is one of the aesthetical criteria for label placement,

every problem formulation dealing with this aspect represents only one subjectiveview. Hence, it is not possible to compare the results of different problem formu¬

lations and say which one is best. If this were possible and such a criterion for

comparison would exist, then we could use this criterion directly as objective func¬

tion of a new problem formulation. However, as we see from Figures 4.10(b) and

4.10(c) both ambiguity avoiding methods distributed the labels much more evenly

throughout the free space than it was done by the standard method in Figure 4.10(a).We see that both methods, (AMBIl) and (AMBI2) have regions where labels are

placed very well, and others where the other method found a better placement. In

most of these situations a close placement of two labels is unavoidable and there

exists only the choice of placing a label close to one or the other label. For this

reason, and since these methods could also be used locally, it is up to the designerof the map to choose the appropriate placement strategy depending on the current

situation.

4.8 Computational Results

The following subsections summarize the results of our numerical tests and our ex¬

perience with the different models developed in the previous sections. We start in

Subsection 4.8.1 by giving a description of the test set used in these experiments.

Subsequently we discuss the usage of the preprocessing rules (R1)-(R4) and point out

their effectiveness regarding the reduction of the search space. In Subsection 4.8.3

we compare the placements got by using FPH on models (Ml) with postprocessing

(PPl) and (M2), as well as (M3) and (M4) in case of point selection. Then these re¬

sults are compared to the best-known method from literature for this test set, where

4 8 Computational Results 127

*—J_^cnzi gg^g wh p.

s ia

bssrr=?

u

,,^.,Tr3 Efjfi

(a) 150 labels placed by (Ml) with position pri¬

orities

jauit-^ö

tri

£ n ros__

m

-,—^

'

'"gljj t-i-1 ^^ ESH^-^

^'. ."^ orxWMiH

EZZ3

^* '» A

tF E_23

dJ EH_i

pri

•^ ^'.A.u.i-.l '"".'"'«

r tf

(b) Placement found by (AMBIl)

ir? i""

3_„

I 3,

sur r~ ~i "».„.a '

E»a ——;

("''ysasa S3 "* »„la

1—4

ET3

scm^f^ ».sa

(c) Placement found by (AMBI2)

Figure 4 10 Placements of ambiguity avoiding strategies (AMBIl) and (AMBI2)


again both cases, with point selection allowed and prohibited are discussed. Finally,

we conclude this section with some implementation details and further description

of the parameters used in this comparison.

4.8.1 Test Set

Our numerical experiments were carried out on a test set introduced by Christensen

et al. [CMS95]. During the last years this test set has become a major benchmark

class for the comparison of different methods dealing with the label placement

problem and therefore is used widely.

An instance consists of a set of randomly placed point features on a fixed size

square where the dimensions of the map and labels were chosen by the authors in

order to identify a typical map scale for an 11 by 8.5 inch page size. Hence, for one

instance n point features are placed randomly on a map of size 792 by 612. Each

point feature has four potential label positions (upper left, right and lower left,

right) and all labels are axis-parallel rectangles of fixed size 30x7 with no position

priorities given. The objective always consisted in either solving problem (P2), i.e.

finding a placement which maximizes the number of well placed labels, or solving

problem (P3) when point selection is allowed.

Tests were especially run for medium dense (n = 750) and dense (n = 1500) maps.

In all figures of the following subsections we represent labels overlapping other labels

or point features in black and labels which are placed without conflict in light gray.

By common agreement we assume that in all examples coniiict-free labels crossingthe bounding box of the map are counted as well placed labels.

4.8.2 Reduction by Preprocessing

In this section we investigate the effectiveness of the preprocessing rules (R1)-(R4)presented in Section 4.4. We applied these rules to 10 instances of the test set

described in Section 4.8.1 for n = 750 and n = 1500, respectively. Table 4.2 shows

the average results for the smaller problem and Table 4.3 those for the larger one

where we additionally distinguish between the cases that point selection is allowed

or forbidden. Since every point feature has four potential label positions we have

for every instance of the smaller problem a total of 3000 potential positions.


with pointselection

number of fixed

point features

number of removed

label positions

number of eliminated

constraints

orig. problem 750 % 3000 % 7734 %

Rule 1:

Rules 2-4:

total:

6 0.80

497 66.27

503 67.07

931 31.03

1555 51.83

2486 82.86

5538 71.61

1678 21.70

7216 93.31

remaining: 247 32.93 514 17.14 518 6.69

without pointselection

number of fixed

point features

number of removed

label positions


constraints

orig. problem 750 % 3000 % 7734 %

Rules 2-4: 226 30.13 962 32.07 1750 22.63

remaining: 524 69.87 2038 67.93 5984 77.37

Table 4.2: Average reduction of constraints by pre-processing for problems with

n = 750.

The first table shows the results for instances where point selection is allowed.

Only in that case rule (RI) may be applied. Moreover, it is always used before

the other rules, since it reduces the constraint set and therefore also influ¬

ences the reduction of rules (R2)-(R4). However, once it was applied it will

not become necessary to use it again. In contrast to this, (R2)-(R4) have to be

applied several times in order to achieve a maximum reduction (see also Section 4.4).

The values in the first column correspond to the number of labels which could be

fixed (and thus completely eliminated) during the preprocessing phase. However,if point selection is allowed there are two possibilities to fix a label: either a pointfeature will stay unlabeled or on the contrary it could be placed without conflict.

The first situation can only occur after application of rule (RI), hence the number

in the line with (RI) corresponds to the amount of unlabeled point features. On

the other hand, fixing of label positions with respect to conflict-free placements is

carried out in rules (R2) and (R4). The corresponding line in the tables states the

number of labels placed without overlap. In both cases the corresponding variables

need not be included in the optimization phase.

The second column denotes the reduction of the number of removed label positions,which is the sum of all positions that are either eliminated or fixed.

Finally, the last column in Table 4.2 shows the number of eliminated constraints.

Of course the major reduction is achieved by rule (RI) and only about 20% could

be eliminated by the other rules. However, in total more than 90% of all constraints

could be removed by the preprocessing rules (R1)-(R4).


Comparing the situations where point selection is allowed to that one where it is

forbidden (lower part of Table 4.2), we see that in the first case about 2/3 of all

points could be labeled in advance whereas in the second case only about 1/3 of

them was fixed. This is a direct consequence of the application of rule (RI) in the

first case which eliminates a large number of label positions (and constraints) and

therefore makes subsequent placement easier.

with pointselection

number of fixed

point features

number of removed

label positions


constraints

orig. problem 1500 % 6000 % 30521 %

Rule 1:

Rules 2-4:

total:

115 7.61

141 9.40

256 17.01

3113 51.88

475 7.92

3588 59.8

25760 84.40

832 2.73

26592 87.13

remaining: 1244 82.99 2412 40.2 3929 12.87

without point

selection

number of fixed

point features

number of removed

label positions


constraints

orig. problem 1500 % 6000 % 30521 %

Rules 2-4: 42 2.80 196 3.27 583 1.91

remaining: 1458 97.20 5804 96.73 29938 98.09

Table 4.3: Average reduction of constraints by pre-processing for problems with

n = 1500.

Comparing the results of Table 4.2 to those of the larger instances shown in

Table 4.3 we see that in both cases the largest amount of constraints is eliminated

by rule (RI). It can also be observed that the percentage of this reduction is

higher for dense problems, because in that case the probability of the occurrence of

forbidden positions (overlapped point features) is much higher.

All other reductions are smaller for the larger instances, since it is much more difficult

to fix labels or remove positions when there is little free space left. The situation

becomes even worse for dense problems with forbidden point selection (see lower

part of Table 4.3). In that case all reductions are less than 5%. However, since

the preprocessing rules can be carried out extremely fast they are used even in this

situation.

4.8.3 Comparison of the Models

To compare our different models we used the test set described in Section 4.8.1 and

generated 25 instances with 750 and 1500 point features each.


Models

Throughout the remaining three subsections we denote the models and correspond¬

ing methods as follows: (Ml) is the C-SAP (4.1)-(4.4) for label placement without

point selection as described in Section 4.5.1. The second model without point

selection, (M2), is defined by the extended SAP (4.9)-(4.11).

For label placement with point selection we have the C-SAP (4.15)-(4.18) as (M3).Furthermore we define another new method, (M4), in order to overcome the weak¬

ness of the model given by (4.19)-(4.21). (M4) does not use virtual label positions,

but is based on an extended SAP. First, we apply preprocessing rules (R1)-(R4)to the constraints and then we solve the extended SAP (4.9)-(4.11) as described

in Section 4.5.2. After each run of FPH we fix all well placed labels. This reduces

the problem size and also modifies the constraint set. We continue by solving the

remaining problem (again applying (R2)-(R4)) and repeat this procedure as longas an improvement can be achieved. Only then we apply postprocessing (PP2) to

get a final, conflict-free placement.

All C-SAP's and extended SAP's are solved by FPH. Further details on these

methods or their parameter settings for FPH can be found in Section 4.8.5.

ComparisonThe following table summarizes the test results for the two models without pointselection. Since in (Ml) the objective function minimizes only the pairwise overlapswe performed subsequent to the optimization phase postprocessing (PPl). In (M2)the objective function maximizes already the number of well placed labels directlyand therefore no postprocessing becomes necessary. For each instance we used 10

different starting points and performed for each of them 1000 iterations. Out of

these 10 values we computed the best, average and worst solution.

750 point features

bad labels %

1500 point features

bad labels %

worst

(Ml) average

best

100.24 13.37

97.11 12.95

94.44 12.59

1078.16 71.88

1071.48 71.43

1064.80 70.99

worst

(M1)+(PP1) average

best

77.80 10.37

75.00 10.00

72.24 9.63

817.48 54.50

808.63 53.91

799.48 53.30

worst

(M2) average

best

67.80 9.04

64.84 8.65

62.00 8.27

673.32 44.89

667.10 44.47

661.32 44.09

Table 4.4: Comparison of (Ml) and (M2).


Table 4.4 shows the average number of badly placed labels over the 25 in¬

stances of the worst, average and best label placement found. The second column

shows the percentage of bad labels with respect to the total number of point features.

We clearly see the increasing improvement from top to bottom. The worst case

of each new model is always better than the best solution of the previous one.

It is not astonishing that (Ml) used without postprocessing achieves quite poor

results since its objective of minimizing the number of pairwise overlaps givesat best just an idea where to search for good solutions. (M2) performs very

well and its results come close to those of the best method (Simulated Anneal¬

ing) of the comparison carried out by Christensen et al. [CMS95] (see Section 4.8.4).

Besides these results we also experimented with the number of iterations in order to

see whether an increased number of iterations (and running time) can achieve better

results. The average results of the 25 instances for 200, 1000 and 10000 iterations

are shown for n = 750 in Table 4.5.

(M2)750 points

200 iterations

bad labels %

1000 iterations

bad labels %

10000 iterations

bad labels %

worst

average

best

71.64 9.55

67.96 9.06

64.52 8.60

67.80 9.04

64.84 8.65

62.00 8.27

65.64 8.75

62.92 8.39

60.44 8.06

Table 4.5: Varying the number of iterations for (M2) with 750 point features.

Solving an instance for one starting point with 1000 iterations took about 22

seconds on a Sun Ultra Sparc workstation. Hence, it follows from the above results

that if time is not a critical factor, then an increase of the number of iterations

could be permitted in order to improve the quality of the label placement. However,if time is critical, then probably the improvement of less than 1% does not admit

the longer running times.

Two typical placements are depicted in Figure 4.11. In the first picture, Fig¬ure 4.11(a) we see the placement found after 1000 iterations by (Ml) with post¬

processing (PPl). It has 79 badly placed labels. Figure 4.11(b) shows the result

for the same instance solved by (M2) which is slightly better and has only 67 bad

labels.

4.8.4 Comparison to Other Heuristics

Since we have seen in the experiments of the last subsection that for label placementwithout point selection model (M2) works best, we want to compare now this best


(a) (Ml) with postprocessing (PPl): 79 bad labels

(b) (M2) after 1000 iterations 67 bad labels

Figure 4.11: Comparison of (Ml) and (M2) with the example of a random map with

750 point features with point selection prohibited.


model with other heuristics developed for this task.

As already mentioned in Section 4.8.1 we use the test set introduced by Christensen

et al. [CMS95] of 25 randomly generated maps with n = 750 and n — 1500 pointfeatures. In their paper the authors compared several different heuristics on this

test set and in all cases Simulated Annealing achieved the best results.

A summary of their results for the problem without point selection is given in Ta¬

ble 4.6. It shows the ranking of the three best methods, the number of conflictinglabels and the approximate fraction of badly placed labels. The names of the meth¬

ods refer to those described in Section 4.2.

ranking

750 point features

method bad %

1500 point features

method bad %

1.

2.

3.

Sim. Annealing 60 8

Hirsch 135 18

Zoraster 157 21

Sim. Annealing 615 41

Gradient Descent 975 65

Hirsch 1140 76

Table 4.6: Comparison of methods for instances without point selection. Taken from

Figure 11 in [CMS95].

To carry out a fair comparison on the same platform we also implemented the

described Simulated Annealing and compared its results to our method. Moreover,we included in our tests a random placement in order to get a trivial lower bound

and a greedy procedure consisting of our postprocessing (PPl) applied to a random

placement.

The following tables summarize the results of our test. The parameters for

Simulated Annealing were chosen as suggested in [CMS95] and are described in

Section 4.8.5. However, in contrast to their paper, where every instance was onlysolved once, we solve each of the 25 instances 10 times taking the best, average and

worst result of these 10 runs.

One run for Simulated Annealing took on a Sun Ultra Sparc workstation 18 seconds

for n = 750 and 84 seconds for n = 1500, slightly less time than needed for 1000

iterations of (M2).

For some instances the best placements were found by (M2) and for others bySimulated Annealing. As we can see from Table 4.7, Simulated Annealing is on the

average slightly better than (M2). However, the difference is for both problem sizes

for the best as well as for the worst case at most 0.7%. Hence, (M2) can be placedsecond in the ranking shown in Table 4.6. Some typical placements for one instance

of this test series are shown in Figure 4.12.


750 point features

bad labels %

1500 point features

bad labels %

worst

Sim. Ann. average

best

63.08 8.41

60.75 8.10

58.32 7.78

663.13 44.21

656.92 43.79

650.80 43.39

worst

(M2) average

best

67.80 9.04

64.84 8.65

62.00 8.27

673.32 44.89

667.10 44.47

661.32 44.09

worst

Greedy (PPl) average

best

214.40 28.59

198.39 26.45

182.36 24.31

1002.80 66.85

982.16 65.48

962.88 64.19

worst

Random average

best

550.84 73.45

532.78 71.04

513.12 68.42

1383.04 92.20

1365.52 91.03

1350.24 90.02

Table 4.7: Comparison of different heuristics for random maps without point selec¬

tion.

Table 4.8 depicts the results of a comparison of (M3) and (M4) with Simulated

Annealing for problem (P3), label placement with point selection. Both C-SAP

models are combined with postprocessing (PP2) in order to guarantee a conflict-free

placement. Compared to each other all three methods achieved very similar results,with only small differences regarding the quality of the solution. (M4) found slightlybetter results for the smaller problems, whereas Simulated Annealing was better for

larger ones.

750 point features

bad labels %

1500 point features

bad labels %

worst

(SA) average

best

48.08 6.41

45.85 6.11

43.60 5.81

548.80 36.59

541.77 36.12

535.44 35.70

worst

(M3)+(PP2) average

best

46.88 6.25

45.91 6.12

45.12 6.03

554.32 36.95

550.80 36.72

547.52 36.50

worst

(M4)+(PP2) average

best

46.24 6.17

44.90 5.99

43.64 5.82

554.32 36.95

549.53 36.64

545.72 36.38

Table 4.8: Comparison of different heuristics for random maps with point selection.

A point of greater difference is the speed of the algorithm: Simulated Annealingneeds for one run about 35 seconds for the smaller and 122 seconds for the larger


£*rc& 'ilia tisaaf, y;rë=in ts=s

——, F^TTTT r~~llliliU

^^^S^_T^jjl_ ,-—-pl^^f E^iïl'

I t^^ O -I l-^T1 £—, i^Ytt^—. CSS L^_. )*^T —. ""*"t

Simulated Annealing (64) Simulated Annealing (667)

(M2) (67) (M2) (679)

S^Ï^ÏS» =sS

fc^L^A^s^â*

Greedy (PPl) (198) Greedy (PPl) (973)

Random (528) Random (1364)

Figure 4.12: Comparison of different methods for n - 750 (left) and n = 1500 (right)features. Black labels (number in brackets) indicate conflicting labels.


problem, whereas (M4) needs only 6 and 50 seconds, respectively, and fastest (dueto the large elimination of label positions in the preprocessing) was (M3) with only

4 and 22 seconds.

Finally, it should be mentioned that for our test instances we could not reproduce

the good results with our re-implemented Simulated Annealing of the paper

[CMS95] with only about 27% badly placed labels for 1500 point features.

Summing up we see that our results of (M2) and (M4) are absolutely comparable to

those of Simulated Annealing, the best-known method for this benchmark test set.

Moreover, the combination of an extended A-polynomial as objective function and

a set of forbidden assignments as constraint set also allows further combinations of

this model with additional e.g. aesthetical aspects, as we have seen in the case of

avoidance of ambiguity in Section 4.7. Many other aspects such as the size of cities

(=importance that point is labeled) or label preferences could be included this way

in the objective function.

4.8.5 Implementation Details and Parameter Settings

In this subsection we discuss some implementation aspects and describe the

parameters of FPH used in our tests.

All methods were implemented in C and run on a Sun Ultra Sparc workstation.

The visualization tool was implemented by B.Mateev in Objective-C under

NeXT-Step on a PP200. Instances which should be visualized were generated bythe visualization tool which determined the sets of weighted partial assignmentsfor the objective function and the forbidden partial assignments for constraints.

Then these two sets were used as input for the algorithm (which also can be used

stand-alone) which computed a placement in form of an assignment vector.

In the sequel we describe the parameters of FPH which were used in our comparisons

for solving the C-SAP's (and their extensions). The algorithm was implementedas described in Section 3.3.1. For this reason we report here only the parameter

setting and its combination with pre- and postprocessings.

Recentering:All methods use every 10 iterations a recentering of the variables pir, i £ N,r e Kx.

The goal of this recentering is to avoid that the algorithm gets stuck too early at

the boundary of the feasible region. For this reason we add the constant 0.6 to all

values in p and normalize afterwards, such that Er=iP^ = 1 for all i e N.


Operator/Iteration/Computation:As described in Section 3.3 we use here as well the Gauss-Seidel version F'[P] of

the operator with fitness function £ := YaQß for the C-SAP. Again the exponents

a, ß > 0 were used to control the intensity of either dynamics. We experimentedwith many different values for the exponents and set them finally to a := 10 and

ß :— 0.5, which seems to work best for these label placement instances.

To avoid numerical instability as far as possible, we initialize 0 with a largeconstant Omit := 10100 such that successive multiplication with (1 — JI^) w^ no^

too soon cause numerical problems. The stopping criterion was given by a fixed

number of iterations, which was set to 1000, if not mentioned differently.

Combination with Pre- and Postprocessing:The two models (Ml) and (M2) for label placement without point selection

are generally used together with the preprocessing rules (R2)-(R4). However,

(Ml) reduces in the unweighted case to a Satisfiability problem, using only the

9-dynamics, and (M2) corresponds to an extended SAP which does not use the

constraints in TZ. Since (Ml) does not minimize the number of badly placed labels

directly, it is normally used in combination with postprocessing (PPl).

The other two methods (M3) and (M4) are designed for label placement with pointselection. (M3) consists of an objective function and the constraint set TZ and

therefore is a classical C-SAP. The weights used in the objective function are set to

Wio = 1 and w„ — 2 for all r ^ 0, i e N. (M4) is in analogy to (M2) an extended

SAP, which does not use virtual label positions. Both methods use preprocessingrules (R1)-(R4). However, since in (M4) FPH is applied more than once, each time

fixing the well placed labels in between, we use (RI) only the first time. Due to

application of preprocessing rule (RI) the extended SAP for (P3) can be solved

much faster than for (P2). Moreover, (M3) and (M4) are also both combined with

postprocessing (PP2) to ensure a conflict-free placement.

Simulated Annealing:The parameters for Simulated Annealing were taken from the paper [CMS95]. The

feasible set for configuration changes was the entire set of labels. The initial tem¬

perature T0 in the annealing schedule was chosen such that a worse placement is

accepted with probability p = exp~AE/T = | when the change in the objectivefunction AE = 1. At each temperature a maximum of 20n labels are repositionedand then the temperature is decreased by 10 per cent. If more than 5n successful

configuration changes are made at any temperature, the temperature is immediatelydecreased. This process is repeated for at most 50 temperature stages whereas it

terminates earlier, if the algorithm stays at a particular temperature for the full 20n

steps without accepting a single label repositioning.

Appendix A

Gauss-Seidel Version of the

Operator

In most of our numerical experiments we have used a variant F'[£] of operator F[£]of (1.20), which updates the components of a point p sequentially. For this reason

F'[£\ is called the Gauss-Seidel version of F[£] and is defined as follows:

Definition A.l (Operator F'[g\)Let £ = iÇir) be a ßtness function on A0. We deßne for each j e N a mapping

Fj[Ç] : A0 -+ A0 by

{Pirjir(p)ff l = j

EÎ=iP«f«(p) VieN,reK.Pir otherwise

Then the operator F'[Ç] : A0 -> A0 is deßned by

m = k® ° - ° nis- (at)

We see that between p and F'[Ç](p), n — 1 so-called intermediate points are visited,which we denote as follows: For all j e N and any fixed p e A0 we define

p® := FjlftipU-V), with pW:=p,

and we get therefore p^ = F'[£](p).

Regarding the fixed points of the operators F and F', the following property holds:

Proposition A.2

Let £ be a ßtness function on A0. Then the operators F := F[£] and F' := F'[£]have the same ßxed points.

Proof

'=>': Let Fip) = p. Since F[ip) changes the components pi. in the same way as

140 Gauss-Seidel Version of the Operator

F and leaves all other components unchanged, we have p\' = F(p)i, = pL and

therefore p^ = p. Continuing analogously for all remaining components i G A" we

get

p = pW = ...=pW = F'(p). (A.2)

,<=K. From F'ip) = p it follows by the same argument as above that (A.2) holds.

However, since all these intermediate points are equal, it makes no difference if we

change the components one after each other, or all at once and therefore F(p) = p.

The next theorem gives, similarly as Theorem 2.26 does for F[£], some sufficient

condition under which F'[Ç] is a strict growth transformation:

Theorem A.3

Let z : Rnk -y 1 bea polynomial such that Y = (Yir) with Yir(p) = -E~{p) Js a

ßtness function on A0. Moreover, let F' := F'[£] where £ := G{Y) as in (2.35). Then

F' is a strict growth transformation for z.

Proof

Let i e N be fixed and p := p^~l\ Subsequently we show that z(F[ip)) > zip).From (2.8) it follows that

k k

z(Flip)) > zip) & ]T f-k Yir > £>rIV (A.3)r=l 2îs=lPisîs r=l

Multiplication with the positive denominator yields

/ _,Pir'- irZir c_ I / J

Pir*- ir I I / ^PisÇis I/ _,

Pir irÇir i / j / ^PirPis*- irÇi.

T= \

or equivalently

r=l \r—l / \s=l J r=l r—1 s=l

s^r

k k

/ jPir*- irt,ir (1 Pir) / _, / ^PirPis*- irÇis fL U w

r—1-^

r=l s=l

Z-is^rPis s^r

k k

^2^2PirPisYir{£ir ~ £is) > 0 ^ (A.4)r=l s=l

sj^r

k k

^2 S PirPis(Yir - Yis)iîr - &s) > 0. (A.5)r=l s=r+l

Since Gi is strictly increasing it follows from (2.36) that (A.5) is always true and

therefore F' is a growth transformation.

Gauss-Seidel Version of the Operator 141

Moreover, we see that equality in (A.4) holds if and only if for all r, s e K, r / s

with Pir,PiS 7^ 0: £ir — Cis- Since this corresponds exactly to the fixed pointcharacterization of (2.63) it follows that F'[^} is a strict growth transformation.

Note that the assumptions on G in this theorem are much weaker than those used

in Theorem 2.26 for operator F. Hence, it follows that if £ satisfies the assumptions

of Theorem 2.26 then not only F[(], but also F'[£] is a growth transformation and

therefore several results of Section 2.3 hold as well for F'.

Moreover, if z is an A-polynomial, then it is linear along the connecting line of two

intermediate points p(*-1) and pW. Hence, Theorem A.3 implies that there exists a

path from p to -F'(p) along which z increases. This observation generalizes the result

of Theorem 2.24.

Appendix B

The Concept of KKT Points

For nonlinear programming problems, the KKT conditions are first order necessary

conditions for local optima (see e.g. [BSS93]). Since we have used these conditions

in Section 2.2.4 for the R-SAP, we describe here briefly how they are derived from

the general setting.

Let / and J denote two index sets, !Clnbea nonempty, open set and continuouslydifferentiable functions /, g% for % e I and h3 for j e J from Rn to R be given. The

problem is to find solutions of the following problem

max fix)

s.t. g%ix) < 0 \fiel

hoix) = 0 Vj eJ[ ' '

xeX.

Theorem B.l (Karush-Kuhn-Tucker Necessary Condition)Let x* be a local solution of (B.l) and L :— {i G I | g%(x*) = 0}. Suppose that

Vgz(x*) for all i G L and Vh3(x*) for all j G J are linearly independent. Then

there exist unique Lagrangian multipliers ut for i G L and v3 for j G J, such that x*

(which is called a KKT point) satisßes the following conditions:

-Vf(x*)+J2u^9^x*)+l>2v3Vhi(x*) = °

ul9iix*) = 0 Mi eiK '

u% > 0 V? G I.

In case of the R-SAP the region of feasibility is given by A and therefore we set

9ir = -Pxr for all (%,r) e N x K =: f and h3 := Y^-~i Pjr~ ! for all j e N =: J. We

see now that Vg,r(p) and Vh3(p) are for all points p G A linearly independent and

therefore (B.2) arc necessary conditions for a local maximum. By setting / := zip),we get for all i e N,r e K the following KKT conditions for a local maximum p* of

144 The Concept of KKT Points

the R-SAP:

-Yirip*) - Uir + Vi = 0

-UirP*lr = 0

Uir > 0,

which are further discussed in (2.22)-(2.24) on page 34.

Appendix C

A Gradient Approach

In Section 2.3 we have studied properties of the discrete time dynamical system

defined by iteration of the mapping F[Y}. We show now that this system can also

be interpreted as a discretized version of a continuous gradient dynamical system in

A0, that is of the form p = v, where v is the gradient vector field of z with respect

to a well chosen inner product.

We recall now the necessary definitions and notations as needed in our context.

Let M be a relatively open subset in an affine subspace of Rra. The tangent space

of M at a point p (denoted by TPM) can be viewed as the set of tangent vectors at

p to the curves in M passing through p. Moreover, by an inner product ( , ) on M

we mean a family {( , )p \ p e M} of inner products with ( , )p defined on TpMsuch that ( , )p depends smoothly on p.

Let z : M —> R be a smooth function defined on M. The differential Dz associates

to every point p G M the linear map

Dzip) : TPM 4R, £ M- Dzip) £

where Dz(p) is the best linear approximation to z — z(p) in the neighborhood of p.

A vector field v on M is defined by assigning a vector v(p) G Rn to every point

p G M.

Definition Cl (Gradient Vector Field)Given a smooth function z : M —> R. The gradient vector held grad z of z with

respect to the inner product { , ) is the unique vector field satisfying the followingtwo properties:

(i) grad zip) e TPM, Vp G M

146 A Gradient Approach

(ii) Dzip) C = (gradzip),C)p, V£ G TPM.

Note that if M = Rn (which implies that TpM = Rn) is endowed with the Euclidean

inner product, then the associated gradient field is the column vector

grad,(p) = V.(p)=(^(p),...,^-(p)).In the sequel we discuss the situation for SAP's and therefore consider M = A0.

The tangent space of A0 in a point p, denoted by TPA° is given by

TpA° := £ Gsnxk y52?ir = o,VieN\. (Ci)

r=l

Let z : A0 —y R be the objective function of a SAP (n, k, T, w), i.e.

z(p)=5^ [wt npirTeT \ (i,r)£T

and F := F[r] be the operator defined by (1.20) and (1.21). Consequently we have

for pt+1 := Fip*)

EÎ=iPi5r"(p*

4+1 = *V)* =

^k .J,.A V» G AT, r G iv (C.2)

and we get for the difference of two successive points

* / k \tâ1 -p\r =

^fc

P"

, Arir(p') - $>LrÎS(p4) I VieN,reK. (c.3)

Ea=iPLr"(p*) V i^i )

Since T is a fitness function, Es-iP^Yis > 0 for alH G A",p G A0. Hence we can

define the following inner product ( , ) on A0 by

it,ri)p--=fQ(p)v (CA)

where Qip) = diag(ç) is the nk x nk diagonal matrix defined on A0 with q G (R"fc) +

given by

qir:=Eks=iP^s VzGiV,rGKPir

Since the numerator is positive and p G A0, it follows immediately that Qip) is

positive definite. (C.4) can be written as

&i = EE E"=lP"r"^r- (C5)i=l r=l

Ar

Note that the inner product defined above does not only depend on the point p, but

also on the (form of the) objective function z defining Yir.

A Gradient Approach 147

Theorem C.2

The vector held v(p) = F(p) — p given by

vip)„ = Fip)ir - pir = . j Y„ - ^2piSYis J i e N,r G K (C.6)Z^/s=lPisl is \ s=l /

is the gradient vector held of z with respect to the inner product (C.5) on A0.

Proof

We verify directly that v(p) satisfies Definition C.l (i) and (ii).

(i) v(p) e TpA° for all p G A0 follows from

ltv(p),r = i2jlrTirr -X> = 1-1 = 0 VieN

r=l r=l 2-is=lP ls r=l

and the fact that TPA° = {£ | £)*=1 ftr = 0, Vz G N} as derived in (C.l).

(ii) Finally we show that Dzip) ' £ = (v(p)i Op'-

n k y-^kp

/ k \

("<pU)> = EE ^s=1Jls kPirr

rîr - $>rM ùri=l r=l

Pir2^s=lPzstzs \ a=i J

n k n k k

i=l r=l i=l r—1 s=l

n k n k n

= EE r«-£- = EE &?-&• = Dz(p) • fi=l r=l i=l r=l

Pir

The first equality of the last row holds, because Ç G TPA° and therefore by(C.l) the second term is zero.

Appendix D

Definitions

D.l General Definitions

Definition D.l (Homogeneous Function (Def. 2.25))A function zip) is called homogeneous of degree d, if ziap) — adzip) for all a.

Definition D.2 (Discrete Neighborhood (Def. 2.13))Let an integer vertex p G A1 be given. Then the discrete neighborhood of p is

defined by

Mip) :={qeAI\3ieN: qL ^ pL, Çj, = P], VjeN\ {i}}.

Definition D.3 (Discrete Local Maximum (Def. 2.14))A point p* G A7 is a discrete (strict) local maximum of the SAP, if for all q G M(p*):z(q) < zip*) (ziq) < zip*)).

Definition D.4 (Continuous Local Maximum (Def. 2.12))A point p* e A is a (strict) local maximum of the R-SAP, if there exists a neighbor¬hood Uip*) : Vç G U(p*) 0 A : ziq) < zip*) (z(q) < zip*)).

Definition D.5 (Stationary Point, Saddle Point (Def. 2.19))A point p* e A is called a stationary point of z, if the gradient of z projected on the

afhne subspace aff(A) is zero in p*, i.e. YiT(p*) = v,-L for all i G N,r G K and some

constants v^ e R.

Moreover, we callp* G A a saddle point of z on aff(A), ifp* is a stationary point and

in any neighborhood W£(p*) naff(A) there exist points p, q with zip) < zip*) < ziq).

Definition D.6 (Nash Equilibrium (Def. 2.20))A point p* G A is a Nash equilibrium of the R-SAP, if

VieN,\/qe{PeA\ Pj. = p*, \/j e N \ {i}} . z(q) < zip*).

Definition D.7 (Growth Transformation (Def. 2.1))Let z be a continuous function on A0 Ç R"fc. We say that a continuous mappingF : A0 —r A0 is a growth transformation for z iff

zip) < ziFip)) Vp G A0.

150 Definitions

If additionally

zip) = z(Fip)) => p - Fip)

F is called a strict growth transformation.

Definition D.8 ((Unstable) Fixed Point (Def. 2.27))Let F : A ->• A be given, then p* G A is called a fixed point, if Fip*) = p*.

Moreover, a ßxed point p* G A is called unstable, if there exists a neighborhood

U(p*) such that for every neighborhood Uiip*) m M(p*) there exists at least one

starting point p G Uiip*) C\ A such that the sequence {Ft(p)},t > 0 does not lie

entirely in Uip*).

Definition D.9 (Attractor (Def. 2.43))Let F : A —>• A be an operator. A point p* G A is called an attractor of F if there

exists a neighborhood Uip*) such that for any starting point p° G Uip*) n A the

sequence {Ft(p°)},t > 0 converges to p*. In this case

RAip*) := {p e A | lim F\p) = p*}t—^oo

is called the region of attraction ofp* with respect to F.

Definition D.10 (Geometric Convergence (Def. 2.36))Let a sequence {p*}, t > 0 with lim^œp* = p* be given. Then the sequence {p1} is

geometrically convergent, if there exist c > 0,0 < p < 1 and an index t0 such that

||p* —P*\\ < cPt f°r ail t > t0.

D.2 Specific Notations

Definition D.ll (Assignments (Def. 1.1))Let N be the index set of the decision variables and let K be the set of possiblevalues for each variable.

1. An assignment A is a set of the form A := {(i, r(i)) | i = 1,..., n, r(?) G K} Ç

N x K. Moreover, we denote by A the set of all possible assignments.

2. A partial assignment T is a set of the form T := {(i,r(i)) \ i G M,r(i) G K}for some M Ç N. If moreover a positive weight wT e R+ is given, we call

(T, wt) a weighted partial assignment.

3. A partial assignment T is satisßed by a given assignment AeA,ifTCA.

Definition D.12 (C-SAP, SAP (Def. 1.2))The constrained semi-assignment problem C-SAP for the decision variables x^, i G N,with possible values in K is deßned by a 5-tuple in, k, TZ, T, w) where

1. TZ is a set of (forbidden) partial assignments with \R\ > 2 for all R G TZ

(deßning the constraints)


2. (T, w) is a set of weighted partial assignments (deßning the objective function)

and consists in

max z (A) = ^2{wT\T Ç A}TeT

s.t. AeA

RqtA VReTZ. (D.l)

An assignment AeA which satisfies (D.l) is called a feasible assignment.The semi-assignment problem SAP is the special case of the C-SAP in which TZ = 0;it is given by the 4-tuple (n, k, T, w).

Definition D.13 (Feasible Points: A1, A, A0, A* (Def. 1.5))The set of all assignments is dehned by

5>r = l, Vie AT I.r-l

A7:= Ipe {0,l}nxk

Moreover, we denote its convex hull by

A :=Conv(A/) = Jp G [0, l]nxk

and its relative interior by A0. One single simplex will be denoted by

k ~\

$>«• = !, Vie w|r=l J

A*:=<Ug[0,1]k

r=l

Definition D.14 (Fitness Function (Def. 1.6))A vector £ = (£„.) (i G N,r G K), where £ir : A0 —y R+ are continuous functions, is

called ßtness function on A0.

Definition D.15 (Operator F[$](p) (Def. 1.7))Let £ be a ßtness function on A0. Then the operator F[£,} : A0 —>• A0 is for all

i e N,r e K deßned by

mW* =P-4ry- "h E,(p) := ^^»(p).S,(p)

s=l

Definition D.16 (Gradient/Repellor Dynamics (Def. 1.8))Let z be the objective function of a SAP instance in,k,T,w) such that Y = (Yir)deßned by

dzY„(z,p) := g—ip) = E Wt II p3s Vi^N,reK

Pir,fef:T ü.*)er\(t,r)

152 Definitions

is a ßtness function. Then we call the dynamic deßned by F[Ya], a > 0 the gradient-

type dynamic.

Moreover, if a set of forbidden partial assignments TZ is given, we deßne the ßtness

function 6 = (&ir) by

etr(p):= II I1" II Pi>) Vi£N,r<=K

*%R V MeR\(i,r) J

and call the to F[Qß], ß > 0 corresponding dynamic, the repellor dynamic.

Definition D.17 (A-Polynomial, PN,P (Def. 2.4))• Let a 4-tuple in, k,T,w) as for a SAP instance, together with a constant c G R

be given. Then we call z : Rnk ->• R

Z(P) = E I WT Yl Pir\+C (D-2)TeT \ (i,r)6T /

an assignment-polynomial (short: A-polynomial), where pir are variables with

i G N,r e K. The degree m of the A-polynomial zip) is deßned by m :=

maxrG7- \T\.

m Let U C N, then we denote by Pu the set of all A-polynomials (D.2) which

contain variables p„ with i eU.

• Let z G PN. If k = 2 then we call z a Boolean A-polynomial; ifm — 2 then z

is referred to as a quadratic A-polynomial.

• If moreover an operator F := F[£] is given, then we denote the set of all

A-polynomials in PN for which F is well deßned by Pp .

Definition D.18 (Equivalent A-polynomials (Def. 2.5))Let zi,z2 : Rnk -> R be two A-polynomials. Then

zi = z2 :& ziip) = z2(p) VpGA

and we call zx a form of z2.

Definition D.19 (Form-Independence (Def. 2.47))Let F := F[P] and z G Pj? be ßxed. Moreover, let for all z G P$f with z ^ z a

set M(z,p) depending on z and p e A be given. If for all p e A and z G Pp with

z = z : M(z,p) = Miz,p), then we deßne Mip) := M(z,p) and call the set Mip)form-independent.

Definition D.20 (Universal Region of Attraction: URAip*) (Def 2.48))Let F := F[£\, z G Pp and p* be an attractor of F. Then the universal region of

attraction URA ofp* is deßned by

URAip*) := fl RAiz,p*),

where RAiz,p*) is the region of attraction ofp* for z.


Definition D.21 (Guaranteed Region of Attraction: GRA^p*) (Def. 2.50))Let F - F[Y], z e Pp and p* G A7 with p*zl = 1 for i e N be an attractor ofF.

Moreover, let

Ap* := {p e A | pii > 0, Vi e N}Mhip*) := {p e Ap, | Y.iizp) > Y„(z,p), VieN,r>l}

Qi(p,p*) - {Q G A I 9.1 > P*i, Vi G N}.

Then the guaranteed region of attraction GRAiip*) is deßned by

GRAiip*) -.= {p e Mhip*) I Qi(p,P*) Ç Mhip*)}.

Definition D.22 (MI2c(p*),Gfii^(p*) (Def. 2.53))Let F := F[Y], z G Pg and p* G A7 with p*x = 1 for all i e N be an attractor ofF.

Moreover, let c* := maxr>i {^^jl^;?"^j} for all i G N. Then we deßne for

any c G Rn with c* < c, < 1 and

Pn >Ci VieN (D.3)

czYn + (1 - Cl)Yls > Yir VieN,r,seK\{l},r^s (DA)

Yn >Yïr VieN,r>l (D.5)

the polytope Mf^ip*) by

M/2V) := {p g A | iD.3), (DA), (D.5)}.

IfQ2(p,p*) is given by

Qi(p,P*) = {q G A | qtl > plU q„ < pir Vi G N, r > 1}

then a to MI%(p*) corresponding subset of the universal region ofattraction is deßned

by

GRAiip*) := {p G MF2ip*) | Q2ip,p*) C MF2ip*)}.

/"**.

S@t'te /

Bibliography

[ABR92] Sheldon Axler, Paul Bourdon, and Wade Ramey. Harmonic Function

Theory. Graduate Texts in Mathematics; 137. Springer-Verlag, 1992.

[AK89] E.H.L. Aarts and J.H.M. Korst. Simulated Annealing and Boltzmann

machines. Wiley, Chichester, 1989.

[Aki79] Ethan Akin. The Geometry of Population Genetics. Lecture Notes in

Biomathematics 31. Springer-Verlag, 1979.

[AvKS98] Pankaj K. Agarwal, Marc van Kreveld, and Subhash Suri. Label place¬ment by maximum independent set in rectangles. Computational Ge¬

ometry: Theory and Applications, 11:209-218, 1998.

[BBPP99] I.M. Bomze, M. Budinich, P.M. Pardalos, and M. Pelillo. The maxi¬

mum clique problem. In D.-Z. Du and P.M. Pardalos, editors, Handbook

of Combinatorial Optimization, Suppl. Vol. A, pages 1-74. Dordrecht:

Kluwer Academic Publ., 1999.

[BCG98] Michael Burkard, Maurice Cochand, and Ariette Gaillard. A dynam¬ical system based heuristic for a class of constrained semi-assignment

problems. In P. Kail and H.-J. Liithi, editors, Operations Research Pro¬

ceedings 1998, pages 182-191. Springer-Verlag, 1998.

[BCS92] A. Billionnet, M.C. Costa, and A. Sutter. An efficient algorithm for

a task allocation problem. Journal of the Association for Computing

Machinery, 39(3):502-518, 7 1992.

[BE67] Leonard E. Baum and J.A. Eagon. An inequality with applications to

statistical estimation for probabilistic functions of markov processes and

to a model for ecology. Bulletin of the American Mathematical Society,

73:360-363, 1967.

[BM85] A. Billionnet and M. Minoux. Maximizing a supermodular pseudo-boolean function: A polynomial algorithm for supermodular cubic func¬

tions. Discrete Applied Mathematics, 12:1-11, 1985.

[Bok81] S.H. Bokhari. A shortest tree algorithm for optimal assignments across

space and time in a distributed processor system. fEEE Transactions

on Software Engineering, 7:583-589, 1981.

156 Bibliography

[Bok87] S.H. Bokhari. Assignment problems in parallel and distributed comput¬

ing. Kluwer Academic Publishers, 1987.

[Bom96] Immanuel M. Bomze. Regularization in evolutionary optimization pro¬

cesses of standard QPs. Department of Statistics, Operations Research

and Computer Science, University of Vienna, Vienna, Austria, 1996.

[Bom97] Immanuel M. Bomze. Evolution towards the maximum clique. Journal

of Global Optimization, 10(2):143-164, 1997.

[BPG97] Immanuel Bomze, Marcello Pelillo, and Robert Giacomini. Evolutionary

approach to the maximum clique problem: Empirical evidence on a lagerscale. Nonconvex Optimization and fts Applications, 18:95-108, 1997.

[BR84] R.E. Burkard and F. Rendl. A thermodynamically motivated simulation

procedure for combinatorial optimization problems. European Journal

of Operational Research, 17:169-174, 1984.

[BS68] Leonard E. Baum and George R. Sell. Growth transformations for func¬

tions on manifolds. Pacific Journal of Mathematics, 27(2), 1968.

[BSS93] Mokhtar S. Bazaraa, Hanif D. Sherali, and C. M. Shetty. Nonlinear

Programming: Theory and Algorithms. John Wiley & Sons, 1993.

[Bur94] Michael Burkard. An interior point algorithm for solving max-cut prob¬lems. Diploma thesis, Technical University of Graz, April 1994.

[BW91] Roger W. Brockett and Wing Shing Wong. A gradient flow for the

assignment problem. In G. Conte and B. Wyman, editors, Progress in

System Control Theory, pages 170-177, 1991.

[Cer85] V. Cerny. Thermodynamical approach to the travelling salesman prob¬lem: An efficient simulation algorithm. Journal of Optimization Theoryand Applications, 45:41-51, 1985.

[CFMS97] Jon Christensen, Stacy Friedman, Joe Marks, and Stuart Shieber. Em¬

pirical testing of algorithms for variable-sized label placement. In Pro¬

ceedings of the 13th Annual ACM Symposium on Computational Geom¬

etry, pages 415-417, 1997.

[CHdW87] M. Chams, A. Hertz, and D. de Werra. Some experiments with simulated

annealing for coloring graphs. European Jounal of Operational Research,

32:260-266, 1987.

[CJ90] Anthony C. Cook and Christopher B. Jones. A Prolog rule-based system

for cartographic name placement. Computer Graphics Forum, 9(2):109-126, 1990.

Bibliography 157

[CL097] David Cox, John Little, and Donal O'Shea. Ideals, Varieties, and Al¬

gorithms: An Introduction to Computational Algebraic Geometry and

Commutative Algebra. Undergraduate texts in mathematics. Springer

Verlag, New York, 1997.

[CMS93] Jon Christensen, Joe Marks, and Stuart Shieber. Algorithms for carto¬

graphic label placement. In Proceedings of the American Congress on

Surveying and Mapping 1, pages 75-89, 1993.

[CMS95] Jon Christensen, Joe Marks, and Stuart Shieber. An empirical studyof algorithms for point-feature label placement. ACM Transactions on

Graphics, 14(3):203-232, 1995.

[Coc93] Maurice Cochand. A fixed point operator for the generalised maximum

satisfiability problem. Discrete Applied Mathematics, 46:117-132, 1993.

[Con92] David Connolly. General purpose simulated annealing. Journal of the

Operational Research Society, 43(5):495-505, 1992.

[Cra89] Yves Crama. Recognition problems for special classes of polynomials in

0-1 variables. Mathematical Programming, 44:139-155, 1989.

[Dev89] Robert L. Devaney. An Introduction to Chaotic Dynamical Systems.

Addison-Wesley, 1989.

[DF92] Jeffrey S. Doerschler and Herbert Freeman. A rule-based system for

dense-map name placement. Communications of the ACM, 35:68-79,1992.

[dWH89] D. de Werra and A. Hertz. Tabu search techniques. A tutorial and an

application to neural networks. OR Spektrum, 11(3).T31-141, 1989.

[FA87] Herbert Freeman and John Ahn. On the problem of placing names in

a geographic map. International Journal of Pattern Recognition and

Artifical Intelligence, 1(1):121-140, 1987.

[FW91] Michael Formann and Frank Wagner. A packing problem with appli¬cations to lettering of maps. In Proceedings of the 7th Annual ACM

Symposium on Computational Geometry, pages 281-288, 1991.

[GJ79] Michael R. Garey and David S. Johnson. Computers and Intractabil¬

ity. A Guide to the Theory of NP-Completeness. A series of books in

the mathematical sciences. W.H.Freeman and Company, San Francisco,1979.

[GL92] Fred Glover and Manuel Laguna. Modern Heuristic Techniques for Com¬

binatorial Problems. 1992.

[Glo89] F. Glover. Tabu search - part I. ORSA Journal on Computing, 1:190-

200, 1989.

158 Bibliography

[GS91] G. Gallo and B. Simeone. Optimal grouping of researchers into depart¬

ments. Ricerca Operativa, 57:45-69, 1991.

[HdW87] A. Hertz and D. de Werra. Using tabu search techniques for graph

coloring. Computing, 29:345-351, 1987.

[Hir82] Stephen A. Hirsch. An algorithm for automatic name placement around

point data. The American Cartographer, 9(1):5-17, 1982.

[HK99] Alain Hertz and Daniel Kobler. A tabu search for the constrained semi-

assignment problem, to appear, 1999.

[HM96] Uwe Helmke and John B. Moore. Optimization and Dynamical Systems.

Springer, 3rd printing, 1996.

[HRVW96] Christoph Helmberg, Franz Rendl, Robert Vanderbei, and HenryWolkowicz. An interior-point method for semidefinite programming.SIAM Journal of Optimization, 6(2):342-361, 1996.

[HS86] P. Hansen and B. Simeone. Unimodular functions. Discrete Applied

Mathematics, 14:269-281, 1986.

[HS88] Josef Hofbauer and Karl Sigmund. The Theory of Evolution and Dynam¬

ical Systems. London Mathematical Society Student Texts 7. Cambridge

University Press, 1988.

[Imh62] Eduard Imhof. Die Anordnung der Namen in der Karte. In International

Yearbook of Cartography, pages 93-129, Bonn Bad Godesberg, 1962.

Kirschbaum.

[Imh75] Eduard Imhof. Positioning names on maps. The American Cartographer,

2(2):128-144, 1975.

[Kar72] R.M. Karp. Reducibility among combinatorial problems. In R.E. Miller

and J.W. Thatcher, editors, Complexity of Computer Computation,

pages 85-103. Plenum Press, New York, 1972.

[Kar75] R.M. Karp. On the computational complexity of combinatorial prob¬lems. Networks, 5:45-68, 1975.

[KGV83] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi. Optimization bysimulated annealing. Science, 220(4598):671-680, May 1983.

[KI88] T. Kato and H. Imai. The NP-completeness of the character placement

problem of 2 or 3 degrees of freedom. In Record of Joint Conference ofElectrical and Electronic Engineers in Kyushu, page 1138, 1988.

[Kur90] Petr Kurka. Game theoretical models of mutation and selection. In

J. Maynard Smith and G. Vida, editors, Organizational constraints on

the dynamics of evolution, Proceedings in non linear science, chapter 16,

pages 213-220. Manchester University Press, 1990.

Bibliography 159

[LA83] V. Losert and E. Akin. Dynamics of games and genes: Discrete versus

continuous time. Journal of Mathematical Biology, 17:241-251, 1983.

[LA87] P.J.M. Laarhoven and E.H.L. Aarts. Simulated Annealing: Theory and

Applications. Reidel Publishing Company, 1987.

[Mal94] Federico Malucelli. A polynomially solvable class of quadratic semi-

assignment problems. Technical report, Dipartimento di Informatica,Université di Pisa, 1994.

[MP94] Federico Malucelli and Daniele Pretolani. Quadratic semi-assignment

problems on structured graphs. Ricerca Operativa, 69:57-78, 1994.

[MS65] T.S. Motzkin and E.G. Straus. Maxima for graphs and a new proof of a

theorem of Turan. Canadian Journal of Mathematics, 17:533-540, 1965.

[MS82] John Maynard Smith. Evolution and the Theory of Games. Cambridge

University Press, 1982.

[MS91] Joe Marks and Stuart Shieber. The computational complexity of car¬

tographic label placement. Technical Report TR-05-91, Harvard CS,1991.

[Rhy70] J. Rhys. A selection problem of shared fixed costs and networks. Man¬

agement Science, 17:200-207, 1970.

[SG76] S. Sahni and T. Gonzales. P-complete approximation problems. Journal

of the ACM, 23:555-565, 1976.

[Sig87] Karl Sigmund. Game dynamics, mixed strategies, and gradient systems.

Theoretical Population Biology, 32:114-126, 1987.

[Tay79] P. Taylor. Evolutionary stable strategies with two types of players. Jour¬

nal of Applied Probability, 16:76-83, 1979.

[TJ78] P. Taylor and L. Jonker. Evolutionary stable strategies and game dy¬namics. Mathematical Biosciences, 40:145-156, 1978.

[vKSW98] Marc van Kreveld, Tycho Strijk, and Alexander Wolff. Point set labelingwith sliding labels. In Proceedings of the l^th Annual ACM Symposiumon Computational Geometry, pages 337-346, 7-10 June 1998.

[WB91] Chyan Victor Wu and Barbara Pfeil Buttenfield. Reconsidering rules

for point-feature name placement. Cartographica, 28(l):10-27, 1991.

[Wei96] Jürgen W. Weibull. Evolutionary Game Theory. The MIT Press, 1996.

[Won94] Wing Sing Wong. Gradient flows for local minima of combinatorial

optimization problems. Fields Institute Communications, 3, 1994.

160 Bibliography

[Won95] Wing Sing Wong. Matrix representation and gradient flows for NP-hard

problems. Journal of Optimization Theory and Applications, 87(1):197-220, 1995.

[WW95] Frank Wagner and Alexander Wolff. Map labeling heuristics: Provably

good and practically useful. In Proceedings of the 11th Annual ACM

Symposium on Computational Geometry, pages 109-118, 5-7 June 1995.

[WW97] Frank Wagner and Alexander Wolff. A practical map labeling algorithm.

Computational Geometry: Theory and Applications, 7:387-404, 1997.

[WW98] Frank Wagner and Alexander Wolff. A combinatorial framework for map

labeling. In Sue H. Whitesides, editor, Proceedings of the Symposium on

Graph Drawing '98, volume 1547 of Lecture Notes in Computer Science,

pages 316-331. Springer-Verlag, 13-15 August 1998.

[Yoe72] Pinhas Yoeli. The logic of automated map lettering. The Cartographic

Journal, 9:99-108, 1972.

[Zor86] Steven Zoraster. Integer programming applied to the map label place¬ment problem. Cartographica, 23(3):16-27, 1986.

[Zor90] Steven Zoraster. The solution of large 0-1 integer programming problemsencountered in automated cartography. Operations Research, 38(5):752-759, 1990.

[Zor91] Steven Zoraster. Expert systems and the map label placement problem.

Cartographica, 28(l):l-9, 1991.

Curriculum Vitae

Personal Data

Name:

Date of birth:

Place of birth:

Nationality:

Michael Burkard

August 7th, 1971

Graz, Austria

Austria

Martial status: Single

Education

1977-1981 Primary school in Berg. Gladbach, Germany

1981-1989 High school in Graz, Austria

1990-1994 Study of Technical Mathematics at TU Graz, Austria

1994-2000 Assistant at the Institute for Operations Research (IFOR), ETH

Zürich. Writing a dissertation at IFOR on a heuristic for constrained

semi-assignment problems (Prof. H.-J. Lüthi).

Rights / License: Research Collection In Copyright - Non … · 2020. 3. 26. · Contents...

Documents

Transcript of Rights / License: Research Collection In Copyright - Non … · 2020. 3. 26. · Contents...