A A Heuristic for Bottleneck Crossing Minimization and its ... · PDF fileA Heuristic for...

A

A Heuristic for Bottleneck Crossing Minimization and its Performanceon General Crossing Minimization:Hypothesis and Experimental Study

MATTHIAS F. STALLMANN, Department of Computer Science, North Carolina State University

Extensive research over the last twenty or more years has been devoted to the problem of minimizing thetotal number of crossings in layered directed acyclic graphs (dags). Algorithms for this problem are usedfor graph drawing, to implement one of the stages in the multi-stage approach proposed by Sugiyama et al.[1981]. In some applications, such as minimizing the deleterious effects of crosstalk in VLSI circuits, it maybe more appropriate to minimize the maximum number of crossings over all the edges. We refer to this as thebottleneck crossing problem. This paper proposes a new heuristic, maximum crossings edge (mce), designedspecifically for the bottleneck problem. It is no surprise that mce universally outperforms other heuristicswith respect to bottleneck crossings. What is surprising, and the focus of this paper, is that, in many settings,the mce heuristic excels at minimizing the total number of crossings. Experiments on sparse graphs supportthe hypothesis that mce gives better results (vis a vis barycenter) when the maximum degree of the dagis large. In contrast to barycenter, the number of crossings yielded by mce is further reduced as runtimeis increased. Even better results are obtained when the two heuristics are combined and/or barycenter isfollowed by the sifting heuristic reported in [Matuszewski et al. 1999].

Categories and Subject Descriptors: G.2.2 [Discrete Mathematics]: Graph Theory—Graph Algorithms

General Terms: Algorithms, Experimentation, Performance

Additional Key Words and Phrases: barycenter heuristic, crossing minimization, sifting heuristic

ACM Reference Format:Stallmann, M. F. 2011. A Heuristic for Bottleneck Crossing Minimization and its Performance on GeneralCrossing Minimization: Hypothesis and Experimental Study ACM J. Exp. Algor. V, N, Article A (JanuaryYYYY), 32 pages.DOI = 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000

1. BACKGROUNDAn `-layer graph G = (V,E) has V = V1 ⊕ ... ⊕ V` and E ⊆

⋃1≤i<`(Vi × Vi+1). In other

words, the nodes are partitioned into ` layers and all edges connect vertices on adjacentlayers. It is usually assumed that the graph is directed and acyclic and a directed edgeedge vw has v ∈ Vi and w ∈ Vi+1 for some i. Some drawing algorithms for generalgraphs first convert a graph into this format (see, e.g., [Sugiyama et al. 1981] and [DiBattista et al. 1999, Chapter 9])

An embedding of a layered graph G assigns the nodes of Vi to points on the line y = iand gives a permutation πi of Vi so that, for each v ∈ Vi, the node v is mapped to thepoint (πi(v), i). The value πi(v) is the position of v (on its layer) – the simpler notationp(v) is used from here on. If each vw ∈ E is mapped to a straight line (or arrow), anembedding may induce crossings among the edges. In particular, the edge vw crossesxy – assuming v, x ∈ Vi, w, y ∈ Vi+1 – if p(v) < p(x) and p(w) > p(y). The crossingnumber of edge vw is the number of edges xy that cross vw – denote this as c(vw). Thewell-known crossing number problem is to minimize the total number of crossings, orc(E) =

∑e∈E c(e)/2. This is motivated, among other contexts, by the desire to create

esthetically pleasing drawings of graphs [Di Battista et al. 1999, Chapter 9].A related optimization problem, the bottleneck crossing problem, seeks to minimize

c(E) = maxe∈E c(e). From here on the original crossing number problem is referredto as the total problem and the bottleneck crossing problem as the bottleneck problem.The c notation is extended to subsets of E, i.e., c(E′) means maxe∈E′ c(e), where E′ ⊆ E.It is easy to show that the bottleneck problem, like the total problem is NP-hard. Theproof by Garey and Johnson [1983] for the total problem can be adapted to use the

ACM Journal of Experimental Algorithmics, Vol. V, No. N, Article A, Publication date: January YYYY.

A:2 Matthias F. Stallmann

Bandwidth problem [Garey and Johnson 1979, Appendix A1.3] instead of MinimumLinear Arrangement. The primary application for the bottleneck problem is the VLSIcrosstalk problem as described in [Bhatt and Leighton 1984] – the real problem isa bottleneck problem rather than a total problem: maximum delay and probabilityof a defect in a circuit are both related to maximum crosstalk along a single wire.See also [Takahashi et al. 2005] for a more detailed description of how wire crossingsdegrade the performance of a circuit.

Theoretical results have been obtained for (a) the two-layer crossing minimizationproblem where one layer is fixed (see [Di Battista et al. 1999, Section 9.2] and [Munozet al. 2001; Li and Stallmann 2002; Nagamochi 2005; Cakiroglu et al. 2007; Dujmovicet al. 2003; Dujmovic and Whitesides 2004]); (b) the two layer problem when both lay-ers may be permuted [Shahrokhi et al. 2000; 2001]; and (c) the k-layer problem [Duj-movic et al. 2008]. Junger and Mutzel [1997] have proposed an integer programmingmodel for this situation and a solution technique that works for small problem in-stances. Later work on exact algorithms includes results by Mutzel [2001], Chimaniet al. [2006; 2008; 2011] and Buchheim et al. [2006], among others. Both the Jungerand Mutzel paper and one by Stallmann et al. [2001] report extensive experiments forsolving the both-layer permutation problem.

Many heuristics have also been proposed for the multi-layer problem. The mostpopular are barycenter, median, and sifting [Bachmaier et al. 2010; Matuszewskiet al. 1999]. Preprocessing to obtain better initial ordering has also been proposed.In the two-layer case, breadth-first search and a more sophisticated variant thereof(see [Stallmann et al. 2001]) appear to work well. Depth-first search is preferred forthe multi-layer case (see, e.g. [Gansner et al. 1993; Gupta and Stallmann 2010]). Somemeta-heuristics have also been proposed for both the two- and multi-layer problems(see, e.g., [Valls et al. 1996; Laguna et al. 1997; Srivastava and Sharma 2008])

The maximum crossings edge (mce) heuristic, introduced here, is specifically de-signed to solve the bottleneck problem. It is not surprising that our heuristic, as de-scribed below, universally outperforms other well-known heuristics (barycenter, sift-ing, . . . ) with respect to the bottleneck problem. For some of these results, see [Guptaand Stallmann 2010]. A minor modification of the integer program described by Jungerand Mutzel could be used to solve small instances of the bottleneck problem on two-layer dags.

An unexpected observation about the mce heuristic is that it also outperforms well-known heuristics with respect to the total problem on many problem instances. Thefocus of this paper is an attempt to characterize these instances and come up with areasonable hypothesis for this behavior. In the process we hope to gain a better under-standing of some of the strengths and weaknesses of various approaches to crossingminimization, a topic that may be of relevance as parallel implementations of crossingminimization heuristics are considered.1

The following section describes the mce heuristic. Section 3 proposes a hypothesisrelating to the ability of mce to solve the total problem and gives an example to supplysupporting intuition. In Section 4, the details of the experimental test bed are de-scribed. Section 5 reports the primary results, while Section 6 gives experimental datafor longer execution times and combinations of heuristics. The observed outcomes andsome proposals for future work are presented in Section 7.

1Referred to here are recent discussions at the Dagstuhl workshop 11191, “Graph Drawing with AlgorithmEngineering Methods.” See [Ajwani et al. 2011].


A Heuristic for Bottleneck Crossing Minimization A:3

2. THE MAXIMUM CROSSINGS EDGE HEURISTICGenerally, heuristics for crossing minimization in multi-layer graphs fall into two maincategories:

— Sorting heuristics: each iteration sorts nodes on layer i based on its relationship tolayer(s) i− 1 and/or i + 1. For example, the barycenter heuristic sorts using the meanposition of the nodes adjacent to x as x’s key.

— Node-insertion heuristics: each iteration positions a node x on its layer so that somemeasure is optimized. The sifting heuristic described in [Matuszewski et al. 1999;Schonfeld 2000] uses the obvious measure: minimum total crossings. As described inmore detail later, the mce heuristic uses a more complicated local optimum.

Many well-known heuristics are static: the order in which layers (for sorting) ornodes (for node-insertion) are considered is fixed. Barycenter does an upward sweep,sorting layers i = 2, . . . , ` with respect to layer i − 1, followed by a downward sweepthat sorts layers i = `− 1, . . . , 1 with respect to layer i + 1.

Sifting chooses nodes in a given order π(V ) and completes a pass during which everynode is inserted. Passes are repeated until there is no improvement (a failure), atwhich point another sequence of passes is done using π(V ) in reverse order, again, untila failure occurs. The heuristic then continues in the same manner until the numberof failures reaches a predefined threshold. According to Matuszewski et al. [1999] thebest results occurred when π(V ) sorted the nodes by decreasing degree.

In his Master’s thesis, Gupta [2008] reports experimental results for, among otherheuristics, dynamic versions of barycenter and sifting. The former repeatedly choosesa not previously chosen layer i whose incident edges have the most crossings and sortsit with respect to layers i− 1 and/or i + 1. The latter – maximum crossings node (mcn)– repeatedly chooses a not previously chosen node whose incident edges have the mostcrossings. A pass is completed when all layers, respectively, nodes, have been chosen.All heuristics described here do multiple passes and have three termination options:stop when there is no improvement, stop after a fixed number of iterations, or stopafter a given amount of time.

The maximum crossings edge (mce) heuristic, another dynamic node-insertionvariant, was specifically designed to minimize bottleneck crossings, i.e., c(E) =maxe∈E c(e). Mce differs from sifting (and mcn) in two ways:

— The choice of node for insertion is based on the edge vw that currently has the max-imum number of crossings. Mce chooses either v or w or both, depending on whichhave not been previously chosen.

— The node insertion in mce is based on a local rather than a global optimum. Whereassifting and mcn position a node x so that it minimizes total crossings, or, more pre-cisely, the total number of crossings among edges E(y) for all y ∈ layer(x), mce takesa different approach: mce determines the position of x to be the one that minimizesthe maximum number of crossings among the edges incident to x only.

Using the same terminology as for the other heuristics, an iteration of mce refers tothe insertion/positioning of a single node and a pass inserts each node exactly once.

The outer loop of the mce heuristic, described in Figure 1(a), considers each nodeonce, based on the number of times one of its incident edges is crossed. First an edgeis chosen, the one with the most crossings, and then each of its endpoints, if it hasnot been previously chosen, is inserted on its layer via the EDGESIFT procedure. Themost important parts of the loop body are lines (4) and (6), which can, and usually will,change the edge to be chosen in the next iteration.



MCE is(1) while there is an edge with at least one unmarked endpoint do(2) let e = vw be chosen so that c(e) = c(E) and v, w are not both marked(3) if v is not marked then EDGESIFT(v)(4) and update c(f) for f ∈ Ei where i = layer(v) endif(5) if w is not marked then EDGESIFT(w)(6) and update c(f) for f ∈ Ei where i = layer(w) endif(7) mark both v and w(8) end do

(a) The main mce algorithm.

EDGESIFT(x) is(1) let y1, . . . , yk be the nodes on layer(x) sorted by position(2) maintain min = the minimum (local) number of bottleneck crossings so far(3) and p = the position at which min occurred(4) for i = p(x)− 1 downto 1 do(5) swap x with yi and update c(e) for e ∈ E(x) ∪ E(yi)(6) let ci = c(E(x) ∪ E(yi))(7) if ci < min then let min = ci and p = i(8) end do(9) repeat the preceding swapping loop for i = 1 to k(10) if p < p(x) then move x before yp

(11) else if p > p(x) then move x after yp

At this point c(E(x)) is minimized wrt the current ordering of other layers

(b) The algorithm for a single positioning iteration.

SWAP(v, w) is(1) let v1, . . . , vdeg(v) be the nodes adjacent to v, sorted by position(2) let w1, . . . , wdeg(w) be the nodes adjacent to w, sorted by position(3) do an insertion sort by position,

starting with the order v1, . . . , vdeg(v), w1, . . . , wdeg(w):(4) whenever there is an inversion between vi and wj do(5) decrement c(vvi) and c(wwj)(6) repeat the sort, starting with w1, . . . , wdeg(v), v1, . . . , vdeg(v),

and incrementing instead of decrementing the c’s

(c) The algorithm for swapping two nodes during positioning.

Fig. 1. The three functions that compose the mce algorithm.

EDGESIFT finds an optimal position for node x, one that minimizes the maximumnumber of crossings of any incident edge, or max(c(f)) where f is any edge incidenton x. It does so by noting the effect on max(c(f)) each time x is swapped with one ofits current neighbors on layer(x). In Figure 1(b), the variable min keeps track of thecurrent minimum value of max(c(f)). The swapping sequence first moves x all the wayto the left, position 1, and then all the way back to the right.

The most important part of the EDGESIFT procedure is interpretation of optimumposition. The position of x for edges incident on layer(x) is updated locally, consideringonly the edges incident on the two nodes being swapped. So EDGESIFT may not find the



2

2

4 51

1 63 5 74

3

2

2

4 51

1 63 5 74

3

a's

b's

a's

b's

Fig. 2. The minimum position for a node is not optimal.

real position of minimum bottleneck. Figure 2 illustrates this. In the top picture, edge(a7,b2), shown as a thick red line, is chosen – it has the most crossings: five. EDGESIFTis first applied to a7: the minimum value of c(E(a7)) is achieved when it is swappedwith a6, as shown in the bottom picture. When EDGESIFT is applied to b2 its positiondoes not change. Considering only the edges incident on a7 and b2 – shown as dashedblue lines in the bottom picture – the maximum number of crossings is 4, the numberfor the edge (a7,b5). The number of crossings on edge (a6,b1) has increased to 6, but thatedge is not taken into account by the EDGESIFT algorithm – neither of its endpointsis an endpoint of (a7,b2).

One might consider the failure to find the true minimum position a drawback of theheuristic. It turns out, however, to be an advantage (in addition to avoiding the obviousdifficulty in finding it). The position that actually optimizes c(E) in Figure 2 – imaginethat these are just the first two layers in a larger dag – is the starting position, acommon phenomenon when there are many crossings to start with. Insisting on a truebottleneck minimum for the current EDGESIFT is therefore likely to trap the heuristicin a local optimum. It is desirable to promote movement of nodes, especially in theearly stages. The EDGESIFT algorithm takes this one step further: it chooses a positionfarthest away from the starting position when breaking ties – this detail is not shownin Figure 1(b).

We turn now to the problem of updating c(e) for e ∈ E(x) ∪ E(yi) when x and yi

are swapped, line (5) of EDGESIFT. This is the SWAP procedure in Figure 1(c). SWAPassumes that v and w are in neighboring positions on the same layer. Without lossof generality, let p(v) < p(w) before the swap and p(v) > p(w) after it. The SWAPprocedure is essentially the same as the O(|E| + |C|) inversion-counting algorithm ofBarth et al. [2002]. However, instead of merely counting inversions among the relevantedges, we need to update c(e) for each one.

For a more detailed description, let us restrict our attention to the subset ofE(v) ∪ E(w) incident on the layer above that of v, w. The same procedure is applied(separately) for the edges incident on the layer below. To reflect the fact that v will nolonger be to the left of w, we decrement c(e) and c(f) if e crosses f when p(v) < p(w).Conversely, we increment c(e) and c(f) if e crosses f when p(v) > p(w).



Table I. The effect of max degree on relative performance of barycenter versusmce, based on classes of dags with similar structure and varying max degrees. Allstatistics are averages over 32 instances of the relevant class.

max degree 6.5 7.5 10.3 14.0 20.9 32.9crossings (bary/mce) 0.79 0.84 0.90 0.99 1.10 1.16

The time bound for each iteration of mce has three main components: (a) choosingan edge with maximum c(e) – line (2) of MCE; (b) the updates of c(e) for all edgesencountered in EDGESIFT – the insertion sorts in SWAP; and (c) the time it takesto update crossings for the edges in line (6) of MCE. For convenience, let m and nrepresent the number of edges and vertices of the graph, respectively, let ni = |Vi|, andlet mi = |Ei| be the number of edges incident on (vertices of) layer i. Part (a) can bedone in time O(m) per iteration by simply scanning through the list of edges. Part (b)requires more careful analysis. An edge sifting operation with node x will swap x witheach node on layer(x) at most twice. A swap between two nodes v and w – SWAP inFigure 1(c) – takes O(deg(v)deg(w)) to do the insertion sort that counts inversions. Foran iteration where x is on layer i, this gives a total time of O(mideg(x)). Mce doespart (c) by systematically counting the crossings for each edge: each time e is foundto cross edge f – easily determined by the positions of their endpoints – the crossingcount is incremented for both e and f . This takes O(m2

i ) where i = layer(v) (or layer(w)as the case may be) and dominates the time bound.

The analysis of sifting follows that of part (b) above. Parts (a) and (c) are unique tomce. So the time per iteration for sifting is O(mideg(x)). Total crossings can be updatedon the fly.

Based on [Barth et al. 2002] the time for a barycenter iteration is mi log mi: it takesni log ni to sort layer i and mi log mi to update the total crossings – essential if we wantto update the minimum so far.

Assuming the graphs are sparse and the incident edges are roughly equally dis-tributed among the layers – both true for the dags in our study – these bounds trans-late to O(n2/`2) for mce, O(n/`) for sifting and O((n/`) log(n/`)) for barycenter.

The actual implementations used in the experiments described here all have theO(n2/`2) bound ascribed to mce: all heuristics update edge crossings after every it-eration so that they can be compared with respect to bottleneck crossings as well astotal crossings. The reader can judge, based on all subsequent experimental results,whether or not this matters. Mce still takes significantly more time per iteration thaneither sifting or barycenter – see Section 6.1.

3. A HYPOTHESIS AND THE INTUITION BEHIND ITNow to a discussion of when and why mce outperforms other heuristics with respect tototal crossings – here the focus is on barycenter; some results on sifting are reportedlater. Preliminary experiments reported in [Gupta and Stallmann 2010] yielded theresults in Table I. Both the max degree and the ratio of crossing numbers are averagesover classes of 100 randomly generated dags, each dag having 14 layers, 40 nodes perlayer, and roughly 580 edges. A bias parameter was adjusted to obtain different maxdegrees – more about the generation of these types of dags later.

While the trend is not always so clear cut, many experiments done since have demon-strated the same phenomenon with respect to classes of dags having different numbersof layers and different densities. Experiments were restricted to dags with average de-gree no more than 2.5 – dags that are too dense have smaller variation among perfor-mance of different heuristics.



2

18

31 4 5 6 7

9 10 118 12 13 14 15 16 171 193 4 5 6 72

187.1 10.88...

42 3 41 31.5

18 18 18.517

2 2 2 22 21 1 1 1 11a's

b's

(a) Barycenter is stuck: 28 total crossings, 16 bottleneck crossings. Numbers above andbelow the dag are barycenter weights.

2

18

31 4 5 6 7

9 10 118 12 13 14 15 16 171 193 4 5 6 72a's

b's

(b) One iteration of mce: 82 total crossings, 10 bottleneck crossings. The edge withmaximum number of crossings and the sifted node are indicated with thick red lines.

2

18

31 4 5 6 7

9 10 118 12 13 14 15 16 171 193 4 5 6 72a's

b's

(c) Two iterations of mce: 84 total crossings, 12 bottleneck crossings. The other end-point of the max crossings edge has been sifted.

Fig. 3. Barycenter ordering and two iterations of mce on a two-layer example.



2

18

31 4 5 6 7

9 10 118 12 13 14 15 16 171 193 4 5 6 72a's

b's

(a) Three iterations of mce: 74 total crossings, 11 bottleneck crossings.

2

18

3 14 5 6 7

9 10 11 812 13 14 151617 119 3 4 5 6 72a's

b's

(b) The iteration preceding the final one – the one with fewest crossings – for mce: 22 totalcrossings, 9 bottleneck crossings.

2

18

3 145 6 7

9 10 11 812 13 14 151617 119 3 4 5 6 72a's

b's

(c) The final configuration for mce (no improvements after this point): 14 total cross-ings, 9 bottleneck crossings, after 157 iterations. Note: a configuration with only 5bottleneck crossings was encountered at iteration 132.

Fig. 4. The third and the two last iterations of mce on the two-layer example.



Working Hypothesis. The relative performance of mce versus barycenter with re-spect to total crossings is correlated with the maximum degree of the dag. In particular,a dag with large maximum degree yields better results for mce than for barycenter.

Why should this be so?Consider the illustrative example depicted in Figures 3 and 4. The key characteristic

of this example is that there are a few nodes with high degree accompanied by manynodes of degree 1 and a few others of small degree > 1.

It is useful to refer to a channel, the set of edges between two consecutive layersin a multi-layer dag. The example actually represents a single channel drawn from amulti-layer dag that exhibited poor results for the barycenter heuristic.

Let the channel degree of a node be the number of its incident edges that are inthe channel. The degree discrepancy of a channel is the ratio of the maximum channeldegree to the median channel degree, ignoring nodes whose channel degree is 0. Forthe channel in the example, the degree discrepancy is 9, the degree of b2. The vastmajority of nodes have channel degree 1 even though their degrees in the larger dagfrom which this channel is taken are mostly > 1.

Large degree discrepancies must exist in a sparse dag with large maximum degree.In multi-layer dags it exists in many channels – the maximum degree is large and themedian degree is 1, but overall degree discrepancy is difficult to quantify in a usefulfashion. In particular, degree discrepancy does not capture the existence of more thanone high-degree node or that there are adjacent high-degree nodes, both important aswe shall soon see.

Figure 3(a) shows a configuration at which barycenter gets stuck, call it a barycentertrap if it is far from optimum (or the best known solution). Using that configurationas a starting point, Figure 3(b) and (c) show the first two iterations of mce. The newposition of b1 in Figure 3(b) reduces the number of crossings for the edge (a19,b1) from16 to 7 while leaving the number for (a18,b1) at 10. The crossings on the remainingedges into b1 increase from 0 to 7, which does not affect the maximum number of cross-ings involving edges incident on b1. The number of total crossings, however, increasesdramatically because almost all edges into b2 cross almost all of those into b1.

The second iteration of mce, illustrated in Figure 3(c), increases both the total cross-ings and the bottleneck crossings. How can this be? The increase is due to the edge(a18,b2), which is not incident on either a19 or b1. This edge becomes the maximumcrossings edge for iteration 3.

The third iteration, shown in Figure 4(a), reduces the number of crossings on theedge (a18,b2) to 1, while increasing those on (a19,b1) to 11. The deciding factor is thenew position of b2. There is no better position for a18.

The penultimate position for mce is in Figure 4(b). Nodes a18 and b2 are close to thecenter, usually a good position for high-degree nodes. The only loose end is a degree-1 node that is easily moved to a significantly better position, as Figure 4(c) shows.Mce has saved an earlier configuration with the fewest bottleneck crossings. If it isrun for more than 157 iterations, up to 10,000, neither the bottleneck nor the totalcrossings decrease, although the configurations continue to change and the minimaare encountered several times – here the next edge to be considered is either (a8,b2) or(a18,b3), both with 7 crossings.

Although 5 is the minimum number of bottleneck crossings achieved by variousheuristics and combinations thereof (see Section 6.2), and is likely the optimum, allheuristics/combinations are able to find solutions with 11 total crossings, dependingon the initial configuration.

The most relevant characteristic of the example is one that makes life difficult forthe barycenter heuristic. A high degree node x is “pulled” into a random position by



its degree-1 neighbors y1, . . . , yd. After that y1, . . . , yd simply follow x, pulling it fartheralong that direction, i.e., farther from the middle. This is not a problem if x has nohigher degree neighbors. However, if it does, then a higher degree neighbor z will oftenstay on the opposite side of the middle. The presence of other high-degree nodes canthen lead to many crossings.

The mce heuristic, on the other hand, immediately focuses on edges that connecthigh-degree nodes on opposite sides.

4. EXPERIMENTAL METHODOLOGYIn [Gupta and Stallmann 2010] our primary aim was to demonstrate the effectivenessof mce for minimizing bottleneck crossings. The hypothesis described in Section 3 wasan afterthought. The problem instances we chose covered a small but broad sample ofdags, almost all of which had fewer than 600 nodes. These included the 100-node Romegraphs2 – all 140 of them; randomly generated dags with 560 nodes, average degrees2.5, 3, and 4, with nodes divided evenly among either 14 or 40 layers; and random treeswith 560 nodes on either 14 or 40 layers, not necessarily divided evenly – these wereminimum spanning trees on random points in the unit square arranged so that eachpath would traverse as many layers as possible. We created a class of 100 instances ofeach type and aggregated results over each class. Because solution quality (both bot-tleneck and total crossings) was updated at the end of each iteration, we ran both mceand barycenter for a fixed number – in this case 10,000 – iterations. Finally, the exe-cution of each heuristic was preceded by a depth-first search, with nodes arranged oneach layer according to their preorder number. This preprocessing took very little timeand usually improved the performance of both heuristics, particularly with respect tobottleneck crossings.

In the experimental results reported here, the idea was to mitigate the effects ofirrelevant factors, and to avoid having too many variables. Thus, the new experiments

— use a 4 second time limit instead of a 10,000 iteration limit for each run; the itera-tion limit gave an unfair advantage to mce, which takes significantly more time periteration;

— omit the depth-first search preprocessing and start each heuristic with a completelyrandom ordering of the nodes on each layer; depth-first search tended to favor mce,and, in some cases, caused barycenter to produce worse results (for total crossings);

Post-processing using the adjacent exchange method produced better results for bothheuristics – obviously it cannot make results worse – and tended to favor mce overbarycenter. More discussion can be found in Section 6.2. We did not use post-processingin the initial experiments of [Gupta and Stallmann 2010], and do not use it here untilsome of the later experiments.

All experiments were done on a two-processor, single-core 3 GHz Intel Pentium 4with 2048 KB cache (per processor) and 2 GB main memory, running Red Hat Enter-prise 5 Linux (server edition)3. Software is available on the web 4.

The biggest challenge was obtaining problem instances with varying maximum de-grees and a range of possible “shapes”. Dags arising in applications were ruled out forthree reasons:

2Rome graphs [Di Battista et al. 1997] are derived from entity-relationship diagrams, more than half ofwhich come from the Italian Internal Revenue Service and the Italian National Advisory Council for Com-puter Applications in the Government.3The (identical) platforms used were part of the NCSU Virtual Computing Lab cluster – see vcl.ncsu.edufor more information.4See /people.engr.ncsu.edu/mfms/Software. This includes source code plus scripts to generate instancesand run experiments.



(1) It is difficult to find examples that are “pre-layered”. Applying standard layeringstrategies to application examples tends to introduce a lot of dummy nodes; theseintroduce an extraneous factor to the experimental results.

(2) It is difficult to control the maximum degree of instances from applications whileretaining their basic structure. For example, the Rome graphs all have roughlythe same maximum degree. Adding edges to them certainly increases the averagedegree but also makes them denser, a variable likely to have significant effect.

(3) An arbitrary collection of application instances exhibits a large variety of factorsnot relevant to the current study.

In this setting the usual definition of density, i.e., m/C(n, 2), where C(n, 2) is themaximum number of possible edges for an unlayered, undirected graph, is less ap-propriate than m/n, the number of edges per vertex. We refer to the latter as degreedensity since it is also the average degree of a node divided by two.

The experiments used two kinds of random layered dags: uniform and connected.The uniform dags are based on a probability that a potential edge between two layersexists. They are called uniform because the generation method specifies the number oflayers and the exact number of nodes per layer.5 More specifically, for layers 2, . . . , `,each of the (n/`)2 potential edges from layer i − 1 to layer i is chosen with a givenprobability p. This may induce some nodes on layer i that have no predecessors – callthem locally isolated. To improve the odds that the dag will be connected and to avoidcompletely isolated nodes on layers 2, . . . , `, we pick, for each locally isolated node yon layer i, a random node x on layer i − 1 and add edge xy. The degree density of theresulting dag is hard to control directly but remarkably stable over a class once theright value of p is determined.

A connected dag, on the other hand, is based on a randomly generated spanning treeto ensure that it is connected. The desired additional edges are added by randomlychoosing a potential edge xy with x and y on adjacent layers and then checking whetherthe edge xy already exists. The number of nodes per layer will vary, depending on thestructure of the spanning tree. Each root-leaf path in the tree starts with the root onlayer 1 and continues on increasing layers until layer ` is reached. It then reversesdirection and continues until layer 1 is reached, repeating the reversal as often asnecessary.

Each method has an additional wrinkle, a method by which the maximum degreeof a node can be controlled. In the uniform method, control is exerted by the methodof choosing the predecessor x for a locally isolated node y. If x is chosen uniformly atrandom, the degree of each node in the dag will be roughly the same, determined bythe degree density. However, if the choice of x is restricted to a subset of the nodes on i– the same subset for each y – each node in that subset is likely to have larger degreethan the rest. Since we are dealing with sparse dags, where the number of locallyisolated nodes could be large, the maximum degree can be controlled quite easily. Ifthe bias in these choices is b, then we choose the subset to have dk/be nodes, where kis the number of nodes on layer i, and, as it turns out, max degree will be pretty closeto b (as long as b does not exceed the number of nodes per layer).

In the connected method we use the construction of the spanning tree to governmaximum degree. One input to the method is a skew factor σ. The tree is constructedas a rooted, directed tree. Along the way each node is assigned a number of children,randomly chosen between 1 and σ. There are additional complications to ensure that

5The generation method also allows pre-specified variation of the number of nodes. It was used to emulatethe SAS Activity-Based Management dags discussed in [Watson et al. 2008].



(a) u(10, 10, 1.25, 1) (b) u(10, 10, 1.25, 10)

(c) c(100, 1.25, 10, 2) (d) c(100, 1.25, 10, 4)

Fig. 5. Examples of four different types of input instances.



each desired layer is populated and that only a few layers have a small number ofnodes. The latter cause σ to govern the variance in nodes per layer as well.

We use the notation u(`, k, d, b) to denote a class of (32 randomly generated) uni-form dags with ` layers, k nodes per layer, degree density d, and bias b. The notationc(n, m, `, σ) is used for a class of connected dags with n nodes, m edges (i.e, degree den-sity m/n), ` layers, and skew factor σ. Figure 5 shows random instances of classesu(10, 10, 1.25, 1), u(10, 10, 1.25, 10), c(100, 1.25, 10, 2), and c(100, 1.25, 10, 4) to illustratefour possible variations: uniform with small bias (and therefore small max degree),uniform with large bias, connected with small skew, and connected with large skew.The uniform dags are not necessarily connected (the isolated nodes on layer 0 are anartifact of the generation method) but one large component usually includes almost allof the nodes. The connected dags are not necessarily uniform, particularly when theskew is large.

The random dags for the experiments reported later were chosen to represent awide range of possibilities: sparse versus dense; tall, medium, and short, i.e, varyingthe number of layers while keeping the number of nodes the same; and small versuslarge max degree as controlled by bias and skew. The size, 2000 nodes, was chosen tobe “just right”: not too small to cast doubt on scalability, but not so large as to causea single pass of a heuristic, particularly mce, to take more than a few seconds. Everyindividual experiment represents a set of 32 runs – all the instances of the given class.In order to avoid any bias introduced by the generation methods – both favor mce –all layers of each instance are randomly permuted, as is the order in which edges aregiven in the input file.6

5. EXPERIMENTS IN SUPPORT OF THE HYPOTHESISThe working hypothesis stated in Section 3 is inherently imprecise. It should be un-derstood that the hypothesis is about a general trend, not a statistically validatedcorrelation between two specific measures of merit. The experimental results reportedhere support the hypothesis in a general sense and for the majority of instances in-vestigated. As will become clear, there are other factors at work, some of which can beexplained more directly in terms of degree discrepancy and/or details of the exampleused to explain the hypothesis in Section 3.

Two measures are applied to the supporting experimental results: the ratio of thenumber of crossings obtained by the barycenter heuristic to that obtained by mce,bary/mce, and the maximum degree. Both are calculated for each individual instanceand aggregated over the 32 instances of a class. If bary/mce is greater than 1.00, mcedoes a better job of reducing total crossings, if less than 1.00, barycenter performsbetter.

From here on, when results are reported, the word crossings refers to total crossingsunless bottleneck crossings are explicitly specified.

Given the discussion of degree discrepancy in the previous section, why did we notuse a measure more directly related to it instead of max degree? The maximum de-gree discrepancy over all channels, call it max discrepancy, is a natural choice. Butit is difficult to distill all relevant factors into a single number. In the case of degreediscrepancy, it may matter how many high discrepancy channels exist, how close theyare to each other, how many nodes they have, how many nodes of high degree exist oneither side of the channel, whether they are adjacent, etc. None of this information iscaptured by max discrepancy, or, for that matter, max degree. High max degree or maxdiscrepancy can only improve the likelihood that the right conditions exist for a larger

6There are actually two input files: one gives the list of edges and the other gives the permutation of nodeson each layer. See [Stallmann et al. 2001] for more details.



0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

5 10 15 20 25 30 35 40

bary/m

ce

max degree

uniform dense uniform sparse connected

Fig. 6. Maximum degree versus bary/mce ratio for all classes.

bary/mce ratio. From the experiments we were able to conclude that max discrepancyis not a better predictor of bary/mce than max degree. Hence there is no reason to usethe more complicated measure.

To get a general impression of the relationship between maximum degree and thebary/mce ratio of total crossings, consider the chart in Figure 6. It appears that evenif we ignore all the other factors, such as number of layers and degree density, thereis a clustering that exhibits the hypothesized tendency, at least among connected dagsand sparse uniform ones. For reasons not yet fully understood, dense uniform classesdid not follow the general trend. Because of the outliers – uniform classes with bias20 and 40 (max degree > 20) – we also experimented with uniform classes having maxdegree in a mid-range, achieved by setting bias = 5. See observations (1) and (2) forsome conjectures applying to dense uniform dags and the outliers, respectively.

Tables II and III give more detailed information. It is clear that when degree den-sity and number of layers are taken into account the tendency is much more evident.For sparse dags, both uniform and connected, the increase in ratio is dramatic as maxdegree increases. Some additional observations follow. Much of what we see can be ex-plained by the likelihood of channels with more than one high (channel) degree nodeand lots of nodes of lower degree, in most cases degree-1. When such channels exist,barycenter is likely to be trapped, as in the two-layer example of Section 3, and mce islikely to find a configuration with fewer crossings than barycenter. But max discrep-ancy as defined earlier does not account for any quantitative differences here.

(1) For dense uniform dags there is no relationship between max degree and bary/mceratio. When a dag is sufficiently dense, there are likely to be fewer degree-1 nodesthan necessary to effect the behavior illustrated in the two-layer example of Sec-tion 3.

(2) In case of the outliers, observe that there cannot be many nodes of maximum de-gree when maximum degree is sufficiently large. It takes at least two or threelarge-degree nodes in a channel to create problems for the barycenter heuristic.When the odds are that most channels have only one (or none), barycenter is likelyto do well, at least better than mce.



Table II. Statistics for relative performance of bary versus mce on uniform classes.

crossings (total) bottleneck crossingsmax bary/mce mce ≤ med. bary/mce mce ≤ med.

class deg mean stdev bary val. mean stdev bary val.u(25,80,1.05,1) 7.4 0.46 0.04 0 2346 1.53 0.14 32 19u(25,80,1.05,5) 11.5 0.53 0.06 0 2420 1.58 0.18 32 22

u(25,80,1.05,40) 36.8 0.79 0.15 5 1588 1.44 0.15 32 28u(50,40,1.05,1) 6.9 0.47 0.06 0 845 1.75 0.16 32 8u(50,40,1.05,5) 11.3 0.63 0.08 0 823 1.93 0.25 32 9

u(50,40,1.05,40) 35.8 1.07 0.39 15 323 1.79 0.36 31 15u(100,20,1.05,1) 7.0 0.54 0.07 0 392 1.88 0.27 32 4u(100,20,1.05,5) 11.1 0.66 0.10 0 305 1.92 0.25 32 5

u(100,20,1.05,20) 20.3 1.15 0.40 16 135 1.78 0.35 31 8u(25,80,1.25,1) 8.6 0.81 0.03 0 13582 1.57 0.11 32 36u(25,80,1.25,5) 10.7 0.84 0.03 0 13978 1.57 0.11 32 37

u(25,80,1.25,40) 27 0.86 0.05 0 12767 1.60 0.08 32 39u(50,40,1.25,1) 8.3 0.89 0.04 0 5678 1.72 0.11 32 16u(50,40,1.25,5) 10.3 0.91 0.05 1 5615 1.71 0.12 32 16

u(50,40,1.25,40) 27.3 0.85 0.07 0 4147 1.54 0.09 32 19u(100,20,1.25,1) 8.2 0.91 0.04 0 2544 1.68 0.13 32 8u(100,20,1.25,5) 10.1 0.94 0.05 4 2448 1.67 0.11 32 9

u(100,20,1.25,20) 16.9 0.88 0.06 1 1839 1.59 0.15 32 10

Table III. Statistics for relative performance of bary versus mce on connected classes.

crossings (total) bottleneck crossingsmax bary/mce mce ≤ med. bary/mce mce ≤ med.

class deg mean stdev bary val. mean stdev bary val.c(2000,2100,25,4) 6.1 0.55 0.06 0 3126 1.63 0.15 32 25c(2000,2100,25,8) 10.6 0.79 0.14 2 3988 1.96 0.65 31 47c(2000,2100,50,4) 5.8 0.68 0.06 0 1441 1.85 0.21 32 11c(2000,2100,50,8) 10.1 1.04 0.19 18 1622 2.35 0.65 32 17

c(2000,2100,100,4) 6.0 0.69 0.08 0 600 1.93 0.32 32 6c(2000,2100,100,8) 9.8 1.13 0.19 24 844 2.75 0.81 32 9c(2000,2500,25,4) 8.8 0.91 0.05 1 16659 1.62 0.14 32 47c(2000,2500,25,8) 18.3 1.01 0.08 19 24136 1.46 0.23 32 102c(2000,2500,50,4) 8.2 1.04 0.05 24 7855 1.79 0.20 32 21c(2000,2500,50,8) 15.7 1.16 0.08 32 10167 1.61 0.22 32 44

c(2000,2500,100,4) 8.0 1.09 0.05 32 3516 1.83 0.20 32 11c(2000,2500,100,8) 13.5 1.26 0.09 32 5236 1.84 0.38 32 24



(3) The dense connected dags support the hypothesis. Recall that the layers vary a lotin cardinality when skew is larger – one instance of the class c(2000,2500,100,8)has several layers with only one node, a couple with close to 100 nodes, and 9 as themedian (the mean is obviously 20). On the other hand, dags with smaller skew arecloser to being uniform – one instance of class c(2000,2500,100,4) has minimumlayer size 4, maximum 57, and median 19.5.For the dags with larger skew the higher cardinality channels are more likely tohave large degree discrepancy; in fact, they are more likely to have a couple nodeswith high channel degree and most of the rest with channel degree 1 or 2.The c(2000,2500,100,8) instance has five channels with degree discrepancy 10 ormore and many additional ones with discrepancy 8 or 9. The median degree in anygiven channel is 1 and the mean ranges from 1 to 2.25. In the c(2000,2500,100,4)instance, the maximum overall channel degree discrepancy is 6 and most individ-ual channels have discrepancy 3 or 4.

(4) For reasons related to those described in item 3, the overall performance of mceis considerably better for connected dags with larger skew than it is for any othercategory. These dags have considerable variance among sizes of layers – and hencechannels. If a channel with high degree discrepancy, where mce has a clear advan-tage, is surrounded by ones with only a small number of edges, this advantage isnot “dampened.”

(5) Higher bary/mce ratio in connected dags with larger skew is also related to thenumber of layers. If there are few layers, only a few channels will have degree dis-crepancy equal to or near the max discrepancy (recall also that these dags havelayers with only a small number of nodes, leaving even fewer channels that mighthave barycenter traps). A large number of layers means a larger number of poten-tial barycenter traps. Even though this also means a smaller average channel size,the variance in number of nodes per layer guarantees that many of these channelswill still be large.

In all of these observations, factors/details not accounted for by max degree had tobe examined. The explanations given are conjectures only; they deal with effects thatare difficult to quantify and even more difficult to model and explain. In the followingsection we consider a more important property of mce – that its performance in manysituations is much better than these initial experiments suggest.

6. ADDITIONAL EXPERIMENTSFar from being the final word on the performance of mce, the previous section soughtonly to provide evidence for the hypothesis stated in Section 3. The experiments re-ported in what follows deal with other issues, issues that were motivated by the earlierexperiments or are based on incidental observations while carrying them out. To avoidan overwhelming amount of detail, attention is devoted to six classes, selected to rep-resent a broad spectrum of types of dags and performance of mce versus barycenter.They are c(2000,2100,25,4), c(2000,2100,50,8), c(2000,2500,100,8); and u(25,80,1.25,1),u(50,40,1.05,1), u(100,20,1.05,5). Aside from three representatives each of uniform andconnected classes, there are several other dimensions that have each value repre-sented.

— two of each possible number of layers: 25, 50, 100;— three of each degree density: 1.05 and 1.25;— three of each (roughly speaking) max degree category

— low: c(2000,2100,25,4) at 6.1, u(50,40,1.05,1) at 6.9, u(25,80,1.25,1) at 7.4;— high: c(2000,2100,50,8) at 10.1, u(100,20,1.05,5) at 11.1, c(2000,2500,100,8) at

13.5;



Table IV. Runtimes of barycenter, mce, and sifting as number of nodes increases. Thenumbers shown are median runtimes in microseconds per iteration over three instances ofthe class with the given number of nodes.

Results for an Intel Linux machine with 2 MB cache.nodes 1000 2,000 4,000 8,000 16,000 32,000 64,000 128,000

mce 80 130 247 690 2,469 6,095 11,500 20,995sifting 47 74 130 360 1,200 3,026 6,156 13,198

bary 23 42 100 293 1,109 2,772 5,857 12,671

— a broad spectrum of mean bary/mce ratios:— u(50,40,1.05,1) at 0.47— c(2000,2100,25,4) at 0.55— u(100,20,1.05,5) at 0.68— u(25,80,1.25,1) at 0.81— c(2000,2100,50,8) at 1.04— c(2000,2500,100,8) at 1.26

Inasmuch as possible the choices of values along any pair of dimensions are orthogo-nal to each other. The selection of classes is an attempt to approximate a Latin squaresdesign.

There are many similarities between mce and sifting and in some situations siftingis a viable alternative to mce. Results on sifting are therefore included here as well.The sifting implementation is identical to that reported in [Matuszewski et al. 1999]except that instead of using a fixed number of failures as the only possible stoppingcriterion, it also allows the user to specify a fixed number of sifting iterations or anexecution time limit. Because we are dealing with multiple layers, sifting sorts thenodes by decreasing or increasing degree as specified in the paper. This variation wasreported to give the best results.

Median and barycenter are also very similar to each other. However, the performanceof the median heuristic, at least on these dag classes, is abysmal when compared tobarycenter.

6.1. Longer RuntimesLet us now consider the actual runtime of the three heuristics under study as thesize of the dags increases. For this purpose we created a sequence of dags of classesc(n, 1.05n,

√n, 2) for n ranging from 1, 000 to 128, 000. The

√n in the class parameters

is approximate, i.e. rounded up to the next multiple of 5 or 10. The results are shownin Table IV.

Recall that in our implementation, with m ∈ O(n), all three heuristics have anO(n2/`2) time bound per iteration. Here ` =

√n so this translates to O(n). The con-

stant for mce is significantly larger because of the extra overhead of searching for themaximum crossings edge and having to recompute local edge crossings during eachswap – lines (3) to (6) of Figure 1(c).

Each heuristic was run for 100,000 iterations and the resulting total was scaledto obtain a per-iteration time in microseconds. The runtimes initially exhibit roughlylinear behavior as expected but then increase dramatically between 4,000 and 32,000nodes. This increase is probably a cache effect. Note that the increase levels off againafter 32,000. The number of bytes per node in the graph data structure is close to200 and there is a fair amount of memory allocation/deallocation going on when edgecrossings are updated – about 5,000 allocations per node, each an average of 12 bytes(these are used for sorting to count crossings and could be avoided with an initialallocation of the largest size required).



u(50,40,1.05,1), bary/mce = 0.47time bary mce sifting mce

bary

4 sec 850.6 1832.8 2955.6 2.1810 sec 848.4 1359.0 2954.9 1.6220 sec 848.3 1133.2 2954.9 1.3540 sec 847.1 988.3 — 1.1780 sec 844.9 874.1 — 1.04

160 sec 844.9 799.1 — 0.95

c(2000,2100,25,4), bary/mce = 0.55time bary mce sifting mce

bary

4 sec 3193.2 5869.6 7522.8 1.8510 sec 3142.5 3825.7 7461.1 1.2320 sec 3091.9 2956.1 7461.1 0.9740 sec 3048.3 2424.1 — 0.80


bary

4 sec 302.4 459.5 890.3 1.5510 sec 302.4 397.5 890.3 1.3520 sec — 383.8 — 1.3040 sec — 380.7 — 1.2980 sec — 380.4 — 1.29

160 sec — 380.3 — 1.29


bary

4 sec 13762.3 17109.4 14110.2 1.2410 sec 13639.1 14516.0 13951.1 1.0620 sec 13574.3 13204.2 13899.3 0.9740 sec 13494.5 12305.3 13896.6 0.9180 sec 13449.5 11723.5 13896.6 0.87

160 sec 13410.7 11292.2 — 0.84


bary

4 sec 1650.5 1597.8 2572.1 0.9910 sec 1650.5 1190.6 2572.1 0.74


bary

4 sec 5211.7 4148.5 4583.9 0.8010 sec 5211.7 3666.3 4582.3 0.71

0.5

1

1.5

2

4 40

mce / bary

Time (sec)

u(100,20,1.05,5)

u(50,40,1.05,1)

u(25,80,1.25,1)

Fig. 7. Reduction in total crossings when mce is alloted more runtime. Each number is the mean numberof total crossings after the heuristic is applied. The ratios are the average of the ratios for the 32 instances,not the ratio of the averages.

Longer runs for mce. Overall, the results reported so far are rather disappointing formce with respect to total crossings. It is clear that mce was not designed for minimizingtotal crossings, so why not leave it at that? The first glimmer of hope was the following:running bary and mce for 4 seconds each on the Rome graphs yielded a larger bary/mceratio than when each was run for 10,000 iterations. In other words, mce did bettervis a vis bary when both were allowed 4 seconds than when mce was allowed roughly1.0 seconds and bary was alowed roughly 0.2, the runtimes for 10,000 iterations. Theseresults were aggregated over all 140 Rome graphs.

Figure 7 illustrates the impact of longer mce runs on crossing numbers for the se-lected classes. The numbers shown are average (mean) numbers of total crossings for



time (sec) bary mce5 73,905 614,747

10 67,970 524,01020 64,583 386,24280 64,583 240,273

160 64,583 160,275240 62,754 129,233720 62,104 73,267

1040 60,523 60,6001120 60,523 58,6371520 60,523 50,9921840 60,523 47,4691920 60,523 46,667

40000 50000 60000 70000 80000 90000 100000

0 500 1000 1500 2000 2500

Num

ber o

f Crossings

Time (seconds)

bary mce

Fig. 8. Decrease in crossings for barycenter and mce on an instance of class c(20000,21000,160,4).

each class – if the average fails to decrease with increasing time, it implies that therewas no improvement in any instances of the class. The average mce/bary ratio is alsoshown in the rightmost column. We use the inverse of the bary/mce ratio of the previ-ous section to emphasize the decrease in number of crossings for mce.

Longer runs always help mce on average, although for class u(100,20,1.05,5), thedecrease between 40 and 160 seconds represents an improvement for only 6 instancesof the 32, all but one by only one or two crossings. Barycenter shows improvementin three of the four classes where it had the fewest crossings after 4 seconds. Theclass u(100,20,1.05,5) is, again, the exception. The tables also include results for 10-second runs on classes where mce already outperformed barycenter after 4 seconds.Mce continues to reduce total crossings in these cases as well while barycenter fails toimprove at all. Extra time does not appear to help sifting very much, if at all.

The chart at the bottom of Figure 7 shows how the mce/bary ratio for the three uni-form classes decreases as mce spends more time. Except for one class, u(100,20,1.05,5),the longer runs eventually give mce significantly better results. As already noted, nei-ther heuristic shows any real improvement in this case.

The degree of improvement when mce runs for a longer time is only partly corre-lated with the original bary/mce ratio or maximum degree. A more important factorappears to be the distinction between uniform (bad for mce) and connected (good for



mce) classes. Among the uniform classes, the most troublesome one is u(100,20,1.05,5)– not expected given the larger mean max degree: 11.1 versus 6.9 and 8.6. Insteadof having instances that are not as bad for barycenter, i.e., less likely to fall into thebarycenter trap of the example in Section 3, the class has instances that are more likelyto be worse for mce. The max discrepancy occurs in only a few channels and these haverelatively few nodes. Thus, the maximum crossings edge identified by mce is likely tohave few crossings and not be easily distinguished from other edges, nor likely to causemuch improvement if its crossing number is reduced. Contrast this with instances ofc(2000,2100,25,4), u(50,40,1.05,1), and u(25,80,1.25,1), all of which have large channeldensity – number of edges per channel – and therefore a wider range of c(e) values.

Long runs on a single instance. Experiments with really long runs – 2000 seconds – onan instance 10 times as large as those used in the other experiments amplify the oneson smaller instances. They illustrate how mce keeps decreasing the total number ofcrossings; barycenter initially reduces total crossings rapidly, but improves very littleafter a certain point. The results in Figure 8 are for a randomly chosen and randomlyordered single instance of class c(20000,21000,160,4), i.e., a scaled-up version of in-stances in class c(2000,2100,50,4), a class that exhibits small bary/mce ratio (0.68).This dag has max degree 6, nodes per layer ranging from 25 to 362 (median 108), andchannel degrees that are mostly 4 or 5. The lines in the table show only the points atwhich barycenter improves, however slightly, except at the end, where the improve-ments in the mce crossing number become significant. The chart fills in other datapoints for mce. Sifting does very poorly in this situation: the number of crossings ityields never falls below 113,000.

One clear advantage that mce has over barycenter (and sifting) is that it is dynamic.Barycenter and sifting always consider the same layers/nodes in the same sequenceregardless of changes that have taken place. On the other hand, the sequence of edgesconsidered by mce will change from one pass to the next, often significantly.

In an mce run on one instance of class u(50,40,1.05,1) the node that was edge siftedfirst in the first pass was edge sifted in iterations 1404 and 1281 in the second andthird passes, respectively. The node that was edge sifted last in the first pass was 1049iterations from the end in the second pass and 907 iterations from the end in the third.This particular instance has 1988 non-isolated nodes, so each pass had 1988 iterations.

The moral of the story here is that the number of total crossings found by mce canusually be decreased as runtime is increased. Very long runs can be justified in someapplications, e.g., circuit layout, where it makes sense to spend weeks solving a specificinstance in order to reap millions of dollars in savings when the associated circuit ismass produced.

Random restarts. Given that the barycenter heuristic reaches a point of diminishingreturns after relatively little runtime, it makes sense to wonder whether it will do bet-ter if it starts over. Would additional execution time be used more effectively if shorterruns were done from multiple random initial configurations? The answer is yes, up to apoint. Table V shows the effect of restarting barycenter after each 4 second interval in-stead of running it continuously. The crossing numbers shown for the 4 second restartsrepresent the minimum achieved among t/4 runs, given a total of t seconds. Both thepermutation of each layer and the order of the adjacency lists were randomized at thestart of each run.

The first two instances in the table are from class u(50,40,1.05,1), a natural choicegiven the well-defined crossover between mce and barycenter after 80 seconds – seeFigure 7. Instances were chosen based on their mce/bary ratio of crossings after 80 sec-onds runtime. Random restarts at 10 second intervals fared no better – and usuallyworse – than ones at 4 second intervals.



Table V. Performance improvement with random restarts for barycenter on twoinstances of class u(50,40,1.05,1) and one of class u(100,20,1.05,5).

instance in top quartile at 80 sec. for u(50,40,1.05,1): mce/bary = 1.18time 4 40 80 160 320 640 1280

single run (bary) 616 616 616 616 616 616 616single run (mce) 1887 842 727 680 624 591 591

4 sec restarts (bary) 616 603 603 585 579 579 577

median instance at 80 sec. for u(50,40,1.05,1): mce/bary = 1.01time 4 40 80 160 320 640 1280

single run (bary) 856 856 856 856 856 856 856single run (mce) 1768 922 863 685 647 591 586

4 sec restarts (bary) 856 799 799 780 759 759 759

median instance for u(100,20,1.05,5): mce/bary = 1.27time 4 40 80 160 320 640 1280

4 sec restarts (bary) 305 301 277 277 260 260 24920 sec restarts (mce) 450∗ 388 388 371 330 330 33040 sec restarts (mce) 450∗ 387 387 387 361 330 330

∗A single 4-second run of mce.

In cases where the mce/bary ratio after 80 second single runs was small, the randomrestarts were generally unable to keep abreast of the improvements yielded by thelonger mce runs. Conversely, if the ratio was large, barycenter reduced the crossingnumber to a value lower than what mce was able to achieve. In the first instance ofTable V, mce appears to have reached its lower bound after 1280 seconds (the sameappears to be the case for barycenter). In the two extreme cases, the smallest/largestratio for the class, not shown in the table, barycenter, resp. mce, is not even able toreach the single run 80 second crossing number of the other heuristic after 1280 sec-onds (with barycenter using 4 second restarts).

Random restarts make the most sense for the class u(100,20,1.05,5), where neitherheuristic shows improvement after a certain point. The final instance in Table V showsthe effect of random restarts for both heuristics. Experiments show significant im-provements for both barycenter and mce, but the mce/bary ratio, i.e., the gap betweenthe two, actually increases – from 1.27 to 1.33 when mce does either 20 or 40 secondrestarts.

In general, longer runs tend to favor mce and allow it to achieve results as good as orbetter than those of barycenter when this is not the case for 4-second runs. There arenotable exceptions, with respect to runs on individual instances with random restarts,and with respect to the problematic class u(100,20,1.05,5). The following section exam-ines ways to improve mce even more, as well as improve upon the best results achievedby any single heuristic.

6.2. Combining the Two HeuristicsRecall the example in Section 3 that was used to illustrate how mce escapes a barycen-ter trap – Figures 3 and 4. Suppose that mce does this often, i.e., often improves upon apoor barycenter solution. Then if you run barycenter followed by mce you get the bestof both worlds. If barycenter results in a smaller number of crossings than mce, fol-lowing it with mce can do no worse – the starting configuration is always a candidatefor the best one. On the other hand, if mce does better on the given instance, one canassume that it will also escape barycenter traps successfully. Indeed the combinationof barycenter followed by mce (bary+mce) yields the minimum number of crossings



(a) barycenter, 80 total, 7 bottleneck (b) mce + barycenter, 79 total, 5 bottleneck

(c) barycenter + mce, 60 total, 3 bottleneck (d) barycenter + sifting, 61 total, 5 bottleneck

Fig. 9. Four heuristics applied to Rome graph grafo10550. Single heuristics are run for 4 seconds, combinedones for 2 seconds each heuristic. The bottleneck crossings reported above occur at different iterations thanthe one pictured.

among barycenter, mce, and bary+mce in almost every instance and has the minimumaverage number of crossing for all classes used in the experiments. If sifting is throwninto the mix, bary+sifting often performs as well as, or better than, bary+mce. It isfair to ask if mce followed by barycenter offers similar advantages. This is never trueamong all 30 original classes in the experiments.

Figure 9 illustrates the performance of heuristic combinations on a specific 100-nodeRome graph (instance grafo10550). The red box in Figure 9(a) shows a local barycenter



Table VI. Number of times minimum bottleneck crossings are achieved by mceand by bary+mce. The bary/mce ratio for total crossings is given as a referencepoint.

class bary/mce mce only bary+mce only bothu(50,40,1.05,1) 0.47 0 32 0

c(2000,2100,25,4) 0.55 0 32 0u(100,20,1.05,5) 0.68 0 28 4

u(25,80,1.25,1) 0.81 1 28 3c(2000,2100,50,8) 1.04 7 22 3

c(2000,2500,100,8) 1.26 17 10 5

trap in which the two outermost edges from the node marked X each have lots ofcrossings.

Aside from having the fewest crossings, the solution provided by bary+mce also hassome esthetic advantages: there are fewer long-distance edges than in the other em-beddings and the middle distance ones would hardly be noticeable if the picture werestretched vertically.

Turning now to the aggregate behavior of the combined heuristics, consider Fig-ure 10. It shows the ratio x/m for each heuristic or combination x, where m is theminimum number of crossings among all five. The values x and m are averages overall instances of the given class. In the case of combined heuristics, they were eachrun for 2 seconds so that the total time matched the 4 seconds in the experiments ofSection 5.

What is really surprising here is the behavior of sifting. In all but two cases itexhibits the worst performance when run by itself and in the two others the differ-ences among heuristics are much less pronounced. On the other hand, when precededby barycenter, it almost always does best or very close to it. The same is true to alesser degree for mce – bary+mce gives the best solutions in the two situations wherebary+sifting does not. However, this is much less of a surprise: the two classes whosebary+mce performance is best are also the two classes with the best performance formce by itself.

The difference in magnitude of the ratios x/m decreases as the average value –average total crossings for the combination that has the minimum – increases. Thiscan be expected. If the absolute decrease in value is roughly the same for all classes,those with a smaller denominator will have a larger ratio. But it would seem thata larger number of (initial or single-heuristic) crossings would allow more room forimprovement. This is not the case. The large number of crossings is related to having alarge channel density, true of denser dags, like u(25,80,1.25,1) and c(2000,2500,100,8),or those that have fewer layers, like c(2000,2100,25,4); the class u(25,80,1.25,1), withthe largest value, has both. In a dense channel edges cannot easily be “untangled”.For these (locally or globally) dense classes, there is even much less improvement fromthe initial, random configuration to the minimum. For example, the initial (average)configuration for u(25,80,1.25,1) has 62708.5 crossings – a ratio of ≈ 6 vis a vis theminimum. For the class u(100,20,1.05,5), the initial value is 8799.6 for a ratio ≈ 42.

The combination of two heuristics is also beneficial for reducing bottleneck cross-ings. When minimizing bottleneck crossings we started the second heuristic at theconfiguration where the first heuristic (barycenter) achieved the minimum number ofbottleneck crossings (instead of where the total crossings were minimized). In everyinstance of the six selected classes, 192 instances in all, either mce or bary followedby mce achieved the minimum bottleneck crossings among the three single heuristicsand two combinations. In the overwhelming majority of instances the bary+mce com-bination gave the minimum – see Table VI. Sifting, both by itself and when preceded



(a) bary/mce = 0.47, avg value = 706.1 (b) bary/mce = 0.55, avg value = 2364.0

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

bary mce si2ing b+mce b+si2

u(50,40,105,1)

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00


c(2000,2100,25,4)

(c) bary/mce = 0.68, avg value = 178.9 (d) bary/mce = 0.81, avg value = 10388.6

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00


u(100,20,105,5)

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00


u(25,80,125,1)

(e) bary/mce = 1.04, avg value = 896.6 (f) bary/mce = 1.26, avg value = 3678.7

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00


c(2000,2100,50,8)

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00


c(2000,2500,100,8)

Fig. 10. Relationship of each heuristic and combination to the minimum value among them. Avg valuerefers to the minimum average number of crossings – among the heuristics/combinations – over each class.



Table VII. Reduction in total crossing number after each of various heuris-tics and combinations before and after post-processing using adjacent ex-changes. The numbers represent median total crossings for the class.

u(50,40,1.05,1), bary/mce = 0.47bary mce sifting bary+mce bary+sifting

before 845 1825 2940 853 699after 835 1725 2935 837 699

% decrease 0.94 4.36 0.00 0.94 0.00

c(2000,2100,25,4), bary/mce = 0.55bary mce sifting bary+mce bary+sifting

before 3126 5837 7645 2582 2375after 3052 5641 7643 2467 2375

% decrease 1.66 2.88 0.00 3.42 0.00


before 305 462 886 247 179after 283 431 886 235 179

% decrease 1.69 1.85 0.00 2.44 0.00


before 13582 16888 14078 12665 10164after 13419 16558 14078 12382 10159

% decrease 1.06 1.67 0.00 1.83 0.00


before 1622 1581 2605 915 1243after 1606 1533 2602 895 1243

% decrease 0.00 1.44 0.00 0.66 0.00


before 5236 4179 4621 3699 3969after 5137 4082 4613 3572 3968

% decrease 1.31 1.67 0.00 1.11 0.00

by barycenter, did very poorly: the minimum bottleneck crossings it achieved in ei-ther case were at least twice the overall minimum, not surprising, given that siftingintentionally places nodes so as to minimize total crossings (at the likely expense ofminimizing bottleneck crossings).

An interesting trend is the relationship between the bary/mce ratio for total cross-ings and the number of times mce by itself achieves the minimum (or bary+mce doesnot). It is tempting to speculate on whether the mechanism(s) leading to better perfor-mance by mce (versus barycenter) on total crossings are the same as those that causemce by itself to perform better than bary+mce.

Post-processing. A simple post-processing step can almost always reduce the totalnumber of crossings: apply the adjacent exchange method [Di Battista et al. 1999,Sec. 9.2.2] until improvement is no longer possible. Table VII shows the effect of post-processing after each of the five heuristics/combinations on the six selected classes.Post-processing clearly improves the performance of mce, individually or preceded by



barycenter, more so than any other heuristic. That could be because the edge siftingthat takes place in mce has absolutely no relationship with the swaps (adjacent ex-changes) that locally optimize total crossings. The regular sifting heuristic gains littleif anything from additional swaps. If such swaps were possible, sifting, in many cases,would have already performed them in the course of sifting one of the two nodes. Im-provement is possible only when the order in which nodes were sifted leaves some looseends.

The last major set of experiments put all of the previous observations together,specifically the benefit of longer runs for mce, of running mce or sifting after barycen-ter, and of doing post-processing. Absent longer runs and post-processing, barycenterplus sifting outperformed barycenter plus mce in four of the six selected classes. Wouldthe reverse be true if we allowed more time and did post-processing, both of which arenow known to favor mce in other contexts? Figure 11 shows the results. In all but onecase, extra time beyond 8 seconds does not help barycenter plus sifting. That case,u(25,80,1.25,1), has an interesting anomaly: results for 128 and 512 seconds are actu-ally worse than those for 32 seconds. This is because post processing does not improveupon the best configuration found after 128 seconds of sifting in any of the instances ofthe class. Meanwhile, the best configuration after 32 seconds for every instance of classu(25,80,1.25,1) is improved by post processing. Such an anomaly occurs occasionallyfor individual instances of other classes and/or for mce but not enough to affect theaverages.

The chart at the bottom of Figure 11 graphically illustrates the progress made bybary+mce after 2 seconds of barycenter. Since time was quadrupled for each succes-sive run, the chart uses a log scale to illustrate that in two cases, u(50,40,1.05,1) andu(25,80,1.25,1), there is a steady linear decrease in the ratio bary+mce to bary+sifting.Since the latter fails to decrease in the case of u(50,40,1.05,1) and does so only slightlyfor u(25,80,1.25,1), the chart really shows a decrease in the number of total crossingsfor mce as time is increased by a multiplicative factor. The corresponding parts of thetable show this as well, although closer examination of the numbers indicates that thedecrease diminishes slightly over time. It is likely, though not definite, that bary plusmce will eventually surpass bary plus sifting in the case of u(25,80,1.25,1).

In contrast, the decrease for u(100,20,1.05,5) bottoms out after 32 seconds as doesthe crossing number for both combinations. The earlier experiments comparing longerruns of plain barycenter, mce, and sifting showed the same tendency. But here it isclear that the gap between bary plus mce and bary plus sifting is not due to a strength(or lack of weakness) in barycenter. The comparison is between mce and sifting andthe latter does better only when preceded by barycenter.

Depth-first search preprocessing. So far we have avoided a discussion of the impact ofdepth-first search as a preprocessing step. Depth-first search chooses an arbitrary nodeon layer 1, labels all nodes according to their preorder number in the dfs tree, and thensorts the nodes on each layer based on these preorder numbers. The resulting order-ing significantly improves the placement of nodes on long paths, ones whose futureplacement is most likely to propagate to other layers.

In order to illustrate the effectiveness of dfs preprocessing, we experimented with asingle instance of class u(100,20,1.05,5), the same instance used to illustrate the effec-tiveness of random restarts – see Table V. There was an additional agenda for theseexperiments: to determine whether the lack of further improvement we saw for theinstances of u(100,20,1.05,5) was due to being close to optimum solutions – clearly notthe case for runs of plain barycenter or mce, but possibly for the mce+sifting combina-tion – or due to weakness in the heuristics/combinations attempted so far.



u(50,40,1.05,1)bary + bary + mce /

time mce sifting sifting2 sec 837.2 705.8 1.198 sec 787.5 705.7 1.1232 sec 731.9 705.7 1.04128 sec 692.2 — 0.98512 sec 648.3 — 0.92

c(2000,2100,25,4)bary + bary + mce /

time mce sifting sifting2 sec 2478.6 2362.5 1.058 sec 2093.9 2347.2 0.90

32 sec 1793.2 2347.2 0.77128 sec 1614.9 — 0.69

u(100,20,1.05,5)bary + bary + mce /

time mce sifting sifting2 sec 228.8 178.9 1.288 sec 212.2 178.9 1.1932 sec 207.2 — 1.16128 sec 207.2 — 1.16

u(25,80,1.25,1)bary + bary + mce /

time mce sifting sifting2 sec 12485.5 10384.2 1.208 sec 11627.9 10238.6 1.1432 sec 11138.5 10229.7 1.09128 sec 10794.4 10231.4 1.06512 sec 10453.2 10229.8 1.02

c(2000,2100,50,8)bary + bary + mce /

time mce sifting sifting2 sec 863.0 1291.8 0.688 sec 757.1 1291.5 0.6032 sec 739.8 1291.5 0.59128 sec 737.4 — 0.58

c(2000,2500,100,8)bary + bary + mce /

time mce sifting sifting2 sec 3589.1 3969.5 0.908 sec 3211.2 3957.0 0.81

32 sec 3083.9 3957.0 0.78128 sec 3047.0 — 0.77

0.9

1

1.1

1.2

1.3

2 32 512

bary+m

ce/bary+si,ing

0me a,er bary (sec)

u(100,20,1.05,5)

u(25,80,1.25,1)

u(50,40,1.05,1)

Fig. 11. Reduction in total crossings when the combinations bary+mce and bary+sifting are given addi-tional runtime and post-processing. Barycenter is executed for 2 seconds before mce or sifting. The ratios(bary+mce / bary+sifting) are the average of the ratios for the 32 instances, not the ratio of the averages.



Table VIII. Performance improvement with depth-first search preprocessing an instance ofclass.

column (1) (2) (3) (4) (5) (6) (7) (8) (9)preproc. — dfs — dfs — — dfs dfs dfs

first mce mce bary bary bary bary bary bary bary∗second — — — — mce sifting mce sifting mce

crossings 330 271 249 223 174 147 142 130 128

∗Here barycenter is followed by adjacent-exchange post-processing before mce is applied.

Table VIII illustrates the results – it shows total crossings for each heuris-tic/combination, with the single row sorted by decreasing crossing number.

The bary and mce results refer to 1280-second runs with 4-second and 20-secondrestarts, respectively, as in Table V. With dfs preprocessing we did the same thing:used random restarts with dfs applied after each random reordering. Clearly the effectof dfs is also sensitive to the initial order. For 320 random starting configurations, theinitial crossing number ranged from 8323 to 9253 with a median of 8885. After dfs, thecrossing number ranged from 1895 to 2865 with a median of 2388.

For each combination involving barycenter, we saved the ordering for the best resultand then applied mce or sifting to it, running each for 2048 seconds. The 2048-secondruns turned out to be unnecessary: the best solution was found by mce in ≈ 6 sec-onds and by sifting in less than a second). In all cases the last heuristic applied wasfollowed by post-processing. The last column, column (9), shows what happened whenpost-processing was applied to the (best) barycenter result before applying mce, a ma-jor improvement over the result obtained without the intermediate post-processing –column (7). As might be imagined, the use of post-processed barycenter as a startingpoint for sifting yielded worse results than the use of plain barycenter.

Even this simple example illustrates the dramatic effectiveness of dfs preprocessingand that the effect is more pronounced for mce than it is for barycenter. Compare the18% decrease from column (1) to column (2) with the 10.5% decrease from column (3)to column (4). It is also more pronounced for mce (preceded by barycenter) than it is forsifting. Compare the 18% decrease from column (5) to column (7) – if we use column (9),the post-processed barycenter, there is a 26% decrease – with the 12% decrease fromcolumn (6) to column (8).

The example also suggests that instances of the class u(100,20,1.05,5) have dramat-ically fewer crossings when a suitable combination of heuristics is applied. This maywell be true of other instances that have many small layers.

Final remarks. The additional experiments reported in this section show mce, partic-ularly when preceded by barycenter, to be a desirable alternative to other crossingminimization heuristics. The mce heuristic also benefits when depth-first search pre-processing is used, significantly more so than when depth-first search preprocessingis applied to barycenter plus sifting. The ability of the mce heuristic to outperformother heuristics in many situations arises even though – and possibly because – itwas designed for a different objective. Its main strength is that it is dynamic, lesslikely to revisit previously considered configurations than other known heuristics. An-other similar advantage of mce is that its node insertion strategy is orthogonal to thesorting that takes place with depth-first search or barycenter and the swapping ofadjacent-exchange post processing. In many cases the performance of mce, bary+mce,and dfs+bary+mce when given more time, is also related, albeit not as directly, to theworking hypothesis.



7. CONCLUSIONS AND FUTURE WORKThe mce heuristic was designed to solve the bottleneck crossing problem, a task atwhich it excels. The primary application for bottleneck crossing minimization is thecrosstalk problem in integrated circuits.

The much more interesting and surprising results reported in this paper relate tothe performance of mce when used to minimize the total number of crossings in amulti-layer dag. To wit:

(1) The relative performance of mce when compared with the barycenter heuristic canoften be predicted based on the maximum degree in the dag. This is the workinghypothesis proposed here. Other factors, such as degree density, number of layers,variance in layer cardinality, degree variance within individual layers, and adja-cencies among high-degree nodes also play a role but are much harder to quantify.

(2) The working hypothesis can be partially explained using a simple two-layer exam-ple where the barycenter heuristic gets trapped and mce escapes from the trap.Deviations from the hypothesis have explanations related to the example and maybe exceptions that prove the rule.

(3) The performance of mce is actually better than the initial experiments suggest.When run with a more generous time limit than the one originally used, it canalmost always reduce the number of crossings to a greater extent than barycentercan. The latter shows no improvement after a certain point.

(4) Allowing barycenter to restart at new random configurations is clearly a betteruse of additional runtime than continuous execution and yields better results thanlong mce runs in some situations. Mce can also occasionally benefit from restartsafter a longer execution without improvement.

(5) Two heuristics are better than one. In particular, both of the node insertion heuris-tics, sifting and mce, obtain significantly better results when preceded by barycen-ter.

(6) Post-processing using an adjacent exchange heuristic reduces total crossings. Thisis true to the greatest extent for mce, less so for barycenter, and hardly at all forsifting (including when mce and sifting are preceded by barycenter).

(7) Depth-first search preprocessing can yield significant improvements for all heuris-tics and combinations thereof. It appears that the improvements are most dramaticwhen mce is the final heuristic applied.

(8) The additional experiments with longer runtimes and combinations of heuristicsgenerally support the working hypothesis when applied to the performance of mceversus not only barycenter, but also sifting and the bary plus mce/sifting combina-tions. However, there appear to be complicating factors with respect to at least oneclass. The distinction between uniform and connected dags appears to play a majorrole as well.

The many experiments that were done raise a lot of questions. The working hypoth-esis turned out to be mostly a springboard for more comprehensive understanding ofthe effectiveness of mce. Potential future work includes the following.

— Investigate other classes of dags including modified versions of those arising in ap-plication areas. The search for suitable problem instances is a challenge for all ex-perimental studies, including this one. Dags that are significantly different from theones studied here are worth exploring either to gather more evidence for the work-ing hypothesis and observations reported here or to prompt a more detailed look atthe various factors influencing relative performance in various contexts.

— Use more carefully controlled experiments and detailed examination of dag charac-teristics to gain additional insight that will guide the design of future heuristics.



The behavior of the heuristics and combinations on the class u(100,20,1.05,5) is es-pecially interesting in this regard. Are there approaches that will reduce the numberof crossings even further than what we observed? A more general idea applies to thedesign phase of heuristics and engineered algorithm implementations in many con-texts: Experiments can guide design as opposed to merely validating the relativeperformance of heuristics.

— Although Matuszewski et al. [1999] report that the best results for sifting wereobtained when sorting nodes by degree, it is likely that a random reordering of nodeswill be more profitable during long runs. Mce can also be randomized: instead oftraversing edges in a fixed sequence7 in order to find the one with most crossings,a random order will break ties randomly; this is especially useful in cases such asu(100,20,1.05,5), where there are likely to be lots of ties.

— Explore other ways to combine heuristics. One possibility is to interleave two ormore, as was done with adaptive insertion and barycenter in [Stallmann et al. 2001].

— Modify mce so that its design is more intentional about minimizing total crossings.One possibility is using the edge with the maximum number of crossings to decidewhich nodes to sift, but designing the actual sifting to minimize total crossings, asis done in the original sifting heuristic.

— Characterize dags for which extra effort in the form of additional runtime and/ormore sophisticated heuristics is likely to pay off. Experiments reported here do notgive any definitive guidance, but one might conjecture that locally sparse dags –ones that have relatively few edges per channel – are the best candidates.

— Parallelize the more successful combinations. A Dagstuhl working group within theseminar on Graph Drawing and Algorithm Engineering (seminar 11191) initiatedefforts on parallelization of barycenter [Ajwani et al. 2011]. The node insertion iter-ations of both sifting and mce can be done on individual layers or groups of consec-utive layers as was discussed in the case of barycenter (based, of course, solely onthe configuration at the last synchronization point). The choice of next node to siftor maximum crossings edge can either be done locally or at a synchronization point.The key question is: how much does node insertion within independent groups oflayers degrade solution quality?

ACKNOWLEDGMENT

I would like to thank Franc Brglez for our many years of collaboration on algorithm experimentation. Theapproach used in this paper was partially inspired by our joint work reported in [Stallmann and Brglez2007a; 2007b]. Thanks to Saurabh Gupta, who played a major role in implementing the heuristics reportedhere (and others). Aaron Peeler and Josh Thompson of the NCSU Virtual Computing lab went out of theirway to make the running of these experiments more convenient – I am grateful for their efforts. Additionalthanks go the reviewers for their multiple helpful suggestions, particularly one reviewer who pointed outthe possibility that random restarts can improve the performance of the barycenter heuristic.

REFERENCESAJWANI, D., DEMETRESCU, C., GUTWENGER, C., KRUG, R., MEYERHENKE, H., MUTZEL, P., NAHER, S.,

SANDER, G., AND STALLMANN, M. 2011. Report of working group: Parallel graph drawing. Tech. rep.,Dagstuhl Workshop 11191. See www.dagstuhl.de/wiki/index.php/Image:ParallelGraphDrawing.pdf.

BACHMAIER, C., BRANDENBURG, F. J., BRUNNER, W., AND HUBNER, F. 2010. A global k-level crossingreduction algorithm. In WALCOM 2010, 4th International Workshop on Algoritms and Computation.Lecture Notes in Computer Science Series, vol. 5942. 70–81.

BARTH, W., JUNGER, M., AND MUTZEL, P. 2002. Simple and efficient bilayer cross counting. In 10th Inter-national Symposium on Graph Drawing. Lecture Notes in Computer Science Series, vol. 2528. 130–141.

7The current implementation visits layers in a fixed order while edges in each layer are traversed based onthe current permutation of nodes.



BHATT, S. AND LEIGHTON, F. 1984. A framework for solving VLSI graph layout problems. JCSS 28, 300–343.

BUCHHEIM, C., EBNER, D., JUNGER, M., KLAU, G., MUTZEL, P., AND WEISKIRCHER, R. 2006. Exact cross-ing minimization. In 13th International Symposium on Graphy Drawing. Lecture Notes in ComputerScience Series, vol. 3843. 37–48.

CAKIROGLU, O. A., ERTEN, C., KARATAS, O., AND SOZDINLER, M. 2007. Crossing minimization in weightedbipartite graphs. In 6th International Workshop on Experimental Algorithms. Lecture Notes in Com-puter Science Series, vol. 4525. 122–135.

CHIMANI, M., GUTWENGER, C., AND MUTZEL, P. 2006. Experiments on exact crossing minimization us-ing column generation. In 5th International Workshop on Experimental Algorithms. Lecture Notes inComputer Science Series, vol. 4007. 303–315.

CHIMANI, M., HUNGERLANDER, P., JUGER, M., AND MUTZEL, P. 2011. An SDP approach to multi-levelcrossing minimization. In ALENEX 2011. 116–126.

CHIMANI, M., MUTZEL, P., AND BOMZE, I. 2008. A new approach to exact crossing minimization. In 16thAnnual European Symposium on Algorithms. Lecture Notes in Computer Science Series, vol. 5193.284–296.

DI BATTISTA, G., EADES, P., TAMASSIA, R., AND TOLLIS, I. G. 1999. Graph Drawing: Algorithms for theVisualization of Graphs. Prentice Hall.

DI BATTISTA, G., GARG, A., LIOTTA, G., TAMMASIA, R., TASSINARI, E., AND VARGIU, F. 1997. An exper-imental comparison of four graph drawing algorithms. Computational Geometry: Theory and Applica-tions 7, 303–325.

DUJMOVIC, V., FELLOWS, M. R., KITCHING, M., LIOTTA, G., MCCARTIN, C., NISHIMURA, N., RAGDE, P.,ROSAMOND, F., WHITESIDES, S., AND WOOD, D. R. 2008. On the parameterized complexity of layeredgraph drawing. Algorithmica 52, 267–292.

DUJMOVIC, V., FERNAU, H., AND KAUFMANN, M. 2003. Fixed parameter algorithms for one-sided crossingminimization revisited. In 11th International Symposium on Graph Drawing.

DUJMOVIC, V. AND WHITESIDES, S. 2004. An efficient fixed parameter tractable algorithm for 1-sided cross-ing minimization. Algorithmica 40, 115–128.

GANSNER, E., KOUTSIFIOS, E., NORTH, S., AND VO, K. 1993. A technique for drawing directed graphs.IEEE Trans. Software Engineering 19, 214–230.

GAREY, M. R. AND JOHNSON, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman.

GAREY, M. R. AND JOHNSON, D. S. 1983. Crossing Number is NP-complete. SIAM J. Algebraic DiscreteMethods 4, 312–316.

GUPTA, S. 2008. Crossing minimization in k-layer graphs. M.S. thesis, North Carolina State University.GUPTA, S. AND STALLMANN, M. 2010. Bottleneck crossing minimization in layered graphs. Tech. Rep. 13,

Dept. of Comp. Sci., North Carolina State University.JUNGER, M. AND MUTZEL, P. 1997. 2-layer straightline crossing minimization: Performance of exact and

heuristic algorithms. JGAA 1.LAGUNA, M., MARTI, R., AND VALLS, V. 1997. Arc crossing minimization in hierarchical digraphs with tabu

search. Computers and Operations Research 24, 1175–1186.LI, X. Y. AND STALLMANN, M. F. 2002. New bounds on the barycenter heuristic for bipartite graph drawing.

Information Processing Letters 82, 293–298.MATUSZEWSKI, C., SCHONFELD, R., AND MOLITOR, P. 1999. Using sifting for k-layer straightline crossing

minimization. In 8th International Symposium on Graph Drawing. Lecture Notes in Computer ScienceSeries, vol. 1731. 217–224.

MUNOZ, X., UNGER, W., AND VRT’O, I. 2001. One sided crossing minimization is NP-hard for sparse graphs.In 9th International Symposium on Graph Drawing.

MUTZEL, P. 2001. An alternative method to crossing minimization on hierarchical graphs. SIAM J. Opti-mization 11, 1065–1080.

NAGAMOCHI, H. 2005. On the one-sided crossing minimization in a bipartite graph with large degrees.Theoritical Computer Science 332, 417–446.

SCHONFELD, R. 2000. k-layer straightline crossing minimization by speeding up sifting. In 8th InternationalSymposium on Graph Drawing.

SHAHROKHI, F., SYKORA, O., SZEKELY, L., AND VRTO, I. 2001. On bipartite drawings and the linear ar-rangement problem. SIAM J. Computing 30, 1773–1789.



SHAHROKHI, F., SYKORA, O., SZEKELY, L. A., AND VRTO, I. 2000. A new lower bound for the bipartitecrossing number with applications. Theoretical Computer Science 245, 281–294.

SRIVASTAVA, K. AND SHARMA, R. 2008. A hybrid simulated annealing algorithm for the bipartite crossingnumber minimization problem. In IEEE Congress on Evolutionary Computation. 2948–2954.

STALLMANN, M. AND BRGLEZ, F. 2007a. High-contrast algorithm behavior: Observation, conjecture, andexperimental design. Tech. Rep. 14, Dept. of Comp. Sci., North Carolina State University.

STALLMANN, M. AND BRGLEZ, F. 2007b. High-contrast algorithm behavior: Observation, hypothesis, andexperimental design. In Proceedings, First Annual Workshop on Experimental Computer Science.

STALLMANN, M., BRGLEZ, F., AND GHOSH, D. 2001. Heuristics, Experimental Subjects, and TreatmentEvaluation in Bigraph Crossing Minimization. Journal on Experimental Algorithmics 6, 8.

SUGIYAMA, K., TAGAWA, S., AND TODA, M. 1981. Methods for visual understanding of hierarchical systemstructures. IEEE Transactions on Systems, Man, and Cybernetics 11, 109–125.

TAKAHASHI, H., KELLER, K., LE, K., SALUJA, K., AND TAKAMATSU, Y. 2005. A method for reducing thetarget fault list of crosstalk faults in synchronous sequential circuits. IEEE Transaction on CAD 24,252–263.

VALLS, V., MARTI, R., AND LINO, P. 1996. A tabu thresholding algorithm for arc crossing minimization inbipartite graphs. Annals of Operations Research 63, 233–251.

WATSON, B., BRINK, D., STALLMANN, M., RHYNE, T.-M., DEVARAJAN, R., AND PATEL, H. 2008. Visualiz-ing very large layered graphs with quilts. Tech. Rep. 17, North Carolina State University.


A A Heuristic for Bottleneck Crossing Minimization and its ... · PDF fileA Heuristic for...

Documents

Transcript of A A Heuristic for Bottleneck Crossing Minimization and its ... · PDF fileA Heuristic for...