Estimating Bounds for Quadratic Assignment Problems ... Bounds for Quadratic Assignment Problems...

23
Estimating Bounds for Quadratic Assignment Problems Associated with Hamming and Manhattan Distance Matrices based on Semidefinite Programming Hans Mittelmann * Jiming Peng July 15, 2008 Abstract Quadratic assignment problems (QAPs) with a Hamming distance ma- trix of a hypercube or a Manhattan distance matrix of rectangular grids arise frequently from communications and facility locations and are known to be among the hardest discrete optimization problems. In this paper we consider the issue of how to obtain lower bounds for those two classes of QAPs based on semidefinite programming (SDP). By exploiting the data structure of the involved distance matrix B, we first show that for any permutation matrix X, the matrix Y = αE - XBX T is positive semi- definite, where α is a properly chosen parameter depending only on the associated graph in the underlying QAP and E = ee T is a rank one matrix whose elements have value 1. This results in a natural way to approxi- mate the original QAPs via SDP relaxation based on the matrix splitting technique. Our new SDP relaxations have a smaller size compared with other SDP relaxations in the literature and can be solved efficiently by most open source SDP solvers. Experimental results show that for the underlying QAPs of size up to n=200, strong bounds can be obtained effectively. Key words. Quadratic Assignment Problem (QAP), Semidefinite Program- ming (SDP), Singular Value Decomposition (SVD), Relaxation, Lower Bound. * Department of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287- 1804, USA. Email:[email protected]. Corresponding author. Department of Industrial and Enterprise System Engineering , University of Illinois at Urbana-Champaign. Urbana, IL, 61801. 1

Transcript of Estimating Bounds for Quadratic Assignment Problems ... Bounds for Quadratic Assignment Problems...

Estimating Bounds for Quadratic Assignment

Problems Associated with Hamming and

Manhattan Distance Matrices based on

Semidefinite Programming

Hans Mittelmann∗ Jiming Peng †

July 15, 2008

Abstract

Quadratic assignment problems (QAPs) with a Hamming distance ma-trix of a hypercube or a Manhattan distance matrix of rectangular gridsarise frequently from communications and facility locations and are knownto be among the hardest discrete optimization problems. In this paper weconsider the issue of how to obtain lower bounds for those two classes ofQAPs based on semidefinite programming (SDP). By exploiting the datastructure of the involved distance matrix B, we first show that for anypermutation matrix X, the matrix Y = αE − XBXT is positive semi-definite, where α is a properly chosen parameter depending only on theassociated graph in the underlying QAP and E = eeT is a rank one matrixwhose elements have value 1. This results in a natural way to approxi-mate the original QAPs via SDP relaxation based on the matrix splittingtechnique. Our new SDP relaxations have a smaller size compared withother SDP relaxations in the literature and can be solved efficiently bymost open source SDP solvers. Experimental results show that for theunderlying QAPs of size up to n=200, strong bounds can be obtainedeffectively.

Key words. Quadratic Assignment Problem (QAP), Semidefinite Program-ming (SDP), Singular Value Decomposition (SVD), Relaxation, Lower Bound.

∗Department of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287-1804, USA. Email:[email protected].

†Corresponding author. Department of Industrial and Enterprise System Engineering ,University of Illinois at Urbana-Champaign. Urbana, IL, 61801.

1

1 Introduction

The standard quadratic assignment problem takes the following form

minX∈Π

Tr(AXBXT

)(1)

where A,B ∈ <n×n, and Π is the set of permutation matrices. This problemwas first introduced by Koopmans and Beckmann [19] for facility location. Themodel covers many scenarios arising from various applications such as in chipdesign [14], image processing [28], and communications [6, 24]. For more ap-plications of QAPs, we refer to the survey paper [7] where many interestingQAPs from numerous fields are listed. Due to its broad range of applications,the study of QAPs has caught a great amount of attention of many expertsfrom different fields [21]. It is known that QAPs are among the hardest discreteoptimization problems. For example, a QAP with n = 30 is typically recognizedas a great computational challenge [5].

In this work we focus on two special classes of QAPs where the matrixB is the Hamming distance matrix of a hypercube in space <d for a positiveinteger d, or the Manhattan distance matrix of rectangular grids. The firstclass of problems arises frequently from channel coding in communication wherethe purpose is to minimize the total channel distortion caused by the channelnoise [6, 8, 24, 31]. For easy of reference, we first give a brief introduction to thisclass of problems. Vector Quantization (VQ) is a widely used technique in lowbit rate coding of speech and image signals. An important step in a VQ-basedcommunication system is index assignment where the purpose is to minimizethe distortion caused by channel noise. Mathematically, the index assignmentproblem in a VQ-based communication system can be defined as follows[6, 24]

minX∈Π

Tr(AX(HP + PT H)XT

), (2)

where A is the channel probability matrix, P is a diagonal matrix based onthe resource statistics, and H is the Hamming distance matrix of a binarycodebook C = {c1, · · · , cn} of length d (n = 2d). The above problem has caughtthe attention of many researchers in the communication community and manyresults have been reported [24, 31]. For example, it was shown in [24] that theproblem is NP-hard in general. However, under the conditions that the channelis Binary Symmetric (BSC) and the quantizer is a uniform scalar with a uniformsource, i.e.,

A = [aij ], aij = thij (1− t)d−hij , P =1n

I,

then the problem can be solved in polynomial time[31]. In [6], Ben-David andMalah used the projection bound to estimate the lower and upper bound of theabove problem.

The problem (2) is usually of large size. Take a communication system with8-bit quantizer for example, the resulting QAP has a size of n = 512. This posesa great challenge in solving problem (2). As we will see in our later discussion,

2

even obtaining a strong lower bound for problem (2) is nontrivial. The secondclass of problems that we shall discuss in the present work arises usually fromfacility locations [7, 19, 21, 23], and is well-known to be one of the hardest QAPsin the literature [21]. For more background on this type of problems, we referto [21, 23].

A popular technique for finding the exact solution of the QAPs is the branchand bound (B&B) method. Crucial in a typical B&B approach is how a strongbound can be computed at a relatively cheap cost. Various relaxations andbounds for QAPs have been proposed in the literature. Roughly speaking,these bounds can be categorized into two groups. The first group includesseveral bounds that are not very strong but can be computed efficiently suchas the well-known Gilmore-Lawler bound (GLB) [11, 20], the bound based onprojection [12] and the bound based on convex quadratic programming [4]. Thesecond group contains strong bounds that require expensive computation suchas the bounds derived from the lifted integer linear programming [1, 2, 13] andbounds based on SDP relaxation [25, 33].

In this paper, we are particularly interested in bounds based on SDP relax-ations. As pointed out in [21, 25], the SDP bounds are tighter compared withbounds based on other relaxations, but usually much more expensive to computedue to the large number O(n4) of variables and constraints in the standard SDPrelaxation. To release such a computational challenge, Ding and Wolkowicz [9]proposed a new SDP relaxation for QAPs with O(n2) variables and constraints.In [18], Klerk and Sotirov exploited the symmetry in the underlying problem tocompute the bounds based on the standard SDP relaxation.

Note that when the two classes of QAPs considered in this paper are as-sociated with matrices of specific structure, it is highly desirable to use thisspecific data structure in the problem to design algorithm to solve the problemor estimate lower bounds. The purpose of this work is to utilize the algebraicproperties of the associated matrix to derive new SDP relaxations for the un-derlying QAPs. A distinction between our SDP relaxations and the existingSDP relaxations of QAPs lies in the lifting process. While most existing SDPrelaxations of QAPs cast the underlying problem as a generic quadratic opti-mization model with discrete constraints to derive the SDP relaxation, we tryto characterize the matrix XBXT for any permutation matrix X by using theproperties of the matrix B.

Our approach is inspired by the following observation: the eigenvalues of thematrix XBXT are independent of the permutation matrix X. Moreover, as weshall see later, if B has some special structure such as the Hamming distancematrix associated with a hypercube, then for any permutation matrix X, thematrix XBXT − αE is negative semi-definite for a properly chosen parameterα, where E is a matrix whose elements have value 1. This provides a new wayto construct SDP relaxations for the underlying QAPs. Compared with theexisting SDP relaxations for QAPs in the literature, our new model is moreconcise and yet still able to provide a strong bound for the underlying QAP.

Secondly, we consider the issue of how to further improve the efficiency ofthe SDP relaxation model without much sacrifice in the quality of the lower

3

bound. For such a purpose, we suggest to remove some constraints to derivea simplified variant of the new SDP model. The simplified SDP model canbe solved efficiently by most open source SDP solvers. Experiments on largescale QAPs show that the bound from the new simplified SDP relaxation iscomparable to the bound provided by the original SDP model.

The paper is organized as follows. In Section 2, we explore the propertiesof the matrix associated with the Hamming distance such as the Hamming dis-tance matrix of the hypercube in a finite dimensional space and the Manhattandistance of rectangular grids. In Section 3, we discuss how to use the theoreticalproperties of the matrices to construct new SDP relaxations for the underlyingQAPs. We also discuss some simplified variants of the SDP models introducedin Section 3 and its enhancement. Promising numerical experiments will bereported in Section 4, and finally we conclude our paper with some remarks inSection 5.

2 Properties of the matrices associated with theHamming and Manhattan distance

We start with a simple description of the geometric structure of a hypercube inspace <d. First we note that corresponding to every vertex v of the hypercube isa binary code cv of length d, i.e., cv ∈ {0, 1}d. The Hamming distance betweentwo binary codewords corresponding to the two vertices of the hypercube isdefined by

δ(vi, vj) =d∑

k=1

|ckvi− ck

vj|. (3)

Moreover, if for every vertex v of the hypercube we define

S(v, l) = {v′ ∈ V : δ(v, v′) = l}, (4)

and |S(., .)| denotes the cardinality of the set, then we have

|S(v, l)| = C(d, l),∀ l = 1, · · · , d, (5)

where C(d, l) is the combinatorial function.

Definition 2.1. The matrix Hd ∈ <n×n with n = 2d is called a Hammingdistance matrix of a hypercube in <d if

Hd = [hdij ], hd

ij = δ(vi, vj),

where V = {vi : i = 1, · · · , n} is the vertex set of the hypercube in <d.

We are now showing the following key result

4

Theorem 2.2. Let Hd ∈ <n×n (n = 2d) be the Hamming distance matrixdefined by Definition 2.1 whose eigenvalues λ1, · · · , λn are listed in decreasingorder. Then we have

λ1 =dn

2, λ2 = · · · = λn−d = 0, λn−d+1 = · · · = λn = −n

2(6)

Hde = λ1e, e = (1, · · · , 1)T ∈ <n. (7)

Proof. For any index i ∈ {1, · · · , n}, it is easy to see that

n∑j=1

hdij =

d∑l=1

lC(d, l) = d2d−1.

which further implies the second conclusion of the theorem. We next show

λ2 < λ1. (8)

The above inequality holds clearly when d = 1 and n = 2. We now discuss thecase when d > 1 and n = 2d > 2. Suppose inequality (8) does not hold and thuswe can assume that λ2 = λ1. Then there must exist a vector u ∈ <n such that

Hdu = d2d−1u, uT e = 0.

Let i0 be the index corresponding to the maximal absolute value of ui, i.e.,

|ui0 | = arg maxi=1,··· ,n

|ui|.

We have[Hd − d

2ete]u = d2d−1u.

It followsn∑

j=1

(hdi0j −

d

2)uj = d2d−1ui0 . (9)

Sincen∑

j=1

|hdi0j −

d

2| < d2d−1, ∀d > 1,

from the choice of ui0 one can conclude the relation (9) does not hold. Thisproves inequality (8).

Next we progress to prove the first conclusion by mathematical induction.When d = 1, it is easy to see that for any permutation matrix X ∈ <2×2, wehave

XH1XT = H1.

In such a case, the theorem holds trivially. Now let us assume the conclusionsof the theorem are true for d = k (n/2 = 2k). We consider the case where

5

d = k + 1 and n = 2k+1. Note that every matrix defined by Definition 2.1.can be represented as XHdXT for some specific matrix Hd and a permutationmatrix X. Because the eigenvalues of the matrix XHdXT are independent ofX, it suffices to consider the eigenvalues of any specific matrix Hd constructedfrom a hypercube. Now let us consider the matrix Hk ∈ <n

2×n2 defined by

(2.1) associated with a hypercube in <k and a set of binary codes of length k{c1, · · · , cn/2}. We can construct a new set of n binary codes of length k + 1 asfollows

civ = 0⊕ ci, ci+n/2

v = 1⊕ ci, i = 1, · · · , n/2.

Here the symbol ⊕ denotes the operation of directly adding the two binarystrings to formulate a new binary string. Based on such a construction andDefinition 2.1, we have

Hd = Hk+1 =(

Hk E2k + Hk

E2k + Hk Hk

). (10)

Here E2k ∈ <2k×2k

is the all-1 matrix. Next, consider an eigenvector (uT1 , uT

2 )T ∈<2k+1

of the matrix Hk+1 corresponding to its eigenvalue λ. It follows immedi-ately

Hku1 + Hku2 + eT2ku2e2k = λu1; (11)

Hku1 + Hku2 + eT2ku1e2k = λu2. (12)

The above relations yield

(λI − 2Hk)(u1 + u2) = eT2k(u1 + u2)e2k .

Because e2k is also an eigenvector of the matrix λI −Hk, we have

eT (λI − 2Hk)(u1 + u2) = (λ− k2k)eT2k(u1 + u2) = 2d−1eT

2k(u1 + u2).

Therefore, we must have either λ = (k + 1)2k = d2d−1 or eT2k(u1 + u2) = 0. In

the first case, we know from the second conclusion of the theorem that λ is thelargest eigenvalue of the matrix Hk+1 corresponding to its eigenvector e2k+1 . Inthe second case, it must hold

(λI − 2Hk)(u1 + u2) = 0,

which further implies that either λ/2 is an eigenvalue of Hk with an eigenvectoru1 + u2 or u1 + u2 = 0. If u1 + u2 = 0, from relations (11)-(12) we canconclude that u1 is a multiple of e2k , and (eT

2k ,−eT2k)T is an eigenvector of

Hk+1 with eigenvalue λ = −2k. It remains to consider the case when λ/2 is aneigenvalue of Hk with an eigenvector u1 + u2. Therefore, we have either λ = 0or λ = −2× 2k−1 = −2k = −2d−1. This shows that all the negative eigenvaluesof Hd equal to −2d−1 when d = k + 1. This proves the first conclusion of thetheorem.

A direct consequence of Theorem 2.2 is:

6

Corollary 2.3. Suppose that H is the Hamming distance matrix of the hyper-cube in <d defined in Definition 2.1. Then the projection matrix of H onto thenull space of e is negative semidefinite. Moreover, matrix H1 = d

nE − 2nH is a

projection matrix with rank d onto the null space of e.

For any positive integer k, we consider the following matrix

Mk = [mij ] ∈ <k×k,mij = |i− j| . (13)

Such a matrix can be viewed as a submatrix of a Hamming distance matrixof the hypercube in <k−1 corresponding to a subset of vertices. It can also becast as the Manhattan distance matrix of the grids in a straight line. FromTheorem 2.2 we obtain the following Corollary.

Corollary 2.4. For a given matrix Mk defined by (13), the matrix Mk =k−12 Ek −Mk is positive semidefinite.

Now let us consider the Manhattan distance of rectangular grids. Withoutloss of generality, we can assume the rectangle has k rows and l columns witha set V = {v1, · · · , vn} of n = k × l nodes. Note that the Manhattan distancecan be represented as the sum of two distances: the row and column distances.Because the set of eigenvalues of the matrix XBXT is invariant for any per-mutation matrix X, by performing permutations if necessary, we can assumethe subset of nodes Vr

i = {v(i−1)l+j : j = 1, · · · , l} are from the i-th row of therectangular grids, and the subset of nodes Vc

j = {v(i−1)l+j : i = 1, · · · , k} arefrom the j-th column of the rectangular grids. In such a case, the row distancematrix Br can be written as the Kronecker product of two matrices

Br = Mk ⊗ El,

where Mk is a matrix defined by (13). Now recalling Corollary 2.4, we know thatthe matrix k−1

2 Ek −Mk is positive semidefinite. Moreover, it is straightforwardto verify the following relation

k − 12

En −Br =(

k − 12

Ek −Mk

)⊗ El.

Since the Kronecker product of two positive semidefinite matrices is also positivesemidefinite (Corollary 4.2.13, pp. 245-246, [16]), it follows immediately thatthe matrix k−1

2 En−Br is positive semidefinite. Similarly we can also show thatk−12 En − Bc is positive semidefinite where Bc is the column distance matrix

defined byBc = Ek ⊗Ml.

We thus have the following result.

Theorem 2.5. Suppose M is a Manhattan distance matrix of rectangular gridswith k rows and l columns. Then the matrix M = k+l−2

2 En − M is positivesemidefinite.

7

3 New SDP Relaxations for QAPs

In this section, we describe our new SDP relaxations for QAPs associated with aHamming distance matrix or a Manhattan distance matrix of rectangular grids.The section consists of three parts. In the first subsection, we first introduce thenew SDP relaxations for QAPs with a Hamming distance matrix and compareit with other SDP relaxations for the underlying problems in the literature. Inthe second subsection, we discuss the case for QAPs with a Manhattan distancematrix. In the last subsection, we discuss how to further improve the relaxationmodels in the first two subsections and simplify the relaxation models based oncomputational consideration. The bounds from the simplified models will beanalyzed as well.

3.1 New SDP relaxations for QAPs with a Hamming dis-tance matrix

We first discuss the case when B is the Hamming distance matrix. Let Y =dnEn − 2

nXBXT . From Corollary 2.3, we have Y 2 = Y , which can be furtherrelaxed to I � Y � 0. On the other hand, since the matrix d

nEn − 2nB is also a

projection matrix and X is a permutation matrix, we have

Y = X

(d

nEn −

2n

B

)XT = X

(d

nEn −

2n

B

)2

XT .

Let us define Z = X

(d

nEn −

2n

B

), we have

(XZ

) (XZ

)T

=(

XXT XZT

ZXT ZZT

)=

(I YY Y

). (14)

Following the matrix-lifting procedure in [9], we relax such a relation to I XT ZT

X I YZ Y Y

� 0. (15)

Because XZT = X

(d

nEn −

2n

B

)XT , the above constraint can be further re-

duced to (I ZT

Z Y

)� 0, I − Y � 0.

We next impose constraints on the elements of the matrix Y . Because all theelements on the diagonal of B are zeros, so are the elements on the diagonalof the matrix XBXT . Moreover, since all the off-diagonal elements of B arepositive integers, we thus have yij ≤ d−2

n . Therefore, we can relax the QAP

8

with a Hamming distance matrix to the following SDP:

min d2Tr(AE)− n

2Tr(AY ) (16)

s.t. yii = dn , ∀i ∈ {1, · · · , n}; (17)∑n

j=1 yij = 0, ∀i ∈ {1, · · · , n}; (18)

yij ≤ yii − 2n , ∀j 6= i ∈ {1, · · · , n}; (19)(

I ZT

Z Y

)� 0, I − Y � 0; (20)

Z = X

(d

nEn −

2n

B

); (21)

X ≥ 0, XT e = Xe = e. (22)

Because we employ the same matrix-lifting technique to derive the SDP relax-ation for the underlying QAPs, it would be interesting to compare our modelwith the strongest SDP relaxation model (MSDR3) in [9]. It is easy to seethat from any feasible solution of our SDP model, we can construct a feasiblesolution to the MSDR3 model in [9]. We thus have the following result.

Theorem 3.1. Let µ∗SDP be the lower bound derived by solving (16) with aHamming distance matrix, and µ∗MSDR3 is the bound derived from the MSDR3model in [9]. Then we have

µ∗SDP ≥ µ∗MSDR3.

Proof. To prove the theorem, we first observe that there are three major dif-ferences between model (16) and MSDR3: The first one is that the constraints(17)-(20) are replaced by following constraints I XT ZT

X I YZ Y W

� 0, diag (Y ) = Xdiag (B) (23)

Y e = XBe, diag (W ) = Xdiag (B2),We = XB2e. (24)

Secondly, MSDR3 uses the projection matrix of X onto the null space of e.Thirdly, additional cuts based on the eigenvalues of the matrices A and B areadded to strengthen the relaxation.

One can easily verify that the constraints in model (16) is tighter than theconstraints (23)-(24) with respect to matrix Y . Secondly, since we have alreadyprojected the matrix B onto the null space of e, the further projection of thepermutation matrix X onto the null space of e won’t change our model and im-prove its bound. Last, the cuts in MSDR3 are essentially based on the followinginequality

Tr(AY ) ≥< A,B >−=n∑

i=1

λi(A)λn−i+1(B), (25)

9

where λi(A), i = 1, · · · , n and λi(B), i = 1, · · · , n are the eigenvalues of A and Bsorted in a decreasing order, respectively. It should be mentioned that in [9], theauthors proposed to apply similar cuts to some carefully chosen sub-matrices ofthe whole product matrix AY = AXBXT . However, as shown in Theorem 2.2,for any permutation matrix X, the projection matrix of XBXT onto the nullspace of e has either zero or identical negative eigenvalues. This implies thatthe relation (25) is satisfied automatically in model (16). Similarly one can alsoshow that all the cuts in MSDR3 are satisfied in model (16).

We mention that in our recent paper [22], we have shown (see Theorem 2.6of [22]) that for QAPs associated with the hypercube (or hamming distancematrices), the optimal solution can be attained at a permutation matrix X∗

with x∗11 = 1. Therefore, we can further add such a constraint to strengthen thebound.

3.2 New SDP relaxations for QAPs with a Manhattandistance matrix

In this subsection we describe our new SDP relaxations for QAPs with a Man-hattan distance matrix of rectangular grids with k rows and l columns. As inSection 2, we first decompose the matrix B into two matrices, i.e., B = Br +Bc,where Br and Bc are the distance matrices based on the row and column posi-tions of the nodes in the rectangular grids, respectively. Without loss of gener-ality, we can further assume that

Br =k − 1

2En −Br =

(k − 1

2Ek −Mk

)⊗ El,

Bc =l − 1

2En −Bc = Ek ⊗

(l − 1

2El −Ml

).

It follows from Corollary 2.4 that both Br and Bc are positive semi-definite.Let Br and Bc denote the square root of the matrix Br and Bc, respectively.

We have

Tr(AXBXT

)=

k + l − 22

Tr(AEn)− Tr(AX(Br + Bc)XT

)=

k + l − 22

Tr(AEn)− Tr(AX(B2

r + B2c )XT

).

Let us define Y r = XBrXT , Y c = XBcX

T . It follows immediately 1

Y r = XB2rXT , Y c = XB2

cXT . (26)

The above relations can be relaxed to

Y r −XB2rXT � 0, Y c −XB2

cXT � 0.

We next discuss how to compute the square roots Br and Bc. For this we firstcite a classic result on Kronecker product (Lemma 4.2.10, [16]).

1We mention that it is also possible to use the Cholesky decomposition of Br and Bc.

10

Lemma 3.2. Let A1, A2 ∈ <m×m, B1, B2 ∈ <n×n. Then we have

(A1 ⊗B1)(A2 ⊗B2) = (A1A2)⊗ (B1B2).

Based on the above lemma, we have

Br = Mk ⊗(

1√lEl

), Bc =

(1√k

Ek

)⊗ Ml,

where Mk, Ml are the square roots of the matrices k−12 Ek−Mk and l−1

2 El−Ml,respectively.

We next discuss how to impose some constraints on the elements of Y r andY c. It is easy to see that all the elements on the diagonal of Y r have the samevalues as that of Br. We next consider the sum of all the elements in any rowof Y r. Since Y r = XBrX

T , we have

Y re = X

(k − 1

2En −Br

)XT e =

(k − 1)n2

e−XBre.

Similar relations for Y c can also be obtained. Because

(Br + Bc)ij ≥ 1,∀i 6= j ∈ {1, · · · , n},

it followsyr

ij + ycij ≤

k + l − 42

,∀j 6= i ∈ {1, · · · , n}.

Based on the above discussion, we can relax the QAPs with a Manhattan dis-tance matrix of a k × l rectangular grids to the following SDP:

min k+l−22 Tr(AE)− Tr(A(Y r + Y c)) (27)

s.t. diag (Y r) = k−12 e, diag (Y c) = l−1

2 e,

Y re = (k−1)n2 e−XBre;

Y ce = (l−1)n2 e−XBce;

yrij + yc

ij ≤ k+l−42 , ∀j 6= i ∈ {1, · · · , n};(

I ZTr

Zr Y r

)� 0,

(I ZT

c

Zc Y c

)� 0,

Zr = XBr, Zc = XBc,

X ≥ 0, XT e = Xe = e.

Recall that in [9], the authors imposed the constraints (23)-(24) on the matrixY = XBXT . Suppose that (Y r, Y c) is a feasible solution to model (27).

In addition to the constraint in model (27), we can further add more con-

11

straints, resulting in the following relaxation:

mink + l − 2

2Tr(AE)− Tr(A(Y r + Y c)) (28)

s.t.

(I Y r

Y r W r

)� 0,

(I Y c

Y c W c

)� 0,

diag (W r) = Xdiag (B2r ), diag (W c) = Xdiag (B2

c );W re = XB2

re, W ce = XB2c e;

constraints in model (27);

Suppose that (Y r, Y c) is a feasible solution to model (27) or (28) and letus define Y ∗ = k+l−2

2 − Y r − Y c. One can easily see that the matrix Y ∗ alsosatisfies constraints (23)-(24) with respect to matrix argument Y . Therefore,we have

Theorem 3.3. Let µ∗SDP be the lower bound derived by solving (27) or (28)with a Manhattan distance matrix, and let µ∗MSDR1 be the bound derived fromthe MSDR1 model in [9]. Then we have

µ∗SDP ≥ µ∗MSDR1.

We mention that the above model can be further enhanced by using thegeometric structure of the rectangular grids. For example, based on the sym-metries in the rectangular grids, one can show that there exists at least a globalsolution that satisfies

x1j = 1, j ≤ dk2ed l

2e.

We note that some QAPs (like nug16a, nug17, nug18) from the QAP li-brary [7] were constructed out of the larger instances by deleting several nodesfrom some rows or columns. In order to apply model (27), we need to firstcompute the row (or column) distance matrix of the original rectangular. Thenwe remove the last few rows and columns corresponding to the construction ofthe underlying QAP instance. In such a case, we can not represent the row (orcolumn) distance matrix Br (or Bc) as a Kronecker product. Nevertheless, wecan still show that the matrices Br = k−1

2 En − Br, Bc = l−12 En − Bc remain

positive semi-definite.

3.3 Further enhancement and simplifications

In this subsection, we first discuss how to further enhance the SDP relaxationmodels proposed earlier in this section by incorporating the simple linear con-straints in the GLB model into our model. To see this, let us recall the GLBmodel, which can be defined as the following linear assignment problem

min Tr(WX) (29)X ≥ 0, Xe = XT e = e. (30)

12

Here W is a matrix constructed from the matrices A and B by the followingrule [11]. Let ai,: and b:,i be the i-th row and i-th column of matrices A and B,respectively. We have

wij = aiibjj +n−1∑k=1

a∗ikb∗kj ,

where a∗ik, k = 1, · · · , n − 1 are the sorted elements of ai,: excluding aii ina decreasing order, and b∗kj , k = 1, · · · , n − 1 are the sorted elements of b:,j

excluding bjj in an increasing order. Let Y = XBXT where B is the originalmatrix in (1), then we have

ai,:y:,i ≥ wi,:x:,i, ∀i = 1, · · · , n.

We can add the above constraints to the existing SDP relaxation model toimprove its lower bound.

On the other hand, we also note that although the SDP relaxation modelsproposed in earlier in this section are much more concise and simpler than otherSDP relaxations for QAPs in the literature, they might still involve intensivecomputation for large scale QAP instances where n ≥ 150. We next discuss howto simplify these relaxation models to compute lower bounds for the underlyingQAPs in the present work. To start, let us first examine the SDP model (16)-(22) where we impose the following weak constraints on the relation betweenthe matrix Y and the assignment matrix X(

I ZT

Z Y

)� 0, Z = X

(d

nEn −

2n

B

).

If we replace the above constraint by Y � 0, then we can simplify model (16)-(22) to the following

min d2Tr(AE)− n

2Tr(AY ) (31)

s.t. yii = dn , ∀i ∈ {1, · · · , n};∑n

j=1 yij = 0, ∀i ∈ {1, · · · , n};yij ≤ yii − 2

n , ∀j 6= i ∈ {1, · · · , n};Y � 0, I − Y � 0.

Due to the special structure of the Hamming distance matrix, we have thefollowing interesting result.

Theorem 3.4. Let µ2SDP be the lower bound derived by solving (31) for QAPs

with a Hamming distance matrix and µ1SDP be the bound derived from model (16).

Then we haveµ1

SDP = µ2SDP .

Proof. To prove the theorem, we note that due to the group symmetry in theHamming distance matrix [18], one can show that there exist multiple optimal

13

solutions X1, X2, · · · , Xn satisfying∑n

i=1 Xi = E 2. This implies that if wespecify the SDP constraints in model (16) to(

I ZT

Z Y

)� 0, I − Y � 0, Z =

E

n

(d

nEn −

2n

B

);

then the bound computed from the resulting SDP relaxation is tighter thanthe bound computed from model (16). Now let us recall Theorem 2.2, we have

Z = En

(d

nEn −

2n

B

)= 0. Therefore, we can further reduce the above SDP

constraint toY � 0, I − Y � 0,

which is precisely the SDP constraints in the simplified model (31). This finishesthe proof of the theorem.

We also relax the SDP constraint in (27) to derive the following simple SDP:

min k+l−22 Tr(AE)− Tr(A(Y r + Y c)) (32)

s.t. diag (Y r) = k−12 e, diag (Y c) = l−1

2 e,

Y re = (k−1)n2 e−XBre;

Y ce = (l−1)n2 e−XBce;

yrij + yc

ij ≤ k+l−42 , ∀j 6= i ∈ {1, · · · , n};

Y r � 0, Y c � 0,

X ≥ 0, XT e = Xe = e.

Suppose that (Y r, Y c) is a feasible solution to model (32) and let us defineY ∗ = k+l−2

2 − Y r − Y c. One can easily see that the matrix Y ∗ also satisfiesconstraints (23)-(24) with respect to matrix argument Y . Therefore, we have

Theorem 3.5. Let µ∗SDP be the lower bound derived by solving (32) for QAPswith a Manhattan distance matrix. Let µ∗MSDR1 be the bound derived from theMSDR1 model in [9]. Then we have

µ∗SDP ≥ µ∗MSDR1.

4 Numerical Experiments

In this section, we report some numerical results based on our model. Ourexperiments consist of two parts. In the first part, we report the bounds forQAPs with a Hamming distance matrix of a hypercube. In the second part, wepresent our numerical experiments for QAPs with a Manhattan distance matrixof rectangular grids.

2Such an interesting fact was also discussed in our early work [22].

14

For QAPs of a Hamming distance metric, we test our model on four differentchoices of the matrix A. The first one is the matrix used in [30] defined by

aij =∆3

√1− ρ2

(2πσ6)32

n∑l=1

exp−δ2

2σ2 ((1−ρ2)(i−n1)2+(j−n1−ρ∗(i−n1))

2+(l−n1−ρ∗(i−n1))2),

where n1 = n+12 , and ∆ is the step size to quantize the source, σ is the variance of

the Gaussian Markov source with zero mean and ρ its correlation coefficient. Inour experiment, we set the step size ∆ = 0.4, σ = 1 and ρ = 0.1, 0.9 respectively.These two different choices of ρ represent the scenarios of the source with densecorrelation and non-dense correlation. The second test problem is the so-calledHarper code [15] with

aij = |i− j| .

We also tested our model on a random matrix A and the so-called vector quan-tization problem (denoted by VQ) [32] provided by our colleague Dr. Wu fromMcMaster University.

Our experiments are done on an AMD Opteron with 2.4GHz CPU and 12GB memory. We use the latest version of CVX [10] and SDPT3 [29] underMatlab R2007a to solve our problem. In the following tables, we list the lowerbound (L-bound in the table) computed from our model, and the upper bound(U-bound in the table) obtained by using some heuristics from the QAP library.It should be pointed out that we use these heuristics to find an upper boundbecause, except for the case n = 16, no global solutions to the underlyingproblems have been reported in the literature. We also mention that in all thetables, the problem eng1 (9) refers to the engineering problem with ρ = 0.1(0.9)respectively. Table 1 summarizes the results with the hypercube in <d withd = 4, 5, 6, 7. We also mention that since the computational cost for solvingthe model (16) is more expensive that that of the simplified model (31), whilethe lower bounds computed from these two models are essentially the same asshown in Theorem 3.4, we use only the later model to compute the lower boundand the CPU time are also listed in seconds in the table.

From Table 1, we can see that except for the problems eng1 and eng9, thebounds provided by our model for other test problems are very close to the upperbounds. This indicates our bound is very strong and close to the global optimalsolution of the underlying problem. For the test problems eng1 and eng9, weobserved that the relative gap between the lower and upper bounds increases asn increases. One possible explanation for this is that for large n, most elementsof the matrix A have usually very small values. On the other hand, from itsdefinition we know that the elements of Y will also have small absolute values.These two points might lead to some numerical instability when we solve theSDP relaxation.

We next report our experiments for the ESC problems from the QAP li-brary [7]3. From Table 2 we can see that our lower bounds can be computed

3We thank Kurt Anstreicher for reminding us that our model can be applied to the ESCinstances as pointed out in [3].

15

Measurements ProblemsSize Bounds/CPU Harper eng1 eng9 rand VQ

L-bound 2741.7 1.5452 .930857 733.5 244.11n=16 CPU(secs) 1 1 1 1 1

U-bound 2752 1.58049 1.02017 745.2 245.27L-bound 27328 1.24196 1.03724 6372 294.49

n=32 CPU(secs) 3 4 3 3 3U-bound 27360 1.58528 1.40941 6486 297.29L-bound 262160 .926658 .887776 60965 352.4

n=64 CPU(secs) 56 56 68 54 45U-bound 262260 1.58297 1.43201 62035 353.5L-bound 2446944 .881738 .846574 1163220 393.29

n=128 CPU(secs) 1491 1688 2084 2937 2719U-bound 2479944 1.56962 1.43198 1177020 399.09

Table 1: Bounds for QAPs associated with hypercubes

very efficiently. We also mention that in all the tables, whenever all the elementsof the associated matrices A and B have only integer values, we list the valueobtained by rounding up the lower bound (obtained from the relaxation model)to the smallest integer above it. Comparing with Table 1 in [9], one can verifythat our bound is usually stronger than that from MSDR3. In particular, forseveral instances such as ESC16a-c,ESC16h, ESC32b-d, ESC32h, our bound isvery strong and comparable to the strongest bound reported in the literature.On the other hand, we also would like to point out that in several cases, ourbound is not that strong. With a closer look at the data, we noted that for allthe cases where our bound is not strong, the coefficient matrix A is very sparseand has only a few nonzero elements. Since we did not exploit the structure ofthe matrix A in our model, it is not surprising that we could not obtain strongbounds for these cases. For a comparison, we also computed the bounds basedon the enhanced model introduced in Section 3.3 by incorporating the linearconstraints in the GLB model [11, 20] into the SDP model proposed in thiswork. The corresponding bounds and CPU time to get these bounds are listedin the brackets in Table 2. As one can see from Table 2, the enhanced modelis able to provide better bound when A is very sparse at the cost of more CPUtime.

We now present our experiments on several large scale QAPs with a Man-hattan distance matrix of rectangular grids. For these problems, the best knownbound were reported in [17], and we use the best known feasible solution fromthe QAP library as the upper bound. We list the best known lower and upperbound in the last column of the table. Table 3 lists the results of both models(27) and (32) for QAPs of size from 42 to 100. As we can see from Table 3, thebound provided by the simple model (32) is quite strong and comparable withthe bound by model (27). In a few cases (indicated by the fat font), the bound

16

Prob. L-Bound CPU U-Bound Prob. L-Bound CPU U-Boundesc16a 57 (57) 1 (2) 68 esc32a 19 (35) 7 (7) [103,130]esc16b 284 (284) 1 (1) 292 esc32b 103 (103) 7 (8) [132,168]esc16c 135 (135) 1 (1) 160 esc32c 578 (578) 6 (7) [616,642]esc16d 4 (4) 4 (4) 16 esc32d 152 (152) 7 (7) [191,200]esc16e 18 (18) 1 (1) 28 esc32e 0 (0) 5 (6) 2esc16g 19 (19) 1 (1) 26 esc32g 0 (0) 5 (6) 6esc16h 927 (927) 1 (1) 996 esc32h 381 (381) 7 (8) [424,438]esc16i 0 (0) 1 (1) 14 esc64a 0 (47) 63 (79) [98,116]esc16j 1 (1) 1 (9) 8 esc128 0 (3) 2140 (2706) [2,64]

Table 2: Bounds for the ESC instances

Model (23) Model (26)Problems CPU (secs) bound bound CPU (secs) Best known bound

wil50 1357 46526 46467 22 [47098,48816]sko42 611 14393 14377 12 [14934,15812]sko49 1232 21449 21427 21 [22004,23386]sko56 2631 31711 31641 34 [32610,34458]sko64 4474 45013 44946 65 [45736,48498]sko72 8866 61854 61764 135 [62691,66256]sko81 15173 85194 85056 262 [86072,90998]sko90 31213 108496 108312 489 [108493,115534]

sko100a 71419 143068 142843 760 [142668,152002]sko100b 74268 144826 144571 754 [143872,153890]sko100c 74271 139059 138870 939 [139402,147862]sko100d 72979 140577 140385 865 [139898,149576]sko100e 71658 140053 139823 884 [140105,149150]sko100f 71638 140288 140077 921 [139452,149036]wil100 73048 263710 263406 814 [263909,273038]

Table 3: Bounds for large scale QAPs: Part I

17

Problems CPU (secs) Bound Best known boundtho150 6801 7537980 [7620628,8133398]stu1 34214 22425300 22642312stu2 37795 14551200 14882423stu3 37769 6807470 7095880stu4 38657 1526710 1694716stu5 43215 1828430 2000485stu6 36910 9300150 9575346stu7 35019 4289370 4488544stu8 38659 952830 1089094stu9 35078 12002100 12347759stu10 39656 7534160 7865751stu11 39537 3447140 3685347stu12 37766 719307 845597stu13 38707 6404970 6660709stu14 36881 3990840 4251108stu15 43202 1828430 2000485stu16 44020 354328 437802

Table 4: Bounds for large scale QAPs: Part II

obtained from the simple model (32) even improves over the best known boundfor the underlying QAP in the literature.

It should be pointed out that for QAP instances whose size is above 100,we could not apply the model (27) due to the restriction of memory. In thefollowing table we list only the lower bound computed by solving the simplemodel (32). The first test problem is from [7] while the rest are from [27]. Allthe tested problems from [27] have a size of n = 200.

We also tested our model on QAPs with a Manhattan distance matrix ofrectangular grids whose size is below 40. These QAP instances were solved on a2.67GHz Intel Core 2 computer with 4GB memory. We compare our result withthe method in [9]. We should mention that very strong bounds for the underlyingproblems have been obtained by using the vector lifting SDP relaxation [25, 33].As reported in [25, 9], the bound from SDR3 (we use the same notion as in [9]) isthe strongest compared with other SDP relaxations in the literature, but thesevector lifting SDP relaxations usually involve intensive computation and can notbe handled effectively by most open source SDP solvers based on interior-pointmethods. Therefore, we only compare our algorithm with the algorithm in [9].

As we can see from Table 5 4, except for a few cases, the bounds provided bymodels (27) and (32) are tighter than the bound provided by MSDR3, and the

4We mentioned that the bound we got from MSDR3 is slightly different from what re-ported in [9]. This is because in [9], the authors applied their model to two scenarios byswapping A and B, and select the best bound from these two scenarios. However, in our pa-per, since we elaborate more on the structure of B, thus we could not exploit such a possibility.Consequently, for purpose of consistence, we only list the bound obtained in our experiment.

18

Model (23) Model (26) MSDR3Problems CPU bound Bound CPU CPU Bound OptimumNug12 2 509 509 1 6 501 578Nug14 3 930 927 1 15 917 1014Nug15 4 1044 1041 1 24 1016 1150Nug16a 5 1439 1433 1 50 1460 1610Nug16b 4 1102 1102 1 42 1085 1240Nug17 5 1527 1520 1 55 1548 1732Nug18 7 1719 1715 1 80 1725 1930Nug20 9 2299 2298 1 174 2287 2570Nug21 12 2163 2152 1 204 2095 2438Nug22 15 3226 3185 1 258 3118 3596Nug24 19 3114 3106 1 507 3055 3488Nug25 21 3331 3326 1 668 3290 3744Nug27 38 4727 4690 1 619 4602 5234Nug28 40 4656 4639 1 1314 4553 5166Nug30 59 5490 5478 2 2122 5418 6174scr12 2 28110 28110 1 8 18766 31410scr15 5 43823 43583 1 25 23735 51140scr20 11 86181 85964 1 136 42114 110030tho30 55 128815 127943 2 1538 118010 149936Ste36a 129 7108 7010 4 4382 4423 9526tho40 212 206622 205571 5 12103 183150 [214218,240516]

Table 5: Bounds for small scale QAPs

19

bounds from models (27) and (32) are very close while the simple model (32) isthe most efficient.

5 Conclusions

In this paper, we have proposed new SDP relaxations for the QAP associatedwith a Hamming distance matrix of the hypercube in a finite space, or a Man-hattan distance matrix of rectangular grids. By exploiting the special structureof the underlying distance matrix ( denoted by B), we show that for a properlychosen parameter α depending on the size of the hypercube or the rectangulargrids, the matrix αE − B is positive semidefinite. This leads to a new way torelax the underlying QAP to an SDP. We show that compared with other SDPrelaxations of similar size in the literature, tighter bounds can be computedmuch more efficiently based on the proposed models. Promising experimentsconfirmed our theoretical conclusions.

There are several different ways to extend our results. First of all, it is ofinterests to investigate whether the proposed approach in the present work canbe adapted to derive new SDP relaxations for general QAPs. Secondly, given theefficiency of the model, it is possible to develop an effective branch-bound typemethod for solving the underlying QAPs. Thirdly, we note that at the optimalsolution of the original QAP, the matrix XBXT has a low rank ( rank d forthe case of Hamming distance, and k + l− 2 for the Manhattan distance case).Therefore, it will be interesting to investigate how to find a low rank solutionto the relaxed SDP. We also observed in our experiments that when the matrixA is very sparse, then the bound from our model might not be very strong. Itwill be interesting to explore how we can incorporate the sparse structure of theunderlying matrix into our SDP relaxation model. Further study is needed toaddress these issues.

Acknowledgement: We would like to thank Yichuan Ding and HenryWolkowicz for their generous support in the preparation of the paper, whichallows us to share their code. We also thank Thomas Stuetzle for providing uswith the randomly generated QAP instances.

References

[1] W.P. Adams and T.A. Johnson. Improved Linear Programming-BasedLower Bounds for the Quadratic Assignment Problem. In P.M. Pardalosand H. Wolkowicz, editors, Quadratic Assignment and Related Problems,DIMACS 25 Series in Discrete Mathematics and Theoretical ComputerScience, pages 43-75, 1994.

[2] W.P. Adams and H.D. Sherali. A tight linearization and an algorithm forzero-one quadrtic programming problems. Management Science, 32, 1274-1290, 1986.

20

[3] K.M. Anstreicher. Recent advances in the solution of quadratic assignmentproblems, Mathematical Programming (Series B), 97: 27-42, 2003.

[4] K.M. Anstreicher and N.W. Brixius. A new bound for the quadratic as-signment problem based on convex quadratic programming, MathematicalProgramming 80:341-357, 2001.

[5] K.M. Anstreicher, N.W. Brixius, J.-P. Goux, and J.Linderoth. Solving largequadratic assignment problems on computational grids, Mathematical Pro-gramming 91(3):563-588, 2002.

[6] G. Ben-David and D. Malah. Bounds on the performance of vector-quantizers under channel errors. IEEE Transactions on Information The-ory, 51(6), 2227–2235, 2005.

[7] R.E. Burkard, E. ela, S.E. Karisch and F. Rendl. QAPLIB-A quadraticassignment problem library. J. Global Optimization, 10, 391–403, 1997.

[8] N.T. Cheng and N.K. Kingsburg. Robust zero-redundancy vector quanti-zation for noisy channels. In Proceedings of the International Conferenceon Communications, Boston, 1338–1342, 1989.

[9] Y. Ding and H. Wolkowicz. A matrix-lifting semidefinite relaxation for thequadratic assignment problem. Technical report CORR 06-22, Universityof Waterloo, Ontario, Canada, October,2006.

[10] M. Grant, S. Boyd and Y. Ye. CVX Users’ Guide, 2007.http://www.stanford.edu/˜boyd/cvx/cvx usrguide.pdf

[11] P.C. Gilmore. Optimal and suboptimal algorithms for the quadratic assign-ment problem. SIAM Journal on Applied Mathematics, 10:305–313, 1962.

[12] S.W. HADLEY, F. RENDL and H. WOLKOWICZ. A new lower boundvia projection for the quadratic assignment problem. Mathematics of Op-erations Research, 17, 727-739, 1992.

[13] Hahn, P. and Grant, T. Lower bounds for the quadratic assignment problembased upon a dual formulation. Operations Research, 46, 912–922, 1998.

[14] M. Hannan and J.M. Kurtzberg. Placement techniques. In Design Automa-tion of Digital Systems: Theory and Techniques, M.A. Breuer Ed., vol(1),Prentice-hall: Englewood Cliffs, 213–282, 1972.

[15] L.H. Harper. Optimal assignments of numbers to vertices. J. of the Societyfor Industrial and Applied Mathematics, 12, 131–135, 1964.

[16] R.A. Horn and C.R. Johnson. Topics in Matrix Analysis. Cambridge Uni-versity Press,1991.

21

[17] S.E. Karisch and F. Rendl. Lower bounds for the quadratic assign-ment problem via triangle decompositions. Mathematical Programming,71(2):137-152, 1995.

[18] E. de Klerk and R. Sotirov. Exploring group summetry in semidefiniteprogramming relaxations of the quadratic assignment problem. TechnicalReport. Department of Econometrics and OR, Tilburg University, Tilburg,the Netherlands, 2007.

[19] T. Koopmans and M. Beckmann. Assignment problems and the location ofeconomic activities, Econometrica, 25, 53–76, 1957.

[20] E. Lawler. The quadratic assignment problem. Management Science, 9,586-599, 1963.

[21] E.M. Loiola, N.M. Maia de Abreu, P.O. Boaventura-Netto, P. Hahn, andT. Querido. A survey for the quadratic assignment problem. European J.Oper. Res. 176(2), 657–690, 2007.

[22] H. Mittelmann, J. Peng and X. Wu. Graph Modeling for Quadratic Assign-ment Problem Associated with the Hypercube. Technical Report, Depart-ment of IESE, UIUC, Illinois. 2007.

[23] C.E. Nugent, T.E. Vollman, and J. Ruml. An experimental comparison oftechniques for the assignment of facilities to locations. Operations Research,16, 150–173, 1968.

[24] L.C. Potter and D.M. Chiang, Minimax nonredundant channel coding.IEEE Transactions on Communications, vol. 43, no. 2/3/4, pp. 804–811,February/March/April 1995.

[25] F. Rendl and R. Sotirov. Bounds for the quadrtic assignment problem us-ing bundle method. Mathematical Programming, Series B, 109(2-3), 505-524,2007.

[26] M. Scriabin and R.C. Vergin. Comparison of computer algorithms and vi-sual based methods for plant layout. Management Science, 22, 172–187,1975.

[27] T. Stutzle and S. Fernandes. New Benchmark Instances for the QAP andthe Experimental Analysis of Algorithms. In J. Gottlieb und G. Raidl, ed-itor, Evolutionary Computation in Combinatorial Optimization: 4th Euro-pean Conference, EvoCOP 2004, volume 3004 in Lecture Notes in ComputerScience, pages 199-209, Springer Verlag, Berlin, Germany, 2004.

[28] E.D. Taillard. Comparison of iterative searches for the quadratic assingn-ment problem. Location Science,3:87-105, 1995.

[29] K.C. Toh, R.H. Tutuncu and M. Todd. On the implementation and usage ofSDPT3. 2006. http://www.math.nus.edu.sg/˜mattohkc/guide4-0-draft.pdf

22

[30] X. Wang, X. Wu and S. Dumitrescu. On optimal index assignment for MAPdecoding of Markov sequences, Technical report, Department of Electricaland Computer Engineering, McMaster University, Ontario, Canada, 2006.

[31] K. Zeger and A. Gersho. Pseudo-Gray Code. IEEE Trans. on Communi-cations, 38(12), 2147–2158, 1990.

[32] Xiaolin Wu. Private communication. Department of Electrical and Com-puter Engineering, McMaster University, Ontario, Canada, 2007.

[33] Q. Zhao, S.E. Karisch, F. Rendl, and H. Wolkowicz. Semidefinite program-ming relaxations for the quadratic assignment problem. J. CombinatorialOptimization, 2, 71–109, 1998.

23