On Equivalence of Semideﬁnite Relaxations for Quadratic ...V SDR is derived using (4) with the...

On Equivalence of Semidefinite Relaxations for Quadratic Matrix

Programming ∗

Yichuan Ding † Dongdong Ge ‡ Henry Wolkowicz §

April 28, 2010

Contents

1 Introduction 21.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Quadratic Matrix Programming: Case I 42.1 Lagrangian Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Equivalence of Vector and Matrix Lifting for QMP1 . . . . . . . . . . . . . . . . . . 5

3 Quadratic Matrix Programming: Case II 83.1 Equivalence of Vector and Matrix Lifting for QMP2 . . . . . . . . . . . . . . . . . . 9

3.1.1 Proof of (Main) Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.1.2 Unbalanced Orthogonal Procrustes Problem . . . . . . . . . . . . . . . . . . . 17

3.2 An Extension to QMP with Conic Constraints . . . . . . . . . . . . . . . . . . . . . 183.2.1 Graph Partition Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Conclusion 19

References 23

Index 23

Abstract

In this paper, we analyze two popular semidefinite programming (SDP ) relaxations forquadratically constrained quadratic programs (QCQP ) with matrix variables. These are basedon vector-lifting and on matrix lifting and are of different size and expense. We prove, under mildassumptions, that these two relaxations provide equivalent bounds. Thus, our results provide atheoretical guideline for how to choose an inexpensive SDP relaxation and still obtain a strongbound. Our results also shed important insights on how to simplify large-scale SDP constraintsby exploiting the particular sparsity pattern.

∗Technical report CORR 2010-02; URL: orion.math.uwaterloo.ca/˜hwolkowi/reports/ABSTRACTS.html†Department of Management Science & Engineering, Stanford University. E-mail [email protected]‡Antai College of Economics and Management, Shanghai Jiao Tong University. E-mail [email protected]§Research supported by Natural Sciences Engineering Research Council Canada. E-mail [email protected]

1

http://orion.math.uwaterloo.ca/protect unhbox voidb@x penalty @M hwolkowi/reports/ABSTRACTS.html

Keywords: Semidefinite Programming Relaxations, Quadratic Matrix Programming, QuadraticConstrained Quadratic Programming, Hard Combinatorial Problems.

1 Introduction

In this paper, we aim to provide theoretical insights on how to compare different semidefiniteprogramming (SDP ) relaxations. In particular, we study a vector lifting relaxation and compareit to a significantly smaller matrix lifting relaxation and show that the resulting two bounds areequal.

Many hard combinatorial problems can be formulated as a quadratically constrained quadraticprogram (QCQP ) with matrix variables. If the resulting formulated problem is nonconvex, thenSDP relaxations provide an efficient and successful approach for computing approximate solutionsand strong bounds. Finding strong and inexpensive bounds is essential for branch and boundalgorithms for solving large hard combinatorial problems. However, there can be many differentSDP relaxations for the same problem, and it is usually not obvious which relaxation is overalloptimal with regard to both computational efficiency and bound quality, e.g., [12].

For examples of using SDP relaxations for QCQP arising from hard problems, see e.g., quadraticassignment (QAP) [11, 12, 23, 26, 35], graph partitioning (GPP) [34], sensor network localization(SNL) [9, 10, 21], and the more general Euclidean distance matrix completions [1].

1.1 Preliminaries

The concept of quadratic matrix programming (QMP ) was introduced by Beck in [6], where itrefers to a special instance of QCQP with matrix variables. Because we include the study ofmore general problems, we denote the model discussed in [6] as the first case of quadratic matrixprogramming, denoted (QMP1 ),

(QMP1 )

µ∗P1 := min trace(XT Q0X) + 2 trace(CT

0 X) + β0

s.t. trace(XT QjX) + 2 trace(CTj X) + βj ≤ 0, j = 1, 2, . . . ,m

X ∈Mnr,

where Mnr denotes the set of n by r matrices, Qj ∈ Sn, j = 0, 1, . . . ,m, Sn is the space of n × n

symmetric matrices, and Cj ∈Mnr. Throughout this paper, we use the trace inner-product (dot

product) C ·X := trace CTX.The applicability of QMP1 is limited when compared to the more general class QCQP .

However, many applications use QCQP models in the form of QMP1, e.g., robust optimization[7] and sensor network localization [1]. In addition, many combinatorial problems are formulatedwith orthogonality constraints in one of the two forms

XXT = I, XT X = I. (1)

When X is square, the pair of constraints in (1) are equivalent to each other, in theory. However,relaxations that include both forms of the constraints rather than just one, can be expected toobtain stronger bounds. For example, Anstreicher and Wolkowicz [4] proved that strong dualityholds for a certain relaxation of QAP when both forms of the orthogonality constraints in (1) areincluded; however, there can be a duality gap if only one of the forms is used. Motivated by thisresult, we extend our scope of problems so that the objective and constraint functions can include

2

both forms of quadratic terms XT QjX and XPjXT . We now define the second case of quadratic

matrix programming (QMP2 ) problems as

(QMP2 )

min trace(XT Q0X) + trace(XP0XT ) + 2 trace(CT

0 X) + β0

s.t. trace(XT QjX) + trace(XPjXT ) + 2 trace(CT

j X) + βj ≤ 0, j = 1, . . . ,m

X ∈Mnr,

(2)

where Qj, Pj are symmetric matrices of appropriate sizes.Both QMP1 and QMP2 can be vectorized into the QCQP form using

trace(XT QX) = vec(X)T (Ir ⊗Q) vec(X), trace(XPXT ) = vec(X)T (P ⊗ In) vec(X), (3)

where ⊗ denotes the Kronecker product, and vec(X) vectorizes X by stacking columns of X ontop of each other. The difference in the Kronecker products (Ir ⊗ Q), (P ⊗ In) shows that thereis a difference in the corresponding Lagrange multipliers and illustrates why the bounds fromLagrangian relaxation will be different for these two sets of constraints. The SDP relaxationfor the vectorized QCQP is called the vector-lifting semidefinite relaxation (V SDR) . Undera constraint qualification assumption, V SDR for QCQP is equivalent to the dual of classicalLagrangian relaxation, see e.g., [24, 33, 5].

From (3), we get

trace(XT QX) = trace(Ir ⊗Q)Y, if Y = vec(X) vec(X)T

trace(XPXT ) = trace(P ⊗ In)Y, if Y = vec(X) vec(X)T .(4)

V SDR is derived using (4) with the relaxation Y vec(X) vec(X)T . A Schur complement argu-ment, e.g., [25, 22], implies the equivalence of this relaxation to the large matrix variable constraint[

1 vec(XT )vec(X) Y

]

0. A similar result holds for trace(XPXT ) = vec(X)T (P ⊗ In) vec(X).

Alternatively, from (3), we get the smaller system

trace(XT QX) = trace QY, if Y = XXT

trace(XPXT ) = trace PY, if Y = XT X.(5)

The matrix-lifting semidefinite relaxation MSDR is derived using (5) with the relaxation Y XXT . A Schur complement argument now implies the equivalence of this relaxation to the smaller

matrix variable constraint

[

I XT

X Y

]

0. Again, a similar result holds for trace(XPXT ).

Intuitively, one expects that V SDR should provide stronger bounds than MSDR . Beck[6] proved that V SDR is actually equivalent to MSDR for QMP1 if both SDP relaxationsattain optimality and have a zero duality gap, e.g., when a constraint qualification, such as theSlater condition, holds for the dual program. In this paper we strengthen the above result bydropping the constraint qualification assumption. Then, we present our main contribution, i.e., weshow the equivalence between MSDR and V SDR for the more general problem QMP2 under aconstraint qualification. This result is of more interest because QMP2 does not possess the samenice structure (chordal pattern) as QMP1. Moreover, QMP2 encompasses a much richer class ofproblems and therefore has more significant applications.

3

1.2 Outline

In Section 2 we present the equivalence of the corresponding V SDR and MSDR formulationsfor QMP1 and prove Beck’s result without the constraint qualification assumption, see Theorem2.1. Section 3 proves the main result that V SDR and MSDR generate equivalent lower boundsfor QMP2, under a constraint qualification assumption, see Theorem 3.1. Numerical tests areincluded. Section 4 provides concluding remarks.

2 Quadratic Matrix Programming: Case I

We first discuss the two relaxations for QMP1. We denote the matrices in the relaxations obtainedfrom vector and matrix lifting by

M(qVj (•)) :=

[

βj vec(Cj)T

vec(Cj) Ir ⊗Qj

]

,

M(qMj (•)) :=

[

βj

rIr CT

j

Cj Qj

]

.

We let

y =

(

x0

vec(X)

)

∈ Rnr+1, Y =

(

X0

X

)

∈M(r+n)r,

and denote the quadratic and homogenized quadratic functions

qj(X) := trace(XT QjX) + 2 trace(CTj X) + βj

qVj (X,x0) := trace(XT QjX) + 2 trace(CT

j Xx0) + βjx20

= yT M(qVj (•))y

qMj (X,X0) := trace(XT QjX) + 2 trace(XT

0 CTj X) + trace

βj

rXT

0 IrX0

= trace Y T M(qMj (•))Y

2.1 Lagrangian Relaxation

As mentioned above, under a constraint qualification, V SDR for QCQP is equivalent with thedual of classical Lagrangian relaxation. We include this result for completeness and to illustratethe role of a constraint qualification in the relaxation. We follow the approach in [24, Pg. 403] anduse the strong duality of the trust region subproblem [31] to obtain the Lagrangian relaxation (or

4

dual) for QMP1 as an SDP .

µ∗L := max

λ≥0minX

q0(X) +∑m

i=j λjqj(X)

= maxλ≥0

minX,x2

0=1

qV0 (X,x0) +

∑mj=1 λiq

Vj (X,x0)

= maxλ≥0,t

miny

yT(

M(qV0 (•)) +

∑mj=1 λjM(qV

j (•)))

y + t(1− x20)

= maxλ≥0,t

miny

trace(

M(qV0 (•)) +

∑mj=1 λjM(qV

j (•)))

yyT + t(1− x20)

= (DV SDR1 )

max t

s.t.

[

t 00 0

]

−∑m

j=1 λjM(qVj (•)) M(qV

0 (•))

λ ∈ Rm+ , t ∈ R.

(6)

As illustrated in (6), Lagrangian relaxation is the dual program (denoted by DV SDR1 ) of thevector-lifting relaxation V SDR1 given below. Hence, under a constraint qualification, the La-grangian relaxation is equivalent with the vector-lifting semidefinite relaxation. The usual con-straint qualification is the Slater condition, i.e.,

∃λ ∈ Rm+ , s.t. M(qV

0 (•)) +

m∑

j=1

λjM(qVj (•)) ≻ 0. (7)

2.2 Equivalence of Vector and Matrix Lifting for QMP1

Recall that the dot product refers to the trace inner-product, C ·X = trace CTX. The vector-liftingrelaxation is

(V SDR1 )

µ∗V 1 := min M(qV

0 (•)) · ZV

s.t. M(qVj (•)) · ZV ≤ 0, j = 1, 2, . . . ,m

(ZV )1,1 = 1ZV 0.

Thus the constraint matrix is blocked as ZV =

[

1 vec(X)T

vec(X) YV

]

.

The matrix-lifting relaxation is

(MSDR1 )

µ∗M1 := min M(qM

0 (•)) · ZM

s.t. M(qMj (•)) · ZM ≤ 0, j = 1, 2, . . . ,m

(ZM )1:r,1:r = Ir

ZM 0.

Thus the constraint matrix is blocked as ZM =

[

Ir XT

X YM

]

.

V SDR1 is obtained by relaxing the quadratic equality constraint YV = vec(X) vec(X)T to

YV vec(X) vec(X)T , and then formulating this as ZV =

[

1 vec(X)T

vec(X) YV

]

0. MSDR1 is

obtained by relaxing the quadratic equality constraint YM = XXT to YM XXT and then

reformulating this to the linear conic constraint ZM =

[

Ir XT

X YM

]

0. V SDR1 involves O((nr)2)

5

variables and O(m) constraints which is often at the complexity of O(nr), whereas the smallerproblem MSDR1 has only O((n + r)2) variables. The equivalence of relaxations using vectorand matrix-liftings is proved in [6, Theorem 4.3] by assuming a constraint qualification for thedual programs. We now present our first main result and prove the above mentioned equivalencewithout any constraint qualification assumptions. The proof is of interest in itself in that we usethe chordal property and matrix completions to connect the two relaxations.

Theorem 2.1. As numbers in the extended real line [−∞,+∞], the optimal values of the tworelaxations obtained using vector and matrix-liftings are equal, i.e.,

µ∗V 1 = µ∗

M1.

Proof. The proof follows by showing that both V SDR1 and MSDR1 generate the same optimalvalues as the following program.

(V SDR′

1)

µ∗V 1′ := min Q0 ·

∑rj=1 Yjj + 2C0 ·X + β0

s.t. Qj ·∑r

j=1 Yjj + 2Cj ·X + βj ≤ 0, j = 1, 2, . . . ,m

Zjj =

[

1 xTj

xj Yjj

]

0, j = 1, 2, . . . , r,

where xj , j = 1, 2, . . . , r, are the columns of matrix X, and Yjj, j = 1, 2, · · · , r, represent thecorresponding quadratic parts xjx

Tj .

We first show that the optimal values of V SDR1 and V SDR′

1are equal, i.e., that

µ∗V 1 = µ∗

V 1′ . (8)

The equivalence of the two optimal values can be established by showing for each program, for eachfeasible solution, one can always construct a corresponding feasible solution to the other programwith the same objective value.

First, suppose V SDR′

1has a feasible solution Zjj =

[

1 xTj

xj Yjj

]

, j = 1, 2, . . . , r. Construct the

partial symmetric matrix

ZV =

1 xT1 xT

2 . . . xTr

x1 Y11 ? ? ?x2 ? Y22 ? ?... ? ?

. . . ?xr ? ? ? Yrr

,

where the entries denoted by ‘?’ are unknown/unspecified. By observation, the unspecified en-tries of ZV are not involved in the constraints or in the objective function of V SDR1. In otherwords, adding values to the unspecified positions will not change the constraint function values andthe objective value. Therefore, any positive semidefinite completion of the partial matrix ZV isfeasible for V SDR1 and has the same objective value. The feasibility of Zjj (j = 1, 2, . . . , r) for

V SDR′

1implies

[

1 xTj

xj Yjj

]

0 for each j = 1, 2, . . . , r. So all the specified principal submatrices

of ZV are positive semidefinite, and hence ZV is a partial positive semidefinite matrix (See refer-ences [2, 15, 16, 18, 32] for the specific definitions of partial positive semidefinite, chordal graph,semidefinite completion.) It is not difficult to verify the chordal graph property for the sparsity

6

pattern of ZV . Therefore ZV has a positive semidefinite completion by the classical completionresult [15, Theorem 7]. Thus we have constructed a feasible solution to V SDR1 with the sameobjective value as the feasible solution from V SDR′

1. i.e., this shows that µ∗

V 1 ≤ µ∗V 1′ .

Conversely, suppose V SDR1 has a feasible solution

ZV =

1 xT1 xT

2 . . . xTr

x1 Y11 Y12 . . . Y1r

x2 Y21 Y22 . . . Y2r

......

.... . .

...xr Yr1 Yr2 . . . Yrr

0.

Now we construct Zjj :=

[

1 xTj

xj Yjj

]

, j = 1, 2, . . . , r. Because each Zjj is a principal submatrix

of the positive semidefinite matrix ZV , we have Zjj 0. The feasibility of ZV for V SDR1 alsoimplies

M(qVi (•)) · ZV ≤ 0, i = 1, 2, . . . ,m. (9)

It is easy to check that

Qi ·r

∑

j=1

Yjj + 2Ci ·X + βi = M(qVi (•)) · ZV ≤ 0, i = 1, 2, . . . ,m, (10)

where X =[

x1 x2 · · · xr

]

. Therefore, Zjj, j = 1, 2, . . . , r, is feasible for V SDR′

1, and also

generates the same objective value for V SDR′

1as ZV for V SDR1 by (10), i.e., this shows that

µ∗V 1 ≥ µ∗

V 1′ . This completes the proof of (8).Next we prove that the optimal values of MSDR1 and V SDR′

1are equal, i.e., that

µ∗M1 = µ∗

V 1′ . (11)

The proof is similar to the one for (8). First suppose V SDR′

1has a feasible solution Zjj =

[

1 xTj

xj Yjj

]

, j = 1, 2, · · · ,m. Let X =[

x1 x2 · · · xr

]

, and YM =∑r

j=1 Yjj. Now we construct

ZM :=

[

Ir XT

X YM

]

. Then by Zjj 0, j = 1, 2, . . . , r, we have YM =∑r

j=1 Yjj ∑r

j=1 xjxTj = XXT ,

which implies Z∗M 0 by the Schur Complement [22, 25]. And, because

M(qMj (•)) · ZM = Qj ·

r∑

i=1

Yii + 2Cj ·X + βj , j = 1, . . . ,m, (12)

we get ZM is feasible for MSDR and it generates the same objective value as the one by Zjj,j = 1, 2, · · · ,m, for V SDR′

1, i.e., µ∗

M1 ≤ µ∗V 1′ .

Conversely, suppose ZM =

[

Ir XT

X YM

]

0 is feasible for MSDR1 , and X =[

x1 x2 · · · xr

]

.

Let Yii = xi(xi)T for i = 1, 2, . . . , r− 1, and let Yrr = xrx

Tr + (YM −XXT ). As a result, Yii xix

Ti

for i = 1, 2, . . . , r, and∑r

i=1 Yii = YM . So, by constructing Zjj =

[

1 xTj

xj Yjj

]

, j = 1, 2, · · · , r, it is

easy to show that Zjj is feasible for V SDR′

1and generates an objective value equal to the objective

value of MSDR with ZM , i.e., µ∗M1 ≥ µ∗

V 1′ . This completes the proof of (11). Combining thiswith (8) completes the proof of the Theorem.

7

Though MSDR1 is significantly less expensive, Theorem 2.1 implies that the quality of theMSDR1 bound is no weaker than that from V SDR1. This tells us to always choose MSDR1 ifthe problem can be formulated as in QMP1.

Example 2.1 (Sensor Network Localization Problem). The Sensor Network Localization, SNL,problem is one of the most studied problems in Graph Realization, e.g., [20, 21, 30]. In thisproblem one is given a graph with m known points (anchors) ak ∈ Rd, k = 1, 2, · · · ,m, and n

unknown points (sensors) xj ∈ Rd, j = 1, 2, · · · , n, where d is the embedding dimension. A

Euclidean distance dkj between ak and xj or distance dij between xi and xj is also given for somepairs of two points. The goal is to seek estimates of the positions for all unknown points. Onepossible formulation of the problem is as follows.

min 0s.t. trace(XT (Eii + Ejj − 2Eij)X) = dij , ∀ (i, j) ∈ Nx

trace(XT EiiX)− 2 trace([

aTj 0

]

X) + aTj aj = dij, ∀(i, j) ∈ Na

X ∈Mnr,

(13)

where Nx, Na refers to sets of known distances. This formulation is a QMP1, so we can developboth its V SDR1 and MSDR1 relaxations.

min 0s.t. I ⊗ (Eii + Ejj − 2Eij) · Y = dij , ∀ (i, j) ∈ Nx

I ⊗ Eii · Y − 2aTj xi + aT

j aj = dij , ∀ (i, j) ∈ Na[

1 xT

x Y

]

0

(14)

min 0s.t. (Eii + Ejj − 2Eij) · Y = dij , ∀ (i, j) ∈ Nx

Eii · Y − 2[

aTj 0

]

·X + aTj aj = dij , ∀(i, j) ∈ Na

[

I XT

X Y

]

0

(15)

Theorem 2.1 implies that the MSDR1 relaxation always provides the same lower bound as theV SDR1 one, while the number of variables for MSDR1 (O((n+d)2)) is significantly smaller thanthe number for V SDR1 (O(n2d2)). The quality of the bounds combined with a lower computationalcomplexity explains why MSDR1 is a favourite relaxation for researchers.

3 Quadratic Matrix Programming: Case II

In this section, we move to the main topic of our paper, i.e., the equivalence of the vector andmatrix relaxations for the more general QMP2.

8

3.1 Equivalence of Vector and Matrix Lifting for QMP2

We first propose the Vector-Lifting Semidefinite Relaxation V SDR2 for QMP2. From applyingboth equations in (4), we get the following.

(V SDR2)

µ∗V 2 := min

[

β0 vec(C0)T

vec(C0) Ir ⊗Q0 + P0 ⊗ In

]

· ZV

s.t.

[

βj vec(Cj)T

vec(Cj) Ir ⊗Qj + Pj ⊗ In

]

· ZV ≤ 0, j = 1, 2, . . . ,m

(ZV )1,1 = 1

ZV ∈ Srn+1+

(

ZV =

[

1 vec(X)T

vec(X) YV

])

Matrix YV is nr× nr and can be partitioned into exactly r2 block matrices YijV , i, j = 1, 2, . . . , r,

where each block is n × n. From applying both equations in (5), we get the smaller Matrix-Lifting Semidefinite Relaxation MSDR2 for QMP2 . (We add the additional constraint trace Y1 =trace Y2 since trace XXT = trace XT X.)

(MSDR2)

µ∗M2 := min Q0 · Y1 + P0 · Y2 + 2C0 ·X + β0

s.t. Qj · Y1 + Pj · Y2 + 2Cj ·X + βj ≤ 0, j = 1, 2, . . . ,m

Y1 −XXT ∈ Sn+

(

Z1 :=

[

Ir XT

X Y1

]

0

)

Y2 −XT X ∈ Sr+

(

Z2 :=

[

In X

XT Y2

]

0

)

trace Y1 = trace Y2.

V SDR2 has O((nr)2) variables, whereas MSDR2 has only O((n + r)2) variables. The com-putational advantage of using the smaller problem MSDR2 motivates the comparison of thecorresponding bounds. The main result is interesting and surprising, i.e., that V SDR2 andMSDR2 actually generate the same bound under a constraint qualification assumption. In gen-eral, the bound from MSDR2 is at least as strong as the bound from V SDR2 .

Define the block-diag and block-offdiag transformations, respectively,

B0Diag(Q) : Sn → Srn+1, O0Diag(P ) : Sr → Srn+1,

B0Diag(Q) :=

[

0 00 Ir ⊗Q

]

, O0Diag(P ) :=

[

0 00 P ⊗ In

]

.

(See [35] for the r = n case.) It is clear that Q,P 0 implies that both B0Diag(Q) 0,O0Diag(P ) 0. The adjoints b0diag, o0diag are, respectively,

Y1 = B0Diag∗(ZV ) = b0diag(ZV ) :=

∑rj=1 Y

jjV ,

Y2 = O0Diag∗(ZV ) = o0diag(ZV ) := (trace Y

ijV )i,j=1,2,...,r.

(16)

Lemma 3.1. Let X ∈Mnr be given. Suppose that one of the following two conditions hold.

1. Let Yv be given and ZV defined as in V SDR2. Let the pair Z1, Z2 in MSDR2 be constructedusing

Y1 = b0diag(ZV ), Y2 = o0diag(ZV ). (17)

9

2. Let Y1, Y2 be given with trace Y1 = trace Y1, and let Z1, Z2 be defined as in MSDR2. LetYV , ZV for V SDR2 be constructed from Y1, Y2 as follows.

YV =

V11n(Y2)12In . . . 1

n(Y2)1rIn

1n(Y2)12In V2 . . . 1

n(Y2)2rIn

. . . . . . . . . . . .

. . . . . . . . . 1n(Y2)(r−1)rIn

. . . . . . . . . Vr

, (18)

withr

∑

i=1

Vi = Y1, trace Vi = (Y2)ii, i = 1, . . . , r. (19)

Then, Zv satisfies the linear inequality constraints in V SDR2 if, and only if, Z1, Z2 satisfies thelinear inequality constraints in MSDR2. Moreover, the values of the objective functions with thecorresponding variables are equal.

Proof. 1. Note that[

β vec(C)T

vec(C) Ir ⊗Q + P ⊗ In

]

· ZV = β + 2C ·X +(

B0Diag(Q) + O0Diag(P ))

· ZV

= β + 2C ·X + Q · b0diag(ZV ) + P · o0diag(ZV )= β + 2C ·X + Q · Y1 + P · Y2, by (17).

(20)

2. Conversely, we note that trace Y1 = trace Y2 is a constraint in MSDR2. And, ZV as con-structed using (18) satisfies (17). In addition, the n + r assignment type constraints in (19)on the rn variables in the diagonals of the Vi, i = 1, . . . , r, can always be solved. We can nowapply the argument in (20) again.

Lemma 3.1 guarantees the equivalence of the feasible sets of the two relaxations with respect tothe linear inequality constraints and the objective function. However, this ignores the semidefiniteconstraints. The following result partially addresses this deficiency.

Corollary 3.1. If the feasible set of V SDR2 is nonempty, then the feasible set of MSDR2 isalso nonempty and

µ∗M2 ≤ µ∗

V 2. (21)

Proof. Suppose ZV =

[

1 vec(X)T

vec(X) YV

]

is feasible for V SDR2 . Recall that Matrix YV is nr×nr

and can be partitioned into exactly r2 block matrices YijV , i, j = 1, 2, . . . , r. As above, we set Y1, Y2

following (17), and we set Z1 =

[

Ir XT

X Y1

]

, Z2 =

[

In X

XT Y2

]

.

Denote the j-th column of X by X:j , j = 1, 2, . . . , r. Now ZV 0 implies YjjV −X:jX

T:j 0.

Therefore,∑r

j=1 YjjV −

∑rj=1 X:jX

T:j = Y1 − XXT 0, i.e. Z1 0. Similarly, denote the k-th

row of X by Xk:, k = 1, 2, . . . , n. Let (Y ijV )kk denote the k-th diagonal entry of Y

ijV , and define

the r × r matrix Y k := ((Y ijV )kk)i,j=1,2,··· ,r. Then ZV 0 implies Y k − XT

k:Xk: 0. Therefore,∑n

k=1 Y k−∑n

k=1 XTk:Xk: = Y2−XT X 0, i.e. Z2 0. The proof now follows from Lemma 3.1.

10

Corollary 3.1 holds because MSDR2 only restricts the sum of some principal submatrices ofZV (i.e., b0diag(ZV ), o0diag(ZV )) to be positive semidefinite; while V SDR2 restricts the wholematrix ZV positive semidefinite. So the semidefinite constraints in MSDR2 are not as strong as inV SDR2 . Moreover, the entries of YV involved in b0diag(•), o0diag(•) form a partial semidefinitematrix which is not chordal and does not necessarily have a semidefinite completion. Therefore,the semidefinite completion technique we used to prove the equivalence between V SDR1 andMSDR1 is not applicable here. Instead, we will prove the equivalence of their dual programs. Itis well known that the primal equals the dual when the generalized Slater condition holds [17, 28],and in this case we will then conclude that V SDR2 and MSDR2 generate the same bound.

Definition 3.1. For λ ∈ Rm, let:

βλ := β0 +m

∑

j=1

λjβj ;

and let Cλ, Qλ, Pλ be defined similarly.

After substituting α← βλ − α, we see that the dual of V SDR2 is equivalent to

(DV SDR2 )

max βλ − α

s.t.

[

α vec(Cλ)T

vec(Cλ) Ir ⊗Qλ + Pλ ⊗ In

]

0

α ∈ R, λ ∈ Rm+ .

And, the dual of MSDR2 is

(DMSDR2 )

max βλ − trace S1 − trace S2

s.t.

[

S1 RT1

R1 Qλ − tIn

]

0

[

S2 R2

RT2 Pλ + tIr

]

0

R1 + R2 = Cλ

λ ∈ Rm+ , S1 ∈ Sr, S2 ∈ Sn, R1, R2 ∈M

nr, t ∈ R.

The Slater condition for DV SDR2 is equivalent to the following:

∃λ ∈ Rm+ , s.t. Ir ⊗Qλ + Pλ ⊗ In ≻ 0. (22)

The corresponding constraint qualification condition for DMSDR2 is

∃t ∈ R, λ ∈ Rm+ , s.t. Qλ − tIr ≻ 0, Pλ + tIn ≻ 0. (23)

These two conditions are equivalent due to the following lemma, which will also be used in oursubsequent analysis.

Lemma 3.2. Let Q ∈ Sn, P ∈ Sr. Then

Ir ⊗Q + P ⊗ In ≻ 0, (resp. 0)

if, and only if,∃t ∈ R, s.t. Q− tIn ≻ 0, P + tIr ≻ 0, (resp. 0.)

11

Proof. Suppose the symmetric matrices Q, P have spectral decomposition

Q = V ΛQV T , P = UΛP UT ,

where V,U are orthogonal, and the eigenvalues are in the diagonal matrices

ΛQ = Diag((

λ1(Q) λ2(Q) . . . λn(Q)))

, ΛP = Diag((

λ1(P ) λ2(P ) . . . λr(P )))

.

Therefore, we get the equivalences: Ir ⊗ Q + P ⊗ In ≻ 0 if, and only if, (V ⊗ U)(Ir ⊗ ΛP + P ⊗In)(V ⊗U)T ≻ 0 if, and only if, Ir ⊗ΛQ + ΛP ⊗ In ≻ 0 if, and only if, min

iλi(Q) + minj λj(P ) > 0

if, and only if,min

iλi(Q)− t > 0,min

jλj(P ) + t > 0, for some t ∈ R.

The equivalences hold if the strict inequalities, ≻ 0 and > are replaced by the inequalities 0 and≥, respectively.

Now we state the main theorem of this paper on the equivalence of the two SDP relaxationsfor QMP2.

Theorem 3.1. Suppose that DV SDR2 is strictly feasible. As numbers in the extended real line(−∞,+∞], the optimal values of the two relaxations V SDR2, MSDR2, obtained using vectorand matrix-liftings, are equal, i.e.,

µ∗V 2 = µ∗

M2.

3.1.1 Proof of (Main) Theorem 3.1

Since DV SDR2 is strictly feasible, Lemma 3.2 implies that both dual programs satisfy constraintqualifications. Therefore, both programs satisfy strong duality, see e.g., [28]. Therefore, both havezero duality gaps, i.e., the optimal values of DV SDR2, DMSDR2, are µ∗

V 2, µ∗M2, respectively.

Now assume thatλ is feasible for DV SDR2. (24)

Lemma 3.2 implies that λ is also feasible for DMSDR2, i.e., there exists t ∈ R such that

Q := Qλ − tIn 0, P := Pλ + tIr 0. (25)

(To simplify notation, we use Q, P to denote these dual slack matrices.) The spectral decompositionof Q, P can be expressed as

Q = V ΛQV T = [V1 V2]

[

ΛQ+ 00 0

]

[V1 V2]T , P = UΛP UT = [U1 U2]

[

ΛP+ 00 0

]

[U1 U2]T ,

where the columns of the submatrices U1, V1 form an orthonormal basis that spans the range spacesR(P ) andR(Q), respectively; while the columns of U2, V2 span the orthogonal complements R(P )⊥

andR(Q)⊥, respectively. ΛQ+ is a diagonal matrix where diagonal entries are nonzero eigenvalues ofmatrix Q, and ΛP+ is defined similarly. Let σi, θi denote the eigenvalues of P , Q, respectively.

We similarly simplify the notation

C := Cλ, c := vec(C), β := βλ. (26)

Let A† denote the Moore-Penrose pseudoinverse of matrix A. The following lemma allows us toexpress µ∗

V 2 as a function of Q, P , c and β.

12

Lemma 3.3. Let λ, P,Q, c, β be defined as above in (24), (25), (26). Let

α∗ := cT ((Ir ⊗Q + P ⊗ In)†)c. (27)

Then, α∗, λ is a feasible pair for DV SDR2 . And, for any pair α, λ feasible to DV SDR2 , wehave

−α + β ≤ −α∗ + β.

Proof. A general quadratic function f(x) = xT Qx+2cTx+ β is nonnegative for any x ∈ Rn if, and

only if, the matrix

[

β cT

c Q

]

0, e.g., [8, Pg. 163]. Therefore,

[

α cT

c Ir ⊗Qλ + Pλ ⊗ In

]

=

[

α cT

c Ir ⊗Q + P ⊗ In

]

0 (28)

if, and only if,xT (Ir ⊗Q + P ⊗ In)x + 2cT x + α ≥ 0, ∀x ∈ R

nr.

For a fixed α, this is further equivalent to

−α ≤ minx

xT (Ir ⊗Q + P ⊗ In)x + 2cT x

= −cT ((Ir ⊗Q + P ⊗ In)†)c.

Therefore, we can choose α∗ as in (27).

To further explore the structure of (27), we note that c can be decomposed as

c = (U1 ⊗ V1)r11 + (U1 ⊗ V2)r12 + (U2 ⊗ V1)r21 + (U2 ⊗ V2)r22. (29)

The validity of such an expression follows from the fact that the columns of [U1 U2]⊗[V1 V2] form anorthonormal basis of R

nr. Furthermore, the dual feasibility of DV SDR2 includes the constraint[

α cT

c Ir ⊗Q + P ⊗ In

]

0, which implies c ∈ R(Ir ⊗Q + P ⊗ In). This range space is spanned by

the columns in the matrices U1 ⊗ V1, U2 ⊗ V1 and U1 ⊗ V2, which implies that c has no componentin R(U2 ⊗ V2), i.e., r22 = 0 in (29).

The following lemma provides a key observation for the connections between the two dualprograms. It deduces that if c is in R(U1 ⊗ V1), then the α component of the objective value ofDV SDR2 in Lemma 3.3 has a specific representation.

Lemma 3.4. If c ∈ R(U1 ⊗ V1), then the α component of the objective value of DV SDR2 inLemma 3.3 satisfies

−α = −cT ((Ir ⊗Q + P ⊗ In)†)c

=

max − vec(R1)T (Ir ⊗Q)† vec(R1)− vec(R2)

T (P ⊗ In)† vec(R2)s.t. R1 + R2 = C

R1, R2 ∈Mnr

(30)

13

Proof. We can eliminate R2 and express the maximization problem on the right hand side of theequality as maxR1

φ(R1), where

φ(R1) := − vec(R1)T ((Ir ⊗Q)† + (P ⊗ In)†) vec(R1)+2cT (P ⊗ In)† vec(R1)− cT (P ⊗ In)†c.

(31)

Since P , Q are both positive semidefinite, we know (Ir ⊗Q)† + (P ⊗ In)† 0. Hence φ is concave.It is not difficult to verify that (P ⊗ In)†c ∈ R((Ir ⊗Q)† + (P ⊗ In)†). Therefore, the maximum ofthe quadratic concave function φ(R1) is finite and attained at R∗

1,

vec(R∗1) = ((Ir ⊗Q)† + (P ⊗ In)†)†(P ⊗ In)†c

= (P ⊗Q† + PP † ⊗ In)†c;(32)

and this corresponds to a value

φ(R∗1) = cT ((P ⊗Q† + P †P ⊗ In)†(P ⊗ In)† − (P ⊗ In)†)c

= −cT (U ⊗ V )Λ(U ⊗ V )T c,

whereΛ := Λ†

P ⊗ In − (Λ2P ⊗ Λ†

Q + ΛP ⊗ In)†.

Matrix Λ is diagonal. Its diagonal entries can be calculated as

Λ(i, j) =

1σi+θj

if σi > 0, θj > 0

0 if σi = 0 or θj = 0.

We now compare φ(R∗1) with −cT (Ir ⊗Q + P ⊗ In)†c. Let

Λ := (Ir ⊗Q + P ⊗ In)† = (U ⊗ V )(Ir ⊗ ΛQ + ΛP ⊗ In)†(U ⊗ V )T

= (U ⊗ V )((Iσ+⊗ ΛQ + ΛP ⊗ Iθ+

)† + Iσ0⊗ Λ†

Q + Λ†P ⊗ Iθ0

)(U ⊗ V )T ,(33)

where matrix Iσ+(resp. Iσ0

) is r × r, diagonal, and zero, except for the i-th diagonal entries thatare equal to one if σi > 0 (resp. σi = 0); and, matrix Iθ+

(resp. Iθ0) is defined in the same way.

Hence we know that matrix Λ is also diagonal. Its diagonal entries can be calculated as

Λ(i, j) =

1σi+θj

if σi > 0, θj > 01σi

if σi > 0, θj = 01θj

if σi = 0, θj > 0

0 if σi = 0, θj = 0.

(34)

By assumption, c = (U1 ⊗ V1)r11, for some r11 of appropriate size. Note that (U1 ⊗ V1)r11 isorthogonal to the columns in U2 ⊗ V1 and U1 ⊗ V2. Thus only the part (Iσ+

⊗ ΛQ + ΛP ⊗ Iθ+)† in

the diagonal matrix is involved in computations, i.e.,

−cT (Ir ⊗Q + P ⊗ In)†c = −rT11(U1 ⊗ V1)

T (U ⊗ V )Λ(U ⊗ V )T (U1 ⊗ V1)r11

= −rT11(U1 ⊗ V1)

T (U ⊗ V )(Iσ+⊗ ΛQ + ΛP ⊗ Iθ+

)†(U ⊗ V )T (U1 ⊗ V1)r11

= −rT11(U1 ⊗ V1)

T (U ⊗ V )Λ(U ⊗ V )T (U1 ⊗ V1)r11

= φ(R∗1).

14

Now, for the given feasible α∗, λ of Lemma 3.3, we will construct a feasible solution forDMSDR2 that generates the same objective value. Using Lemma 3.2, we choose t ∈ R satis-fying Q = Qλ − tIn 0, P = Pλ + tIr 0.

We can now find a lower bound for the optimal value of DMSDR2 .

Proposition 3.1. Let λ, t, P,Q,C, c, β be as above. Let R∗1 denote the maximizer of φ(R1) in the

proof of Lemma 3.4. Construct R1 R2 as follows:

vec(R1) = vec(R∗1) + (U2 ⊗ V1)r21

vec(R2) = vec(R∗2) + (U1 ⊗ V2)r12.

(35)

Then we obtain a lower bound for the optimal value of DMSDR2 .

µ∗M2 ≥ − vec(R1)

T (Ir ⊗Q)† vec(R1)− vec(R2)T (P ⊗ In)† vec(R2) + β. (36)

Proof. Consider the subproblem that maximizes the objective with λ, t, R1 and R2 defined asabove.

max β − trace S1 − trace S2

s.t.

[

S1 RT1

R1 Q

]

0

[

S2 R2

RT2 P

]

0

S1 ∈ Sr S2 ∈ Sn,

(37)

Because λ, t, R1, and R2 are all feasible for DMSDR2 , this subproblem will generate a lowerbound for µ∗

M2. We now invoke a result from [6], i.e., that there exists a symmetric matrix S such

that trace S ≤ γ and

[

S CT

C Q

]

0 if, and only if, f(X) = trace(XT QX + 2CT X) + γ ≥ 0 for any

X ∈Mnr. This is equivalent to

−γ ≤ minX∈Mnr

trace(XT QX + 2CT X) = − trace(CT Q†C).

Therefore, the subproblem (37) can be reformulated as

max β − trace S1 − trace S2

s.t. − trace S1 ≤ − trace(RT1 Q†R1)

− trace S2 ≤ − trace(R2Q†RT

2 )S1 ∈ Sr, S2 ∈ Sn.

(38)

Hence, the optimal value of (38) has an explicit expression and provides a lower bound for DMSDR2

µ∗M2 ≥ − vec(R1)

T (Ir ⊗Q)† vec(R1)− vec(R2)T (P ⊗ In)† vec(R2) + β.

With all the above preparations, we now complete the proof of (main) Theorem 3.1.

15

Proof. (of Theorem 3.1) Now we compare µ∗V 2 from the expression (27) with µ∗

M2 based on thelower bound expression in (36). By writing c in the form of (29), we get

µ∗v2 = −

[

rT11(U1 ⊗ V1)

T + rT12(U1 ⊗ V2)

T + rT21(U2 ⊗ V1)

T]

(Ir ⊗Q + P ⊗ In)†

[(U1 ⊗ V1)r11 + (U1 ⊗ V2)r12 + (U2 ⊗ V1)r21]+β.

(39)

Consider the crossterm such as rT11(U1 ⊗ V1)

T (Ir ⊗Q + P ⊗ In)†(U1 ⊗ V2)r12. Since (U1 ⊗ V1)r11 isorthogonal to (U1 ⊗ V2)r12, and (Ir ⊗Q + P ⊗ In)† is diagonalizable by [U1 U2]⊗ [V1 V2], this termis actually zero. Similarly, we can verify that the other crossterms equal zero. As a result, only thefollowing sum of three quadratic terms remain, which we label using C1, C2, C3, respectively.

µ∗v2 = −rT

11(U1 ⊗ V1)T (Ir ⊗Q + P ⊗ In)†(U1 ⊗ V1)r11

−rT21(U2 ⊗ V1)

T (Ir ⊗Q + P ⊗ In)†(U2 ⊗ V1)r21

−rT12(U1 ⊗ V2)

T (Ir ⊗Q + P ⊗ In)†(U1 ⊗ V2)r12 + β

=: C1 + C2 + C3 + β.

(40)

We can also formulate the lower bound for µ∗M2 based on (36):

µ∗M2 ≥ − vec(R1)

T (Ir ⊗Q)† vec(R1)− vec(R2)T (P ⊗ In)† vec(R2) + β

= −(vec(R∗1) + (U2 ⊗ V1)r21)

T (Ir ⊗Q)† vec(R∗1 + (U2 ⊗ V1)r21)

− (vec(R∗2) + (U1 ⊗ V2)r12)

T (P ⊗ In)†(vec(R∗2) + (U1 ⊗ V2)r12) + β.

(41)

Since vec(R∗1) and vec(R∗

2) are both in R(U1⊗V1), and this is orthogonal to both (U1⊗V2)r12 and(U2 ⊗ V1)r21, and both matrices (Ir ⊗ Q)† and (P ⊗ In)† are diagonalizable by [U1 U2] ⊗ [V1 V2],we conclude that the crossterms, such as vec(R∗

1)T (Ir ⊗Q)†(U2 ⊗ V1)r21, all equal zero. Therefore,

the lower bound for µ∗M2 can be reformulated as

µ∗M2 ≥ −(vec(R∗

1)(Ir ⊗Q)† vec(R∗1) + vec(R∗

2)(P ⊗ In)† vec(R∗2))

−rT21(U2 ⊗ V1)

T (Ir ⊗Q)†(U2 ⊗ V1)r21

−rT12(U1 ⊗ V2)

T (P ⊗ In)†(U1 ⊗ V2)r12 + β.

=: T1 + T2 + T3 + β.

(42)

As above, denote the first three quadratic terms by T1, T2, T3, respectively.We will show that term C1, C2 and C3 equal T1, T2 and T3, respectively. The equality between

C1 and T1 follows from Lemma 3.4. For the other terms, consider C2 first. Write (Ir⊗Q+P⊗In)†

as the diagonal matrix Λ by (33). Note that (U2⊗V1)r21 is orthogonal with the columns in U1⊗V1

and U1 ⊗ V2. Thus only the part Iσ0⊗Λ†

Q in diagonal matrix Λ is involved in computing term C2,i.e.,

−rT21(U2 ⊗ V1)

T (Ir ⊗Q + P ⊗ In)†(U2 ⊗ V1)r21

= −rT21(U2 ⊗ V1)

T (U ⊗ V )(Iσ0⊗ Λ†

Q)(U ⊗ V )T (U2 ⊗ V1)r21.(43)

Similarly, since (U2 ⊗ V1)r21 is orthogonal with eigenvectors in U1 ⊗ V2, we have

−rT21(U2 ⊗ V1)

T (Ir ⊗Q)†(U2 ⊗ V1)r21

= −rT21(U2 ⊗ V1)

T (U ⊗ V )(Iσ+⊗ Λ†

Q + Iσ0⊗ Λ†

Q)(U ⊗ V )T (U2 ⊗ V1)r21

= −rT21(U2 ⊗ V1)

T (U ⊗ V )(Iσ0⊗ Λ†

Q)(U ⊗ V )T (U2 ⊗ V1)r21.

(44)

16

By (43) and (44), we conclude that that the term C2 equals T2. We may use the same argumentto prove the equality of C3 and T3. Therefore, we conclude that

−cT ((Ir ⊗Q + P ⊗ In)†)c + β

= − vec(R1)T ((Ir ⊗Q)†) vec(R1)− vec(R2)

T ((P ⊗ In)†) vec(R2) + β.(45)

Then by (27) and (36), we have established µ∗V 2 ≤ µ∗

M2. The other direction has been provedin Corollary 3.1. This completes the proof of the theorem.

3.1.2 Unbalanced Orthogonal Procrustes Problem

Example 3.1. In the Unbalanced Orthogonal Procrustes Problem [14] one seeks to solve the fol-lowing minimizing problem.

min ‖AX −B‖2Fs.t. XT X = Ir,

X ∈Mnr,

(46)

where A ∈Mnn, B ∈Mnr, and n ≥ r.The balanced case n = r can be solved efficiently [29] and this special case also admits a

QMP1 relaxation, [6]. Note that the unbalanced case is a typical QMP2. Its V SDR2 can bewritten as:

min trace((Ir ⊗AT A)Y )− trace(2BT AX)s.t. trace((Eij ⊗ In)Y ) = δi,j , i, j = 1, 2, · · · , r,

[

1 x

xT Y

]

0.(47)

It is easy to check that the SDP in (47) is feasible and its dual is strictly feasible, which impliesthe equivalence between MSDR2 and V SDR2 . Thus we can obtain a nontrivial lower bound fromits MSDR2 relaxation:

min trace(AT AY − 2BT AX)

s.t.

[

Ir XT

X Y

]

0,

[

In X

XT Ir

]

0,

trace Y = r.

(48)

We also run preliminary computational experiments to compare the efficiency of the two SDP relaxations,see Table 3.1. All matrices in 5 instances are randomly generated and the problems are solved bySeDuMi 1.1 on a notebook computer(CPU: Intel Core 2 Duo, 2.53GHZ/Memory: 3GB). Table 3.1exhibits the computational advantage of MSDR2 over V SDR2 .

Table 1: The solution time(CPU seconds) of two SDP relaxations on the orthogonal Procrustesproblem.

Problem Scale (n,r) (15,5) (20,10) (30,10) (30,15) (40,20)

V SDR2 (CPU time) 2.14 23.03 65.01 196.70 954.70

MSDR2 (CPU time) 0.37 1.75 7.63 11.81 70.96

17

3.2 An Extension to QMP with Conic Constraints

Some QMP problems include conic constraints such as XT X ()S, where S is a given pos-itive semidefinite matrix. We can prove that the corresponding MSDR2 and V SDR2 are stillequivalent for such problems.

Consider the following general form of QMP2 with conic constraints:

(QMP3 )

min trace(XT Q0X) + trace(XP0XT ) + 2 trace(CT

0 X) + trace(HT0 Z)

s.t. trace(XT QjX) + trace(XPjXT ) + 2 trace(CT

j X) + trace(HTj Z) + βj ≤ 0,

j = 1, 2, . . . ,mX ∈Mnr, Z ∈ K

where K can be the direct sum of convex cones (e.g., second-order cones, semidefinite cones). Notethat the constraint XT X ()S can be formulated as

trace(XT XEij) + (−) trace(ZEij) = Sij

Z 0.(49)

The formulations of V SDR2 and MSDR2 for QMP3 are the same as for QMP2 except forthe additional term Hj · Z and the conic constraint Z ∈ K. Correspondingly, the dual programsDV SDR2 and DMSDR2 for QMP3 will both have an additional constraint

H0 −m

∑

j=1

λjHj ∈ K∗. (50)

If a dual solution λ∗ is feasible for DV SDR2 , then it satisfies the constraint (50) in bothDV SDR2 and DMSDR2 . Therefore, we can follow the proof of Theorem 3.1, and constructa feasible solution for DMSDR2 with λ∗ which generates the same objective value as µ∗

V 2. Thisyields the following.

Corollary 3.2. Assume V SDR2 for QMP3 is strictly feasible and its dual DV SDR2 is feasible.Then DV SDR2 and DMSDR2 both attain their optimum at the same λ and generate the sameoptimal value µ∗

V 2 = µ∗M2.

3.2.1 Graph Partition Problem

Example 3.2 (Graph Partition Problem, (GPP)). GPP is an important combinatorial optimizationproblem with broad applications in network design and floor planning [3, 27]. Given a graph with n

vertices, the problem is to find an r-partition S1, S2, . . . , Sr of the vertex set, such that ‖Si‖ = mi

with m := (mi)i=1,...,r given cardinalities of subsets, and the total number of edges across differentsubsets is minimized. Define matrix X ∈ Mnr to be the assignment of vertices, i.e., Xij = 1 ifvertex i is assigned to subset j; Xij = 0 otherwise. With L the Laplacian matrix, the GPP can beformulated as an optimization problem:

µ∗GPP = min 1

2 trace(XT LX)s.t. XT X = Diag(m)

diag(XXT ) = en

X ≥ 0

(51)

18

This formulation involves quadratic matrix constraints of both types trace(XT EiiX) = 1, i =1, . . . , n and trace(XEijX

T = miδ(i, j), i, j = 1, . . . , r. Thus it can be formulated as a QMP2 butnot a QMP1. Anstreicher and Wolkowicz [5] proposed a semidefinite program relaxation withO(n4) variables and proved that its optimal value equals the so-called Donath-Hoffman lower bound[13]. This SDP formulation can be written in a more compact way as Povh [27] suggested:

µ∗DH = min 1

2 trace((Ir ⊗ L)V )s.t.

∑ri=1

1mi

V ii + W = In,

trace(V ij) = miδi,j, i, j = 1, . . . , rtrace((I ⊗ Eii)V ) = 1, i = 1, . . . , nV ∈ S+

rn, W ∈ S+n ,

(52)

where V has been partitioned into r2 square blocks, with each block size of n by n, and V ij is the(i, j)-th block of V . Note that formulation (52) reduces the number of variables to O(n2r2).

An interesting application is the graph equipartition problem in which mi(= m1)’s are all thesame. Povh’s SDP formulation is actually a V SDR2 for QMP3 :

min 12 trace(XT LX)

s.t. trace( 1m1

XT EijX + EijW ) = δi,j, i, j = 1, . . . , n

trace(XEijXT ) = m1δi,j, i, j = 1, . . . , r

trace(XT EiiX) = 1, i = 1, . . . , nW ∈ S+

n .

(53)

It is easy to check that (52) is feasible and its dual is strictly feasible. Hence by Corollary3.2, the equivalence between MSDR2 and V SDR2 for QMP3 implies that the Donath-Hoffmanbound can be computed by solving a small MSDR2 :

µ∗DH = min 1

2L · Y1,

s.t. Y1 m1In,

Y2 = m1Ir,

diag(Y1) = en,[

Ir XT

X Y1

]

0,

[

In X

XT Y2

]

0

(54)

Because X and Y2 do not appear in the objective, formulation (54) can be reduced to a very simpleform:

µ∗DH = min 1

2L · Y1

s.t. diag(Y1) = en

0 Y1 m1In.

(55)

This MSDR formulation has only O(n2) variables, which is a significant reduction from O(n2r2).This result coincides with Karish and Rendl’s result [19] for the graph equipartition. Their proof

derives from the particular problem structure, while our result is based on the general equivalenceof V SDR2 and MSDR2 .

4 Conclusion

This paper proves the equivalence of two SDP bounds for the hard QCQP in QMP2. Thus, itis clear that a user should use the smaller/inexpensive MSDR bound from matrix-lifting, rather

19

than the more expensive V SDR bound from vector lifting. In particular, our results show thatthe large V SDR2 relaxation for the unbalanced orthogonal Procrustes problem can be replacedby the smaller MSDR2 . And, with an extension of the main theorem, we proved the Karish andRendl result [19] that the Donath-Hoffman bound for Graph Equipartition can be computed witha small SDP .

The matrix completion and semidefinite inequality techniques used in our proofs are of inde-pendent interest.

Unfortunately, it is not clear how to formulate a general QCQP as a MSDR . In par-ticular, the objective function for the QAP , trace AXBXT , does not immediately admit anMSDR representation. (Though, Ding and Wolkowicz [12] provide a different matrix-liftingSDP relaxation, that generally has a strictly lower bound than the vectorized SDP relaxationproposed in [35].) The above motivates the need for finding efficient matrix-lifting representationsfor hard QCQP that allow efficient semidefinite relaxations.

References

[1] A. ALFAKIH, M.F. ANJOS, V. PICCIALLI, and H. WOLKOWICZ. Euclidean distance ma-trices, semidefinite programming, and sensor network localization. Portugaliae Mathematica,to appear. CORR 2009-05. 2

[2] A. ALFAKIH and H. WOLKOWICZ. Matrix completion problems. In Handbook of semidefiniteprogramming, volume 27 of Internat. Ser. Oper. Res. Management Sci., pages 533–545. KluwerAcad. Publ., Boston, MA, 2000. 6

[3] C.J. ALPERT and A.B. KAHNG. Recent directions in netlist partitioning: a survey. Integr.VLSI J., 19(1-2):1–81, 1995. 18

[4] K.M. ANSTREICHER, X. CHEN, H. WOLKOWICZ, and Y. YUAN. Strong duality fora trust-region type relaxation of the quadratic assignment problem. Linear Algebra Appl.,301(1-3):121–136, 1999. 2

[5] K.M. ANSTREICHER and H. WOLKOWICZ. On Lagrangian relaxation of quadratic matrixconstraints. SIAM J. Matrix Anal. Appl., 22(1):41–55, 2000. 3, 19

[6] A. BECK. Quadratic matrix programming. SIAM J. Optim., 17(4):1224–1238, 2006. 2, 3, 6,15, 17

[7] A. BEN-TAL, L. El GHAOUI, and A. NEMIROVSKI. Robust optimization. Princeton Seriesin Applied Mathematics. Princeton University Press, Princeton, NJ, 2009. 2

[8] A. BEN-TAL and A.S. NEMIROVSKI. Lectures on modern convex optimization. MPS/SIAMSeries on Optimization. Society for Industrial and Applied Mathematics (SIAM), Philadelphia,PA, 2001. Analysis, algorithms, and engineering applications. 13

[9] P. BISWAS and Y. YE. Semidefinite programming for ad hoc wireless sensor network local-ization. In Information Processing In Sensor Networks, Proceedings of the third internationalsymposium on Information processing in sensor networks, pages 46–54, Berkeley, Calif., 2004.2

20

[10] M.W. CARTER, H.H. JIN, M.A. SAUNDERS, and Y. YE. SpaseLoc: an adaptive subproblemalgorithm for scalable wireless sensor network localization. SIAM J. Optim., 17(4):1102–1128,2006. 2

[11] E. de KLERK and R. SOTIROV. Exploiting group symmetry in semidefinite programmingrelaxations of the quadratic assignment problem. Math. Program., 122(2, Ser. A):225–246,2010. 2

[12] Y. DING and H. WOLKOWICZ. A low-dimensional semidefinite relaxation for the quadraticassignment problem. Math. Oper. Res., 34(4):1008–1022, 2009. 2, 20

[13] W.E. DONATH and A.J. HOFFMAN. Lower bounds for the partitioning of graphs. IBM J.Res. Develop., 17:420–425, 1973. 19

[14] L. ELDEN and H. PARK. A Procrustes problem on the Stiefel manifold. Numer. Math.,82(4):599–619, 1999. 17

[15] B. GRONE, C.R. JOHNSON, E. MARQUES de SA, and H. WOLKOWICZ. Positive definitecompletions of partial Hermitian matrices. Linear Algebra Appl., 58:109–124, 1984. 6, 7

[16] L. HOGBEN. Graph theoretic methods for matrix completion problems. Linear Algebra Appl.,328(1-3):161–202, 2001. 6

[17] V. JEYAKUMAR and H. WOLKOWICZ. Generalizations of Slater’s constraint qualificationfor infinite convex programs. Math. Programming, 57(1, Ser. B):85–101, 1992. 11

[18] C.R. JOHNSON. Matrix completion problems: a survey. Proceedings of Symposium in AppliedMathematics, 40:171–198, 1990. 6

[19] S.E. KARISCH and F. RENDL. Semidefinite programming and graph equipartition. In Topicsin Semidefinite and Interior-Point Methods, volume 18 of The Fields Institute for Researchin Mathematical Sciences, Communications Series, Providence, Rhode Island, 1998. AmericanMathematical Society. 19, 20

[20] N. KRISLOCK. Efficient algorithms for large-scale Euclidean distance matrix problems withapplications to wireless sensor network localization and molecular conformation. PhD thesis,University of Waterloo, 2010. 8

[21] N. KRISLOCK and H. WOLKOWICZ. Explicit sensor network localization us-ing semidefinite representations and clique reductions. Technical Report CORR2009-04, University of Waterloo, Waterloo, Ontario, 2009. Available at URL:www.optimization-online.org/DB HTML/2009/05/2297.html. 2, 8

[22] J. LIU. Eigenvalue and singular value inequalities of Schur complements. In F. ZHANG, editor,The Schur complement and its applications, volume 4 of Numerical Methods and Algorithms,pages xvi+295. Springer-Verlag, New York, 2005. 3, 7

[23] H.D. MITTELMANN and J. PENG. Estimating bounds for quadratic assignment problemsassociated with the hamming and manhattan distance matrices based on semidefinite pro-gramming. SIAM J. Optim., page to appear, 2008. 2

21

[24] Y.E. NESTEROV, H. WOLKOWICZ, and Y. YE. Semidefinite programming relaxations ofnonconvex quadratic optimization. In Handbook of semidefinite programming, volume 27 ofInternat. Ser. Oper. Res. Management Sci., pages 361–419. Kluwer Acad. Publ., Boston, MA,2000. 3, 4

[25] D. OUELLETTE. Schur complements and statistics. Linear Algebra Appl., 36:187–295, 1981.3, 7

[26] J. PENG and H.D. MITTELMANN. Estimating bounds for quadratic assignment problemsassociated with the hamming and manhattan distance matrices based on semidefinite pro-gramming. SIAM J. Optim., page to appear, 2008. 2

[27] J. POVH. Semidefinite approximations for quadratic programs over orthogonal matrices. InJournal of Global Optimization. Springer Netherlands, 2009. 18, 19

[28] R.T. ROCKAFELLAR. Convex analysis. Princeton Landmarks in Mathematics. PrincetonUniversity Press, Princeton, NJ, 1997. Reprint of the 1970 original, Princeton Paperbacks. 11,12

[29] P. H. Schonemann. A generalized solution of the orthogonal procrustes problem. Psychome-trika, 31(1):1–10, March 1966. 17

[30] A.M-C. SO and Y. YE. Theory of semidefinite programming for sensor network localization.Math. Program., 109(2-3, Ser. B):367–384, 2007. 8

[31] R. STERN and H. WOLKOWICZ. Indefinite trust region subproblems and nonsymmetriceigenvalue perturbations. SIAM J. Optim., 5(2):286–313, 1995. 4

[32] Z. WANG, S. ZHENG, S. BOYD, and Y. YE. Further relaxations of the semidefinite pro-gramming approach to sensor network localization. SIAM J. Optim., 19(2):655–673, 2008.6

[33] H. WOLKOWICZ. Semidefinite and Lagrangian relaxations for hard combinatorial problems.In M.J.D. Powell, editor, Proceedings of 19th IFIP TC7 Conference on System Modelling andOptimization, July, 1999, Cambridge, pages 269–309. Kluwer Academic Publishers, Boston,MA, 2000. 3

[34] H. WOLKOWICZ and Q. ZHAO. Semidefinite programming relaxations for the graph parti-tioning problem. Discrete Appl. Math., 96/97:461–479, 1999. Selected for the special Editors’Choice, Edition 1999. 2

[35] Q. ZHAO, S.E. KARISCH, F. RENDL, and H. WOLKOWICZ. Semidefinite programming re-laxations for the quadratic assignment problem. J. Comb. Optim., 2(1):71–109, 1998. Semidef-inite programming and interior-point approaches for combinatorial optimization problems(Fields Institute, Toronto, ON, 1996). 2, 9, 20

22

Index

A†, Moore-Penrose pseudoinverse, 12Cλ, 11Pλ, 11Qλ, 11Y

ijV , blocks of YV , 9

B0Diag, Block-diag, 9Mnr, set of n by r matrices, 2O0Diag, Block-offdiag, 9R(·), range space, 13Sn, the space of n× n symmetric matrices, 2βλ, 11b0diag, block-diag adjoint, 9vec(X), 3o0diag, block-offdiag adjoint, 9qMj (X), homogenized quadratic matrix function,

4qVj (X), homogenized quadratic vector function,

4qj(X), quadratic function, 4MSDR , matrix-lifting semidefinite relaxation,

3MSDR2, vector-lifting relaxation, 9QMP , quadratic matrix programming, 2QMP1, first case of quadratic matrix program-

ming, 2QMP2, second case of quadratic matrix pro-

gramming, 3, 8V SDR , vector-lifting semidefinite relaxation,

3V SDR1, vector-lifting relaxation, 5V SDR2, vector-lifting relaxation, 9

MSDR1 , matrix-lifting relaxation, 5

block-diag adjoint, b0diag, 9Block-diag, B0Diag, 9block-offdiag adjoint, o0diag, 9Block-offdiag, O0Diag, 9blocks of YV , Y

ijV , 9

chordal graph, 7

first case of quadratic matrix programming, QMP1 ,2

homogenized quadratic matrix function, qMj (X),

4homogenized quadratic vector function, qV

j (X),4

Kronecker product, 3

lifted matrices for QMP1 , 4

matrix-lifting relaxation, MSDR1 , 5matrix-lifting semidefinite relaxation, MSDR ,

3Moore-Penrose pseudoinverse, A†, 12

partial symmetric matrix, 6

quadratic function, qj(X), 4quadratic matrix programming, QMP , 2quadratically constrained quadratic program, QCQP ,

2

range space, R(·), 13

Schur complement, 3second case of quadratic matrix programming,

QMP2 , 3, 8sensor network localization problem, SNL, 8set of n by r matrices, Mnr, 2SNL, sensor network localization problem, 8

the space of n× n symmetric matrices, Sn, 2trace dot product, 2, 5trace inner-product, 2

vector-lifting relaxation, MSDR2 , 9vector-lifting relaxation, V SDR1 , 5vector-lifting relaxation, V SDR2 , 9vector-lifting semidefinite relaxation, V SDR ,

3

23

On Equivalence of Semideﬁnite Relaxations for Quadratic ...V SDR is derived using (4) with the...

Documents

Transcript of On Equivalence of Semideﬁnite Relaxations for Quadratic ...V SDR is derived using (4) with the...