Nonconvex Quadratic Problems and Games with SeparableConstraints

Javier Zazo Ruiz

Universidad Politecnica de Madrid

14 of December 2017


Quadratically constrained quadratic problem (QP)

I Let’s consider a general QP:


xTA0x + 2bT0 x + c0

s.t. xTAix + 2bTi x + ci ≤ 0 ∀i = 1, . . . , N.

where A0, Ai are symmetric matrices and b0, bi, x ∈ Rp, c0, ci ∈ R.

I If A0 � 0 and every Ai � 0 the problem is convex (≈ easy to solve).

I Otherwise, the problem is non-convex (local minima may exist).

I These problems are generally NP-Hard.

I Use of QPs is vast.

Polynomial minimizationI Minimize a polynomial over a set of of polynomial inequalities:

min p0(x)

s.t. pi(x) ≤ 0, i = 1, . . . ,m.

I Rename variables and add them as constraints.

I Example:


x3 − 2xyz + y + 2

s.t. x2 + y2 + z2 − 1 = 0.

Introducing change of variables u = x2, v = yz,

min ux− 2vx+ y + 2

s.t. x2 + y2 + z2 − 1 = 0

u− x2 = 0

v − yz = 0.

Polynomial minimizationSix-hump-camel problem










f(x 1


Partinioning problemsAlso called “Boolean Optimization”



s.t. xi ∈ {−1, 1}

I The problem is NP-hard (even if A0 � 0).

I Binary constraints xi ∈ {−1, 1} ⇐⇒ x2i = 1.

I The MAXCUT � benchmark problem.

Transmit beamforming problem

I Determine optimal beams for downlink transmissions.

I The beamformers affect the system performance, causing interference.



wHi wi

s.t. SINRi(wi, w−i) ≥ Γi

I The above problem can be relaxed:




s.t. SINRi(Wi,W−i) ≥ Γi

Wi = WiH

Wi � 0 ∀i ∈ N

Mats Bengtsson and Bjorn Ottersten, “Optimal and suboptimal transmit beamforming,” in Handbook of Antennas in WirelessCommunications, CRC Press, 2001.

Semidefinite programsReminder

I Linear program (LP):



s.t. Gx ≤ 0

I Semidefinite program (SDP):



s.t. F0 + x1F1 + . . .+ xNFN � 0

I Reduces to an LP if Fi are diagonal.I Can be optimally solved using specialized solvers.

Semidefinite Relaxation of QPsI Given a QP:


xTA0x + 2bT0 x + c0

s.t. xTAix + 2bTi x + ci ≤ 0 ∀i = 1, . . . ,m,(1)

I Define X = xxT and transform xTAix = tr(AiX).

I Obtain non-convex QP:


tr(A0X) + 2bT0 x + c0

s.t. tr(AiX) + 2bTi x + ci ≤ 0 ∀i = 1, . . . , N

X = xxT .

I Relax the rank constraint X � xTx � obtain an SDP:


tr(A0X) + 2bT0 x + c0

s.t. tr(AiX) + 2bTi x + ci ≤ 0 ∀i = 1, . . . ,m

X � xTx.


I Strong duality: (1) and (2) attain the same solution.

Trust region methods

I Problem: minimization of unconstrained problems



I Non-convex surrogate:


dTBkd+ 2∇f(xk)T d

s.t. ‖d‖ ≤ ∆k,

where Bk is the Hessian of f(xk).

I QP with SINGLE quadratic constraint � presents STRONG duality.


xTA0x + 2bT0 x + c0

s.t. g1(x) = xTA1x + 2bT1 x + c1 ≤ 0

QP with a single equality constraint


xTA0x + 2bT0 x + c0

s.t. g1(x) = xTA1x + 2bT1 x + c1 = 0


I Localization problems.

I Principal component analysis (PCA)



s.t. ‖x‖2 = 1.

QP with separable constraints

I QP with separable constraints:


xTA0x + 2bT0 x + c0

s.t. xTAix + 2bTi x + ci E 0 ∀i = 1, . . . , N, N ≤ p,

where all constraints are separable and x = [x1, . . . , xN ], E ∈ {≤,= }.I Roadmap to establish strong duality:

S-propertyStrong alternatives

of SDPs

Strong alternativesof diagonalized SDP

QP w/ separableconstraints

TransformationQP ↔ SDP

Existence ofrank 1 solution

Review of dual methodsI Consider a general optimization problem:



s.t. g(x) ≤ 0.

I Lagrangian: L(x,λ) = f(x) + λTg(x).

I Dual function:

q(λ) = minxL(x,λ)

I Dual problem:




Update: λk+1 = [λk + αkg(xk)]+

arg minx L(x,λ)








I Karush-Kuhn-Tucker conditions (necessary):

∇xf(x) +∇xλTg(x) = 0

λTg(x) = 0

g(x) ≤ 0, λ ≥ 0

Constraint Qualifications (CQs)

I CQs correspond to topological features on the feasible set.

I If satisfied, they guarantee the existence of dual variables for the KKT conditions.

I If not satisfied, dual variables do not exist that fulfill KKT conditions.

I Examples:

I Slater’s condition: gi(x) convex satisfies CQs if ∃x | gi(x) < 0.I Linear independence constraint qualification (LICQ):

gradients of the inequality constraints are linearly independent.I S-property :

I Defined as a system of equivalences.I Much more strict that Slater’s or LICQ � guarantees zero gap with dual problem.

S-property & roadmap (recap)

Definition (S-property.)

A QP satisfies the S-property if and only if the following statements are equivalent for every α:

I ∀x feasible ⇒ f(x) ≥ αI ∃λ ∈ Γ |L(x,λ) ≥ α for all x ∈ Rp

I Roadmap to establish strong duality (recap)

S-propertyStrong alternatives

of SDPs

Strong alternativesof diagonalized SDP

QP w/ separableconstraints

TransformationQP ↔ SDP

Existence ofrank 1 solution

Results on strong duality

A set of matrices {A1, A2, . . . , AN} is said to be simultaneously diagonalizable via congruence, if thereexists a nonsingular matrix P such that PTAiP is diagonal for every matrix Ai.

I Introduction of variables (from the QP):

Ai =

[Ai bibTi ci

], P ∈ Sp+1.

I Fi = PTAiP for all i ∈ { 1, . . . , N } become diagonal.

I F0 is not diagonal necessarily, only the constraints.

Theorem (Zazo et. al)

Given a QP with separable constraints, suppose Slater’s assumption is satisfied and that bi ∈ range[Ai]for every i ∈ N . Furthermore, assume there exists a diagonal matrix D whose elements are ±1 suchthat DF0D is a Z–matrix. Then, the S-property holds.

Robust least squares IApplication example

I Least squares problem:


‖Ax− b‖2

I Robust least squares (RLS):



‖(A+ ∆A)x− (b+ ∆b)‖2

s.t. ‖(∆A,∆b)‖2F ≤ ρ,

I Our proposal of RLS:



‖(A+ ∆A)x− (b+ ∆b)‖2

s.t. ‖(∆A):i‖2 ≤ ρi ∀i ∈ { 1, . . . , p } ,‖∆b‖2 ≤ ρp+1

How to solve a min-max problem



φ(x, y)

I Proposed method:

1. Use (sub)gradient descent on the minimization variable.2. At each step, solve the maximization problem globally.

I We need to compute gradients of a maximization mapping. Define f(x) = maxy∈Y φ(x, y).

∇xf(x) = φ(x, y∗)

where y∗ = arg maxy∈Y f(x, y) (Danskin’s theorem).

I The maximization mapping is non-convex in the RLS problem.

I Can we solve it optimally? � We can try semidefinite relaxation.

RLS: Strong duality result

Theorem (Zazo et. al)

Strong duality between primal problem and its dual holds for any H ∈ RN×p+1 andx = (xT ,−1)T ∈ Rp+1.

Sketch of proof:

1. Reformulate the objective function into standard form:


xTA0x + 2bT0 x + c0

s.t. xTAix + 2bTi x + ci ≤ 0 ∀i = 1, . . . , N, N ≤ p

2. Show A0 is a completely positive matrix.

3. Determine matrix P , and compute F0 = PTA0P .

4. Verify there exists diagonal D such that DF0D is a completely positive matrix.

5. This satisfies requirements of previous theorem.

Sensor network localization problem

I Problem formulation:





(‖xi − sj‖2 − d2ij)

2 +∑


(‖xi − xj‖2 − d2ij)


I Network example:

−50 −40 −30 −20 −10 0 10 20 30 40 50





20 Anchor nodes

Unknown nodes


SimulationsN = 17 nodes with u. position. Noiseless and σ = 1. LOW CONNECTIVITY

0 50 100 150 20010−4



Iterations number



FLEXA poly 2

FLEXA poly 4


Dual ascent

0 50 100 150 200100



Iterations number



FLEXA poly 2

FLEXA poly 4


Dual ascent

Optimal solution

SimulationsN = 11 nodes with u. position. Noiseless and σ = 1. HIGH CONNECTIVITY

0 50 100 150 20010−8



Iterations number



FLEXA poly 2

FLEXA poly 4


Dual ascent

0 50 100 150 200



Iterations number



FLEXA poly 2

FLEXA poly 4


Dual ascent

Optimal solution

SDP Complexity

I SDP methods scale badly with the size of the problem � worst case complexity O(p5)!

I Descent techniques on primal problem may converge to local optima!

I We explore parallel techniques with optimality guarantees.

1. Projected gradient ascent method.2. Majorization-minimization methods.

Projected dual (sub)gradient methodI Necessary condition for optimality:

A0 +∑i

λiAi � 0

I We defineW = {λ ∈ Γ | A0 +

∑iλiAi � 0 } .

I The dual problem is:



I Algorithm:


Update: λk+1 = ΠW [λk + αkg(xk)]

arg minx L(x,λ)








FLEXA decomposition IParallel updates

I FLEXA is a decomposition framework to solve non-convex problems in parallel.

I FLEXA solves problems of the following type:




s.t. g(x) ≤ 0

I The framework uses strongly convex surrogate functions:

G. Scutari, F. Facchinei, P. Song, D. P. Palomar, and J.-S. Pang, “Decomposition by partial linearization: Parallel optimization of multi-agentsystems,” IEEE TSP, vol. 62, no. 3, pp. 641–656, Feb. 2014

Distributed proposal using an extra constraintI Alternative complementarity slackness problem:

A0 +∑

i∈NλiAi � 0, Z � 0, Z ⊥ A0 +∑


I Dual problem:


tr[(A0 +




s.t. Z � 0.

I Algorithm:


Update: Zk+1 = [Zk + αk(A0 +∑

iλk+1i Ai)]+

arg minx,λ L(x,λ, Zk)








Dynamic games

Agent 1 Agent Agent


I Formulated as Nash equilibrium (NE) problem.

I Two possible NE solution concepts:

1. Open loop NE.2. Closed loop NE.

Page 34: Javier Zazo Ruiz - Harvard University2.Show A 0 is a completely positive matrix. 3.Determine matrix P, and compute F 0 = PT A 0P. 4.Verify there exists diagonal Dsuch that DF 0Dis

Demand-side management in smart grids

I Minimize energy costs.

I Acknowledge uncertainty in prize planning.

I Min-max game formulation � distributed framework.

I Quadratic programs with separable constraints.

1. We study the conditions for QPs with separable constraints to exhibit the S-property.2. Novel algorithmic framework � New research lines.

I Squared ranged localization problems: TL and SNL problems

1. Primal methods (ADMM, NEXT, Diffusion)2. Dual methods (centralized and distributed)

I Algorithmic framework

1. Projected subgradient method � efficient but slow.2. Decomposition technique based on FLEXA � efficient and fast.

I Other works involving dynamic games and smart grid problems.

