Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani,...

12
1 A Low-Complexity Parallelizable Numerical Algorithm for Sparse Semidefinite Programming Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei Abstract—In the past two decades, the semidefinite program- ming technique has been proven to be extremely successful in the convexificiation of hard optimization problems appearing in graph theory, control theory, polynomial optimization theory, and many areas in engineering. In particular, major power optimiza- tion problems, such as optimal power flow, state estimation and unit commitment, can be formulated or well approximated as semidefinite programs (SDPs). However, the inability to efficiently solve large-scale SDPs is an impediment to the deployment of such formulations in practice. Motivated by the significant role of SDPs in revolutionizing the decision-making process for real- world systems, this paper designs a low-complexity numerical algorithm for solving sparse SDPs, using the alternating direction method of multipliers and the notion of tree decomposition in graph theory. The iterations of the designed algorithm are highly parallelizable and enjoy closed-form solutions, whose most expensive computation amounts to eigenvalue decompositions over certain submatrices of the SDP matrix. The proposed algorithm is a general-purpose parallelizable SDP solver for sparse SDPs, and its performance is demonstrated on the SDP relaxation of the optimal power flow problem for real-world benchmark systems with more than 13,600 nodes. I. I NTRODUCTION Inspired by the seminal papers [1]–[3], there has been a growing interest in semidefinite programming (SDP), due in part to its applications in combinatorial optimization and a large set of real-world problems across engineering [4]– [7]. Semidefinite programming offers a convex formulation or relaxation framework that is applicable to a wide range of non-convex optimization problems, and has been proven to achieve nontrivial bounds and approximation ratios that are beyond the reach of conventional methods [3], [8]–[10]. While small- to medium-sized SDPs are efficiently solvable by second-order-based interior point methods in polynomial time up to any arbitrary precision [11], these methods are mostly impractical for large-scale SDPs due to computation time and memory issues. The primary obstacle is the requirement of calculating Schur complement matrices and their Cholesky factorizations. Several attempts have been made in order to parallelize this procedure, which have led to software packages such as SDPA and SMCP [12]–[14]. In presence of sparsity, a graph-theoretic analysis of SDP problems is proven to be Ramtin Madani is with the Electrical Engineering Department at the Univer- sity of Texas at Arlington ([email protected]). Abdulrahman Kalbat is with the Electrical Engineering Department, United Arab Emirates University ([email protected]). Javad Lavaei is with the Department of Industrial Engineering and Operations Research, University of California, Berkeley, and also with the Tsinghua-Berkeley Shenzhen Institute ([email protected]). This work was supported by ONR YIP Award, DARPA Young Faculty Award, AFOSR YIP Award, NSF CAREER Award 1351279, and NSF EECS Award 1552096. Parts of this work have appeared in the conference paper “Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei, ADMM for sparse semidefinite programming with applications to optimal power flow problem, IEEE Conference on Decision and Control, 2015”. effective in reducing the number of variables and the order of conic constraints [15], [16]. This approach is remarkably helpful for structured SDP problems, particularly those arising in power system optimization [17]. A promising numerical technique for solving large-scale SDP problems is the alternating direction method of multi- pliers (ADMM), which is a first-order optimization algorithm proposed in the mid-1970s [18] and [19]. While second- order methods are capable of achieving a high accuracy via expensive iterations, a modest accuracy can be attained through tens of ADMM’s low-complexity iterations. In order to obtain a highly accurate solution in a reasonable number of iterations, great effort has been devoted to accelerating ADMM [20], [21]. Because of the sensitivity of the gradient methods to the condition number of the problem’s data, diagonal rescaling is proposed in [22] to improve the performance of ADMM. Moreover, several accelerated variants of ADMM as well as parameter tuning methods have been proposed in the literature to significantly improve the speed of convergence for specific application domains [20]. The O( 1 n ) worst-case convergence rate of ADMM is proven in [23] and [24] under certain assumptions, and a systematic framework is introduced in [25] for the convergence analysis of ADMM by means of control- theoretic methods. The main objective of this work is to design a general- purpose SDP solver for sparse large-scale SDPs, with a guar- anteed convergence and parallelization capabilities under mild assumptions. We start by defining a representative graph for the large-scale SDP problem, from which a decomposed SDP formulation is obtained using a tree/chordal/clique decompo- sition technique. This decomposition replaces the large-scale SDP matrix variable with certain submatrices of this matrix. In order to solve the decomposed SDP problem iteratively, a distributed ADMM-based algorithm is derived, whose iter- ations comprise entry-wise matrix multiplication/division and eigendecomposition on certain submatrices of the SDP matrix. By finding the optimal solution for the distributed SDP, one could recover the solution to the original SDP formulation using an explicit formula. This work is related to and improves upon some recent papers in this area. The paper [26] applies ADMM to the dual SDP formulation, leading to a centralized algorithm that is not parallelizable and is computationally expensive for large-scale SDPs. The work [16] decomposes a sparse SDP into smaller- sized SDPs through a tree decomposition, which are then solved by interior point methods. However, this approach is limited by the large number of consistency constraints. Using a first-order splitting method, [27] solves the decomposed SDP problem created by [16], but the algorithm needs to solve an optimization subproblem at every iteration. Similar

Transcript of Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani,...

Page 1: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

1

A Low-Complexity Parallelizable Numerical Algorithm forSparse Semidefinite Programming

Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei

Abstract—In the past two decades, the semidefinite program-ming technique has been proven to be extremely successful inthe convexificiation of hard optimization problems appearing ingraph theory, control theory, polynomial optimization theory, andmany areas in engineering. In particular, major power optimiza-tion problems, such as optimal power flow, state estimation andunit commitment, can be formulated or well approximated assemidefinite programs (SDPs). However, the inability to efficientlysolve large-scale SDPs is an impediment to the deployment ofsuch formulations in practice. Motivated by the significant roleof SDPs in revolutionizing the decision-making process for real-world systems, this paper designs a low-complexity numericalalgorithm for solving sparse SDPs, using the alternating directionmethod of multipliers and the notion of tree decompositionin graph theory. The iterations of the designed algorithm arehighly parallelizable and enjoy closed-form solutions, whose mostexpensive computation amounts to eigenvalue decompositionsover certain submatrices of the SDP matrix. The proposedalgorithm is a general-purpose parallelizable SDP solver forsparse SDPs, and its performance is demonstrated on the SDPrelaxation of the optimal power flow problem for real-worldbenchmark systems with more than 13,600 nodes.

I. INTRODUCTION

Inspired by the seminal papers [1]–[3], there has beena growing interest in semidefinite programming (SDP), duein part to its applications in combinatorial optimization anda large set of real-world problems across engineering [4]–[7]. Semidefinite programming offers a convex formulationor relaxation framework that is applicable to a wide rangeof non-convex optimization problems, and has been provento achieve nontrivial bounds and approximation ratios thatare beyond the reach of conventional methods [3], [8]–[10].While small- to medium-sized SDPs are efficiently solvable bysecond-order-based interior point methods in polynomial timeup to any arbitrary precision [11], these methods are mostlyimpractical for large-scale SDPs due to computation time andmemory issues. The primary obstacle is the requirement ofcalculating Schur complement matrices and their Choleskyfactorizations. Several attempts have been made in order toparallelize this procedure, which have led to software packagessuch as SDPA and SMCP [12]–[14]. In presence of sparsity,a graph-theoretic analysis of SDP problems is proven to be

Ramtin Madani is with the Electrical Engineering Department at the Univer-sity of Texas at Arlington ([email protected]). Abdulrahman Kalbat iswith the Electrical Engineering Department, United Arab Emirates University([email protected]). Javad Lavaei is with the Department of IndustrialEngineering and Operations Research, University of California, Berkeley, andalso with the Tsinghua-Berkeley Shenzhen Institute ([email protected]).This work was supported by ONR YIP Award, DARPA Young FacultyAward, AFOSR YIP Award, NSF CAREER Award 1351279, and NSF EECSAward 1552096. Parts of this work have appeared in the conference paper“Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei, ADMM for sparsesemidefinite programming with applications to optimal power flow problem,IEEE Conference on Decision and Control, 2015”.

effective in reducing the number of variables and the orderof conic constraints [15], [16]. This approach is remarkablyhelpful for structured SDP problems, particularly those arisingin power system optimization [17].

A promising numerical technique for solving large-scaleSDP problems is the alternating direction method of multi-pliers (ADMM), which is a first-order optimization algorithmproposed in the mid-1970s [18] and [19]. While second-order methods are capable of achieving a high accuracyvia expensive iterations, a modest accuracy can be attainedthrough tens of ADMM’s low-complexity iterations. In orderto obtain a highly accurate solution in a reasonable number ofiterations, great effort has been devoted to accelerating ADMM[20], [21]. Because of the sensitivity of the gradient methods tothe condition number of the problem’s data, diagonal rescalingis proposed in [22] to improve the performance of ADMM.Moreover, several accelerated variants of ADMM as well asparameter tuning methods have been proposed in the literatureto significantly improve the speed of convergence for specificapplication domains [20]. The O( 1n ) worst-case convergencerate of ADMM is proven in [23] and [24] under certainassumptions, and a systematic framework is introduced in [25]for the convergence analysis of ADMM by means of control-theoretic methods.

The main objective of this work is to design a general-purpose SDP solver for sparse large-scale SDPs, with a guar-anteed convergence and parallelization capabilities under mildassumptions. We start by defining a representative graph forthe large-scale SDP problem, from which a decomposed SDPformulation is obtained using a tree/chordal/clique decompo-sition technique. This decomposition replaces the large-scaleSDP matrix variable with certain submatrices of this matrix.In order to solve the decomposed SDP problem iteratively,a distributed ADMM-based algorithm is derived, whose iter-ations comprise entry-wise matrix multiplication/division andeigendecomposition on certain submatrices of the SDP matrix.By finding the optimal solution for the distributed SDP, onecould recover the solution to the original SDP formulationusing an explicit formula.

This work is related to and improves upon some recentpapers in this area. The paper [26] applies ADMM to the dualSDP formulation, leading to a centralized algorithm that is notparallelizable and is computationally expensive for large-scaleSDPs. The work [16] decomposes a sparse SDP into smaller-sized SDPs through a tree decomposition, which are thensolved by interior point methods. However, this approach islimited by the large number of consistency constraints. Usinga first-order splitting method, [27] solves the decomposedSDP problem created by [16], but the algorithm needs tosolve an optimization subproblem at every iteration. Similar

Page 2: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

2

frameworks with the requirement of solving smaller SDPshave been applied to power optimization problems [28]–[32].In contrast with the above papers, the algorithm proposed inthis work is composed of low-complexity and parallelizableiterations, which run fast if the treewidth of the sparsity graphof the SDP problem is small. Since a wide range of real-worldoptimization problems, including those appearing in powersystems, are sparse and benefit from a low treewidth, theproposed algorithm enables solving such problems at scale.This algorithm offers the following advantages compared tothe method developed in our related work [33]: i) it canhandle arbitrary constraints that are not necessarily local, ii)it does not rely on the inversion of large matrices as an initialstep. Both of these improvements are essential for solvinggeneral large-scale sparse SDP problems, including powersystem optimization problems to be discussed later. This paperis also related to [34], which designs a distributed algorithmfor second-order conic programs. In contrast to the existingmethods, the algorithm to be proposed in this paper appliesto higher-order conic problems, and does not require solvingany optimization sub-problem at any iteration.

The paper [35], as a conference version of this work, studiesthe potential of ADMM for solving semidefinite programmingproblems. However, it fails to solve large-scale SDPs and itsexamples are limited to matrices with at most 300 rows. Thecurrent paper improves upon [35] by means of an acceleratedversion of ADMM combined with a preconditioner, whichenables solving real-world problems with over 13,600 rowsin the SDP matrix.

A. Motivation: Power System Optimization

Real-world power optimization problems are concernedwith the efficiency, robustness, reliability, security and re-siliency of power systems, whose decision variables consist ofvarious real-time parameters across a time horizon and underseveral failure scenarios. These parameters include voltages,currents, phase angles, power productions, line flows, trans-former settings, and the on/off statuses of generators and lines.Several factors contribute to a high computational complexityof power optimization problems, such as the nonlinearitiesinduced by laws of physics and discrete variables, the scaleof modern grids, the wide range of failure scenarios, andthe level of uncertainty for demand and renewable energysources. The above-mentioned factors give rise to non-convexmixed-integer optimization problems with tens of thousandsof decision parameters. While the expected level of efficiencyand reliability in recent years necessitates the use of accuratemodels of power systems that are inevitably highly nonconvex,current state-of-the-art solvers such as CPLEX, Gurobi andMOSEK are incapable of handling continuous non-convexityand a large number of discrete parameters arising in real-worldpower system optimization problems.

Recent approaches to tackle computationally-hard poweroptimization problems rely on convex algebraic and/or ge-ometry methods, such as conic relaxation and Sherali-Adamshierarchies [17], [36]–[38]. These advanced techniques arebased on solving SDPs with a considerably large number ofvariables and constraints. Hence, it is imperative to design

efficient and fast algorithms for large-scale SDPs, which areapplicable to fundamental power optimization problems suchas optimal power flow (OPF). The OPF problem is at the heartof the operation of power systems, which finds an optimaloperating point of a power system by minimizing a certainobjective function (e.g., transmission loss or generation cost)subject to power flow equations and operational constraints[39], [40].

Several optimization techniques have been studied for theOPF problem in recent years [41]. Due to the non-convexityand NP-hardness of OPF, these algorithms are not robust, lackperformance guarantees and may not find a global optimum.The paper [7] evaluates the potentials of SDP relaxations forOPF and shows that a global minimum of the problem canbe found using an appropriate SDP formulation if the dualitygap is zero. SDP relaxation is shown to find global or nearglobally optimal solutions (with global optimality guaranteesof at least 99%) for IEEE and Polish systems, and theoreticallyproven to work under different conditions [17], [36], [42]–[46]. Moreover, more advanced SDP relaxations based on thesum-of-squares hierarchy have been proven to be effective forsolving hard instances of the OPF problem [37]. Note that SDPrelaxation is not unique and therefore if one relaxation doesnot work for a particular power problem, it is always possibleto find a more complex SDP formulation to solve the problem(to obtain different hierarchies of convex relaxation, pleaserefer to the tutorial paper [47] and the references therein).

Due to the great success of SDP for the OPF problem, conicrelaxations have been designed for other power optimizationproblems, including state estimation, unit commitment andcharging of electric vehicles [38], [48]–[50]. However, the highdimension of these conic formulations for real-world systemsis an impediment to their implementation. The main objectiveof this work is to design a low-complexity numerical algorithmor a general-purpose SDP solver for large-scale conic problemsthat can be used for a variety of problems, including thoseappearing in the operation of power systems.

This paper is organized as follows. Some preliminaries anddefinitions are provided in Section II. An arbitrary sparseSDP is converted into a decomposed SDP in Section III,for which a numerical algorithm is developed in Section IV.The application of this algorithm for OPF is investigatedin Section V. Numerical examples are given in Section VI,followed by concluding remarks in Section VII.

Notations: R, C, and Hn denote the sets of real numbers,complex numbers, and n×n Hermitian matrices, respectively.The notation X1 ◦ X2 refers to the Hadamard (entrywise)multiplication of matrices X1 and X2. The symbols 〈·, ·〉and ‖ · ‖F denote the Frobenius inner product and normof matrices, respectively. The notation ‖v‖2 denotes the `2-norm of a vector v. The m × n rectangular identity matrix,whose (i, j) entry is equal to the Kronecker delta δij , isdenoted by Im×n. The notations Re{W}, Im{W}, rank{W},and diag{W} denote the real part, imaginary part, rank, anddiagonal of a Hermitian matrix W, respectively. Given avector v, the notation diag{v} denotes a diagonal squarematrix whose entries are given by v. The notation W � 0

Page 3: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

3

means that W is Hermitian and positive semidefinite. Thenotation “i” is reserved for the imaginary unit. The superscripts(·)∗ and (·)T represent the conjugate transpose and transposeoperators, respectively. Given a matrix W, its (l,m) entry isdenoted as Wlm. The subscript (·)opt is used to refer to anoptimal solution to an optimization problem. Given a matrixW, its Moore-Penrose pseudoinverse is denoted as pinv{W}.Given a simple graph H, its vertex and edge sets are denotedby VH and EH, respectively, and the graph H is shown asH = (VH, EH). Given two sets S1 and S2, the notation S1\S2denotes the set of all elements of S1 that do not exist in S2.Given a Hermitian matrix W and two sets of positive integernumbers S1 and S2, define W{S1,S2} as a submatrix ofW obtained through two operations: (i) removing all rowsof W whose indices do not belong to S1, and (ii) removingall columns of W whose indices do not belong to S2. Forinstance, W {{1,2}, {2,3}} is a 2×2 matrix with the entriesW12,W13,W22,W23. The notation |D| shows the cardinalityof a discrete set D or the number of vertices of a graph D.

II. PRELIMINARIES

Consider the semidefinite program

minimizeX∈Hn

〈X,M0〉 (1a)

subject to ls ≤ 〈X,Ms〉 ≤ us, s = 1, . . . , p, (1b)X � 0. (1c)

where M0,M1, . . . ,Mp ∈ Hn, and

(ls, us) ∈ ({−∞} ∪ R)× (R ∪ {+∞})

for every s = 1, . . . , p. Notice that the constraint (1b)reduces to an equality constraint if ls = us. Problem (1) iscomputationally expensive for a large number n due to thepresence of the positive semidefinite constraint (1c). However,if M0,M1, . . . ,Mp are sparse, this expensive constraint canbe decomposed and expressed in terms of some principalsubmatrices of X with smaller dimensions. This will beexplained next.

A. Representative Graph and Tree Decomposition

In order to leverage any possible sparsity of problem (1),a simple graph shall be defined to capture the zero-nonzeropatterns of M0,M1, . . . ,Mp.

Definition 1 (Representative graph [51]). Define G = (VG , EG)as the representative graph of the SDP problem (1), which isa simple graph with n vertices whose edges are specified bythe nonzero off-diagonal entries of M0,M1, . . . ,Mp. In otherwords, two arbitrary vertices i and j are connected if the(i, j) entry of at least one of the matrices M0,M1, . . . ,Mp

is nonzero.

To illustrate Definition 1, consider the problem (1) withn = 4 and p = 1 such that

M0 =

1 0 1 00 5 3 01 3 10 20 0 2 1

, M1 =

0 3 0 03 0 1 20 1 0 10 2 1 1

(2)

The representative graph of the SDP problem (1) in this caseis a graph with the vertex set {1, 2, 3, 4} such that every twovertices are connected to one another except for the vertices1 and 4. The reason is that the (1, 4) entries of M0 and M1

are both equal to 0.Using a tree decomposition algorithm (also known as

chordal or clique decomposition), we can obtain a decomposedformulation for problem (1), in which the positive semidefiniterequirement is imposed on certain principal submatrices of Xas opposed to X itself.

Definition 2 (Tree decomposition [52]). A tree graph T iscalled a tree decomposition of G if it satisfies the followingproperties:

1) Every node of T corresponds to and is identified by asubset of VG .

2) Every vertex of G is a member of at least one node of T .3) Tk is a connected graph for every k ∈ VG , where Tk

denotes the subgraph of T induced by all nodes of Tcontaining the vertex k of G.

4) The subgraphs Ti and Tj have at least one node incommon, for every (i, j) ∈ EG .

Each node of T is a bag (collection) of vertices of G andhence it is referred to as a bag.

As an example, Figure 1 borrowed from [51] shows a treedecomposition of the graph corresponding to the physicalstructure of the IEEE 14-bus power network. Notice that fourproperties are satisfied: 1) every bag of the tree decompositionis a collection of the vertices of the graph, 2) every vertex ofthe graph appears in at least one bag of the tree decomposition,3) all bags containing each particular vertex of the graphform a connected subgraph in the tree decomposition (forexample, there are three bags containing node 2 and theyform a connected path), and 4) every two connected verticesof the graph appear in at least one common bag of the treedecomposition.

Let T = (VT , ET ) be an arbitrary tree decomposition ofG, with the set of bags VT = {C1, C2, . . . , Cq}. As will bediscussed in the next section, it is possible to cast problem (1)in terms of those entries of X that appear in at least one ofthe submatrices X{C1, C1},X{C2, C2}, . . . ,X{Cq, Cq}.Theseentries of X are referred to as important entries. Once theoptimal values of the important entries of X are found via aniterative algorithm, the remaining entries of X can be obtainedthrough an explicit formula to be stated later.

Among the factors that may contribute to the computationalcomplexity of the decomposed problem are: the size of thelargest bag, the number of bags, and the total number ofimportant entries. Finding a tree decomposition that leadsto the minimum number of important entries (minimum fill-in problem) or possesses the minimum size for its largestbag (treewidth problem) is known to be NP-hard. The al-gorithm proposed in this paper reaches its maximum ef-ficiency when a tree decomposition of the sparsity graphwith max{|C1|, . . . , |Cq|} = O(1) is already available. Thisalgorithm only requires a suboptimal tree decomposition witha low width. Good decompositions can be easily found usingthe nested dissection method. The work [53] proves that

Page 4: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

4

1

2

3

5

4

7

8

9

6

11

14

13 12

10

6,12,13

1,2,5

2,4,5

2,3,4

4,5,9

6,9,13

4,7,9 7,8

5,6,9

9,13,14

6,9,11 9,10,11

Fig. 1: The graph of the IEEE 14-bus test case (left figure) and a tree decomposition of this graph (right figure)

nested dissection is O(log(|G|)) suboptimal for bounded-degree graphs, and notes that: “we do not know a class ofgraphs for which nested dissection is suboptimal by more thana constant factor”. There are many other efficient algorithmsin the literature that find near-optimal tree decompositions(specially for power networks due to their near planarity)[54], [55]. In all of the experiments of this paper, a treedecomposition of the sparsity graph is obtained in less than90 seconds, using the algorithm described in [54]. Moreover,optimization problems defined over physical infrastructures,such as power systems, often benefit from the fact that thetopology of the system changes slowly, and therefore thetree decomposition may be performed offline and used fora class of SDP problems as opposed to a single instance ofthe problem.

B. Sparsity Pattern of Matrices

Let Fn denote the set of symmetric n × n matrices withentries belonging to the set {0, 1}. The distributed optimizationscheme to be proposed in this work uses a group of sparseslack matrices. We identify the locations of nonzero entries ofsuch matrix variables using descriptive matrices in Fn.

Definition 3. Given an arbitrary matrix X ∈ Hn, define itssparsity pattern as a matrix N ∈ Fn such that Nij = 1 if andonly if Xij 6= 0 for every i, j ∈ {1, ..., n}. Let |N| denote thenumber of nonzero entries of N. Define the set

S(N) , {X ∈ Hn | X ◦N = X}.

Due to the Hermitian property of X, if d denotes the numberof nonzero diagonal entries of N, then every X ∈ S(N) can bespecified by (|N|+ d)/2 real-valued scalars corresponding toRe{X} and (|N|−d)/2 real scalars corresponding to Im{X}.Therefore, S(N) is |N|-dimensional over R.

Definition 4. Suppose that T = (VT , ET ) is a tree decomposi-tion of the representative graph G with the bags C1, C2, . . . , Cq .• For r = 1, . . . , q, define Cr ∈ Fn as a sparsity pattern

whose (i, j) entry is equal to 1 if {i, j} ⊆ Cr and is 0otherwise for every i, j ∈ {1, ..., n}.

• Define C ∈ Fn as an aggregate sparsity pattern whose(i, j) entry is equal to 1 if and only if {i, j} ⊆ Cr for atleast one index r ∈ {1, . . . , q}.

• For s = 0, 1, . . . , p, define Ns ∈ Fn as the sparsitypattern of Ms.

The sparsity pattern C, which can also be interpreted as theadjacency matrix of a chordal extension of G induced by T ,

captures the locations of the important entries of X. The matrixC will later be used to describe the domain of definition forthe variable of the decomposed SDP problem.

C. Indicator Functions

To streamline the formulation, we will replace any positivityor positive semidefiniteness constraints in the decomposedSDP problem by the indicator functions introduced below.

Definition 5. For every l ∈ {−∞} ∪ R and u ∈ R ∪ {+∞},define the convex indicator function Il,u : R→ {0,+∞} as

Il,u(x) ,{

0 if l ≤ x ≤ u+∞ otherwise

Definition 6. For every r ∈ {1, 2, . . . , q}, define the convexindicator function Jr : Hn → {0,+∞} as

Jr(X) ,

{0 if X{Cr, Cr} � 0

+∞ otherwise

III. DECOMPOSED SDP

Consider the problem

minimizeX∈S(C)

〈X,M0〉 (3a)

subject to ls ≤ 〈X,Ms〉 ≤ us, s = 1, . . . , p, (3b)X{Cr, Cr} � 0, r = 1, . . . , q (3c)

which is referred to as decomposed SDP throughout this paper.Due to the chordal theorem [56], problems (1) and (3) leadto the same optimal objective value. Furthermore, if Xref ∈S(C) denotes an arbitrary solution of the decomposed SDPproblem (3), then there exists a solution Xopt to the SDPproblem (1) such that Xopt ◦C = Xref .

The matrix Xopt can be constructed by mapping certainentries of Xref to 0 according to the sparsity patten. These en-tries of the matrix variable X are referred to as missing entriesand cannot be found by solving the decomposed problem (3).Several matrix completion approaches can be adopted to findthe missing entries, which enable the construction of a feasiblepoint for the original SDP (1). An algorithm is proposed in[16] and [15] that obtains a completion for Xref within theset {X ∈ Hn |X ◦C = Xref , X � 0} whose determinant ismaximum. However, such a solution may not be favorable forapplications that require a low-rank solution such as an SDPrelaxation. It is also known that there exists a polynomial-time algorithm to fill a partially-known real-valued matrix insuch a way that the rank of the resulting matrix becomes

Page 5: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

5

equal to the highest rank among all bags [57], [58]. In [51],we extended this result to the complex domain by proposinga recursive algorithm that transforms Xref ∈ S(C) into asolution Xopt for the original SDP problem (1) whose rankis upper bounded by the maximum rank among the matricesXref{C1, C1},Xref{C2, C2}, . . . ,Xref{Cq, Cq}. This algorithmis stated below for completeness.

Matrix completion algorithm:1) Set T ′ := T and X := Xref .2) If T ′ has a single node, then consider Xopt as X and

terminate; otherwise continue to the next step.3) Choose two bags Cx and Cy of T ′ such that Cx is a leaf

of T ′ and Cy is its unique neighbor.4) Define

K , pinv{X{Cx ∩ Cy, Cx ∩ Cy}} (4a)

Gx , X{Cx \ Cy, Cx ∩ Cy} (4b)

Gy , X{Cy \ Cx, Cx ∩ Cy} (4c)

Ex , X{Cx \ Cy, Cx \ Cy} (4d)

Ey , X{Cy \ Cx, Cy \ Cx} (4e)

Sx , Ex −GxKG∗x = QxDxQ∗x (4f)

Sy , Ey −GyKG∗y = QyDyQ∗y (4g)

where QxDxQ∗x and QyDyQ

∗y denote the eigenvalue

decompositions of Sx and Sy with the diagonals of Dx

and Dy arranged in descending order. Then, update a partof X as follows:

X{Cy \ Cx, Cx \ Cy} := GyKG∗x

+ Qy

√Dy I|Cy\Cx|×|Cx\Cy|

√Dx Q∗x

and update X{Cx \ Cy, Cy \ Cx} accordingly to preservethe Hermitian property of X.

5) Update T ′ by merging Cx into Cy , i.e., replace Cy withCx ∪ Cy and then remove Cx from T ′.

6) Go back to step 2.

Theorem 1. Consider an arbitrary solution Xref of thedecomposed SDP problem (3). The output of the matrixcompletion algorithm, denoted as Xopt, is a solution of theoriginal SDP problem (1). Moreover, the rank of Xopt issmaller than or equal to:

max

{rank {Xref{Cr, Cr}}

∣∣∣∣ r = 1, . . . , q

}.

Proof. Please refer to [51] or [17] for the proof.

IV. ALTERNATING DIRECTION METHOD OF MULTIPLIERS

Consider the optimization problem

minimizex∈Rnx

y∈Rny

f(x) + g(y) (5a)

subject to Ax + By = c. (5b)

where c ∈ Rnc , A ∈ Rnc×nx and B ∈ Rnc×ny are constantmatrices, and f : Rnx → R∪{+∞} and g : Rny → R∪{+∞}are convex functions. Notice that the variables x and y arecoupled through the linear constraint (5b) while the objective

function is separable. The augmented Lagrangian function forproblem (5) is equal to

Lµ(x,y, λ) = f(x) + g(y)

+ λT(Ax + By − c) + (µ/2)‖Ax + By − c‖22,(6)

where λ ∈ Rnc is the Lagrange multiplier associated withthe constraint (5b), and µ ∈ R is a fixed parameter. ADMMis one approach for solving problem (5), which performs thefollowing procedure at each iteration [59]:

xk+1 = argminx∈Rnx

Lµ(x,yk, λk), (7a)

yk+1 = argminy∈Rny

Lµ(xk+1,y, λk), (7b)

λk+1 = λk + µ(Axk+1 + Byk+1 − c). (7c)

where k = 0, 1, 2, . . ., for an arbitrary initialization(x0,y0, λ0). In these equations, “argmin” means an arbitraryminimizer of a convex function and does not need anyuniqueness assumption. Notice that each of the updates (7a)and (7b) is an optimization sub-problem with respect to eitherx and y, by freezing the other variable at its latest value.Several acceleration techniques have been proposed in theliterature, aiming to improve the convergence behavior ofADMM [20], [25]. One such approach, regarded as over-relaxed ADMM, involves adopting a sequence of intermediateLagrange multipliers {λk}∞k=1 as follows:

xk+1 = argminx∈Rnx

Lµ(x,yk, λk), (8a)

λk+1 = λk + µ(α− 1)(Axk+1 + Byk − c), (8b)

yk+1 = argminy∈Rny

Lµ(xk+1,y, λk+1), (8c)

λk+1 = λk+1 + µ(Axk+1 + Byk+1 − c), (8d)

where α ∈ [1, 2] is a fixed parameter. We employ theresidue sequence {εk}∞k=1 proposed in [20] as measure forconvergence:

εk+1 = (1/µ)‖λk+1 − λk‖22 + µ‖B(yk+1 − yk)‖22 (9)

ADMM is particularly interesting for the cases where sub-problems can be solved efficiently through an explicit formula.Under such circumstances, it would be possible to execute alarge number of iterations in a short amount of time. In thissection, we first cast the decomposed SDP problem (3) in theform (5) and then regroup the variables into two blocks P1

and P2 playing the roles of x and y in the ADMM algorithm.

A. Projection Onto Positive Semidefinite Cone

The algorithm to be proposed in this work requires theprojection of q matrices belonging to H|C1|,H|C2|, . . . ,H|Cq|onto the positive semidefinite cone. This is probably the mostcomputationally expensive part of each iteration.

Definition 7. For a given Hermitian matrix Z, define theunique solution to the optimization problem

minimizeZ∈Hm

‖Z− Z‖2F (10a)

subject to Z � 0 (10b)

Page 6: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

6

Algorithm 1 Over-relaxed ADMM for decomposed SDP

1: Initialize X0, {z0s}ps=0, {X0

C;r}qr=1, {X0

N ;s}ps=0, {Λ0

C;r}qr=1, {Λ0

N ;s}ps=0, {λ0z;s}

ps=0

2: repeat

3: Xk+1 :=[∑q

r=1 Cr ◦ (XkC;r −Λk

C;r/µ) +∑ps=1 Ns ◦ (Xk

N ;s −ΛkN ;s/µ)

]�C [

∑qr=1 Cr +

∑ps=1 Ns]

4: zk+10 := 〈M0,X

kN ;0〉 − (λkz;0 + 1)/µ

5: zk+1s := max{min{〈Ms,X

kN ;s〉 − λkz;s/µ, us}, ls} for s = 1, 2, . . . , p

6: Λk+1C;r := Λk

C;r + µ(α− 1)(Xk+1 ◦Cr −XkC;r) for r = 1, 2, . . . , q

7: Λk+1N ;s := Λk

N ;s + µ(α− 1)(Xk+1 ◦Ns −XkN ;s) for s = 0, 1, . . . , p

8: λk+1z;s := λkz;s + µ(α− 1)(zk+1

s − 〈Ms,XkN ;s〉) for s = 0, 1, . . . , p

9: Xk+1C;r := (Xk+1 ◦Cr + Λk

C;r/µ)+ for r = 1, 2, . . . , q

10: yk+1s :=

zk+1s +λk

z;s/µ−〈Ms,Ns◦Xk+1+ΛkN;s/µ〉

1+‖Ms‖2Ffor s = 0, 1, . . . , p

11: Xk+1N ;s := Ns ◦Xk+1 + Λk

N,s/µ+ yk+1s Ms for s = 0, 1, . . . , p

12: Λk+1C;r := Λk+1

C;r + µ(Xk+1 ◦Cr −Xk+1C;r ) for r = 1, 2, . . . , q

13: Λk+1N ;s := Λk+1

N ;s + µ(Xk+1 ◦Ns −Xk+1N ;s ) for s = 0, 1, . . . , p

14: λk+1z;s := λk+1

z;s + µ(zk+1s − 〈Ms,X

k+1N ;s 〉) for s = 0, 1, . . . , p

15: until meet stopping criterion

as the projection of Z onto the cone of positive semidefinitematrices, and denote it as Z+.

The next Lemma reveals the interesting fact that problem(10) can be solved through an eigenvalue decomposition of Z.

Lemma 1. Let Z = Q× diag{(ν1 . . . , νm)}×Q∗ denote theeigenvalue decomposition of Z. The solution of the projectionproblem (10) is given by

Z+ = Q× diag{(max{ν1, 0}, . . . ,max{νm, 0})} ×Q∗

Proof. Please refer to [60] for the proof.

B. ADMM for Decomposed SDP

We apply ADMM to the following reformulation of thedecomposed SDP problem (3):

minimizeX∈S(C)

{XN;s∈S(Ns)}ps=0

{XC;r∈S(Cr)}qr=1

{zs∈R}ps=0

z0 +

p∑s=1

Ils,us(zs) +

q∑r=1

Jr(XC;r)

subject to X ◦Cr = XC;r, r = 1, 2, . . . , q, (11a)X ◦Ns = XN ;s, s = 0, 1, . . . , p, (11b)zs = 〈Ms,XN ;s〉, s = 0, 1, . . . , p. (11c)

If X is a feasible solution of (11) with a finite objective value,then

Jr(X) = Jr(X ◦Cr)(11a)= Jr(XC;r) = 0

which concludes that X{Cr, Cr} � 0. Morever,

Ils,us(〈X,Ms〉) = Ils,us

(〈X ◦Ns,Ms〉)(11b)= Ils,us

(〈XN ;s,Ms〉)(11c)= Ils,us(zs) = 0

which yields that ls ≤ 〈X,Ms〉 ≤ us. Therefore, X is afeasible point for problem (3) as well, with the same objectivevalue. Define

1) ΛC;r ∈ S(Cr) as the Lagrange multiplier associated withthe constraint (11a) for r = 1, 2, . . . , q,

2) ΛN ;s ∈ S(Ns) as the Lagrange multiplier associatedwith the constraint (11b) for s = 0, 1, . . . , p,

3) λz;s ∈ R as the Lagrange multiplier associated with theconstraint (11c) for s = 0, 1, . . . , p.

We regroup the primal and dual variables as

(Block 1) P1 = (X, {zs}ps=0)

(Block 2) P2 = ({XC;r}qr=1, {XN ;s}ps=0)

(Dual) D = ({ΛC;r}qr=1, {ΛN ;s}ps=0, {λz;s}ps=0) .

Note that “block 1”, “block 2” and “D” play the roles of x,y and λ in the standard formulation of ADMM, respectively.

Page 7: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

7

The augmented Lagrangian can be calculated as

(2/µ)Lµ(P1,P2,D) = LD(D)/µ2

+ ‖z0 − 〈M0,XN ;0〉+ (1 + λz;0)/µ‖2F

+

p∑s=1

(‖zs − 〈Ms,XN ;s〉+ λz;s/µ‖2F + Ils,us(zs)

)+

q∑r=1

(‖X ◦Cr −XC;r + (1/µ)ΛC;r‖2F + Jr(XC;k)

)+

p∑s=1

‖X ◦Ns −XN ;s + (1/µ)ΛN ;s‖2F (13)

where

LD(D) =− (1 + λz;0)2

−p∑s=1

λ2z;s −q∑r=1

‖ΛC;r‖2F −p∑s=1

‖ΛN ;s‖2F (14)

Using the blocks P1 and P2, the ADMM iterations forproblem (11) can be expressed as follows:

1) The subproblem (7a) in terms of P1 consists of twoparallel steps:

(a) Minimization in terms of X: This step consists of|C| scalar quadratic and unconstrained programs. Itpossesses an explicit formula that involves |C| parallelmultiplication operations.

(b) Minimization in terms of {zs}ps=0: This step consistsof p + 1 scalar quadratic programs each with a boxconstraint. It possesses an explicit formula that involvesp+ 1 parallel multiplication operations.

2) The subproblem (7b) in terms of P2 also consists of twoparallel steps:

(a) Minimization in terms of {XC;r}qr=1: This step consistsof q projection problems of the form (10). Accordingto Lemma 1, this reduces to q parallel eigenvalue de-composition operations on matrices of sizes |Cr|× |Cr|for r = 1, . . . , q.

(b) Minimization in terms of {XN ;s}ps=0: This step con-sists of p unconstrained quadratic programs of sizes|Ns| for s = 0, 1, . . . , p. The quadratic programs areparallel and each of them possesses an explicit formulathat involves 2|Ns| multiplications.

3) Computation of the dual variables at each iteration, inequation (7c), consists of three parallel steps:

(a) Updating {ΛC;r}qr=1: Computational costs for this stepinvolve no multiplications and are negligible.

(b) Updating {ΛN ;s}ps=0: Computational costs for thisstep involve no multiplications and are negligible.

(c) Updating {λz;s}ps=0: This step is composed of p +1 parallel inner product computations, each involving|Ns| multiplications for s = 0, 1, . . . , p.

Like any other first-order method, ADMM can be applied tothe problem (3) or its various reformulations in many differentways. However, the two-block partitioning into (P1,P2) ismeticulously performed in this work to achieve two goals:

• Making the optimization problem for each block naturally

decomposable into disjoint subproblems.• Making all of the subproblems have explicit closed-form

solutions.

These two features distinguish the developed algorithm from ageneric ADMM-based algorithm for SDPs, and enable solvinglarge-scale sparse SDPs.

Notation 1. For every D,E ∈ Hn, the notation D �C Erefers to the entrywise division of those entries of D and Ethat correspond to the ones of C i.e.,

(D�C E)ij ,

{Dij/Eij if Cij = 1

0 if Cij = 0.

In what follows, we will elaborate on every step of theADMM iterations:Block 1: The first step of the algorithm that corresponds to(7a) consists of the operation

Pk+11 := argmin

P1

Lµ(P1,Pk2 ,Dk).

Notice that the minimization of Lµ(P1,Pk2 ,Dk) with respectto P1 is decomposable in terms of the real scalars

Re{Xij} for i = 1, . . . , n; j = i, . . . , n (16a)Im{Xij} for i = 1, . . . , n; j = i+ 1, . . . , n (16b)

zs for s = 1, . . . , p (16c)

which leads to an explicit formula.Block 2: The second step of the algorithm that correspondsto (7b) consists of the operation

Pk+12 = argmin

P2

Lµ(Pk+11 ,P2,Dk)

Notice that the minimization of Lµ(P1,Pk2 ,Dk) with respectto P1 is decomposable in terms of the matrix variables{XC;r}qr=1 and {XN ;s}ps=0. Hence, the update of XC;r

reduces to the problem (10) for Z = XC;r{Cr, Cr}. Asshown in Lemma 1, this can be performed via the eigenvaluedecomposition of a |Cr|×|Cr| matrix. In addition, the updatedvalue of XN ;s is a minimizer of the function

LN ;s(Z) =‖zs − 〈Ms,Z〉+ λz;s/µ‖2F+‖X ◦Ns − Z + (1/µ)ΛN ;s‖2F (18)

By taking the derivatives of this function, it is possible to findan explicit formula for Zopt. Define L′N ;s(Z) ∈ S(Ns) as thegradient of LN ;s(Z) with the following structure:

L′N ;s(Z) ,

[∂LN ;s

∂Re{Zij}+ i

∂LN ;s

∂Im{Zij}

]i,j=1,...,n

Then, we have

L′N ;s(Z)/2 = Z−X ◦Ns − (1/µ)ΛN,s

+ (−zs + 〈Ms,Z〉 − λz;s/µ)Ms.

Therefore,

Zopt = X ◦Ns + (1/µ)ΛN,s + ysMs, (19)

where ys , zs−〈Ms,Zopt〉+λz;s/µ. Hence, it only remains

to derive the scalar ys, which can be done by inner multiplying

Page 8: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

8

Algorithm 2 Preconditioning of problem (3)

Require: {Ms}ps=0, {ls}ps=1, {us}

ps=1, γ

1: M0 ← γ × M0

‖M0‖F2: for s = 1, . . . , p do

3: ls ← ls‖Ms‖F

4: us ← us

‖Ms‖F

5: Ms ← Ms

‖Ms‖F6: end for

7: return {Ms}ps=0, {ls}ps=1, {us}

ps=1

Ms to both sides of the equation (19).Closed-form solutions for each step of the over-relaxed

ADMM can be derived similarly, which leads to Algorithm 1.

Theorem 2. Assume that Slater’s conditions hold for thedecomposable SDP problem (3). For α = 1, the sequence{Xk}∞k=0 generated by Algorithm 1 converges to an optimalsolution for (3).

Proof. The convergence of both primal and dual variables isguaranteed for a standard ADMM problem if the matrix Bin (5b) has full column rank [61]. After realizing that (1) isobtained from a two-block ADMM procedure, the theoremcan be concluded form the fact that the equivalent of B forthe algorithm (1) is a mapping from the variables {XC;r}qr=1

and {XN ;s}ps=0 to

{XC;r}qr=1, {XN ;s}ps=0 and {〈Ms,XN ;s〉}ps=0

which is not singular, i.e., it has full column rank. The detailsare omitted for brevity.

C. Parameter Selection and Preconditioning

The performance of the proposed algorithm somewhat de-pends on the parameter µ. Since this algorithm is based on atwo-block ADMM technique with the blocks P1 and P2, onemay directly use the existing results for parameter selectionin two-block ADMM, such as the techniques developed in[62] and [25]. However, given the complexity of finding anoptimal value of µ, it may be more efficient to instead takemore iterations with a sub-optimal value µ to reduce theoverall runtime by avoiding the preprocessing time neededfor designing µ. Another issue is that first-order methods,including ADMM, are sensitive to the condition number ofthe problem data. One may optimally precondition the derivedtwo-block ADMM by adopting the existing methods, suchas [22], which is a daunting task in general. In this paper,we resort to a simple preconditioning technique, describedin Algorithm 2, to normalize different parts of the data.More precisely, the constraints in (3b) are normalized and theobjective function is rescaled according to a tuning factor γ.

V. OPTIMAL POWER FLOW

Consider an n-bus electrical power network with the topol-ogy described by a simple graph H = (VH, EH), meaning that

each vertex belonging to VH = {1, . . . , n} represents a nodeof the network and each edge belonging to EH = {1, . . . ,m}represents a transmission line. Define Cf ∈ {0, 1}m×n andCt ∈ {0, 1}m×n to be the “from” and “to” incidence matricesof the network graph, respectively, i.e., the (l, k) entry of Cf isequal to one if and only if k is the starting node of the branchl, whereas the (l, k) entry of Ct is equal to one if and only ifk is the ending node of l. Define G = {1, . . . , u} to be the setof generating units, and let Cg ∈ {0, 1}u×n denote the unitincidence matrix, i.e., the (g, k) entry of Cg is 1 if and only ifunit g is located at bus k. Additionally, let Y ∈ Cn×n denotethe admittance matrix of the network, and define Yf ∈ Cm×nand Yt ∈ Cm×n to be the “from” and “to” admittancematrices, respectively. Define V ∈ Cn as the unknown voltagephasor vector, i.e., Vk is the voltage phasor for node k ∈ VH.Let P+Qi represent the vector of complex power supply bygenerating units, where P ∈ Ru and Q ∈ Ru are the vectorsof active and reactive powers, respectively. Define Sd ∈ Cn tobe the vector of nodal complex power demand. Finally, definesf = [sf ;1, sf ;2, . . . , sf ;m]> and st = [st;1, st;2, . . . , st;m]>

to be the vectors of complex power entering the starting andending nodes of branches, respectively.

The classical OPF problem can be described as follows:

minimizeV∈Cn

P,Q∈Ru

sf ,sf∈Cm

∑g∈G

c2;gP2g + c1;gPg + c0;g (20a)

subject to V mink ≤ |Vk| ≤ V max

k , k ∈ VH, (20b)

Qming ≤ Qg ≤ Qmax

g , g ∈ G, (20c)

Pming ≤ Pg ≤ Pmax

g , g ∈ G, (20d)

CTg (P + Qi) = diag{VV∗Y∗}+ Sd, (20e)

|sf ;l| ≤ smaxl , l ∈ EH, (20f)

|st ;l| ≤ smaxl , l ∈ EH, (20g)

sf = diag{CfVV∗Y∗f}, (20h)

st = diag{CtVV∗Y∗t }, (20i)

where V mink , V max

k , Pming , Pmax

g , Qming , Qmax

g and smaxl are

constant limits, and c2;g ≥ 0, c1;g and c0;g are coefficientsaccounting for the cost of producing power by unit g. Moredetails on a general formulation may be found in [7].

OPF is a highly non-convex problem, which is known tobe difficult to solve in general. However, the constraints ofproblem (20) can all be expressed as linear functions of theentries of the quadratic matrix VV∗. This implies that theconstraints of OPF are linear in terms of a matrix variableW , VV∗. One can reformulate OPF by replacing eachmonomial ViV ∗j with the new variable Wij and represent theconstraints in the form of problem (1) with a representativegraph that is isomorphic to the network topology graph H.In order to preserve the equivalence of the two formulations,two additional constraints must be added to the problem: (i)W � 0, (ii) rank{W} = 1. If we drop the rank conditionas the only non-convex constraint of the reformulated OPF

Page 9: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

9

problem, we attain the SDP relaxation of OPF as follows:

minimizeW∈Hn

P′,P,Q∈Ru

sf ,sf∈Cm

∑g∈G

c2;gP′g + c1;gPg + c0;g (21a)

subject to (V mink )2 ≤Wkk ≤ (V max

k )2, k ∈ VH, (21b)

Qming ≤ Qg ≤ Qmax

g , g ∈ G, (21c)

Pming ≤ Pg ≤ Pmax

g , g ∈ G, (21d)

P + Qi = diag{WY∗}+ Sd, (21e)sf = diag{CfWY∗f}, (21f)

st = diag{CtWY∗t }, (21g)[P ′g PgPg 1

]� 0, g ∈ G, (21h)[

smaxl sf ;ls∗f ;l smax

l

]� 0, l ∈ EH, (21i)[

smaxl st;ls∗t;l smax

l

]� 0, l ∈ EH, (21j)

W � 0, (21k)

where the auxiliary variable P ′g accounts for the square of Pgfor each g ∈ G. Note that the above problem is an SDP becauseits objective function is linear and its constraints are linearscalar/matrix equalities and inequalities in the variables of theproblem [11]. Therefore, it can be put into the canonical form(1) (please refer to [7] for details on such a reformulation).As stated in the introduction, several papers in the literaturehave shown great promises for finding global or near-globalsolutions of OPF using the above relaxation or a penalizedversion of the SDP relaxation. The major drawback of relaxingthe OPF problem to SDP is the requirement of defining amatrix variable, which makes the number of scalar variablesof the problem quadratic with respect to the number of networkbuses. However, we have shown in [17] that real-world gridswould have a low treewidth, e.g., at most 26 for the Polishtest system with over 3000 buses. This makes our proposednumerical algorithm scalable and highly parallelizable for theabove SDP relaxation. As an example, the SDP relaxation ofOPF for a large-scale European grid with 9241 buses amountsto simple operations over 857 matrices of size 31 by 31, aswell as 14035 matrices of size 2 by 2.

VI. SIMULATION RESULTS

In this section, we evaluate the performance of the proposedalgorithm for solving the SDP relaxation of OPF over PanEuropean Grid Advanced Simulation and State Estimation(PEGASE) test systems [63], [64]. All simulations are run inMATLAB using a system with an Intel 3.0 GHz, 12-core CPUand 256 GB RAM. The overall running time of 5000 iterationis between 8 and 50 minutes in a MATLAB implementationwithout parallelization. For larger cases, running time dimin-ishes using MATLAB parallel computing toolbox and couldbe further reduced in C++.

In all of the experiments, a tree decomposition of thesparsity graph is obtained in less than 90 seconds, usingthe algorithm described in [54]. The experiment results aresummarized in Table I. The second column indicates the total

number of upper and lower bounds of the form (3b) (i.e.,2 × p). The number of positive-semidefinite submatrices ofthe form (3c) and their maximum size are shown in the thirdand forth columns, respectively. Notice that since no thermallimits are imposed for the case 13659-bus system, the numberof constraints and bags for this case are smaller, compared toits preceding system. The over-relaxation parameter α = 1.8is used for all cases. The quality of solutions obtained via5000 ADMM iterations is given in columns 7 to 10. Asshown in Figure 2, the residue function εk (as defined in(9)) is monotonically decreasing for all simulated cases. Theconvergence behavior of the ADMM coupling constraint (i.e.,‖Ax5000 + By5000 − c‖2) and the cost value for differentcases are depicted in Figure 2, as well. In order to furtherassess the quality of solutions after 5000 ADMM iterations,the maximum violation of inequality constraints in (3b) andthe largest absolute value among the negative eigenvalues ofall submatrices X5000{Cr, Cr} are given in the eighth andninth columns. Small-sized bags, corresponding to submatricesof W in (21), are combined to obtain a modest number ofbags. To elaborate on the algorithm, note that every iterationamounts to a basic matrix operation or an eigendecompositionover matrices of size at most 31×31 for the PEGASE 13659-bus system.

The effect of different choices for tuning parameter µ on theconvergence of the objective value for the 13659-bus systemis illustrated in Figure 3a. Additionally, Figure 3b comparesdifferent choices of the over-relaxation parameter for the 2869-bus system. Further preconditioning efforts, in addition to theone suggested in Algorithm 2, may reduce the number ofiterations (note that OPF is often very ill-conditioned due tohigh inductance-to-resistance ratios), and this is left for futurework.

VII. CONCLUSION

Motivated by the application of sparse semidefinite pro-gramming in many hard optimization problems across engi-neering, the objective of this work is to design a fast and par-allelizable numerical algorithm for solving sparse semidefiniteprograms (SDPs). To this end, the underling sparsity structureof a given SDP problem is captured using a tree decompositiontechnique, leading to a decomposed SDP problem. A highlydistributed/parallelizable numerical algorithm is developed forsolving the decomposed SDP, based on the alternating direc-tion method of multipliers (ADMM). Each iteration of thedesigned algorithm has a closed-form solution, which involvesmultiplications and eigenvalue decompositions over certainsubmatrices induced by the tree decomposition of the sparsitygraph. The proposed algorithm is applied to the classicaloptimal power flow problem, and also evaluated on large-scale PEGASE benchmark systems. The numerical techniquedeveloped in this paper enables solving complex models ofvarious power optimization problems, such as optimal powerflow, unit commitment and state estimation, through conicoptimization.

Page 10: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

10

Number of Number Maximum Linear Maximum Maximum ADMM Running RunningTest cases inequality of size µ γ coupling inequality PSD cost time without time with

constraints bags of bags violation violation violation value parallelization parallelizationCase 1354-bus 17238 3398 13 1200 1e-6 1.0e-1 1.4e-2 6.2e-4 738.10 8 min 7 minCase 2869-bus 34694 6637 13 2500 1e-6 2.3e-1 7.9e-2 7.6e-4 1328.52 14 min 10 minCase 9241-bus 96108 14892 31 3200 1e-7 4.2e-1 2.1e-1 4.5e-4 3140.25 39 min 18 minCase 13659-bus 90140 5185 31 3000 1e-6 5.8e-1 1.6e-1 1.5e-4 3850.71 50 min 24 min

TABLE I: Performance of the proposed algorithm for solving the SDP relaxation of the OPF problem on PEGASE test casesvia serial and parallel implementations.

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

101.5

102.0

102.5

103.0

103.5

104.0

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

10-1.0

100.0

101.0

102.0

103.0

Infe

asib

ility

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

-15,000

-10,000

-5,000

0

5,000

10,000

15,000

Co

st

Va

lue

(a)

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

102.5

103.5

104.5

105.5

106.5

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

10-1.0

100.0

101.0

102.0

103.0

Infe

asib

ility

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

-9,000

-6,000

-3,000

0

3,000

6,000

9,000

Co

st

Va

lue

(b)

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

103.0

104.0

105.0

106.0

107.0

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

10-0.5

100.5

101.5

102.5

103.5

Infe

asib

ility

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

-30,000

-15,000

0

15,000

30,000

Co

st

Va

lue

(c)

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

103.5

104.0

104.5

105.0

105.5

106.0

106.5

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

10-0.5

100.0

100.5

101.0

101.5

102.0

102.5

Infe

asib

ility

0 1,000 2,000 3,000 4,000 5,000

Iteration Number

0

1,000

2,000

3,000

4,000

5,000

6,000

Co

st

Va

lue

(d)

Fig. 2: These plots show the convergence behavior of the residue functions, indefeasibly sequences and cost values for PEGASEtest systems. (a): 1354-bus system, (b): 2869-bus system, (c): 9241-bus system, (d): 13659-bus system.

Page 11: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

11

1000 1500 2000 2500 3000 3500 4000 4500 5000

Iteration Number

100

100.5

101.0

101.5

102.0

102.5

103.0

(a)

2,000 2,500 3,000 3,500 4,000 4,500 5,000

Iteration Number

600

800

1000

1200

1400

Cost V

alu

e

(b)

Fig. 3: The effect of different choices for tuning parameters on the convergence of cost value, (a): 13659-bus system, (b):2869-bus system.

REFERENCES

[1] F. Alizadeh-Dehkharghani, “Combinatorial optimization with interiorpoint methods and semi-definite matrices,” Ph.D. dissertation, Universityof Minnesota, 1992.

[2] Y. Nesterov and A. Nemirovskii, Interior-point polynomial algorithmsin convex programming. SIAM, 1994, vol. 13.

[3] M. X. Goemans and D. P. Williamson, “Improved approximation algo-rithms for maximum cut and satisfiability problems using semidefiniteprogramming,” Journal of the ACM (JACM), vol. 42, no. 6, pp. 1115–1145, 1995.

[4] S. P. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear matrixinequalities in system and control theory. SIAM, 1994, vol. 15.

[5] Y. Nesterov, “Semidefinite relaxation and nonconvex quadratic optimiza-tion,” Optimization methods and software, vol. 9, no. 1-3, pp. 141–160,1998.

[6] Z.-Q. Luo, “Applications of convex optimization in signal processingand digital communication,” Mathematical Programming, vol. 97, no.1-2, pp. 177–207, 2003.

[7] J. Lavaei and S. Low, “Zero duality gap in optimal power flow problem,”IEEE Transactions on Power Systems, vol. 27, no. 1, pp. 92–107, 2012.

[8] T.-H. Chang, Z.-Q. Luo, and C.-Y. Chi, “Approximation bounds forsemidefinite relaxation of max-min-fair multicast transmit beamformingproblem,” IEEE Transactions on Signal Processing, vol. 56, no. 8, pp.3932–3943, 2008.

[9] H. D. Mittelmann and F. Vallentin, “High-accuracy semidefinite pro-gramming bounds for kissing numbers,” Experimental Mathematics,vol. 19, no. 2, pp. 175–179, 2010.

[10] E. de Klerk and R. Sotirov, “Improved semidefinite programmingbounds for quadratic assignment problems with suitable symmetry,”Mathematical Programming, vol. 133, no. 1-2, pp. 75–91, 2012.

[11] L. Vandenberghe and S. Boyd, “Semidefinite programming,” SIAMReview, vol. 38, no. 1, pp. 49–95, 1996.

[12] K. Fujisawa, K. Nakata, M. Yamashita, and M. Fukuda, “SDPA project:Solving large-scale semidefinite programs,” Journal of the OperationsResearch Society of Japan, vol. 50, no. 4, pp. 278–298, 2007.

[13] K. Fujisawa, H. Sato, S. Matsuoka, T. Endo, M. Yamashita, andM. Nakata, “High-performance general solver for extremely large-scale semidefinite programming problems,” in Proceedings of the In-ternational Conference on High Performance Computing, Networking,Storage and Analysis, 2012.

[14] M. S. Andersen, J. Dahl, and L. Vandenberghe, “Implementation ofnonsymmetric interior-point methods for linear optimization over sparsematrix cones,” Mathematical Programming Computation, vol. 2, no. 3-4,pp. 167–201, 2010.

[15] K. Nakata, K. Fujisawa, M. Fukuda, M. Kojima, and K. Murota,“Exploiting sparsity in semidefinite programming via matrix completionII: Implementation and numerical results,” Mathematical Programming,vol. 95, no. 2, pp. 303–327, 2003.

[16] M. Fukuda, M. Kojima, K. Murota, and K. Nakata, “Exploiting sparsityin semidefinite programming via matrix completion I: General frame-work,” SIAM Journal on Optimization, vol. 11, no. 3, pp. 647–674,2001.

[17] R. Madani, M. Ashraphijuo, and J. Lavaei, “Promises of conic relax-ation for contingency-constrained optimal power flow problem,” IEEETransactions on Power Systems, vol. 31, no. 2, pp. 1297–1307, 2016.

[18] D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinearvariational problems via finite element approximation,” Computers &Mathematics with Applications, vol. 2, no. 1, pp. 17 – 40, 1976.

[19] R. Glowinsk and A. Marroco, “Sur l’approximation, par lments finisd’ordre un, et la rsolution, par pnalisation-dualit d’une classe de prob-lmes de dirichlet non linaires,” ESAIM: Mathematical Modelling andNumerical Analysis, vol. 9, no. R2, pp. 41–76, 1975.

[20] T. Goldstein, B. O’Donoghue, S. Setzer, and R. Baraniuk, “Fast alternat-ing direction optimization methods,” SIAM Journal on Imaging Sciences,vol. 7, no. 3, pp. 1588–1623, 2014.

[21] Y. Nesterov, “A method of solving a convex programming problem withconvergence rate O(1/k2),” Soviet Mathematics Doklady, vol. 27, no. 2,pp. 372–376, 1983.

[22] P. Giselsson and S. Boyd, “Diagonal scaling in Douglas-Rachfordsplitting and ADMM,” in IEEE Conference on Decision and Control,2014.

[23] B. He and X. Yuan, “On the O(1/n) convergence rate of the Douglas–Rachford alternating direction method,” SIAM Journal on NumericalAnalysis, vol. 50, no. 2, pp. 700–709, 2012.

[24] R. D. C. Monteiro and B. F. Svaiter, “Iteration-complexity of block-decomposition algorithms and the alternating direction method of mul-tipliers,” SIAM Journal on Optimization, vol. 23, no. 1, pp. 475–507,2013.

[25] R. Nishihara, L. Lessard, B. Recht, A. Packard, and M. I. Jordan,“A general analysis of the convergence of ADMM,” in InternationalConference on Machine Learning, 2015.

[26] Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmentedLagrangian methods for semidefinite programming,” Mathematical Pro-gramming, vol. 2, no. 3-4, pp. 203–230, 2010.

[27] Y. Sun, M. S. Andersen, and L. Vandenberghe, “Decomposition inconic optimization with partially separable structure,” SIAM Journal onOptimization, vol. 24, no. 2, pp. 873–897, 2014.

[28] R. A. Jabr, “Exploiting sparsity in SDP relaxations of the OPF problem,”IEEE Transactions on Power Systems, vol. 27, no. 2, pp. 1138–1139,2012.

[29] Y. Weng, Q. Li, R. Negi, and M. Ilic, “Distributed algorithm for SDPstate estimation,” in Innovative Smart Grid Technologies, 2013.

[30] A. Hauswirth, T. Summers, J. Warrington, J. Lygeros, A. Kettner, andA. Brenzikofer, “A modular AC optimal power flow implementation fordistribution grid planning,” in IEEE PowerTech, 2015.

[31] H. Zhu and G. B. Giannakis, “Power system nonlinear state estimationusing distributed semidefinite programming,” IEEE Journal of SelectedTopics in Signal Processing, vol. 8, no. 6, pp. 1039–1050, 2014.

[32] E. Dall’Anese, H. Zhu, and G. B. Giannakis, “Distributed optimal powerflow for smart microgrids,” IEEE Transactions on Smart Grid, vol. 4,no. 3, pp. 1464–1475, 2013.

[33] A. Kalbat and J. Lavaei, “A fast distributed algorithm for decomposablesemidefinite programs,” in IEEE Conference on Decision and Control,2015.

[34] Q. Peng and S. H. Low, “Distributed optimal power flow algorithmfor radial networks, I: Balanced single phase case,” to appear in IEEETransactions on Smart Grid, 2016.

[35] R. Madani, A. Kalbat, and J. Lavaei, “ADMM for sparse semidefiniteprogramming with applications to optimal power flow problem,” in IEEEConference on Decision and Control, 2015.

Page 12: Ramtin Madani, Abdulrahman Kalbat and Javad Lavaeilavaei/Dis_com_2016.pdf · Ramtin Madani, Abdulrahman Kalbat and Javad Lavaei ... “Ramtin Madani, Abdulrahman Kalbat and Javad

12

[36] R. Madani, S. Sojoudi, and J. Lavaei, “Convex relaxation for optimalpower flow problem: Mesh networks,” IEEE Transactions on PowerSystems, vol. 30, no. 1, pp. 199–211, 2015.

[37] C. Josz and D. K. Molzahn, “Moment/sum-of-squares hierarchy forcomplex polynomial optimization,” arXiv preprint arXiv:1508.02068,2015.

[38] S. Fattahi, M. Ashraphijuo, J. Lavaei, and A. Atamturk, “Conic relax-ations of the unit commitment problem,” Energy, vol. 134, pp. 1079–1095, 2017.

[39] J. Momoh, R. Adapa, and M. El-Hawary, “A review of selected optimalpower flow literature to 1993. I. nonlinear and quadratic programmingapproaches,” IEEE Transactions on Power Systems,, vol. 14, no. 1, pp.96–104, 1999.

[40] J. Carpentier, “Contribution a l etude du dispatching economique,”Bulletin Society Francaise Electricians, vol. 3, no. 8, pp. 431–447, 1962.

[41] M. B. Cain, R. P. O’Neill, and A. Castillo, “History of optimal powerflow and formulations,” Technical Report, 2012.

[42] S. Sojoudi and J. Lavaei, “Physics of power networks makes hardoptimization problems easy to solve,” in IEEE Power and Energy SocietyGeneral Meeting, 2012.

[43] S. H. Low, “Convex relaxation of optimal power flow–Part I: Formu-lations and equivalence,” IEEE Transactions on Control of NetworkSystems, vol. 1, no. 1, pp. 15–27, 2014.

[44] J. Lavaei, D. Tse, and B. Zhang, “Geometry of power flows andoptimization in distribution networks,” IEEE Transactions on PowerSystems, vol. 29, no. 2, pp. 572–583, 2014.

[45] S. Sojoudi and J. Lavaei, “Exactness of semidefinite relaxations fornonlinear optimization problems with underlying graph structure,” SIAMJournal on Optimization, vol. 24, no. 4, pp. 1746–1778, 2014.

[46] L. Gan, N. Li, U. Topcu, and S. H. Low, “Optimal power flow indistribution networks,” IEEE Conference on Decision and Control, 2013.

[47] R. Y. Zhang, C. Josz, and S. Sojoudi, “Conic optimization theory:Convexification techniques and numerical algorithms,” arXiv preprintarXiv:1709.08841, 2017.

[48] R. Madani, J. Lavaei, and R. Baldick, “Convexification of power flowproblem over arbitrary networks,” in IEEE Conference on Decision andControl, 2015.

[49] R. Madani, J. Lavaei, R. Baldick, and A. Atamturk, “Power systemstate estimation and bad data detection by means of conic relaxation,”in Hawaii International Conference on System Sciences, 2017.

[50] S. Sojoudi and S. H. Low, “Optimal charging of plug-in hybrid electricvehicles in smart grids,” in IEEE Power and Energy Society GeneralMeeting, 2011.

[51] R. Madani, S. Sojoudi, G. Fazelnia, and J. Lavaei, “Finding low-ranksolutions of sparse linear matrix inequalities using convex optimization,”SIAM Journal on Optimization, vol. 27, no. 2, pp. 725–758, 2017.

[52] R. Halin, “S-functions for graphs,” Journal of geometry, vol. 8, no. 1-2,pp. 171–186, 1976.

[53] J. R. Gilbert, “Some nested dissection order is nearly optimal,” Infor-mation Processing Letters, vol. 26, no. 6, pp. 325–328, 1988.

[54] H. L. Bodlaender and A. M. Koster, “Treewidth computations I. Upperbounds,” Information and Computation, vol. 208, no. 3, pp. 259–275,2010.

[55] ——, “Treewidth computations II. Lower bounds,” Information andComputation, vol. 209, no. 7, pp. 1103–1119, 2011.

[56] R. Grone, C. R. Johnson, E. M. Sa, and H. Wolkowicz, “Positivedefinite completions of partial Hermitian matrices,” Linear Algebra andits Applications, vol. 58, pp. 109–124, 1984.

[57] M. Laurent, “Polynomial instances of the positive semidefinite andEuclidean distance matrix completion problems,” SIAM Journal onMatrix Analysis and Applications, vol. 22, no. 3, pp. 874–894, 2001.

[58] M. Laurent and A. Varvitsiotis, “A new graph parameter related tobounded rank positive semidefinite matrix completions,” MathematicalProgramming, vol. 145, no. 1-2, pp. 291–325, 2014.

[59] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributedoptimization and statistical learning via the alternating direction methodof multipliers,” Foundations and Trends R© in Machine Learning, vol. 3,no. 1, pp. 1–122, 2011.

[60] N. J. Higham, “Computing a nearest symmetric positive semidefinitematrix,” Linear Algebra and its Applications, vol. 103, pp. 103–118,1988.

[61] B. He and X. Yuan, “On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers,” Numerische Math-ematik, vol. 130, no. 3, pp. 567–577, 2014.

[62] E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson, “Optimalparameter selection for the alternating direction method of multipliers

(ADMM): quadratic problems,” IEEE Transactions on Automatic Con-trol, vol. 60, no. 3, pp. 644–658, 2015.

[63] S. Fliscounakis, P. Panciatici, F. Capitanescu, and L. Wehenkel, “Con-tingency ranking with respect to overloads in very large power systemstaking into account uncertainty, preventive, and corrective actions,” IEEETransactions on Power Systems, vol. 28, no. 4, pp. 4909–4917, 2013.

[64] C. Josz, S. Fliscounakis, J. Maeght, and P. Panciatici, “AC Power FlowData in Matpower and QCQP Format: iTesla, RTE Snapshots, andPEGASE,” arXiv preprint arXiv:1603.01533, 2016.

Ramtin Madani is an Assistant Professor at theElectrical Engineering Department of the Universityof Texas at Arlington. He received the Ph.D. degreein Electrical Engineering from Columbia Universityin 2015, and was a postdoctoral scholar in the De-partment of Industrial Engineering and OperationsResearch at University of California, Berkeley in2016.

Abdulrahman Kalbat is an Assistant Professorin the Electrical Engineering Department at UnitedArab Emirates University. He received the Ph.D.degree in Electrical Engineering from ColumbiaUniversity in 2015.

Javad Lavaei (SM’10) is an Assistant Professor inthe Department of Industrial Engineering and Opera-tions Research at University of California, Berkeley.He obtained his Ph.D. degree in Control and Dynam-ical Systems from California Institute of Technologyin 2011. He has won multiple awards, including NSFCAREER Award, Office of Naval Research YoungInvestigator Award, AFOSR Young Faculty Award,DARPA Young Faculty Award, Google Faculty Re-search Award, Donald P. Eckman Award, ResonateAward, INFORMS Optimization Society Prize for

Young Researchers, INFORMS ENRE Energy Best Publication Award, andSIAM Control and Systems Theory Prize.