ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

70
The Pennsylvania State University The Graduate School Department of Industrial and Manufacturing Engineering ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING OPTIMAL POWER FLOW PROBLEM OVER ELECTRICITY GRIDS A Thesis in Industrial and Manufacturing Engineering by Jinwei Zhang c 2016 Jinwei Zhang Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science August 2016

Transcript of ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

Page 1: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

The Pennsylvania State University

The Graduate School

Department of Industrial and Manufacturing Engineering

ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING

OPTIMAL POWER FLOW PROBLEM OVER

ELECTRICITY GRIDS

A Thesis in

Industrial and Manufacturing Engineering

by

Jinwei Zhang

c© 2016 Jinwei Zhang

Submitted in Partial Fulfillmentof the Requirements

for the Degree of

Master of Science

August 2016

Page 2: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

The thesis of Jinwei Zhang was reviewed and approved* by the following:

Necdet Serhat AybatAssistant Professor of Industrial and Manufacturing EngineeringThesis Adviser

Constantino LagoaProfessor of Electrical Engineering

Janis TerpennyProfessor of Industrial and Manufacturing EngineeringHead of the the Department of Industrial and Manufacturing Engineering

*Signatures are on file in the Graduate School.

Page 3: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

iii

Abstract

The optimal power flow (OPF) problem seeks to find the optimal settings of a given

power system network that optimize particular system objectives such as minimizing the

generation cost or power loss in the network. The OPF problem is nonconvex and gener-

ally hard to solve. Numerous mathematical optimization methods have been studied and

the conditions on the power networks that ensure the OPF can be solved in polynomial

time have been analyzed. Because of the improvements in acquisition and storage tech-

nologies, distributed collection of huge amounts of data in power system is possible; and

there is increasing awareness that the statistical and computational algorithms should be

designed to work in a decentralized manner so that they can be implemented on problems

with distributed data, not stored at a central location, due to memory issues and/or pri-

vacy requirements. In this thesis, the OPF network is modeled as a graph G = (N , E)

of buses N = 1, . . . , N connected with branches in E . The objective function is mod-

eled as composite convex functions Fi = ξi + fi : i = 1, . . . , N with non-smooth part

ξi and smooth part fi, respectively. We show that the distributed first-order augmented

Lagrangian (DFAL) and distributed linearized alternating direction method of multipliers

with proximal gradient (PG-ADMM) can effectively solve OPF problem under the simple

assumption that only the buses connected by a branch can exchange state information.

These two methods are implemented in MATLAB to solve OPF consensus formulations.

The numerical computation results are given to examine the convergence behavior of each

algorithm.

Key Words: Optimal Power Flow, Multi-agent Consensus Model, Distributed Opti-

mization, Composite Convex Function, linearized ADMM, augmented Lagrangian.

Page 4: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

iv

Table of Contents

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2. Optimal Power Flow Problem . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 Formulation of OPF problem . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 General Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Objective function . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.3 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.5 Standard OPF Formulation . . . . . . . . . . . . . . . . . . . . . 9

2.2 Hardness of OPF problem . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Techniques to solve OPF . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Traditional Optimization Methods . . . . . . . . . . . . . . . . . 12

2.3.2 Convex Relaxations for OPF . . . . . . . . . . . . . . . . . . . . 14

Chapter 3. Multi-agent Consensus Problem . . . . . . . . . . . . . . . . . . . . . . . 20

3.1 Formulation of multi-agent consensus optimization problem . . . . . . . 20

3.2 Centralized vs Decentralized methods . . . . . . . . . . . . . . . . . . . 21

3.3 Techniques to solve multi-agent consensus optimization problem . . . . . 23

3.3.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.2 DFAL algorithm for distributed optimization . . . . . . . . . . . 26

3.3.3 Proximal gradient ADMM for distributed optimization . . . . . . 28

Chapter 4. Implementation and Numerical Results . . . . . . . . . . . . . . . . . . . 35

4.1 Distributed OPF problem . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Page 5: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

v

4.1.2 Consensus Constraints . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.3 Line Loss and Generation Cost . . . . . . . . . . . . . . . . . . . 39

4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2.1 Implementation of DFAL . . . . . . . . . . . . . . . . . . . . . . 42

4.2.2 Implementation of DPGA-II . . . . . . . . . . . . . . . . . . . . . 43

4.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Chapter 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Appendix. MATPOWER DATA FORMAT . . . . . . . . . . . . . . . . . . . . . . . 52

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Page 6: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

vi

List of Tables

2.1 OPF typical variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 OPF typical constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.1 Network characteristic of IEEE benchmark system cases . . . . . . . . . . . 42

4.2 Copmarison of DFAL and DPGA-II in Line Loss . . . . . . . . . . . . . . . 46

4.3 Copmarison of DFAL and DPGA-II in Generation Cost . . . . . . . . . . . 47

A.1 Bus Data (mpc.bus) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

A.2 Generator Data (mpc.gen) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

A.3 Branch Data (mpc.branch) . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

A.4 Generator Cost Data† (mpc.gencost) . . . . . . . . . . . . . . . . . . . . . . 55

Page 7: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

vii

List of Figures

3.1 First-order Augmented Lagrangian (DFAL) algorithm . . . . . . . . . . . . 27

3.2 Multi Step-Accelerated Prox. Gradient (MS-APG) algorithm . . . . . . . . 28

3.3 Distributed Proximal Gradient Algorithm I (DPGA-I) . . . . . . . . . . . . 31

3.4 Distributed Proximal Gradient Algorithm II (DPGA-II) . . . . . . . . . . . 34

4.1 Implementation of DFAL algorithm . . . . . . . . . . . . . . . . . . . . . . . 44

4.2 Implementation of DPGA-II algorithm . . . . . . . . . . . . . . . . . . . . . 45

4.3 The convergence behavior for Line Loss with DFAL alg. . . . . . . . . . . . 47

4.4 The convergence behavior for Line Loss with DPGA-II alg. . . . . . . . . . 48

4.5 The convergence behavior for Generation Cost with DFAL alg. . . . . . . . 49

4.6 The convergence behavior for Generation Cost with DPGA-II alg. . . . . . 50

Page 8: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

1

Chapter 1

Introduction

The optimal power flow (OPF) was first introduced by Carpentier in 1962 [1]. The goal

of OPF problem is to find the optimal settings of a given power network that optimize

particular system related objectives such as total generation cost, power loss, bus volt-

age deviation, emission of generating units, and number of control actions while satisfying

power flow equations, system security, and equipment operating limits. Different control

variables, such as generators’ real power outputs and voltages, transformer tap changing

settings, and operating rules for phase shifters, switched capacitors, and reactors, are ma-

nipulated to achieve an optimal network setting based on the problem formulation [2]. Since

then, the OPF problem has been extensively studied and numerous algorithms have been

proposed to solve this highly nonconvex problem, including linear programming, quadratic

programming, and nonlinear programming, such as Newton’s method, interior point meth-

ods, techniques for artificial neural networks, fuzzy logic, genetic algorithm, evolutionary

programming, and particle swarm optimization [3, 4, 5, 6, 7, 8, 9]. The nonconvexity of

OPF problem is partially due to the nonlinearity of physical system governing real(active)

power, reactive power, and voltage magnitude [10]. Many of these methods are based on

the Karush-Kuhn-Tucker (KKT) necessary conditions; therefore they can only guarantee a

local solution [9].

In order to compute the global optimal solution, different conic and convex relaxations

[11, 12] have been proposed to convexify the OPF problem. Among multiple convexification

methods, Lavaei et al. [13] considered the OPF problem with a quadratic generation cost

function, and studied the Lagrangian dual of OPF, which can be formulated as a semidefinite

program (SDP). They studied sufficient conditions that guarantee the duality gap between

OPF problem and its dual is zero, which is the main advantage of the convexification of

OPF through its Lagrangian dual, since a global optimal solution of OPF can be found

(in polynomial time). Moreover, a second-order cone programming (SOCP) relaxation

Page 9: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

2

has been proposed to convexify and solve OPF problem even more efficiently under certain

conditions [14]. The SOCP relaxation will be applied in this thesis as the main modification

to the OPF problem to approximate it with a convex model.

Suppose the generation and distribution system is given as a graph G = (N , E) of nodes

N = 1, ..., N connected with branches in E . Without loss of generality, assume that

(i, j) ∈ E implies i < j; and define

N (i) = j ∈ N : (i, j) ∈ E or (j, i) ∈ E (1.1)

as the set of neighboring nodes to i ∈ N . We assume that each node i has a composite

convex function,

Fi(x) = ξi(x) + fi(x) (1.2)

with a possibly non-smooth convex part ξi and smooth convex part fi, respectively; and we

are interested in minimizing an overall system objective subject to power flow constraints,

i.e.,minx∈X

∑i∈N

Fi(x), (1.3)

where X ⊂ Rn denotes the feasible set of decision variables x associated with power flow

and node-specific constraints. For details of how variables x are defined and X is modeled

based on OPF problem, see Section 2.1. Traditionally, OPF problem has been solved in a

centralized fashion by communicating all the objective functions, such as generation cost

functions, to a central node, and solving the overall problem at this node. However, such an

approach can be prohibitively very expensive both from communication and computation

perspectives. In such case, all the local data needs to be transmitted to the central node

which may also violate privacy constraints of the grid nodes. Furthermore, it requires that

the central node have large enough memory to be able to accommodate all the data [15].

Considering these disadvantages of centralized method,it is motivated to seek for consensus

among all the nodes on an optimal decision using local decisions communicated by the

neighboring nodes, shown as following:

minxi∈Xi,i∈N

∑i∈N

Fi(xi) : xi = xj ,∀(i, j) ∈ E

, (1.4)

Page 10: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

3

where the goal is to collectively solve OPF problem in order to optimize a global objective

function of the system, and Xi is the feasible set split to each node from X . We assume

that only local information exchange is allowed, i.e., there is no central node where the data

can be consolidated, and only neighboring nodes can communicate. We show that the OPF

problem can be formulated as (1.4); hence, it can be regarded as a typical case of decentral-

ized multi-agent consensus optimization problem. (1.4) is a generic model, which can be

used, not only for power system but also for various applications in signal processing [16],

and machine learning [17]. Decentralized multi-agent consensus optimization is motivated

by the emergence of large scale networks such as mobile ad hoc networks [18] and wire-

less sensor network [19], characterized by the lack of centralized access to information and

time-varying connectivity. Therefore, optimization algorithms in such networks should be

completely distributed, relying only on local information.

Given the importance of decentralized multi-agent consensus optimization, a number

of different distributed algorithms have been proposed to solve it. Particularly, Aybat et

al. [15] proposed a distributed first-order augmented Lagrangian (DFAL) algorithm to solve

(1.4). Based on the tests and results reported, the algorithm DFAL performs very well in

practice. However, its implementation on a network of distributed agents requires a more

complex network protocol. Specifically, checking the subgradient stopping criteria for inner

iterations requires evaluating a logical conjunction over G, which may cause trouble for large

networks. To overcome this disadvantage, Aybat et al. [20] proposed a proximal gradient

alternating direction method of multipliers (PG-ADMM) and its stochastic gradient variant

(SPG-ADMM) to solve composite convex problems. They implemented PG-ADMM and

SPG-ADMM on two different, but equivalent, consensus formulations, which gives rise

to two different node-baes distributed algorithms: DPGA-I and DPGA-II and their

stochastic gradient variants. Using only local communication, these node-based distributed

algorithms require less communication burden and memory storage compared to edge-based

distributed algorithms. Moreover, the proposed algorithms consist only a single loop, i.e.,

there are no outer and inner iteration loop; therefore, they are easy and practical to be

implemented over distributed networks. The contribution of this thesis is to implement

DFAL and DPGA-II algorithms to efficiently solve convex relaxation of the OPF problem

Page 11: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

4

in a distributed manner. The IEEE benchmark networks with 3, 9, 14, and 30 buses are used

as test samples, and we compare DFAL and DPGA-II in terms of optimal solution accuracy

and convergence rate. Both DFAL and DPGA-II algorithms applied to OPF problem are

fully distributed, the buses are not required to know some global parameters depending on

the entire network topology.

The rest of this thesis is organized as follows. The OPF problem is formulated in Chapter

2. The multi-agent consensus problems and some algorithms to solve it are discussed in

Chapter 3. The practical implementation of DFAL and DPGA-II to solve SOCP relaxation

of OPF problem is proposed in Chapter 4, and numerical results are illustrated in this

chapter. Concluding remarks are given in Chapter 5. Finally, the data format used in

the implementation is summarized in Appendix. The following notations will be used

throughout this thesis.

• i: Imaginary unit.

• R: Set of real numbers.

• Hn×n

: Set of n× n Hermitian matrices.

• Re· and Im·: Real and imaginary parts of a complex matrix.

• ∗: Conjugate transpose operator.

• >: Transpose operator.

• : Matrix inequality sign in the positive semidefinite sense (i.e., given two symmetric matrices

A and B, A B implies A−B is a positive semidefinite matrix, meaning that its eigenvalues

are all nonnegative).

• Tr: The matrix trace operator.

• | · |: The absolute value operator.

Given complex values a1 and a2, the inequality a1 ≥ a2 means Rea1 ≥ Rea2 and

Ima1 ≥ Ima2

Page 12: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

5

Chapter 2

Optimal Power Flow Problem

Power flow is also known as “load flow”. This is the name given to a network solution that

consists of complex currents, complex voltages, real power, and reactive power at every bus

in the system. Since we assume that the parameters of system elements, such as lines and

transformers, are constant, the power system in consideration is a linear network. However,

in the power flow problem, the relationship between voltage and current at each bus is

nonlinear, and the same holds for the relationship between the real and reactive power

consumption at a bus or the generated real power and scheduled voltage magnitude at a

generator bus. Thus even computing a feasible power flow for given amount of power injected

at each generator bus involves the solution of nonlinear equations. Therefore, OPF problems

are nonconvex, and in general large-scale optimization problems which may contain both

continuous and discrete control variables. Many different OPF formulations have been

developed to address specific instances of the problem under various assumptions, each

having different objective functions, controls, and constraints. Regardless of the different

names given to different cases, any power system optimization problem that includes a set

of power flow equations in the constraints may be classified as an OPF problem.

OPF is an important part of power system operation and planning. It describes the

optimal electrical response of the transmission system to a particular set of loads and power

injections. In traditional power systems, OPF is mainly used for planning purpose together

with a forecast engine, e.g., to determine the system state in the day-ahead market with the

given current system information. In the smart grid paradigm, due to highly intermittent

nature of the renewables, the later the prediction is made, the more reliable it would be.

Therefore, if OPF can be solved very efficiently in real time, some of the unpredictability

in the system will be mitigated.

Page 13: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

6

2.1 Formulation of OPF problem

2.1.1 General Structure

Generally, OPF requires solving a system of nonlinear equations and inequalities, describ-

ing optimal and secure operation of a power system:

minx,u

f(x,u)

subject to g(x,u) = 0 (2.1)

h(x,u) ≤ 0,

where

i. f(x,u) is an appropriate cost to be minimized;

ii. x is the vector of state variables;

iii. u is the vector of control variables, which are usually the independent variables in an

OPF;

iv. g(x,u) is the set of equality constraints resulting from power flow equations;

v. h(x,u) is the set of inequality constraints resulting from the physical limits imposed

on the vector arguments x and u.

Depending on the objective functions and constraints, there are different mathematical

formulations for the OPF problem. They can be broadly classified as follows:

i. Linear problem in which objectives and constraints are given in linear forms with

continuous variables.

ii. Nonlinear problem where either objectives or constraints or both are nonlinear with

continuous control variables.

iii. Mixed-integer linear problem where control variables are both discrete and continuous

within the linear setting.

Page 14: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

7

2.1.2 Objective function

The most common OPF objective is to minimize the total generation cost, sometimes also

considering system losses as well. Generation cost functions are often approximated with

quadratic cost curves [21, 22], or with piecewise linear functions [23]. Besides minimiza-

tion of generation cost, other commonly adopted objectives include maximization of power

quality (often by minimizing voltage deviation), and minimization of capital costs during

system planning. In nearly all cases, the objective is a function of real and reactive power

generated in the system. Other objectives considered in the literature and practice also

include power transfer capability, number of controls rescheduled or shifted, cost or VAR

investment, optimal voltage profile, load shedding, environment impact, system loadability,

etc. [24].

2.1.3 Variables

State variables x in OPF problems represent the electrical state of the system. These

continuous state variables include bus voltage magnitude, bus voltage angle, real and reac-

tive power injections at each bus, as well as MVar loads, line parameters. Control variables

u typically include a subset of the state variables as well as variables representing control

device settings, such as transformer tap ratios, p-shifter angels, values of switchable shunt

capacitors and inductors. Control variables may be continuous or discrete. The choice of

state variables is dictated by the form of power flow equations used, while control variables

differ widely among OPF formulations based on the nature of particular problem. Table

2.1 summarizes typical variables previous used in the literature [24].

2.1.4 Constraints

Typical equality and inequality constraints used in OPF are summarized in Table 2.2 [24].

Equality constraints g(x,u) include the power flow conservation constraints. Alternating

Current (AC) power flow has been adopted for use in real-life transmission systems. OPF

formulations incorporating the AC power flow equations are nonlinear; thus, nonconvex.

For the sake of simplification of the model in practice, the flow of real power in the system

may be “decoupled” from the flow of reactive power, which leads to decomposing OPF

Page 15: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

8

Variable TypesState variablesBus voltage magnitude ContinuousBus voltage phase angel ContinuousBus voltage real & imaginary parts ContinuousNetwork power flow ContinuousBranch currents ContinuousSlack bus power ContinuousGenerator reactive power outputControl VariablesReal/reactive power generation ContinuousRegulated bus voltage magnitude ContinuousTransformer tap settings DiscreteTransformer phase shifters Continuous, DiscreteSwitched shunt reactive devices BinaryLoad to shed Continuous, DiscreteMW interchange transactions ContinuousHVDC link MW controls ContinuousFACTS controls Continuous, DiscreteGenerator voltage control settings ContinuousStandby start-up units BinaryLine switching Binary

Table 2.1: OPF typical variables

problem into separate subproblems for the real and reactive power flows [25]. Many OPF

algorithms take advantage of this decomposition because it provides significant algorithm

simplification while introducing only a “small” amount of error under certain condition.

However, the decoupling approach to OPF is not typically accurate when complex control

devices are present in the system [26]. Sometimes although the current type is AC, Direct

Current (DC) power flow equations are used to approximate the nonlinear balance equations

for AC power flow. However, this simplification, i.e., using DC to approximate AC power

flow, both neglects network losses and prevents accurate cost accounting for reactive power.

Neglecting network losses introduces unacceptable levels of error in large power system

models. Several methods are available to enhance the DC power flow equations to provide

an estimate of system losses [24].

The inequality constraints h(x,u) include minimum and maximum limits on control and

state variables, such as bus voltage and line current magnitudes. Many transient security

constraints may also be incorporated analytically.

Page 16: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

9

Equality constraintsFull AC power flowDecoupled AC power flowDC power flowNet active power exportSteady-state securityOther balance constraintsInequality constraintsActive/reactive power generation limitsDemand constraintsBus voltage limitsBranch flow limitsControl limitsTransmission interface limitsActive/reactive power reserve limitsSpinning reserve requirementsActive/reactive power flow in a corridorTransient securityTransient stabilityTransient contingenciesEnvironment constraints

Table 2.2: OPF typical constraints

2.1.5 Standard OPF Formulation

The bus injection model being used throughout this thesis is the standard model for

power flow analysis and optimization. It is built on nodal variables such as voltage, current

and power injection, while the branch flow model focuses on currents and powers on the

branches. The buses in a power system network are generally divided into three categories:

generation bus, load bus, and slack bus. Following quantities are specified at each bus: (1)

|V |: magnitude of the complex voltage V ; (2) θ: phase angel of the complex voltage; (3) P :

active (real) power; (4) Q: reactive power. Typically the bus voltage magnitude and phase

angel are represented by one complex variable V instead of two real variables separately.

Consider a power system network with the set of buses N := 1, 2, . . . , N, the set of

generator buses I ⊂ N , and the set of flow lines E ⊂ N ×N . Define:

• PDi + iQDi : Complex power of the load connected to bus i ∈ N .

• PGi + iQGi : Complex power output of the generator connected to bus i ∈ N .

• Vi: Complex voltage at bus i ∈ N .

Page 17: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

10

• yij : Admittance of the transmission line (i, j) ∈ E , yij = gij − ibij .

Define V, PG, QG, PD and QD as the vectors Vii∈N , PGii∈N , QGii∈N , PDii∈N ,

and QDii∈N , respectively. Suppose fi(PGi) is a convex function representing the cost

associated with generator i ∈ I, and our objective is to minimize the total generation cost.

This problem can be formulated as follows:

minPG,V

f(PG) =∑i∈I

fi(PGi)

subject to PGi + iQGi = PDi + iQDi + Vi∑

j∈N (i)

y∗ij

(Vi − Vj)∗, ∀i ∈ N (2.2a)

Pmini≤ PGi ≤ P

maxi

, ∀i ∈ I (2.2b)

Qmini≤ QGi ≤ Q

maxi

, ∀i ∈ I (2.2c)

V mini≤ |Vi| ≤ V

maxi

, ∀i ∈ N (2.2d)

|Vi − Vj | ≤ ∆V maxij

, ∀(i, j) ∈ E (2.2e)

In this formulation PG and QG are the controllable parameters of the power network, and

PD and QD are fixed. Therefore, once PG and QG are fixed, this setting will set the value

of voltage V through out the network G according to power flow equation (2.2a). Given

the known vectors PD and QD, OPF minimizes the objective function over the unknown

parameters V, PG,and QG subject to the power flow equations at all buses as well as the

physical constraints. We assume that PGi = QGi = 0, if i 6∈ I, and the limits Pmini

, Pmaxi

,

Qmini

, Qmaxi

, V mini

, V maxi

, ∆V maxij

are given.

2.2 Hardness of OPF problem

There are three main difficulties that make OPF problem hard to solve.

(1) Active Constraints: Given a feasible point, the inequality constraints satisfied at

this point with strict inequalities are called inactive, while those constraints satisfied with

equality are referred to as binding or active constraints. Finding the active inequality

constraints has a combinatorial aspect, and it is a difficult part of solving the OPF problems.

Page 18: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

11

No direct methods exist for solving OPF without using intermediate optimization method

to identify the active sets at an optimal solution [2].

(2)Nonconvexity : Even though some OPF problems can be formulated with linear objec-

tive function and constraints, e.g., DC-OPF problem. In most cases, the formulation has

a nonlinear form due to power flow equations representing the physical constraints on the

electric grid and explaining the highly nonlinear interplay among real power, reactive power

and complex voltage as shown in equation (2.2a). Hence, described by nonlinear equations,

the nonconvex feasible region may even be disconnected. Consequently the KKT conditions

are not sufficient for a global optimum in general. In particular, for AC power systems, the

OPF problem is inherently nonconvex giving rise to many local optima. As a result, existing

solution methods used extensively in practice rely on iterative optimization methods, which

can only return local optimal solutions at best. To summarize, the nonconvexity of OPF

problem in general prevents solving OPF in polynomial time, which makes OPF NP-hard

to solve.

(3)Network Complexity : Traditional mathematical optimization methods have been used

to effectively solve conventional OPF problems where the constraints are represented in

steady-state without considering system contingencies that can occur temporarily. An-

other difficulty in this respect is to accurately model and deal with the binary and integer

variables representing the node levels switch controls in the power systems in addition to

conventional OPF problems. Moreover, due to emergence of a deregulated electricity mar-

ket and consideration of dynamic system properties, the traditional concepts and practices

of power systems are overruled by an economic market management. Therefore, the dif-

ficulty of solving OPF problems increases significantly with increasing network size and

complexity [13].

Page 19: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

12

2.3 Techniques to solve OPF

2.3.1 Traditional Optimization Methods

The majority of traditional techniques discussed in the literature use one of the following

methods: gradient descent, Newton’s method, simplex method, sequential linear program-

ming (SLP), sequential quadratic programming (SQP), and the interior point methods

(IPM). The simplex method is suitable for LP-based OPF, and can be directly applied to

DC-OPF formulation[27, 28]. SLP is an extension of LP introduced by Griffth and Stew-

art [29] that allows optimizing problems with nonlinear characteristics via a series of linear

approximations. In certain cases, the original NLP formulation can be reduced to an LP

using linear approximations of the objective function and constraints around an initial esti-

mate of the optimal solution [23, 30, 31, 32]. SQP is a solution technique for NLP problems,

and similar to SLP, it solves the original problem by solving a series of QP problems, of

which solutions converge to the optimal solution of the original problems [33]. In most SQP

implementation for OPF problem, conventional power flow equations are linearized at each

iteration, which can increase the computational efficiency at the cost of an increase in the

number iterations. A significant amount of work has been proposed in the literature on the

implementation of SQP for solving OPF problems [25, 34, 35, 36]. The following methods

focus on directly solving NLP rather than solving LP or QP approximation problems.

(1) Gradient Methods: Gradient methods were among the first attempts around the

end of the 1960s to solve practical OPF problems. Gradient methods can be divided

into three mainstreams of research: Reduced Gradient method (RG), Conjugate Gradient

Method (CG), and Generalized Reduced Gradient Method (GRG). Gradient methods use the

first-order derivative vector ∇f(xk) of the objective function at the current iterate xk to de-

termine an improving direction. Gradient methods are easy to implement, and guaranteed

to converge for well-behaved functions. However, gradient methods are slow, i.e., require

more iterations compared to higher-order methods. Moreover, because they do not evaluate

the second-order derivative, they are only guaranteed to converge a stationary point (which

may not be a true local optimal solution). Global optimality can only be proven for convex

problems, which excludes most OPF formulations [24].

Page 20: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

13

RG was first applied to solve OPF problems by Dommel and Tinney [37]. It used penalty

techniques to enforce box constraints on state variables and the functional constraints.

The CG method is an improvement of the RG method, and is one of the most well-known

iterative methods for solving NLP problems with sparse systems of linear equations. Instead

of using the negative reduced gradient as the direction of descent, the CG method chooses

the descent direction such that it is conjugate to previous search directions by adding

the current negative gradient vector to a scaled, linear combination of previous search

directions. There are several advantages of applying CG method for solving OPF problems;

particularly the improvement in search characteristics over the RG method [25]. GRG

method is an extension of RG method which enables direct treatment of inequality and

nonlinear constraints. Rather than using penalty functions, GRG method modifies the

constraints by introducing slack variables to all inequality constraints; and all constraints are

linearized about the current operating point. Thus the original problem is transferred into a

series of subproblems with linear constraints that can be solved by RG or CG method [38].

However, since linearization introduces error in the constraints, an additional step is required

to modify the variables at the end of each iteration to recover feasibility. In OPF, this

feasibility recovery is performed by solving a conventional power flow [39].

(2) Newton’s Method : Newton’s method is a second-order method for unconstrained

optimization based on Taylor series expansion. The search direction at this point is set

to dk = −H(xk)−1∇f(xk) , where H(xk) denotes the Hessian of f at point xk. Then the

method computes a step size αk > 0 in direction dk satisfying certain step-size selection

rules such as inexact line search with backtracking. However, the method is not guaranteed

to converge to a local minimum as the Hessian matrix H may not be positive semidefinite

in a sufficiently large vicinity of the minimum point. Newton’s method requires the use of

Lagrangian function when applied to OPF problems. Inequality constraints originated from

the power system physical limits should either be treated as equality constraints or omitted,

depending on whether they are binding at the optimal solution or not. Since the active

inequalities are not known prior to the solution, identifying active inequality constraints is

a major challenge for Newton-based solutions for OPF problem [40] as mentioned in Section

2.2. After Sasson et al. [41] have presented an early version of Newton-based OPF method,

Page 21: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

14

and Sun et al. [42] have presented a more efficient algorithm employing the Lagrangian,

researchers have made significant contributions focusing on identifying and enforcing the

active inequality constraints [43, 44, 45, 46, 47].

(3) Interior Point Methods(IPM): The difficulty of enforcing inequality constraints was a

motivating factor for applying IPMs to solve OPF. IPMs are a family of projective scaling

algorithms for solving linear and nonlinear optimization problems. IPMs attempt to deter-

mine and follow a central path through the feasible region to the set of optimal solutions.

The key point of feasibility enforcement is achieved either by using barrier terms in the aug-

mented objective function or by directly manipulating the required KKT conditions [48].

When applied to OPF, IPMs have achieved several improvements over years. The popular

methods at this type include Primal-Dual Interior Point Methods (PDIPMs) [49, 50], Mehro-

tra’s predictor-corrector techniques [51], Gondzio’s multiple-centrality corrections [52, 53],

and trust region techniques [54].

Among different IPMs, PDIPMs are perhaps the most popular deterministic algorithms

studied in OPF research. Granville [49] might be the earliest one that applied PDIPMs to

OPF problem, extending earlier PDIPMs for LP and QP to the NLP case of the reactive

power dispatch. The key advantages over Newton’s methods include no requirement to

identify the active constraint set or to have an initial feasible solution. Most of the research

on IPMs focused on improving the performance of PDIPMs by exploiting the structure of

power flow constraints. For example, Vanti et al. [55] proposed a modified PDIPM which

uses a merit function to enhance the convergence properties for OPF. A simplified OPF

formulation using rectangular coordinates and current mismatches presented by Zhang et

al. [56] leads to a simplification of Hessian matrix and can reduce computational effort.

2.3.2 Convex Relaxations for OPF

In Section 2.3.1, the traditional methods for solving nonlinear and nonconvex OPF prob-

lems were introduced. Some of these methods are based on the KKT necessary conditions,

which can only guarantee consequence to a local optimal solution due to nonconvexity

of OPF problems. Moreover, the nonlinear and nonconvex characteristics make the OPF

problem NP-hard to solve. Considering the OPF formulation given in Section 2.1.5, the

Page 22: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

15

nonconvexity comes from the power flow equation constraints (2.2a). For each transmis-

sion line (i, j) ∈ E , let yij = gij − ibij denote its admittance, and Y ∈ CN×N denote the

admittance matrix of the power network, i.e.,

Yij :=

−yij if i 6= j and (i, j) ∈ E∑

k∈N (i) yik if i = j

0 o.w.,

(2.3)

where N (i) is defined as (1.1). Let Ii denote the current of bus i, and I := [I1 I2...IN ]>

denote the current vector of all buses, which can be written as YV, and Si = Pi + iQi

denote the net power injection of bus i, i.e.,

Si = Pi + iQi = (PGi − PDi) + i(QGi −QDi) (2.4)

The state of each bus can be written as:

• Current balance: Ii =∑

j∈N (i) yij(Vi − Vj) for i ∈ N .

• Power balance: Si = ViI∗i

for i ∈ N .

Define e1, e2, ..., eN as the standard basis vectors in RN such that (ei)j = 0 if i 6= j, (ei)j = 1

if i = j. Therefore,

Si = (PGi − PDi) + i(QGi −QDi) = ViI∗i

= (e∗iV)(e∗

iYV)∗

= e∗iVV∗Y∗ei = Tr(VV∗Y∗eie

∗i). (2.5)

(1) Semidefinite relaxation(SDP): The nonlinear term VV∗ in equality constraint (2.5)

can be replaced by a new matrix variable W ∈ HN×N to make this constraint linear, where

HN×N denotes the set of N×N Hermitian matrices. Meanwhile, in order to make sure that

the map from V to W is invertible, W must be constrained to be both positive semidefinite

and rank-one. The feasible set of original OPF problem in Section 2.1.5 can be equivalently

Page 23: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

16

formulated as follows (for simplicity we emitted the transmission line constraints):

Pmini≤ PGi ≤ P

maxi

, ∀i ∈ G (2.6a)

Qmini≤ QGi ≤ Q

maxi

, ∀i ∈ G (2.6b)

(V mini

)2 ≤Wii ≤ (V maxi

)2, ∀i ∈ N (2.6c)

Tr(WY∗eie∗i) = (PGi − PDi) + i(QGi −QDi), ∀i ∈ N (2.6d)

W = W∗, (2.6e)

Rank(W) = 1. (2.6f)

The constraint Rank(W) = 1 makes OPF formulation nonconvex. Removing this rank

constraint from the optimization problem makes it a SDP relaxation, which is a convex

problem (if the objective function is also a convex function). SDP relaxation can be solved

efficiently in polynomial time. However, the main difficulty here is to analyze the solu-

tion quality of the relaxation compared to the optimal solution of the original nonconvex

problem. In certain scenarios the relaxation is tight.

Lavaei and Low [13] proved that the dual of the original OPF problem is the same as

the dual of the SDP relaxation; and showed that under certain conditions, strong duality

holds between the SDP relaxation and the dual of original OPF problem with quadratic

generation cost. In general, the dual optimal value is only a lower bound on the optimal

objective value of OPF and the lower bound may not be tight in presence of a nonzero

duality gap. Lavaei and Low showed that the global optimal solution to the OPF can be

retrieved from the optimal solution to the convex dual problem whenever the duality gap

between the dual and the original OPF problem is 0; and in this case, the SDP relaxation is

tight, the OPF can be solved either by SDP relaxation or dual relaxation, Lavaei suggests

solving the dual of the OPF problem instead of the primal SDP relaxation as the number of

variables in the primal SDP relaxation increases quadratically with the number of nodes in

the network, while the number of variables in the dual problem increases linearly with the

number of nodes. This distinction is important when using primal and dual interior-point

algorithms for solving the primal SDP and dual relaxations, respectively [13, 50].

Page 24: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

17

In their work, a necessary and sufficient condition is provided to guarantee the existence

of zero duality gap for the OPF problem, which is satisfied for the standard IEEE bench-

mark systems with 14, 30, 57, 118, and 300 buses as well as several randomly generated

systems. Considering the difficulty of verifying this condition, another sufficient is given to

guarantee the existence of zero duality gap for IEEE systems. This condition holds when a

small perturbation is added to the admittance matrix (a small resistance, 10−5pu, to every

transformer that has originally zero resistance) that widely holds in practice. Moreover,

Lavaei proved that the duality gap is expected to be zero for a large class of power systems

due to the passivity of transmission lines and transformers. There exists an unbounded

set of network topologies (admittance matrices) that make the duality gap zero for all pos-

sible values of loads. Later, Lavaei [57] extended the results to the case when there are

other sources of nonconvexity in OPF, such as variable transformer ratios, variable shunt

elements and contingency constraints. In addition to quadratic cost function, Sojoudi and

Lavaei [58] extended the previous work to OPF with arbitrary convex cost function. They

showed that due to the physics of the power network, every OPF problem can be solved in

polynomial time after applying the following approximations:(i) write power balance equa-

tions as inequalities, and (ii) place virtual (fictitious) phase shifters in certain loops of the

network.

(2)Second-order cone relaxation(SOCP): SOCP relaxation is equivalent to SDP relax-

ation for tree networks, and it has a much lower computational complexity than the SDP

relaxation. The challenge of solving the OPF problem comes from the nonlinear equality

constraints in (2.2a). To overcome this challenge, one can approximate the feasible set of

OPF problem with a convex set via change of variables: ∀i ∈ N , define

Wij = ViV∗j, j ∈ N (i) ∪ i. (2.7)

For all (i, j) ∈ E , define

Wi, j :=

Wii Wij

Wji Wjj

∈ H2×2. (2.8)

Page 25: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

18

Using the new variable, the feasible set of the OPF problem can be equivalently written as

(PGi − PDi) + i(QGi −QDi) =∑

j∈N (i)

(Wii −Wij)y∗ij, ∀i ∈ N (2.9a)

Pmini≤ PGi ≤ P

maxi

, ∀i ∈ I (2.9b)

Qmini≤ QGi ≤ Q

maxi

, ∀i ∈ I (2.9c)

(V mini

)2 ≤Wii ≤ (V maxi

)2, ∀i ∈ N (2.9d)

Wi, j = W ∗i, j, ∀(i, j) ∈ E (2.9e)

Rank(Wi, j) = 1, ∀(i, j) ∈ E . (2.9f)

In this equivalent formulation, the rank constraint is nonconvex. A second-order cone

programming relaxation is obtained by dropping the rank 1 constraint and imposing positive

semidefinite constraint instead [58],

Wi, j 0, ∀(i, j) ∈ E . (2.10)

When we compare SDP relaxation and SOCP relaxation for OPF problem, we see that

SDP has a matrix variable W with N(N + 1)/2 unknown entries; hence the number of

scalar variables in SDP is O(N2). Therefore, it may be hard to solve efficiently for a large

value of N . On the other hand, the SOCP relaxation is much more memory efficient than

the SDP relaxation since it imposes constraints only on the submatrices of W, i.e., (2.6e)

versus (2.9e). Moreover, SOCP has lower computational complexity to solve compared to

SDP.

If the solution to the SOCP relaxation is also feasible for the original OPF problem,

i.e., if it satisfies (2.9f), then the relaxation is said to be exact. The exactness for SOCP

relaxation guarantees that its solution must be a global optimal solution for the OPF as

well. That said, the SOCP relaxations are in general not exact [59]. Several papers in the

literature discussed about the conditions to guarantee the exactness of SOCP relaxation,

and all of these conditions require removing some of the constraints in OPF problem [60, 61].

Gan et al. [14] proposed a sufficient condition such that the SOCP relaxation is exact for

radial networks. That said, the proposed condition cannot be verified prior to solving the

Page 26: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

19

relaxation. To overcome this shortcoming, they proposed a modified OPF with slightly

altered voltage magnitude upper bound constraints in (2.2d), and they proved that the

SOCP relaxation for the modified OPF problem is exact. There is no theoretical guarantee

if the modified OPF would have a feasible set that is close to original OPF; therefore,

although the proposed sufficient condition for the modified OPF can be checked prior to

solving, the final solution of the SOCP relaxation for the modified OPF problem may not be

close to the optimal solution of the original OPF problem. On the other hand, they reported

promising numerical results for IEEE test networks showing that the SOCP relaxations work

well in practice.

Page 27: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

20

Chapter 3

Multi-agent Consensus Problem

In Chapter 2, the bus injection model is used to formulate the OPF problem for a con-

nected network G = (N , E) with the set of buses N := 1, . . . , N and the set of flow

lines E ⊂ N × N . Suppose I ⊂ N denotes the set of generator buses. The OPF problem

can be studied as a special case of cooperative optimization problem with multiple agents

connected over the network G. Cooperative multi-agent optimization problems have found

numerous applications in various domains such as distributed consensus optimization [62],

distributed and parallel machine learning [63], and distributed signal processing [18]. In

this class of problems, control and optimization should be completely distributed, relying

only on local observations and information, and the algorithms should be robust against

unexpected changes in topology. This chapter reviews the distributed algorithms for solving

the multi-agent consensus optimization problem.

3.1 Formulation of multi-agent consensus optimization problem

Briefly, cooperative multi-agent network models consist of a distributed computation

scheme, where each agent processes its local information and shares the processed infor-

mation with the neighboring agents over the connectivity network G = (N , E). The goal

of cooperative optimization problem is to collectively reach an optimal consensus decision

that minimizes a global objective function F (x) =∑

i∈N Fi(x), where Fi(x) : Rn → R is the

local objective function of agent-i, i.e., it is known by agent i only. It requires two models

to describe the system: (1) an information exchange model describing the evolution of the

agents’ information in time; (2) an optimization algorithm specifying the details of agent

actions that cooperatively minimize the overall system objective by individually minimizing

their local objectives and exchanging information among themselves [64]. In general, the

Page 28: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

21

cooperative multi-agent problem can be formulated as

F ∗ = minx

F (x) := T (F1(x), F2(x), ...FN (x)) (3.1)

subject to x ∈ X :=

(⋂i∈NXi

)∩ Xg, (3.2)

where T (F1(x), F2(x), ...FN (x)) denotes a given convex combination of local objective func-

tions Fi. The simplest structure, being the sum of local objective functions∑

i∈N Fi(x), is

considered in this thesis. X is defined as the intersection of all local constraints Xi with

additional global constraints Xg that may be imposed by the network structure.

When the problem is separable, i.e., where local objective functions and constraints de-

compose over the components of the decision vector, the distributed optimization algorithms

mainly rely on using Lagrangian dual decomposition and dual methods for computing the

solution. A typical example is the subgradient method as a dual ascent to solve the dual

of a convex constrained optimization problem after relaxing some of the primal constraints.

This decomposition method leads to small subproblems that each agent can solve using its

local information.

When the problem is not separable, dual decomposition approach will not lead to a dis-

tributed methods. Hence, another technique is to use consensus optimization methods, in

which all the agents seek for consensus while locally optimizing their local functions and

exchanging information with neighbors. The main idea is to use consensus formation as a

mechanism for distributing the computations among the agents. There are numerous ap-

plications of multi-agent consensus optimization problem including signal processing within

a network of sensors, network motion planning and alignment, and distributed constrained

multi-agent optimization problems, etc.

3.2 Centralized vs Decentralized methods

Many problems in operations research, machine learning, and statistical analysis can be

cast as optimization problems that potentially have millions of variables with data stored

in a distributed way due to memory issues and privacy requirements. Indeed, solving these

Page 29: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

22

problems in a centralized manner requires consolidating all the local data to the common

node, which can be very expensive both from communication and computation perspectives,

and it may also violate privacy constraints if node i ∈ N does not want to reveal its private

data. Even if aggregating all the local data, defining Fi and Xi, at a common node is

possible, i.e., does not violate privacy and the data can be transferred efficiently, it still

requires that the central node has large enough memory to be able to accommodate all

the data over the network. Therefore, at the expense of slow convergence, it may be more

efficient from many perspectives to design decentralized optimization algorithm seek for

consensus among all the nodes on an optimal decision. To briefly summarize, although

the multi-agent consensus problem can be solved in a centralized way by communicating

all private functions Fi to a central node, and solving the overall problem at this node

to compute the optimal solution (x∗, F ∗), this may not be feasible due to many practical

considerations such as memory, communication overhead and privacy requirements.

In the rest, we neglect the global constraints Xg for the sake of simplicity, and consider

the following multi-agent consensus problem:

F ∗ := minx∈Rn

F (x) =

∑i∈N

Fi(x) : x ∈ X :=⋂i∈NXi

. (3.3)

Since each agent i can be seen as an independent computing node with its private data, one

can define a local decision variable xi ∈ Rn for each node i ∈ N . Let the decision vector

x = [xi]i∈N ∈ Rn|N |. The problem (3.3) can be transformed into a decentralized form as

follows

F ∗ := minxi∈Rn, i∈N

∑i∈N

Fi(xi) : xi ∈ Xi, xi = xj ∀(i, j) ∈ E

. (3.4)

The consensus constraints, xi = xj for all (i, j) ∈ E , ensure that all local decisions are equal

to each other. Given x = [xi]i∈N ∈ X1 × . . . × XN , we call it ε-feasible if the consensus

violation satisfies max(i,j)∈E‖xi − xj‖2 ≤ ε, and ε-optimal if |∑

i∈N Fi(xi)− F∗| ≤ ε.

Page 30: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

23

3.3 Techniques to solve multi-agent consensus optimization problem

3.3.1 Related work

During 1980s, Tsitsikis et al. [65, 66] developed a seminal work on algorithms for dis-

tributed minimization of a smooth function. Following this line of work, numerous dis-

tributed algorithms have been designed for distributed minimization of convex functions,

i.e., F in problem(3.4) is convex but not necessarily smooth. Here, we review the following

two types of algorithms: Subgradient based methods, and Alternating Direction Method of

Multipliers (ADMM).

(1) Subgradient Methods: Duchi et al. [67] proposed a distributed dual averaging sub-

gradient algorithm for minimizing the sum of local convex functions Fi over a network

G. This algorithm can compute an ε-optimal solution in O(1/ε2) iterations, which also

depends on network size and topology; however, no guarantee on the consensus violation

max(i,j)∈E‖xi − xj‖2 is provided in [67].

Nedic and Ozdaglar [64] proposed a distributed subgradient method for minimizing the

sum of convex objective functions over a time-varying connectivity structure. The conver-

gence results showed that by setting the step-size as constant c = O(ε), a solution x can

be computed within O(1) iterations such that its consensus violation is bounded above by

a constant that is proportional to c; moreover, the suboptimality error is bounded from

above as |∑

i∈N Fi(xi) − F∗| ≤ ε within O(1/ε2) iterations. However, due to the constant

step-size rule used in the subgradient method, the both feasibility and suboptimality errors

have a constant error term that is also proportional to the stepsize value c. Therefore, the

suboptimality and consensus errors will stall and will not decrease with more iterations.

It should be noticed that the subgradient based algorithms are first-order algorithms

which has slow convergence rate given by O(1/√k) where k is the iteration number, mak-

ing them impractical for many large scale applications. In addition, these algorithms are

synchronous, meaning that the computations are simultaneously performed according to

some global clock, but this may go against the highly decentralized nature of the problem,

Page 31: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

24

which prevents such global information being available to all nodes. These two disad-

vantages motivated the development of asynchronous decentralized algorithms based on

ADMM [68].

(2)ADMM Method : Traditional ADMM decomposes the original problem into two sub-

problems, and after solving them sequentially, the dual variables are updated at each itera-

tion. The drawback of traditional ADMM method is that it partitions the problem into only

two subproblems and thus cannot be implemented in a distributed way for a large network.

Wei & Ozdaglar [68, 69], and Makhdoumi & Ozdaglar [70] proposed distributed ADMM

algorithms that can compute an ε-optimal and ε-feasible solution in O(1/ε) proximal map

evaluations for each Fi.

Although these algorithms have superior iteration complexity compared to the subgradi-

ent methods, they need to evaluate the proximal map for Fi at each iteration. There are

many practical problems with composite convex local functions Fi = ξi+ fi such that while

one can compute the prox map for ξi efficiently, computing the proximal map for Fi = ξi+fi

is not easy. One way to overcome this limitation is by locally splitting variables, i.e., re-

defining Fi(xi, yi) = ξi(xi) + fi(yi), and adding additional constraint xi = yi in problem

(3.4). However, this new formulation will at least double the local memory storage need,

and still require the proximal maps for both ξi and fi to be simple in order to assure the

efficiency of ADMM.

When the local objective function Fi has a special composite convex structure Fi = ξ+fi,

i.e., assuming that the non-smooth term ξ is the same at all nodes, and ∇fi is bounded

for all i ∈ N , Chen & Ozdgalar [71] proposed an inexact proximal gradient method which

can compute ε-feasible and ε-optimal solution in O(1/√ε) iterations that requires O(1/ε)

communications per node in total over a connectivity network with dynamic topology. Con-

sidering that there are many practical problems where nodes in the network have different

non-smooth components, Shi et al. [72] and Aybat et al. [15] proposed proximal gradient

based distributed algorithms that can solve (3.4) over a static connectivity network when

Fi = ξi + fi. Both algorithms can handle node specific non-smooth terms ξi, and without

assuming bounded ∇fi for any i ∈ N . The algorithm PG-EXTRA proposed in [72] is an

extension of the algorithm EXTRA [73] to handle the non-smooth terms ξii∈N . The

Page 32: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

25

distributed first-order augmented Lagrangian (DFAL) algorithm is proposed in [15] will be

discussed in Section 3.3.2. According to our numerical tests, the DFAL algorithm performs

well in practice; however, its implementation on a network of distributed agents requires

a complex network protocol. Specifically, DFAL is a double loop algorithm, and checking

the stopping criterion for inner iterations requires evaluating a logical conjunction over G,

which may not be easy for large-scale networks.

More recently linearized ADMM algorithms are developed to solve the multi-agent consen-

sus problem. Suppose Fi(x) = ξi(x)+fi(Aix), rather than the smooth convexity assumption

on fii∈N , Chang et al. [16] showed the convergence of their distributed method under a

far more stringent assumption: fi is strongly convex with a Lipschitz continuous gradient

for all i ∈ N . Under this stronger assumption, they were able to show linear convergence

rate only when the non-smooth terms are absent and Ai has full column rank for all i ∈ N .

The main idea of the algorithm is the adoption of an inexact step for each ADMM update,

which enables the agents to perform cheap computation at each iteration, and significantly

reduces the computational burden. However, this distributed algorithm requires the global

knowledge of σmin(Ω + W ) of the graph G, where Ω is the graph Laplacian, and W is the

adjacency matrix; this assumption is generally not attainable for very large scale fully dis-

tributed network. Bianchi et al. [74] proposed an asynchronous distributed algorithm based

on linearized ADMM, and showed an almost sure convergence without any rate result. In

this method, a random set of agents become active, compute their proximal gradient steps,

update their local variables and exchange some data with neighbors. This method runs on

edge-based formulations of the decentralized problem, and due to the nature of edge-based

distributed algorithms, the proposed method requires each node to store a dual variable for

each edge it is linked to, and to memorize all its neighbor’s local variable in addition to its

own local variable.

Compared to algorithms discussed above, the node-based algorithm PG-ADMM proposed

by Aybat et al. [20] has certain advantages: it is far more cheaper in information exchange,

computational effort and memory requirement, and are fully distributed, i.e., the agents

only need to know who their neighbors are, but they are not required to know some global

parameters depending on the entire network topology. In this thesis, DPGA-II, a special

Page 33: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

26

implementation of PG-ADMM, will be used to solve OPF problem; therefore, we will devote

Section 3.3.3 to discuss PG-ADMM, more specifically DPGA-II.

In the rest of this chapter, we adopt the following notation. Let G = (N , E) be a connected

graph, where N := 1, 2, ..., N denote the set of computing nodes, and E ⊂ N ×N denote

the set of (undirected) edges such that (i, j) ∈ E implies that i < j. Let N (i) denote the

set of neighboring nodes of i ∈ N as (1.1), and di := |N (i)| denote the degree of node

i ∈ N . Let Ω ∈ R|N |×|N | denote the Laplacian of the graph G, and M ∈ R|E|×|N | denote

the oriented edge-node incidence matrix, i.e., for e = (i, j) ∈ E and k ∈ N , Mek is equal

to 1 if k = i, equal to -1 if k = j, and equal to 0, otherwise. Note that Ω = M>M . Let

ψmax := ψ1 ≥ ψ2 ≥ ... ≥ ψN be the eigenvalues of Ω. Since G is connected, ψN−1 > ψN = 0,

i.e., rank(M) = rank(Ω) = n − 1. Moreover, (M ⊗ In)>(M ⊗ In) = Ω ⊗ In. From the

structure of Ω ⊗ In, it follows that ψini=1are also the eigenvalues of Ω ⊗ In, each with

algebraic multiplicity n.

3.3.2 DFAL algorithm for distributed optimization

Suppose for each i ∈ N , Fi = ξi + fi is a composite convex function such that fi is

convex, differentiable and has a Lipschitz continuous gradient ∇fi with constant Lfi , ξi is

convex, bounded below by some norm, i.e. ξi(.) ≥ ‖.‖, and it has a uniformly bounded

sub-differential. Let x = (x>1, ..., x>

n)> ∈ RnN denotes a vector formed by concatenating

xii∈N ⊂ Rn as a long column vector. Let M ∈ R|E|×|N | be the oriented edge-node

incidence matrix of G. The distributed optimization problem (3.4) with Xi = Rn for i ∈ N

can be written a special case of the following problem in (3.5) by setting A = M⊗In, where

⊗ denotes the Kronecker product.

F ∗ := minx∈RnN

F (x) := f(x) + ξ(x) s.t. Ax = b, (3.5)

where f(x) :=∑

i∈N fi(xi), ξ(x) :=∑

i∈N ξi(xi), and A ∈ Rm×n|N |. Let Aii∈N ⊂ Rm×n

such that A = [A1, A2, ..., AN ]. Problem (3.5) is solved by inexactly solving a sequence of

Page 34: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

27

subproblems:

x(k)∗ ∈ argmin

x∈RnNP (k)(x) := λ(k)ξ(x) + h(k)(x), (3.6)

h(k)(x) := λ(k)f(x) +1

2‖Ax− b− λ(k)θ(k)‖2

2, (3.7)

for appropriately chosen sequences of penalty parameters λ(k) and dual variables θ(k)

such that λ(k) 0. Given positive real sequences α(k), β(k) satisfying 0 ≤ maxα(k), β(k)/λ(k) <

C for all k ≥ 1 for some C, the iterate sequence x(k) is constructed such that every x(k)

satisfies one of the following conditions:

P (k)(x(k))− P (k)(x(k)∗ ) ≤ α(k), (3.8a)

∃g(k)i∈ ∂xiP

(k)(x) |x=x(k) s.t. maxi∈N‖g(k)i‖2 ≤

β(k)

√N, (3.8b)

where ∂xiP(k)(x) |x=x := λ(k)∂ρi(xi)|xi=xi +∇xih

(k)(x). Clearly, ∇h(k)(x) is Lipschitz

continuous in x ∈ RnN with constant λ(k)L + σ2max

(A), where L := maxi∈N Lfi . Given

x(0), λ(0), α(0), β(0) and c ∈ (0, 1), sequence x(k), λ(k), α(k), β(k) can be computed, shown

in Fig. 3.1.

Fig. 3.1: First-order Augmented Lagrangian (DFAL) algorithm

Aybat el at. [15] showed that any limit point of DFAL iterates is optimal; and for any

ε > 0, an ε-optimal |F (xε)−F ∗| ≤ ε and ε-feasible ‖Axε−b‖2 ≤ ε solution can be computed

within O(log(ε−1)) DFAL iterations. The overall computational complexity for the DFAL

Page 35: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

28

Fig. 3.2: Multi Step-Accelerated Prox. Gradient (MS-APG) algorithm

algorithm will depend on the complexity of the oracle that is called within Step 1 of Fig. 3.1

to compute x(k) for each iteration k ≥ 1. MS-APG, displayed in Fig. 3.2, can compute an

x(k) satisfying (3.8a) withing O(1/λ(k)) gradient and prox computations is constructed.

In particular, when DFAL is implemented to solve minx∑

i∈N Fi(x) := ξi(x) + fi(x) over

the connecticity network G, the dual iterates θ(k) need not be explicitly computed, i.e.,

the computation in Step 2 of Fig. 3.1 can be accounted for implicitly using only the primal

iterate sequence. Therefore, the overall algorithm can be implemented in a distributed

manner using only communications with neighboring nodes – since MS-APG, displayed in

Fig. 3.2, can be computed distributedly. Moreover, it is shown that within O(σ1.5max

(Ω)

dminε−1)

proximal-gradient computations and communications per node in total, DFAL can compute

an ε-optimal and ε-feasible solution xε, where Ω denote the Laplacian of G, and dmin is the

degree of the smallest degree node.

3.3.3 Proximal gradient ADMM for distributed optimization

Suppose for each i ∈ N , Fi = ξi+fi is a composite convex function such that fi is convex,

differentiable and has a Lipschitz continuous gradient ∇fi with constant Lfi ; ξi is convex

(possibly non-smooth). In this section we present two different consensus formulations for

minx∑

i∈N Fi(x) := ξi(x) +fi(x), i.e., for the problem in (3.3) with Xi = Rn for i ∈ N , and

briefly discuss two different distributed consensus optimization algorithms, DPGA-I and

DPGA-II. These two algorithms are nothing but customized distributed implementations

of PG-ADMM, the linearized ADMM algorithm proposed in [20], on these two different but

Page 36: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

29

equivalent formulations. Note that since we allow non-smooth convex functions ξi in the

formulation, the case Xi 6= Rn can be handled using indicator functions.

1) DPGA-I algorithm: C ∈ R|N |×|N | is called a communication matrix if for all i ∈ N ,

Cij = 0 for all j /∈ N (i) ∪ i, Cij < 0 for all j ∈ N (i), and Cii = −∑

j∈N (i)Cij . Note

that the Laplacian Ω of the graph G is a communication matrix. It can be shown that for

x = [xi]i∈N satisfying (C ⊗ In)x = 0, there exists x ∈ Rn such that xi = x for all i ∈ N .

Given C with properties above, the feasible set of (3.4) can be equivalently represented as

x = [xi]i∈N ∈ X1× . . .×XN : (C ⊗ In)x = 0. Hence, the problem (3.4) with Xi = Rn for

i ∈ N can be equivalently written as

minx∈Rn|N|

F (x) :=∑i∈N

Fi(xi) : (Ω⊗ In)x = 0, (3.9)

where x> = [x>1, ..., x>

n]> and Fi is defined as Fi = ξi + fi.

For each i ∈ N , define new set of primal variables yij ∈ Rn for j ∈ N (i) ∪ i, and

form yi = [yij ]j∈N (i)∪i. Let Yi := yi :∑

j∈N (i)∪i yij = 0 for i ∈ N , and define

g(y) :=∑

i∈N 1Yi(yi), where 1Yi denotes the indicator function of the set Yi for i ∈ N , i.e.

1Yi(yi) is equal to 0 if yi ∈ Yi, and to +∞ otherwise, where y> = [y>1, . . . ,y>

N]. Consider

the following formulation which is equivalent to (3.9):

miny,x

g(y) +∑i∈N

Fi(xi) s.t. Ωijxj − yij = 0 : λij , ∀j ∈ N (i) ∪ i, ∀i ∈ N , (3.10)

where λij ∈ Rn denote the Lagrange multiplier corresponding to the primal constraint

Ωijxj − yij = 0. Let λi = [λij ]j∈N (i)∪i ∈ R(di+1)n, and λ = [λi]i∈N ∈ Rn(2|E|+|N |).

Set the step-size ci := 1/(Li+γdi(di+1)), the smooth part of the augmented Lagrangian

φγ for the problem in (3.10) can be written as

φγ(x,y, λ) =∑i∈N

fi(xi) +∑

j∈N (i)∪i

λ>ij

(Ωijxj − yij) +γ

2‖Ωijxj − yij‖

2

, (3.11)

Page 37: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

30

for a fixed penalty parameter γ > 0. Hence, ∇xjφγ can be computed as

∇xjφγ(xk,yk, λk) = ∇f(xkj) +

∑i∈N (j)∪j

[Ωijλ

kij

+ γΩij(Ωijxkj− yk

ij)], (3.12)

and the steps of PG-ADMM [20] when implemented on (3.10) can be written in the following

form:

xk+1j

= proxcjξj (xkj− cj∇xjφγ(xk,yk, λk)), j ∈ N

yk+1i

= argminyi

γ2 ∑j∈N (i)∪i

‖Ωijxk+1j− yij +

1

γλkij‖2 :

∑j∈N (i)∪i

yij = 0

, i ∈ N

λk+1ij

= λkij

+ γ(Ωijxk+1j− yk+1

ij) j ∈ N (i) ∪ i, i ∈ N .

Compare to the algorithm in [70], the y-step and λ-step are exactly the same; however, the

x-step is much simpler than the corresponding one in [70], where the x-iterates are computed

by solving minxj∈Rn Fj(xj)+∑

i∈N (j)(λkij

)>(Ωijxj−ykij)+ γ2

∑j∈N (i)∪i ‖Ωijxj−ykij‖

2, which

is equivalent to computing proxξj+fj . Note that even if both ξj and fj have simple proximal

map, the proximal map of the sum is not necessarily simple and it can be impractical to

compute.

Let x0ii∈N denote the set of initial primal iterates. For k ≥ 0, let pk+1

ibe the optimal

Lagrange multiplier corresponding to yi ∈ Yi constraint. Therefore, yk+1ij

= Ωijxk+1j

+

1γ (λk

ij− pk+1

i). Combing this equality with (3.12), it is concluded that λk+1

ij= pk+1

ifor

all k ≥ 0. Since yk+1i∈ Yi, the optimal dual can be computed in closed form: pk+1

i=

1di+1

∑j∈N (i)∪i λ

kij

+ γdi+1

∑j∈N (i)∪iΩijx

kj

for k ≥ 0. Suppose we initialize λ0ij

= p0i

for

all j ∈ N (i)∪i for some given p0i

for all i ∈ N . Thus, pk+1i

= pki+ γdi+1

∑j∈N (i)∪iΩijx

k+1j

for all k ≥ 0. Finally, by defining s0i

= 0 and ski

:= 1di+1

∑j∈N (i)∪iΩijx

kj

for all k ≥ 1,

and initializing y0ij

:= Ωijx0j

for all j ∈ N (i) ∪ i and i ∈ N , the computation of ∇xjφγ in

Page 38: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

31

x-step can be simplified for k ≥ 0 as follows

∇xjφγ(xk,yk, λk)−∇fj(xkj) =

∑i∈N (j)∪j

Ωij(λkij

+ γ(Ωijxkj− yk

ij))

=∑

i∈N (j)∪j

Ωij(2pki− pk−1

i)

=∑

i∈N (j)∪j

Ωij(pki

+ γski). (3.13)

From the above process, the steps can be simplified as shown in Fig. 3.3.

Fig. 3.3: Distributed Proximal Gradient Algorithm I (DPGA-I)

When nodes implement DPGA-I algorithm, they perform the following steps in a dis-

tributed way: i) each node stores three variables in Rn : xki, ski, pki; ii) each node sends out

its pki

+ γski

to all its neighbors, and computes the proximal step; iii) after the computation

of proxiaml steps, each node broadcasts the updated variable xk+1i

, and then updates sk+1i

;

iv) each node updates pk+1i

, and then repeats.

2) DPGA-II algorithm: Let M ∈ R|E|×|N | be the oriented edge-node incidence matrix of

G, therefore problem(3.4) with Xi = Rn for i ∈ N can be equivalently written as

minx∈Rn|N|

F (x) :=∑j∈N

Fi(xi) : (M ⊗ In)x = 0

, (3.14)

Page 39: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

32

where x and Fi is defined as before at the beginning of Section 3.3.3. For each (i, j) ∈ E ,

define new set of primal variables yij ∈ Rn, and let y = [yij ](i,j)∈E ∈ Rn|E|. Consider the

following formulation equivalent to (3.14):

miny,xii∈N

∑i∈N

Fi(xi) : xi − yij = 0 : αij , xj − yij = 0 : βij , ∀(i, j) ∈ E

, (3.15)

where αij ∈ Rn and βij ∈ Rn denote the Lagrange multiplier vectors corresponding to the

primal constraints xi−yij = 0, and xj−yij = 0, respectively. Define α = [αij ](i,j)∈E ∈ Rn|E|,

and β = [βij ](i,j)∈E ∈ Rn|E|. For all i ∈ N , set the stepsize ci := 1/(Li + γdi). The smooth

part of the augmented Lagrangian φγ corresponding to the formulation (3.15) can be written

as

φγ(x,y, α, β) =∑i∈N

fi(xi) +∑

(i,j)∈E

(α>ij

(xi − yij) + β>ij

(xj − yij))

2

∑(i,j)∈E

(‖xi − yij‖

2 + ‖xj − yij‖2)

(3.16)

for a fixed parameter γ > 0. Therefore, ∇xiφγ can be computed as

∇xiφγ(xk,yk, αk, βk) = ∇f(xki) +

∑j:(i,j)∈E

(αkij

+ γ(xki− yk

ij)) +

∑j:(j,i)∈E

(βkji

+ γ(xki− yk

ji))

and the steps of PG-ADMM [20] when implemented on (3.15) can be written in the following

form:

xk+1j

= proxcjξj (xkj− cj∇xjφγ(xk,yk, αk, βk)), j ∈ N

yk+1ij

= argminyij

−(αk

ij+ βk

ij)>yij +

γ

2

(‖xk+1

i− yij‖

2 + ‖xk+1j− yij‖

2)

, (i, j) ∈ E

αk+1ij

= αkij

+ γ(xk+1i− yk+1

ij), (i, j) ∈ E

βk+1ij

= βkij

+ γ(xk+1j− yk+1

ij), (i, j) ∈ E .

Page 40: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

33

Let x0ii∈N denote the set of initial primal iterates. For k ≥ 0, y-step can be solved in

closed form:

yk+1ij

=αkij

+ βkij

2γ+xk+1i

+ xk+1j

2(3.17)

From α-step and β-step, it follows that for k ≥ 0

αk+1ij

+ βk+1ij

2γ=αkij

+ βkij

2γ+xk+1i

+ xk+1j

2− yk+1

ij= 0 (3.18)

Hence, for each (i, j) ∈ E , it follows that αkij

+ βkij

= 0 for k ≥ 1. Suppose we initialize

α0 = β0 = 0, i.e., α0ij

= β0ij

= 0 for all (i, j) ∈ E . Therefore, using closed form in (3.17), it

can be concluded that ykij

= (xki

+ xkj)/2 for all k ≥ 1. As a result, αk

ij= γ

2

∑k`=1

(x`i− x`

j),

and βkij

= γ2

∑k`=1

(x`j− x`

i) for all k ≥ 1. Combining these results will imply that ∇xiφγ

can be computed as follows

∇xiφγ(xk,yk, αk, βk) = ∇f(xki) +

γ

2

∑j∈N (i)

(xki− xk

j) +

∑j:(i,j)∈E

αkij

+∑

j:(j,i)∈E

βkji

= ∇f(xki) +

γ

2

∑j∈N (i)

(xki− xk

j) +

k∑`=1

∑j∈N (i)∪i

(x`i− x`

j)

= ∇f(xki) +

γ

2

∑j∈N (i)

Ωijxkj

+k∑`=1

∑j∈N (i)∪i

Ωijx`j

(3.19)

Finally, by initializing y0ij

= (x0i

+ x0j)/2 for all (i, j) ∈ E , the computation of ∇xiφγ in

x-step can be simplified. Define ski

:=∑

j∈N (i)∪iΩijxkj

for k ≥ 0; and let pki

:= γ∑k

l=1ski

for k ≥ 1 and p0i

= 0. Hence, for k ≥ 0 we have

∇xiφγ(xk,yk, αk, βk) = ∇f(xki) +

1

2(pki

+ γski) (3.20)

Therefore, the DPGA-II steps can be simplified as shown in Fig. 3.4.

When nodes implement DPGA-II algorithm, they perform the following steps in a dis-

tributed way: i) each node stores three variables in Rn : xki, ski, pki; ii) each node computes

Page 41: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

34

Fig. 3.4: Distributed Proximal Gradient Algorithm II (DPGA-II)

the proximal step; iii) after the computation of proximal steps for all nodes, each node

broadcasts the updated variable xk+1i

, and then updates sk+1i

; iv) each node updates pk+1i

,

and then repeats. The major difference between DPGA-I and DPGA-II algorithms is the

number of communication steps required in each iteration. In particular, while DPGA-I

requires communication with neighboring nodes for both primal and dual variables, i.e.,

x and s steps, DPGA-II requires communication only for the dual variables, i.e., s-step.

In order to solve OPF problem in a distributed way, we only implemented DPGA-II as it

requires only one round of communication per iteration.

Page 42: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

35

Chapter 4

Implementation and Numerical Results

In the previous two chapters, we introduced the OPF problem and briefly discussed dis-

tributed consensus optimization for convex problems. It is important to note that the dis-

tributed optimization techniques we discussed in Chapter 3 cannot be immediately applied

to the OPF problem due to nonconvexity caused by power flow conservation constraints.

However, using the previously discussed SOCP relaxation, the OPF can be converted into

a relaxed convex problem which can be attacked using the distributed methods discussed

earlier. In this chapter, we focus on formulating the relaxation as a consensus optimization

problem, and then customize the distributed consensus optimization algorithms, DFAL and

DPGA-II, to solve the resulting problem. Finally, we numerically investigate the perfor-

mance of the proposed methods on IEEE test networks.

4.1 Distributed OPF problem

4.1.1 Data format

We used the AC power flow data for IEEE-3, 9, 14 and 30 buses provided in MATPOWER.

The cases are all in the standard steady-state model typically used for power flow analysis.

The magnitude of all values are expressed in per unit and angles of complex quantities are

expressed in radians. Buses are numbered consecutively, beginning at 1, and generators

are reordered by bus number. The data files used are MATLAB M-files which define and

return a single MATLAB struct. The fields of the struct are baseMVA, bus, branch,

gen, and gencost, where baseMVA is a scalar and the rest are matrices. In the matrices,

each row corresponds to a single bus, branch, or generator. The columns are similar to

the columns in the standard IEEE CDF and PTI formats. The number of rows in bus,

branch, and gen are nb, nl and ng, respectively, while gencost has either ng or 2ng rows

Page 43: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

36

depending on whether it includes costs for reactive power or just for real power. Full details

of MATPOWER case format are documented in Appendix.

Instead of using the supplied data directly, we made some simplifications to the IEEE

test networks in our tests as follows:

(a) Branches: In MATPOWER1, all transmission lines, transformers, and phase shifters

are modeled with a common standard π transmission line model, with series impedance zij =

rij+ixij and total charging susceptance bc, in series with an ideal phase shifting transformer.

The transformer with tap ratio magnitude τ and phase shift angle θshift is located at the

from end of the branch. The parameters rij , xij , bc, τ, θshift are specified in columns BR R(3),

BR X(4), BR B(5),TAP(9), and SHIFT(10), respectively, of the corresponding row of the

branch matrix. Let yij := 1/zij denote the series admittance of branch (i, j) ∈ E in the

π model. The complex current injection if and it at the from and to ends of each branch

can be expressed in terms of the 2 × 2 branch admittance matrix Ybr and the respective

terminal voltages vf and vt, if

it

= Ybr

vf

vt

. (4.1)

Given an arbitrary branch (i, j) ∈ E , let ys, bc, τ and θshift denote the admittance, charging

susceptance, tap ratio, and phase shift angle of branch (i, j), respectively. Then the branch

admittance matrix for (i, j) ∈ E can be written as

Ybr =

(ys + i bc2 ) 1τ2−ys 1

τe−iθshift

−ys 1τeiθshift

(ys + i bc2 )

. (4.2)

However, for the sake of simplification, we assume that each branch is purely resistant.

Under such assumption, the energy is only measured in electronic field without considering

Electromagnetic Induction. By setting all the data in columns BR B, TAP, and SHIFT

as 0, i.e., for any given branch we set bc = τ = θshift = 0, the admittance matrix will be

1Source: MATPOWER 6.0b1 User’s Mannual

Page 44: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

37

simplified as follows

Ybr =

ys −ys−ys ys

. (4.3)

Thus the original 2 by 2 matrix of each line can be replaced by only one admittance pa-

rameter ys of the line.

(b) Shunt Elements: In MATPOWER, a shunt element connected to bus i ∈ N such

as a capacitor or inductor is modeled with a fixed impedance to ground at a bus, and the

admittance of the shunt element at bus i is given as yish

= gish

+ ibish

. The parameters gish

and bish

are specified in columns GS(5) and BS(6), respectively, of row i of the bus matrix.

In our work, we neglect the limit of shunt elements at all buses by setting GS and BS as 0.

(c) In/Out Service: In our work, we assume that all generators and branches are in-

service. Therefore, the GEN STATUS(8) of the gen matrix and BR STATUS(11) of the

branch are all set as 1.

4.1.2 Consensus Constraints

In Section 2.3.2, we briefly discussed the convex SDP and SOCP relaxations for the OPF

problem. In this section, we adopted the SOCP formulation to develop distributed opti-

mization algorithms to solve the relaxed OPF problem. First, we discuss how to formulate

the OPF relaxation as a special case of distributed consensus optimization problem. Clearly,

buses i ∈ N and j ∈ N connected by branch (i, j) ∈ E have to agree on the potential dif-

ference on the branch; hence, we formulate the OPF relaxation by imposing consistency on

the voltage amounts on either side of each branch.

Consider the change of variables defined as in (2.7); instead of directly working with the

complex voltage variables Vi, we adopt new variables Wij for (i, j) ∈ E and Wii for i ∈ N ,

and work with local copies of these variables. Recall that in order to simplify the notation,

we fixed an orientation for representing branches in the network G = (N , E), i.e., (i, j) ∈ E

implies that i < j. Therefore, for each bus i ∈ N , we partition the set of neighboring buses,

N (i), into two subsets: set of forward busses, F(i), and set of backward busses, B(i). More

specifically, F(i) denotes the set of buses connected to bus i with a branch for which i is

the from end, i.e., F(i)) := j ∈ N (i) : (i, j) ∈ E; and B(i) is defined similarly, but this

Page 45: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

38

time i is the to end of the branch connecting i and j, i.e., B(i) := j ∈ N : (j, i) ∈ E.

Therefore, F(i) ∩ B(i) = ∅, and F(i) ∪ B(i) = N (i). Define

Wi =

(W iii,W iij, W i

jj

j∈F(i)

,W iji, W i

jj

j∈B(i)

), i ∈ N ,

where W iii

, W iij

, W iji

and W ijj

are the local copies of Wii, Wij , Wji and Wjj at bus i,

representing the decision variables of bus i. Hence, each bus i ∈ N stores Wi that consists

of 1 + 2di complex variables related to the bus and its neighboring buses, where di is the

degree of bus i representing the number of branches connected to it. Similar to (2.8), we

also define

W ii, j :=

W iii

W iij

W iji

W ijj

∈ H2×2, j ∈ F(i), (4.4a)

W ij, i :=

W iii

W iji

W iij

W ijj

∈ H2×2, j ∈ B(i); (4.4b)

hence, for each i ∈ N , W iij

=(W iji

)∗for all j ∈ N (i). Using this notation, the OPF

problem can be written as follows

F∗

:= min

∑i∈N

Fi(PGi, QGi

,Wi) : (PGi

, QGi,W

i) ∈ Xi, W

i ≡Wj ∀(i, j) ∈ E

(4.5)

where Wi ≡ Wj means that the related components in both vectors are equal to each

other, Xi denotes the set characterizing the AC power flow around bus i ∈ N ; and it can

be defined using similar constraints as in (2.9a)-(2.9e), and (2.10). Indeed,

Xi :=

(PGi

, QGi,W

i) :

(PGi− PDi

) + i(QGi−QDi

) =∑

j∈N (i)(Wi

ii−W i

ij)y∗ij,

(Vmin

i)2 ≤W i

ii≤ (V

max

i)2,

Wii, j 0 ∀j ∈ F(i), W

ij, i 0 ∀j ∈ B(i),

Pmin

i≤ PGi

≤ Pmax

i, Q

min

i≤ QGi

≤ Qmax

i

. (4.6)

Since Wi ∈ C1+2di , Xi is a typically small-dimensional set compare to the original set X .

Page 46: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

39

Consider branch (i, j) ∈ E , since Wi and Wj have different dimensions and have com-

ponents that are not directly related, we cannot impose Wi = Wj . In fact, the consensus

constraint Wi ≡Wj in the relaxed formulation (4.5) means

W iii

= W jii, W i

ij= W j

ij, W i

ji= W j

ji, W i

jj= W j

jj. (4.7)

4.1.3 Line Loss and Generation Cost

To present the SOCP relaxation of OPF problem (4.5) in the distributed form considered

in Sections 3.3.2 and 3.3.3, i.e., mini∈N Fi(x), we model the node-specific objective func-

tions as composite convex. In particular, for each i ∈ N , Fi is set as Fi(PGi , QGi ,Wi) =

fi(PGi ,Wi)+ξi(PGi , QGi ,W

i), where fi(PGi ,Wi) is a convex function representing the cost

associated with generator i ∈ I, and ξi(PGi , QGi ,Wi) = 1Xi(PGi , QGi ,W

i) is the indicator

function of Xi. Note that fi is the function that refer to the real objective needs to be

optimized for the power system network. In this thesis, we applied two different objective

functions fi, Line Loss and Generation Cost.

(1) Line Loss: The power loss of the network is to resistance of branches, and it can be

written as

PowerLoss :∑

(i,j)∈E

gij |Vi − Vj |2, (4.8)

where gij = Re(yij). The objective function is a quadratic function related to the voltage at

each bus. We can represent this objective using only node variables as follows:∑

i∈I fi(PGi).

As discussed in Chapter 2, power flow equations can be stated as in (2.2a), i.e., for each

i ∈ N , Si = Vi∑

j∈N (i) y∗ij

(V ∗i− V ∗

j), where Si = (PGi − PDi) + i(QGi −QDi) for all i ∈ N

as in (2.4). Recall that PGi = QGi = 0 for all i ∈ N \ I, and demand amounts PDi , QDi

are fixed for all i ∈ N . Next, notice that

∑i∈N

Si =∑

(i,j)∈E

y∗ij

(ViV∗i− ViV

∗j− VjV

∗i

+ VjV∗j

) =∑

(i,j)∈E

y∗ij|Vi − Vj |

2. (4.9)

Therefore,

Re

(∑i∈N

Si

)=∑i∈N

(PGi − PDi) =∑

(i,j)∈E

gij |Vi − Vj |2. (4.10)

Page 47: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

40

Since we assume that the power demand PDi is constant for each bus and PGi = QGi = 0 for

all i ∈ N \ I, minimizing the line loss given in (4.8) can be can be achieved by minimizing

the sum of real power generated by all generators∑

i∈N PGi ; therefore, we set fi(PGi) = PGi

for i ∈ I and fi(PGi) = 0 for i ∈ N \ I.

On the other hand, in our numerical tests we adopted an alternative formulation of the

objective. Note we can represent the PowerLoss in (4.8) directly as a function of Wi.

Indeed, recall the variable transformation given in (2.7); the nonlinear objective function

(4.8) can be equivalently reformulated as a linear function of Wi as follows:

PowerLoss =∑i∈N

fi(Wi), where

fi(Wi) :=

1

2

∑j∈F(i)

gij(Wiii−W i

ij− (W i

ij)∗ +W i

jj) +

1

2

∑j∈B(i)

gji(Wiii−W i

ji− (W i

ji)∗ +W i

jj).

Therefore, for all i ∈ N

∂fi∂W i

ii

=1

2

∑j∈N (i)

gij ,∂fi∂W i

jj

=1

2gij ,

∂fi∂W i

ij

= −gij ∀j ∈ F(i),∂fi∂W i

ji

= −gij ∀j ∈ B(i). (4.11)

(2) Generation Cost : For this case, all the generation costs are given using a polynomial

model, specifically, quadratic model. In gencost matrices, the cost model and related coef-

ficients are presented. More specifically, three coefficients of second-order cost polynomial,

(ci2, ci

1, ci

0), are given of each bus i ∈ I. The objective function with quadratic cost is written

as

GenerationCost :∑i∈I

ci2(PGi)

2 + ci1PGi + ci

0. (4.12)

As we did in the line-loss case, we would like to define node-specific objectives directly as a

function of Wi, instead of PGi . Therefore, using (2.2a), we can write PGi as a function of

Page 48: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

41

Wi for all i ∈ I:

PGi(Wi) = Re

∑j∈F(i)

y∗ij

(W iii−W i

ij

)+∑j∈B(i)

y∗ji

(W iii−(W iji

)∗)+ PDi . (4.13)

Therefore, using (4.13), we can write the total generation cost in (4.12) as∑

i∈I fi(Wi),

where fi is defined as follows

fi(Wi) = ci

2

(PGi(W

i))2

+ ci1PGi(W

i) + ci0∀i ∈ I. (4.14)

Therefore, by the chain rule, the partial gradients of fi can be computed as

∂fi∂Wi

=∂fi∂PGi

∂PGi∂Wi

(4.15)

where ∂fi/∂PGi = 2ci2PGi + ci

1of (4.14) are easy to compute, and the partial gradients

∂PGi/∂Wi can be computed from (4.13) as follows

∂PGi∂W i

ii

=∑

j∈N (i)

gij ,∂PGi∂W i

jj

= 0

∂PGi∂W i

ij

= −yij ∀j ∈ F(i),∂PGi∂W i

ji

= −y∗ij∀j ∈ B(i). (4.16)

4.2 Implementation Details

In chapter 3, we have discussed the DFAL and DPGA-II algorithms. In this section, we

customized these two algorithms to solve the OPF problem with consensus constraints (4.5).

IEEE benchmark system case with 3, 9, 14, and 30 buses shown in Table 4.1 are used as test

cases. Both algorithms are terminated when the relative suboptimiality, |∑

i∈N Fi(x(k)i )−

F ∗|/|F ∗|, is less than 10−3; and consensus violation ∆(k) = max(i,j)∈E‖x(k+1)i − x(k+1)

j ‖2

is less than 10−3. Let ε = 10−3 being the tolerance for optimality and feasibility, and

F ∗ is the optimal value computed using an efficient interior point solver MOSEK. The

implementation pseudo code of each algorithm and some computation details are provided.

Page 49: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

42

Types Buses(nb) Generators(ng) Branches(nl)case3 3 3 3case9 9 3 9case14 14 5 20case30 30 6 41

Table 4.1: Network characteristic of IEEE benchmark system cases

4.2.1 Implementation of DFAL

We customized DFAL to solve problem (4.5), which can be written as a special case of the

problem in (3.5) by setting A = M⊗In, where M ∈ R|E|×|N | denotes the oriented edge-node

incidence matrix of G. As stated previously we implemented DFAL algorithm in such a way

that the dual iterates θ(k) are not explicitly computed. How this can be done is explained

in [15]. Indeed, one needs θ(k) to compute ∇h(k) in MS-APG iterations, and also in Step 2

of DFAL to compute θ(k+1). That said, as shown in [15], these computations can be done

without explicitly computing θ(k). Setting θ(1) = 0, from Step 2 in DFAL and (3.7), it can

be concluded that θ(k+1) = −∑k

t=1Ax(t)

λ(t), and ∇hk(x) = λ(k)∇f(x) + A>(Ax− λ(k)θ(k)) =

λ(k)∇f(x) +A>A(x + λ(k)∑k−1

t=11λ(t)

x(t))

. Note that A>A = Ω⊗ In, where Ω ∈ R|cN |×|N |

denotes the Laplacian of G. Therefore,

∇xih(k)(x) = λ(k)∇fi(xi) + di(xi + x(k)

i)−

∑j∈N (i)

(xj + x(k)j

), (4.17)

where x(k) :=∑k−1

t=1λ(k)

λ(t)x(t), and N (i) denotes the set of nodes adjacent to node i ∈ N .

Based on these relations, the Step 1 of MS-APG can be computed in a distributed way

for each node in the network by only communicating with the adjacent nodes without

computing θ(k) in Step 2 of DFAL.

Recall that nodes move to a new outer iteration either (3.8a) or (3.8b) holds. The seem-

ingly unverifiable condition (3.8a) can indeed be verified by checking another sufficient

condition. In particular, there is a way to compute a theoretical upper bound, l(k)max

, on

the number of inner iteration that can guarantee α(k)-optimality to the k-th subproblem in

(3.6), i.e., if (3.8b) does not hold for l(k)max

MS-APG iterations, (3.8a) must be true. Hence,

all the nodes will move to next DFAL iteration after l(k)max

many MS-APG iterations. In

Page 50: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

43

the implementation of DFAL, every node can independently check (3.8b) without com-

municating their private information; however, in order to stop inner iterations and move

to the next DFAL iteration due to stopping condition (3.8b), all the nodes should satisfy

their respective subgradient stopping criterion. Suppose each node will broadcast either

Ture or False depending on whether it has satisfied its subgradient condition or not. In

this case, after every inner iteration, DFAL requires evaluating a logical conjunction over G

to check whether all the nodes report Ture – the conjunction evaluates to False if there is

one node that reports False. Evaluating this conjunction in a distributed manner may not

be easily for large-scale networks, and it may lead to heavy burden on the network. For

those networks, when it is hard implement this conjunction operation, we can avoid it by

checking (3.8a) only. The implementable version of DFAL is shown in Fig.4.1.

4.2.2 Implementation of DPGA-II

Compared to DFAL algorithm, DPGA-II only has one loop which is easier to compute

the iterations for large size network. The computation of proximal step is similar with

that of in DFAL. We adopted constant step-size ci = (Li + γdi)−1 for node i ∈ N in the

tests. Although the algorithm theoretically works for all penalty parameter γ > 0, since the

penalty parameter γ directly affects the step size; hence it affect the convergence behavior in

practice, it should be carefully tuned. When the objective function is line loss, we selected

γ = 12nb based on the network topology. When the objective function is generation cost,

since the exists of scalar parameter baseMVA= 100, we selected γ = 1000 which ensures

the set sizes ci = 1/(Li + γdi) for i ∈ I are approximately the same with ci = 1/(γdi) for

i ∈ N\I. The implementation version is shown in Fig.4.2.

4.3 Numerical results

We solved the distributed OPF optimization problem with consensus (4.5), considering

both Line Loss and Generation Cost in different problems. The numerical result of Line

Loss is shown in Table 4.2, and convergence behavior in Fig.4.3 and Fig.??. The numerical

result of Gneration Cost is shown in Table 4.3, and convergence behavior in Fig.4.5 and

Fig.4.6. The optimal value F ∗ is computed by MOSEK, and it is used as a benchmark

Page 51: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

44

Fig. 4.1: Implementation of DFAL algorithm

Page 52: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

45

Fig. 4.2: Implementation of DPGA-II algorithm

Page 53: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

46

to measure the ε-optimality. The numerical results show that both two algorithms can

converge to the optimal solution and work well in practice. It should be noted that the run-

time reported do not include the effect of communication. However, in practical systems,

transmitting information between neighbors also takes time. The number of communication

rounds per iteration is 1 for both DFAL and DPGA-II.

Table 4.2 shows that in line loss case, DPGA-II performs well than DFAL when the

network topology is larger and more complex. However, in generation cost case shown

in Table 4.3, DPGA-II takes more iterations and CPU time to converge to the optimal

solution in case9 and case14. This may be due to constant step-size rule adopted in our

experiments. Since there is scalar parameter baseMVA, Li, Lipschitz constants of ∇fi may

be too large for certain nodes, leading to very small steps since ci = O(L−1i

). As a result,

the convergence rate may be relative slow. Aybat et al. [20] have shown that by applying

adaptive step-size strategy to DPGA methods, the convergence rate can be improved by a

factor of at least 2 when compared to constant step-size strategy. Thus it is reasonable to

assume that the DPGA-II will also perform better than DFAL in generation cost case when

the adaptive step-size strategy is adopted.

Types Alg. Opt. Value Consensus Violation CPUtime(sec.) # of iterationscase3 DFAL 312.8861 9E-4 5 8

DPGA-II 312.8656 9E-4 9 16MOSEK 313.0794

case9 DFAL 310.7509 6E-4 42 28DPGA-II 310.7733 7E-4 43 28MOSEK 310.5845

case14 DFAL 254.5802 9E-4 170 68DPGA-II 254.5384 7E-4 120 49MOSEK 254.3359

case30 DFAL 189.2243 9E-4 115 22DPGA-II 189.5279 9E-4 78 15MOSEK 189.377

Table 4.2: Copmarison of DFAL and DPGA-II in Line Loss

Page 54: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

47

Types Alg. Opt. Value Consensus Violation CPUtime(sec.) # of iterationscase3 DFAL 5580.2354 6E-4 4 7

DPGA-II 5578.7717 5E-4 57 104MOSEK 5576.0736

case9 DFAL 5129.0475 1E-5 1828 1235DPGA-II 5129.0638 2E-5 2330 1417MOSEK 5134.1789

case14 DFAL 7854.3279 1E-4 1158 474DPGA-II 7858.856 9E-4 1480 591MOSEK 7861.6347

case30 DFAL 568.1058 1E-5 4700 908DPGA-II 568.796 1E-3 2085 388MOSEK 568.6684

Table 4.3: Copmarison of DFAL and DPGA-II in Generation Cost

Fig. 4.3: The convergence behavior for Line Loss with DFAL alg.

Page 55: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

48

Fig. 4.4: The convergence behavior for Line Loss with DPGA-II alg.

Page 56: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

49

Fig. 4.5: The convergence behavior for Generation Cost with DFAL alg.

Page 57: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

50

Fig. 4.6: The convergence behavior for Generation Cost with DPGA-II alg.

Page 58: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

51

Chapter 5

Conclusion

In this thesis, we studied distributed first-order augmented Lagrangian (DFAL) and dis-

tributed proximal gradient ADMM (DPGA-I and DPGA-II) methods for optimal power

flow (OPF) problem with composite convex node-specific objectives Fi = ξi + fi for each

bus. We assume that each bus can only access its local data, and is able to exchange infor-

mation with its neighboring nodes only. The advantage of solving OPF problem in a such a

distributed way is that buses are not required to know any global parameters depending on

the entire network topology, and by using local communication and consensus constraints,

one can avoid central planner leading to a more robust operation of the system as a whole.

Because if the computation is distributed to network, rather than having it all done at a

central node, there will not be a single point of failure or attack. Moreover, by eliminating

the central planner, the communication burden and memory storage are reduced, which

leads to higher computational efficiency, and at the same time, this mode of operation

also respects possible node-level privacy requirements in the network. DFAL and DPGA-II

algorithms are implemented on IEEE benchmark system with 3, 9, 14 and 30 bus cases.

Since OPF problem is in general nonlinear and nonconvex, the SOCP relaxation is adopted

in order to convexify the OPF problem. The numerical results show that both algorithms

can compute ε-optimal and ε-feasible solution, and for certain cases the SOCP relaxation

was able to recover the optimal solution to the original nonconvex OPF (this is verified by

checking the rank constraints).

Page 59: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

52

Appendix

MATPOWER DATA FORMAT

For the sake of completeness, the details of the MATPOWER case format are given in

the tables below taken from MATPOWER 6.0b1 User’s Manual1. First, the baseMVA

field is a simple scalar value specifying the system MVA base used for converting power into

per unit quantities. Following field descriptions are given in (Table A.1-Table A.4)

Table A.1: Bus Data (mpc.bus)

name column description

BUS I 1 bus number (positive integer)BUS TYPE 2 bus type (1=PQ, 2=PV, 3=ref, 4=isolated)PD 3 real power demand (MW)QD 4 reactive power demand (MVAr)GS 5 shunt conductance (MV demanded at V = 1.0 p.u.)BS 6 shunt susceptance (MVAr injected at V = 1.0 p.u.)BUS AREA 7 area number (positive integer)VM 8 voltage magnitude (p.u.)VA 9 voltage angle (degrees)BASE KV 10 base voltage (kV)ZONE 11 loss zone (positive integer)VMAX 12 maximum voltage magnitude (p.u.)VMIN 13 minimum voltage magnitude (p.u.)

LAM P† 14 Lagrange multiplier on real power mismatch (u/MW)

LAM Q† 15 Lagrange multiplier on reactive power mismatch (u/MVAr)

MU VMAX† 16 Kuhn-Tucker multiplier on upper voltage limit (u/p.u.)

MU VMIN† 17 Kuhn-Tucker multiplier on lower voltage limit (u/p.u.)

† Included in OPF output, typically not included (or ignored) in input matrix. Here weassume the objective function has units u.

1http://www.pserc.cornell.edu/matpower/MATPOWER-manual.pdf

Page 60: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

53

Table A.2: Generator Data (mpc.gen)

name column description

GEN BUS 1 bus numberPG 2 real power output (MW)QG 3 reactive power output (MVAr)QMAX 4 maximum reactive power output (MVAr)QMIN 5 minimum reactive power output (MVAr)VG 6 voltage magnitude setpoint (p.u.)MBASE 7 total MVA base of machine, default to baseMVAGEN STATUE 8 machine status, > 0=in-service, < 0=out-of-servicePMAX 9 maximum real power output (MW)PMIN 10 minimum real power output (MW)PC1* 11 lower real power output of PQ capability curve (MW)PC2* 12 upper real power output of PQ capability curve (MW)QC1MIN* 13 minimum reactive output at PC1 (MVAr)QC1MAX* 14 maximum reactive output at PC1 (MVAr)QC2MIN* 15 minimum reactive output at PC2 (MVAr)QC2MAX* 16 maximum reactive output at PC2 (MVAr)RAMP AGC* 17 ramp rate for load following/AGC (MW/min)RAMP 10* 18 ramp rate for 10 minute reserves (MW)RAMP 30* 19 ramp rate for 30 minute reserves (MW)RAMP Q* 20 ramp rate for reactive power (2 sec timescale) (MVAr/min)APF* 21 area participation factor

MU PMAX† 22 Kuhn-Tucker multiplier on upper Pg limit (u/MW)

MU PMIN† 23 Kuhn-Tucker multiplier on upper Pg limit (u/MW)

MU QMAX† 24 Kuhn-Tucker multiplier on upper Qg limit (u/MVAr)

MU QMIN† 25 Kuhn-Tucker multiplier on upper Qg limit (u/MVAr)

* Not included in version 1 case format.† Included in OPF output, typically not included (or ignored) in input matrix. Here weassume the objective function has units u.

Page 61: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

54

Table A.3: Branch Data (mpc.branch)

name column description

F BUS 1 “from” bus numberT BUS 2 “to” bus numberBR R 3 resistance (p.u.)BR X 4 reactance (p.u.)BR B 5 total line charging susceptance (p.u.)RATE A 6 MVA rating A (long term rating)RATE B 7 MVA rating B (short term rating)RATE C 8 MVA rating C (emergence rating)TAP 9 transformer off nominal turns ratio, (taps at “from” bus,

impedance at “to” bus, i.e., if r = x = 0, tap =|Vf ||Vt| )

SHIFT 10 transformer phase shift angle (degrees), positive ⇒ delayBR STATUS 11 initial branch status, 1=in-service, 0=out-of-serviceANGMIN* 12 minimum angle difference, θf − θt (degrees)ANGMAX* 13 maximum angle difference, θf − θt (degrees)

PF† 14 real power injected at “from” bus end (MW)

QF† 15 reactive power injected at “from” bus end (MVAr)

PT† 16 real power injected at “to” bus end (MW)

QT† 17 reactive power injected at “to” bus end (MVAr)

MU SF‡ 18 Kuhn-Tucker multiplier on MVA limit at “from” bus (u/MVA)

MU ST‡ 19 Kuhn-Tucker multiplier on MVA limit at “to” bus (u/MVA)

MU ANGMIN‡ 20 Kuhn-Tucker multiplier lower angle difference limit (u/degree)

MU ANGMAX‡ 21 Kuhn-Tucker multiplier upper angle difference limit (u/degree)

* Not included in version 1 case format. The voltage angle difference is taken to be un-bounded below id ANFMIN< −360 and unbounded above if ANGMAX> 360. If bothparameter are zero, the voltage angle difference is unconstrained.† Included in power flow and OPF output, ignored on input.‡ Included in OPF output, typically not included (or ignored) in input matrix. Here weassume the objective function has units u.

Page 62: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

55

Table A.4: Generator Cost Data† (mpc.gencost)

name column description

MODEL 1 cost model, 1=piecewise linear, 2=polynomialSTARTUP 2 startup cost in US dollars*SHUTDOWN 3 shutdow cost in US dollars*NCOST 4 number of cost coefficients for polynomial cost function,

or number of data points for piecewise linearCOST 5 parameters defining total cost function fp begin in this column,

units of f and p are $/hr and MW (or MVAr), respectively(MODEL=1) ⇒ p0, f0, p1, f1, . . . , pn, fn

where p0 < p1 < · · · < pn and the cost fp is defined bythe coordinates (p0, f0), (p1, f1), . . . , (pn, fn)of the end/break-points of the piecewise linear cost

(MODEL=2)⇒ cn, . . . , c1, c0n+ 1 coefficients of n-th order polynomial cost, starting with

highest order, where cost is f(p) = cnpn + · · ·+ c1p+ c0

† If gen has ng rows, then the first ng rows of gencost contain the costs for active powerproduced by the corresponding generations. If gencost has 2ng rows, then rows ng + 1through 2ng contain the reactive power costs in the dame format.* Not currently used by any MATPOWER functions.

Page 63: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

56

Bibliography

[1] J Carpentier. Contribution to the ecomomic dispatch problem. Bulletin de la Societe

Francaise Electric, 3(8):431–447, 1962.

[2] James A Momoh. Electric power system applications of optimization. CRC Press, 2008.

[3] James A Momoh, ME El-Hawary, and Ramababu Adapa. A review of selected optimal

power flow literature to 1993. part i: Nonlinear and quadratic programming approaches.

IEEE transactions on power systems, 14(1):96–104, 1999.

[4] James A Momoh, ME El-Hawary, and Ramababu Adapa. A review of selected optimal

power flow literature to 1993. part ii: Newton, linear programming and interior point

methods. IEEE Transactions on Power Systems, 14(1):105–111, 1999.

[5] M Huneault, F Galiana, and Que St Bruno. A survey of the optrial power flow litera-

ture. IEEE transactions on Power Systems, 6(2), 1991.

[6] Hongye Wang, Carlos E Murillo-Sanchez, Ray D Zimmerman, and Robert J Thomas.

On computational issues of market-based optimal power flow. Power Systems, IEEE

Transactions on, 22(3):1185–1193, 2007.

[7] KS Pandya and SK Joshi. A survey of optimal power flow methods. Journal of

Theoretical & Applied Information Technology, 4(5), 2008.

[8] Rabih A Jabr, Alun H Coonick, and Brian J Cory. A primal-dual interior point method

for optimal power flow dispatching. Power Systems, IEEE Transactions on, 17(3):654–

662, 2002.

[9] Hua Wei, Hiroshi Sasaki, Junji Kubokawa, and R Yokoyama. An interior point nonlin-

ear programming for optimal power flow problems with a novel data structure. Power

Systems, IEEE Transactions on, 13(3):870–877, 1998.

[10] Ian A Hiskens and Robert J Davy. Exploring the power flow solution space boundary.

Power Systems, IEEE Transactions on, 16(3):389–395, 2001.

Page 64: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

57

[11] Rabih A Jabr. Optimal power flow using an extended conic quadratic formulation.

Power Systems, IEEE Transactions on, 23(3):1000–1008, 2008.

[12] Xiaoqing Bai, Hua Wei, Katsuki Fujisawa, and Yong Wang. Semidefinite programming

for optimal power flow problems. International Journal of Electrical Power & Energy

Systems, 30(6):383–392, 2008.

[13] Javad Lavaei and Steven H Low. Zero duality gap in optimal power flow problem.

Power Systems, IEEE Transactions on, 27(1):92–107, 2012.

[14] Lingwen Gan, Na Li, Steven Low, and Ufuk Topcu. Exact convex relaxation for optimal

power flow in distribution networks. In ACM SIGMETRICS Performance Evaluation

Review, volume 41, pages 351–352. ACM, 2013.

[15] Necdet Serhat Aybat, Garud Iyengar, and Zi Wang. An asynchronous distributed

proximal gradient method for composite convex optimization. arXiv preprint

arXiv:1409.8547, 2014.

[16] Ting-Hau Chang, Mingyi Hong, and Xiongfei Wang. Multi-agent distributed optimiza-

tion via inexact consensus admm. Signal Processing, IEEE Transactions on, 63(2):482–

497, 2015.

[17] Ryan McDonald, Keith Hall, and Gideon Mann. Distributed training strategies for the

structured perceptron. In Human Language Technologies: The 2010 Annual Conference

of the North American Chapter of the Association for Computational Linguistics, pages

456–464. Association for Computational Linguistics, 2010.

[18] Ioannis D Schizas, Alejandro Ribeiro, and Georgios B Giannakis. Consensus in ad hoc

wsns with noisy linkspart i: Distributed estimation of deterministic signals. Signal

Processing, IEEE Transactions on, 56(1):350–364, 2008.

[19] Qing Ling and Zhi Tian. Decentralized sparse signal recovery for compressive sleeping

wireless sensor networks. IEEE Transactions on Signal Processing, 58(7):3816–3827,

2010.

Page 65: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

58

[20] Necdet Serhat Aybat, Zi Wang, Tianyi Lin, and Shiqian Ma. Distributed linearized al-

ternating direction method of multipliers for composite convex consensus optimization.

arXiv preprint arXiv:1512.08122, 2015.

[21] QY Jiang, H-D Chiang, Chuangxin Guo, and YJ Cao. Power-current hybrid rectan-

gular formulation for interior-point optimal power flow. Generation, Transmission &

Distribution, IET, 3(8):748–756, 2009.

[22] MS Osman, Mahmoud A Abo-Sinna, and AA Mousa. A solution to the optimal power

flow using genetic algorithm. Applied mathematics and computation, 155(2):391–405,

2004.

[23] O Alsac, J Bright, M Prais, and B Stott. Further developments in lp-based optimal

power flow. Power Systems, IEEE Transactions on, 5(3):697–711, 1990.

[24] Stephen Frank, Ingrida Steponavice, and Steffen Rebennack. Optimal power flow: a

bibliographic survey i. Energy Systems, 3(3):221–258, 2012.

[25] RC Burchett, Hf H Happ, DR Vierath, and KA Wirgau. Developments in optimal

power flow. Power Apparatus and Systems, IEEE Transactions on, (2):406–414, 1982.

[26] Xiao-Ping Zhang. Restructured electric power systems: analysis of electricity markets

with equilibrium models, volume 71. John Wiley & Sons, 2010.

[27] Brian Stott and Eric Hobson. Power system security control calculations using linear

programming, part i. Power Apparatus and Systems, IEEE Transactions on, (5):1713–

1720, 1978.

[28] Brian Stott and Eric Hobson. Power system security control calculations using linear

programming, part ii. Power Apparatus and Systems, IEEE Transactions on, (5):1721–

1731, 1978.

[29] Ro E Griffith and RA Stewart. A nonlinear programming technique for the optimization

of continuous processing systems. Management science, 7(4):379–392, 1961.

[30] Xiao-Ping Zhang, Christian Rehtanz, and Bikash Pal. Flexible AC transmission sys-

tems: modelling and control. Springer Science & Business Media, 2012.

Page 66: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

59

[31] R Mota-Palomino and VH Quintana. Sparse reactive power scheduling by a penalty

function-linear programming technique. Power Systems, IEEE Transactions on,

1(3):31–39, 1986.

[32] Kenji Iba, Hiroshi Suzuki, Ken-ichi Suzuki, and Katsuhiko Suzuki. Practical reac-

tive power allocation/operation planning using successive linear programming. Power

Systems, IEEE Transactions on, 3(2):558–566, 1988.

[33] Paul T Boggs and Jon W Tolle. Sequential quadratic programming. Acta numerica,

4:1–51, 1995.

[34] Carsten Lehmkoster. Security constrained optimal power flow for an economical opera-

tion of facts-devices in liberalized energy markets. Power Delivery, IEEE Transactions

on, 17(2):603–608, 2002.

[35] X Lin, AK David, and CW Yu. Reactive power optimisation with voltage stability

consideration in power market systems. In Generation, Transmission and Distribution,

IEE Proceedings-, volume 150, pages 305–310. IET, 2003.

[36] GP Granelli and M Montagna. Security-constrained economic dispatch using dual

quadratic programming. Electric Power Systems Research, 56(1):71–80, 2000.

[37] Hermann W Dommel and William F Tinney. Optimal power flow solutions. power

apparatus and systems, IEEE transactions on, (10):1866–1876, 1968.

[38] Esdras Penedo De Carvalho, Anesio dos Santos, and To Fu Ma. Reduced gradient

method combined with augmented lagrangian and barrier for the optimal power flow

problem. Applied Mathematics and Computation, 200(2):529–536, 2008.

[39] John Peschon, Donald W Bree Jr, and Laslo P Hajdu. Optimal power-flow solutions

for power system planning. Proceedings of the IEEE, 60(1):64–70, 1972.

[40] HH Happ. Optimal power dispatch-a comprehensive survey. Power Apparatus and

Systems, IEEE Transactions on, 96(3):841–854, 1977.

[41] AM Sasson, F Viloria, and F Aboytes. Optimal load flow solution using the hessian

matrix. Power Apparatus and Systems, IEEE Transactions on, (1):31–41, 1973.

Page 67: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

60

[42] David I Sun, Bruce Ashley, Brian Brewer, Art Hughes, and William F Tinney. Optimal

power flow by newton approach. power apparatus and systems, ieee transactions on,

(10):2864–2880, 1984.

[43] Gamal A Maria and JA Findlay. A newton optimal power flow program for ontario

hydro ems. Power Systems, IEEE Transactions on, 2(3):576–582, 1987.

[44] A Santos Jr, S Deckmann, and S Soares. A dual augmented lagrangian approach for

optimal power flow. Power Systems, IEEE Transactions on, 3(3):1020–1025, 1988.

[45] GRM Da Costa. Optimal reactive dispatch through primal-dual method. Power Sys-

tems, IEEE Transactions on, 12(2):669–674, 1997.

[46] O Crisan and MA Mohtadi. Efficient identification of binding inequality constraints in

optimal power flow newton approach. In Generation, Transmission and Distribution,

IEE Proceedings C, volume 139, pages 365–370. IET, 1992.

[47] GRM Da Costa, CEU Costa, and AM De Souza. Comparative studies of optimiza-

tion methods for the optimal power flow problem. Electric Power Systems Research,

56(3):249–254, 2000.

[48] Jorge Nocedal and Stephen Wright. Numerical optimization. Springer Science & Busi-

ness Media, 2006.

[49] S Granville, J Mello, and ACG Melo. Application of interior point methods to power

flow unsolvability. Power Systems, IEEE Transactions on, 11(2):1096–1103, 1996.

[50] Rabih A Jabr. A primal-dual interior-point method to solve the optimal power flow

dispatching problem. Optimization and Engineering, 4(4):309–336, 2003.

[51] Gerald0 Leite Torres and Victor Hugo Quintana. An interior-point method for nonlin-

ear optimal power flow using voltage rectangular coordinates. Power Systems, IEEE

Transactions on, 13(4):1211–1218, 1998.

[52] Geraldo L Torres and Victor H Quintana. On a nonlinear multiple-centrality-

corrections interior-point method for optimal power flow. Power Systems, IEEE Trans-

actions on, 16(2):222–228, 2001.

Page 68: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

61

[53] Florin Capitanescu, Mevludin Glavic, Damien Ernst, and Louis Wehenkel. Interior-

point based algorithms for the solution of optimal power flow problems. Electric Power

systems research, 77(5):508–517, 2007.

[54] Karim Karoui, Ludovic Platbrood, Horia Crisciu, and Richard A Waltz. New large-

scale security constrained optimal power flow program using a new interior point al-

gorithm. In Electricity Market, 2008. EEM 2008. 5th International Conference on

European, pages 1–6. IEEE, 2008.

[55] Marcia V Vanti and Clovis C Gonzaga. On the newton interior-point method for

nonlinear optimal power flow. In Power Tech Conference Proceedings, 2003 IEEE

Bologna, volume 4, pages 5–pp. IEEE, 2003.

[56] X-P Zhang, SG Petoussis, and KR Godfrey. Nonlinear interior-point optimal power

flow method based on a current mismatch formulation. In Generation, Transmission

and Distribution, IEE Proceedings-, volume 152, pages 795–805. IET, 2005.

[57] Javad Lavaei. Zero duality gap for classical opf problem convexifies fundamental nonlin-

ear power problems. In American Control Conference (ACC), 2011, pages 4566–4573.

IEEE, 2011.

[58] Samira Sojoudi and Javad Lavaei. Physics of power networks makes hard optimization

problems easy to solve. In Power and Energy Society General Meeting, 2012 IEEE,

pages 1–8. IEEE, 2012.

[59] Bernard C Lesieutre, Daniel K Molzahn, Alex R Borden, and Christopher L DeMarco.

Examining the limits of the application of semidefinite programming to power flow

problems. In Communication, Control, and Computing (Allerton), 2011 49th Annual

Allerton Conference on, pages 1492–1499. IEEE, 2011.

[60] Masoud Farivar, Christopher R Clarke, Steven H Low, and K Mani Chandy. Inverter

var control for distribution systems with renewables. In Smart Grid Communications

(SmartGridComm), 2011 IEEE International Conference on, pages 457–462. IEEE,

2011.

Page 69: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

62

[61] Lingwen Gan, Na Li, Ufuk Topcu, and Steven Low. On the exactness of convex

relaxation for optimal power flow in tree networks. In Decision and Control (CDC),

2012 IEEE 51st Annual Conference on, pages 465–471. IEEE, 2012.

[62] Lin Xiao, Stephen Boyd, and Seung-Jean Kim. Distributed average consensus with

least-mean-square deviation. Journal of Parallel and Distributed Computing, 67(1):33–

46, 2007.

[63] Pedro A Forero, Alfonso Cano, and Georgios B Giannakis. Distributed clustering

using wireless sensor networks. Selected Topics in Signal Processing, IEEE Journal of,

5(4):707–724, 2011.

[64] Angelia Nedic and Asuman Ozdaglar. Distributed subgradient methods for multi-agent

optimization. Automatic Control, IEEE Transactions on, 54(1):48–61, 2009.

[65] John N Tsitsiklis, Dimitri P Bertsekas, and Michael Athans. Distributed asynchronous

deterministic and stochastic gradient optimization algorithms. In 1984 American Con-

trol Conference, pages 484–489, 1984.

[66] John Nikolas Tsitsiklis. Problems in decentralized decision making and computation.

Technical report, DTIC Document, 1984.

[67] John C Duchi, Alekh Agarwal, and Martin J Wainwright. Dual averaging for dis-

tributed optimization: convergence analysis and network scaling. Automatic control,

IEEE Transactions on, 57(3):592–606, 2012.

[68] Ermin Wei and Asuman Ozdaglar. On the o (1= k) convergence of asynchronous

distributed alternating direction method of multipliers. In Global Conference on Signal

and Information Processing (GlobalSIP), 2013 IEEE, pages 551–554. IEEE, 2013.

[69] Ermin Wei and Asuman Ozdaglar. Distributed alternating direction method of multi-

pliers. In Decision and Control (CDC), 2012 IEEE 51st Annual Conference on, pages

5445–5450. IEEE, 2012.

Page 70: ON DISTRIBUTED OPTIMIZATION METHODS FOR SOLVING …

63

[70] Ali Makhdoumi and Asuman Ozdaglar. Broadcast-based distributed alternating direc-

tion method of multipliers. In Communication, Control, and Computing (Allerton),

2014 52nd Annual Allerton Conference on, pages 270–277. IEEE, 2014.

[71] Albert I Chen and Asuman Ozdaglar. A fast distributed proximal-gradient method.

In Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton

Conference on, pages 601–608. IEEE, 2012.

[72] Wei Shi, Qing Ling, Gang Wu, and Wotao Yin. A proximal gradient algorithm

for decentralized composite optimization. Signal Processing, IEEE Transactions on,

63(22):6013–6023, 2015.

[73] Wei Shi, Qing Ling, Gang Wu, and Wotao Yin. Extra: An exact first-order algorithm

for decentralized consensus optimization. SIAM Journal on Optimization, 25(2):944–

966, 2015.

[74] Pascal Bianchi, Walid Hachem, and Franck Iutzeler. A stochastic primal-dual algorithm

for distributed asynchronous composite optimization. In GlobalSIP, pages 732–736,

2014.