Lagrange Relaxation - Operations Research · Lagrange Relaxation 1 Lagrange Dual Problem 2 Weak and...

Lagrange RelaxationOperations Research

Anthony Papavasiliou

1 / 37

Lagrange Relaxation

1 Lagrange Dual Problem

2 Weak and Strong Duality

3 Optimality Conditions

4 Perturbation and Sensitivity Analysis

5 Examples

6 Dual Multipliers in AMPL

2 / 37

Table of Contents





5 Examples


3 / 37

Lagrangian Function

Standard form problem (not necessarily convex):

min f0(x)

s.t. fi(x) ≤ 0, i = 1, . . . ,m

hi(x) = 0, i = 1, . . . ,p

x ∈ Rn, domain D, optimal value p?

Lagrangian function: L : Rn × Rm × Rp → R,dom L = D × Rm × Rp:

L(x , λ, ν) = f0(x) +m∑

i=1

λi fi(x) +

p∑i=1

νihi(x)

Weighted sum of the objective and constraint functionsλi is the Lagrange multiplier associated with fi(x) ≤ 0νi is the Lagrange multiplier associated with the equalityconstraint hi(x) = 0

4 / 37

Dual Function

Lagrange dual function: g : Rm × Rp → R,

g(λ, ν) = infx∈D

L(x , λ, ν)

= infx∈D

(f0(x) +m∑

i=1

λi fi(x) +

p∑i=1

νihi(x))

g is concave, can be −∞ for some λ, ν

5 / 37

Dual Function is Concave

Consider any (λ1, ν1), (λ2, ν2) with λ1, λ2 ≥ 0 and α ∈ [0,1]

g(αλ1 + (1− α)λ2, αν1 + (1− α)ν2)

= infx∈D

(f0(x) +m∑

i=1

(αλ1,i fi(x) + (1− α)λ2,i fi(x))

+

p∑i=1

(αν1,ihi(x) + (1− α)ν2,ihi(x)))

≥ α infx∈D

(f0(x) +m∑

i=1

λ1,i fi(x) +

p∑i=1

ν1,ihi(x))

+(1− α) infx∈D

(f0(x) +m∑

i=1

λ2,i fi(x) +

p∑i=1

ν2,ihi(x))

= αg(λ1, ν1) + (1− α)g(λ2, ν2)

6 / 37

Dual Function is a Lower Bound

If λ ≥ 0 then g(λ, ν) ≤ p?

Proof: If x is feasible and λ ≥ 0 then

f0(x) ≥ L(x , λ, ν) ≥ infx∈D

L(x , λ, ν) = g(λ, ν).

Minimizing over all feasible x gives p? ≥ g(λ, ν)

7 / 37

Lagrange Relaxation of Stochastic Programs

Consider 2-stage stochastic program:

min f1(x) + Eω[f2(y(ω), ω)]

s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,

h2i(x , y(ω), ω) ≤ 0, i = 1, . . . ,m2

Introduce non-anticipativity constraint x(ω) = x andreformulate problem as

min f1(x) + Eω[f2(y(ω), ω)]

s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,

h2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2

x(ω) = x

8 / 37

Dual Function of Stochastic Program

g(ν) = Eωg1(ν, ω) + g2(ν)

where

g1(ν) = inf f1(x) + νx

s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,

and

g2(ν, ω) = inf f2(y(ω), ω)− νx(ω)

s.t. h2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2

Can you think of another relaxation?

9 / 37

Agent Coordination

Consider set of agents G with private cost fg(xg), privateconstraints h2g(xg) ≤ 0

min∑g∈G

fg(xg)

s.t.∑g∈G

h1g(xg) = 0

h2g(xg) ≤ 0

Relax coordination constraints∑

g∈G h1g(xg) = 0:

L(x , λ) =∑g∈G

(fg(pg) + λT h1g(xg))

g(λ) =∑g∈G

infh2g(xg)≤0

(fg(pg) + λT h1g(xg))

10 / 37

Table of Contents





5 Examples


11 / 37

The Dual Problem

Lagrange dual problem:

max g(λ, ν)

s.t. λ ≥ 0

Finds best lower bound on p? from Lagrangian dualfunction

Convex optimization problem with optimal value d?

λ, ν are dual feasible if λ ≥ 0, (λ, ν) ∈ dom g

12 / 37

Weak and Strong Duality

Weak duality: d? ≤ p?

Always holds (for convex and non-convex problems)

Can be used for finding non-trivial bounds to difficultproblems

Strong duality: p? = d?

Does not hold in general

Usually holds for convex problems

Conditions that guarantee strong duality in convexproblems are called constraint qualifications

13 / 37

Linear Programming Duality Mnemonic Table

Primal Minimize Maximize DualConstraints ≥ bi ≥ 0 Variables

≤ bi ≤ 0= bi Free

Variables ≥ 0 ≤ cj Constraints≤ 0 ≥ cj

Free = cj

Prove the mnemonic table using Lagrangian relaxation

14 / 37

Table of Contents





5 Examples


15 / 37

Complementary Slackness

If strong duality holds, x? primal optimal, λ?, ν? dual optimal

f0(x?) = g(λ?, ν?) = infx

(f0(x) +m∑

i=1

λ?i fi(x) +

p∑i=1

ν?i hi(x))

≤ f0(x?) +m∑

i=1

λ?i fi(x?) +

p∑i=1

ν?i hi(x?)

≤ f0(x?)

Therefore, the two inequalities above hold with equality and

x? minimizes L(x , λ?, ν?)

λ?i fi(x?) = 0 for i = 1, . . . ,m

This is known as complementary slackness:

λ?i > 0⇒ fi(x?) = 0 fi(x?) < 0⇒ λ?i = 0

16 / 37

KKT Conditions

KKT conditions for a problem with differentiable fi ,hi :

Primal constraints: fi(x) ≤ 0, i = 1, . . . ,m,hi(x) = 0, i = 1, . . . ,p

Dual constraints: λ ≥ 0

Complementary slackness: λi fi(x) = 0, i = 1, . . . ,m

Gradient of the Lagrangian function with respect to xvanishes:

∇f0(x) +m∑

i=1

λi∇fi(x) +

p∑i=1

νi∇hi(x) = 0

From previous slide, if strong duality holds and x , λ, ν areoptimal, then they must satisfy the KKT conditions

17 / 37

KKT Conditions for Convex Problem

If x , λ, ν satisfy KKT for a convex problem, then they areoptimal:

From complementary slackness: f0(x) = L(x , λ, ν)

From 4th condition (and convexity): g(λ, ν) = L(x , λ, ν)

hence f0(x) = L(x , λ, ν)

18 / 37

KKT Conditions of Maximization with Linear Constraints

Consider a maximization problem with linear constraints:

max f (x)

s.t. Cx = d , (µ)

Ax ≤ b, (λ)

x ≥ 0, (λ2)

Then the KKT conditions have the following form:

Cx − d = 0

0 ≤ λ ⊥ Ax − b ≤ 0

0 ≤ x ⊥ λT A + µT C −∇f (x)T ≥ 0

19 / 37

Table of Contents





5 Examples


20 / 37

Perturbed Problem

Unperturbed optimization problem and its dual:

min f0(x)|fi(x) ≤ 0, i = 1, . . . ,m,hi(x) = 0, i = 1, . . . ,pmax g(λ, ν)|λ ≥ 0

Perturbed problem and its dual:

min f0(x)|fi(x) ≤ ui , i = 1, . . . ,m,hi(x) = vi , i = 1, . . . ,pmax g(λ, ν)− u′λ− v ′ν|λ ≥ 0

x is primal variable; u, v are parameters

p?(u, v) is optimal value as a function of u, v

We are interested in information about p?(u, v) that we canobtain from the solution of the unperturbed problem and itsdual.

21 / 37

Global Sensitivity Result

Assume strong duality holds for the unperturbed problem, andthat λ?, ν? are dual optimal for the unperturbed problem

p?(u, v) ≥ g(λ?, ν?)− u′λ? − v ′ν?

= p?(0,0)− u′λ? − v ′ν?

Sensitivity interpretation:If λ?i is large, p? increases greatly if we tighten constraint i(ui < 0)If λ?i is small, p? does not decrease greatly if we loosenconstraint i (ui > 0)If ν?i is large and positive, p? increases greatly if vi < 0.If ν?i is large and negative, p? increases greatly if vi > 0.If ν?i is small and positive, p? does not decrease much ifvi > 0. If ν?i is small and negative, p? does not decreasemuch if vi < 0.

22 / 37

Local Sensitivity

If (in addition) p?(u, v) is differentiable at (0,0), then

λ?i = −∂p?(0,0)

∂ui

ν?i = −∂p?(0,0)

∂vi

Proof (for λ?i ): from global sensitivity result,

∂p?(0,0)

∂ui= lim

t↓0

p?(tei ,0)− p?(0,0)

t≥ −λ?i

∂p?(0,0)

∂ui= lim

t↑0

p?(tei ,0)− p?(0,0)

t≤ −λ?i

hence, equality

23 / 37

Table of Contents





5 Examples


24 / 37

Duality and Problem Reformulations

Equivalent formulations of a problem can lead to verydifferent duals

Reformulating the primal problem can be useful when thedual is difficult to derive, or uninteresting

Common reformulations

Introduce new variables and equality constraints (we haveseen this already)

Rearrange constraints in subproblems

25 / 37

Rearranging Constraints

How would you relax the following:

minEω[f1(x(ω)) + f2(y(ω), ω)]

s.t. g1i(x(ω)) ≤ 0, i = 1, . . . ,m1,

g2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2

x(ω) = x

Is this a good idea?

26 / 37

A Two-Stage Stochastic Integer Program [Sen, 2000]

Scenario Constraints Binary Solutionsω = 1 2x1 + y1 ≤ 2 and D1 = (0,0), (1,0)

2x1 − y1 ≥ 0ω = 2 x2 − y2 ≥ 0 D2 = (0,0), (1,0), (1,1)ω = 3 x3 + y3 ≤ 1 D3 = (0,0), (0,1), (1,0)

3 equally likely scenarios

Define x as first-stage decision, xω is first-stage decisionfor scenario ω

Non-anticipativity constraint: x1 = 13(x1 + x2 + x3)

27 / 37

Formulation of Problem and Dual Function

max(1/3)y1 + (1/3)y2 + (1/3)y3

s.t. 2x1 + y2 ≤ 2

2x1 − y1 ≥ 0

x2 − y2 ≥ 0

x3 + y) ≤ 123

x1 −13

x2 −13

x3 = 0, (λ)

xω, yω ∈ 0,1, ω ∈ Ω = 1,2,3

g(λ) = max(xω ,yω)∈Dω

2λ3

x1 +13

y1 −λ

3x2 +

13

y2 −λ

3x3 +

13

y3

= max(0,2λ/3) + max(0, (−λ+ 1)/3) + max(1/3,−λ/3)

28 / 37

Duality Gap

Dual function can be expressed equivalently

g(λ) =

13 −

23λ λ ≤ −1

23 −

13λ −1 ≤ λ ≤ 0

23 + 1

3λ 0 ≤ λ ≤ 113 + 2

3λ 1 ≤ λ

Primal optimal value p? = 1/3 with x?1 = x?

2 = x?3 = 1,

y?1 = 0, y?

2 = 1, y?3 = 0

Dual optimal value d? = 2/3 at λ? = 0

Conclusion: We have a duality gap of 1/3

29 / 37

Alternative Relaxation

Add new explicit first-stage decision variable x , with thefollowing non-anticipativity constraints:

x1 = x , (λ1)

x2 = x , (λ2)

x3 = x , (λ3)

Dual function:

g(λ) = max(x1,y1)∈D1

13

x1 − λ1y1+ max(x2,y2)∈D2

(13

x2 − λ2y2+

max(x3,y3)∈D3

13

x3 − λ3y3+ maxx∈0,1

(λ1 + λ2 + λ3)x

= max(0,−λ1) + max(0,13− λ2) + max(

13,−λ3) +

max(0, λ1 + λ2 + λ3)

30 / 37

Closing the Duality Gap

Since we know primal optimal solution, we have hint of what λi

need to be:

λ1 + λ2 + λ3 ≥ 0, (because we should have x? = 1)

−λ1 ≥ 0 (because we should have x?1 = 1)

λ2 ≤ 13 (because we should have x?

2 = 1)

−λ3 ≥ 13 (because we should have x?

3 = 1)

Choosing λ1 = 0, λ2 = 13 and λ3 = −1

3 , we satisfy theseinequalities. Dual function:

g(0,1/3,−1/3) = 0 + 0 +13

+ 0 =13

31 / 37

Conclusions of Example

Different relaxations can result in different duality gaps

Computational trade-off: introducing more Lagrangemultipliers results in better bounds but larger search space

32 / 37

Table of Contents





5 Examples


33 / 37

Non-Uniqueness of KKT Conditions

1 The KKT conditions of a problem depend on how wedefine the Lagrangian function

2 The sign of dual multipliers depends on the KKT conditions(therefore, how we define the Lagrangian function)

3 The sensitivity interpretation of dual multipliers depends onthe KKT conditions (therefore, how we define theLagrangian function)

4 Different software interprets user syntax differently!

34 / 37

Dual Multipliers in AMPL

In order to be able to anticipate the sign of multipliers thatAMPL will assign to constraints, note that:

A constraint of the form f1(x) ≤,=,≥ f2(x) is equivalentlyexpressed as f1(x)− f2(x) ≤,=,≥ 0,

the constraints are relaxed by subtracting their product withtheir corresponding multiplier from the Lagrangian function,

the sign of the dual multiplier is such that the Lagrangianfunction provides a bound to the optimization problem,

the primal-dual optimal pair is such that the KKT conditionscorresponding to this Lagrangian function are satisfied.

In this way, the dual multipliers reported by AMPL canalways be interpreted as sensitivities.

35 / 37

Example

min x + 2y s.t. 0 ≤ x , (λ1), x ≤ 2, (λ2), y = 1, (µ)

Objective function f (x , y) = x + 2y , inequality constraintsf1(x , y) = −x ≤ 0 (i.e., a ≤ constraint), f2(x , y) = x − 2,h(x , y) = y − 1

AMPL Lagrangian:L(x , y) = (x + 2y)− λ1(−x)− λ2(x − 2)− µ(y − 1)

36 / 37

AMPL KKT Conditions

KKT conditions:

Primal feasibility: g1(x , y) ≤ 0, g2(x , y) ≤ 0, h(x , y) = 0

Dual feasibility: λ1 ≤ 0, λ2 ≤ 0

Complementarity: λ1 ⊥ g1(x , y), λ2 ⊥ g2(x , y)

Stationarity:

∇f (x , y)− λ1∇g1(x , y)− λ2∇g2(x , y)− µ∇h(x , y) = 0

Solution: x = 0, y = 1, λ1 = −1, λ2 = 0, µ = 2

37 / 37

Lagrange Relaxation - Operations Research · Lagrange Relaxation 1 Lagrange Dual Problem 2 Weak and...

Documents

Transcript of Lagrange Relaxation - Operations Research · Lagrange Relaxation 1 Lagrange Dual Problem 2 Weak and...