Lagrange Relaxation - Operations Research · Lagrange Relaxation 1 Lagrange Dual Problem 2 Weak and...
Transcript of Lagrange Relaxation - Operations Research · Lagrange Relaxation 1 Lagrange Dual Problem 2 Weak and...
Lagrange Relaxation
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
2 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
3 / 37
Lagrangian Function
Standard form problem (not necessarily convex):
min f0(x)
s.t. fi(x) ≤ 0, i = 1, . . . ,m
hi(x) = 0, i = 1, . . . ,p
x ∈ Rn, domain D, optimal value p?
Lagrangian function: L : Rn × Rm × Rp → R,dom L = D × Rm × Rp:
L(x , λ, ν) = f0(x) +m∑
i=1
λi fi(x) +
p∑i=1
νihi(x)
Weighted sum of the objective and constraint functionsλi is the Lagrange multiplier associated with fi(x) ≤ 0νi is the Lagrange multiplier associated with the equalityconstraint hi(x) = 0
4 / 37
Dual Function
Lagrange dual function: g : Rm × Rp → R,
g(λ, ν) = infx∈D
L(x , λ, ν)
= infx∈D
(f0(x) +m∑
i=1
λi fi(x) +
p∑i=1
νihi(x))
g is concave, can be −∞ for some λ, ν
5 / 37
Dual Function is Concave
Consider any (λ1, ν1), (λ2, ν2) with λ1, λ2 ≥ 0 and α ∈ [0,1]
g(αλ1 + (1− α)λ2, αν1 + (1− α)ν2)
= infx∈D
(f0(x) +m∑
i=1
(αλ1,i fi(x) + (1− α)λ2,i fi(x))
+
p∑i=1
(αν1,ihi(x) + (1− α)ν2,ihi(x)))
≥ α infx∈D
(f0(x) +m∑
i=1
λ1,i fi(x) +
p∑i=1
ν1,ihi(x))
+(1− α) infx∈D
(f0(x) +m∑
i=1
λ2,i fi(x) +
p∑i=1
ν2,ihi(x))
= αg(λ1, ν1) + (1− α)g(λ2, ν2)
6 / 37
Dual Function is a Lower Bound
If λ ≥ 0 then g(λ, ν) ≤ p?
Proof: If x is feasible and λ ≥ 0 then
f0(x) ≥ L(x , λ, ν) ≥ infx∈D
L(x , λ, ν) = g(λ, ν).
Minimizing over all feasible x gives p? ≥ g(λ, ν)
7 / 37
Lagrange Relaxation of Stochastic Programs
Consider 2-stage stochastic program:
min f1(x) + Eω[f2(y(ω), ω)]
s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,
h2i(x , y(ω), ω) ≤ 0, i = 1, . . . ,m2
Introduce non-anticipativity constraint x(ω) = x andreformulate problem as
min f1(x) + Eω[f2(y(ω), ω)]
s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,
h2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2
x(ω) = x
8 / 37
Dual Function of Stochastic Program
g(ν) = Eωg1(ν, ω) + g2(ν)
where
g1(ν) = inf f1(x) + νx
s.t. h1i(x) ≤ 0, i = 1, . . . ,m1,
and
g2(ν, ω) = inf f2(y(ω), ω)− νx(ω)
s.t. h2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2
Can you think of another relaxation?
9 / 37
Agent Coordination
Consider set of agents G with private cost fg(xg), privateconstraints h2g(xg) ≤ 0
min∑g∈G
fg(xg)
s.t.∑g∈G
h1g(xg) = 0
h2g(xg) ≤ 0
Relax coordination constraints∑
g∈G h1g(xg) = 0:
L(x , λ) =∑g∈G
(fg(pg) + λT h1g(xg))
g(λ) =∑g∈G
infh2g(xg)≤0
(fg(pg) + λT h1g(xg))
10 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
11 / 37
The Dual Problem
Lagrange dual problem:
max g(λ, ν)
s.t. λ ≥ 0
Finds best lower bound on p? from Lagrangian dualfunction
Convex optimization problem with optimal value d?
λ, ν are dual feasible if λ ≥ 0, (λ, ν) ∈ dom g
12 / 37
Weak and Strong Duality
Weak duality: d? ≤ p?
Always holds (for convex and non-convex problems)
Can be used for finding non-trivial bounds to difficultproblems
Strong duality: p? = d?
Does not hold in general
Usually holds for convex problems
Conditions that guarantee strong duality in convexproblems are called constraint qualifications
13 / 37
Linear Programming Duality Mnemonic Table
Primal Minimize Maximize DualConstraints ≥ bi ≥ 0 Variables
≤ bi ≤ 0= bi Free
Variables ≥ 0 ≤ cj Constraints≤ 0 ≥ cj
Free = cj
Prove the mnemonic table using Lagrangian relaxation
14 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
15 / 37
Complementary Slackness
If strong duality holds, x? primal optimal, λ?, ν? dual optimal
f0(x?) = g(λ?, ν?) = infx
(f0(x) +m∑
i=1
λ?i fi(x) +
p∑i=1
ν?i hi(x))
≤ f0(x?) +m∑
i=1
λ?i fi(x?) +
p∑i=1
ν?i hi(x?)
≤ f0(x?)
Therefore, the two inequalities above hold with equality and
x? minimizes L(x , λ?, ν?)
λ?i fi(x?) = 0 for i = 1, . . . ,m
This is known as complementary slackness:
λ?i > 0⇒ fi(x?) = 0 fi(x?) < 0⇒ λ?i = 0
16 / 37
KKT Conditions
KKT conditions for a problem with differentiable fi ,hi :
Primal constraints: fi(x) ≤ 0, i = 1, . . . ,m,hi(x) = 0, i = 1, . . . ,p
Dual constraints: λ ≥ 0
Complementary slackness: λi fi(x) = 0, i = 1, . . . ,m
Gradient of the Lagrangian function with respect to xvanishes:
∇f0(x) +m∑
i=1
λi∇fi(x) +
p∑i=1
νi∇hi(x) = 0
From previous slide, if strong duality holds and x , λ, ν areoptimal, then they must satisfy the KKT conditions
17 / 37
KKT Conditions for Convex Problem
If x , λ, ν satisfy KKT for a convex problem, then they areoptimal:
From complementary slackness: f0(x) = L(x , λ, ν)
From 4th condition (and convexity): g(λ, ν) = L(x , λ, ν)
hence f0(x) = L(x , λ, ν)
18 / 37
KKT Conditions of Maximization with Linear Constraints
Consider a maximization problem with linear constraints:
max f (x)
s.t. Cx = d , (µ)
Ax ≤ b, (λ)
x ≥ 0, (λ2)
Then the KKT conditions have the following form:
Cx − d = 0
0 ≤ λ ⊥ Ax − b ≤ 0
0 ≤ x ⊥ λT A + µT C −∇f (x)T ≥ 0
19 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
20 / 37
Perturbed Problem
Unperturbed optimization problem and its dual:
min f0(x)|fi(x) ≤ 0, i = 1, . . . ,m,hi(x) = 0, i = 1, . . . ,pmax g(λ, ν)|λ ≥ 0
Perturbed problem and its dual:
min f0(x)|fi(x) ≤ ui , i = 1, . . . ,m,hi(x) = vi , i = 1, . . . ,pmax g(λ, ν)− u′λ− v ′ν|λ ≥ 0
x is primal variable; u, v are parameters
p?(u, v) is optimal value as a function of u, v
We are interested in information about p?(u, v) that we canobtain from the solution of the unperturbed problem and itsdual.
21 / 37
Global Sensitivity Result
Assume strong duality holds for the unperturbed problem, andthat λ?, ν? are dual optimal for the unperturbed problem
p?(u, v) ≥ g(λ?, ν?)− u′λ? − v ′ν?
= p?(0,0)− u′λ? − v ′ν?
Sensitivity interpretation:If λ?i is large, p? increases greatly if we tighten constraint i(ui < 0)If λ?i is small, p? does not decrease greatly if we loosenconstraint i (ui > 0)If ν?i is large and positive, p? increases greatly if vi < 0.If ν?i is large and negative, p? increases greatly if vi > 0.If ν?i is small and positive, p? does not decrease much ifvi > 0. If ν?i is small and negative, p? does not decreasemuch if vi < 0.
22 / 37
Local Sensitivity
If (in addition) p?(u, v) is differentiable at (0,0), then
λ?i = −∂p?(0,0)
∂ui
ν?i = −∂p?(0,0)
∂vi
Proof (for λ?i ): from global sensitivity result,
∂p?(0,0)
∂ui= lim
t↓0
p?(tei ,0)− p?(0,0)
t≥ −λ?i
∂p?(0,0)
∂ui= lim
t↑0
p?(tei ,0)− p?(0,0)
t≤ −λ?i
hence, equality
23 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
24 / 37
Duality and Problem Reformulations
Equivalent formulations of a problem can lead to verydifferent duals
Reformulating the primal problem can be useful when thedual is difficult to derive, or uninteresting
Common reformulations
Introduce new variables and equality constraints (we haveseen this already)
Rearrange constraints in subproblems
25 / 37
Rearranging Constraints
How would you relax the following:
minEω[f1(x(ω)) + f2(y(ω), ω)]
s.t. g1i(x(ω)) ≤ 0, i = 1, . . . ,m1,
g2i(x(ω), y(ω), ω) ≤ 0, i = 1, . . . ,m2
x(ω) = x
Is this a good idea?
26 / 37
A Two-Stage Stochastic Integer Program [Sen, 2000]
Scenario Constraints Binary Solutionsω = 1 2x1 + y1 ≤ 2 and D1 = (0,0), (1,0)
2x1 − y1 ≥ 0ω = 2 x2 − y2 ≥ 0 D2 = (0,0), (1,0), (1,1)ω = 3 x3 + y3 ≤ 1 D3 = (0,0), (0,1), (1,0)
3 equally likely scenarios
Define x as first-stage decision, xω is first-stage decisionfor scenario ω
Non-anticipativity constraint: x1 = 13(x1 + x2 + x3)
27 / 37
Formulation of Problem and Dual Function
max(1/3)y1 + (1/3)y2 + (1/3)y3
s.t. 2x1 + y2 ≤ 2
2x1 − y1 ≥ 0
x2 − y2 ≥ 0
x3 + y) ≤ 123
x1 −13
x2 −13
x3 = 0, (λ)
xω, yω ∈ 0,1, ω ∈ Ω = 1,2,3
g(λ) = max(xω ,yω)∈Dω
2λ3
x1 +13
y1 −λ
3x2 +
13
y2 −λ
3x3 +
13
y3
= max(0,2λ/3) + max(0, (−λ+ 1)/3) + max(1/3,−λ/3)
28 / 37
Duality Gap
Dual function can be expressed equivalently
g(λ) =
13 −
23λ λ ≤ −1
23 −
13λ −1 ≤ λ ≤ 0
23 + 1
3λ 0 ≤ λ ≤ 113 + 2
3λ 1 ≤ λ
Primal optimal value p? = 1/3 with x?1 = x?
2 = x?3 = 1,
y?1 = 0, y?
2 = 1, y?3 = 0
Dual optimal value d? = 2/3 at λ? = 0
Conclusion: We have a duality gap of 1/3
29 / 37
Alternative Relaxation
Add new explicit first-stage decision variable x , with thefollowing non-anticipativity constraints:
x1 = x , (λ1)
x2 = x , (λ2)
x3 = x , (λ3)
Dual function:
g(λ) = max(x1,y1)∈D1
13
x1 − λ1y1+ max(x2,y2)∈D2
(13
x2 − λ2y2+
max(x3,y3)∈D3
13
x3 − λ3y3+ maxx∈0,1
(λ1 + λ2 + λ3)x
= max(0,−λ1) + max(0,13− λ2) + max(
13,−λ3) +
max(0, λ1 + λ2 + λ3)
30 / 37
Closing the Duality Gap
Since we know primal optimal solution, we have hint of what λi
need to be:
λ1 + λ2 + λ3 ≥ 0, (because we should have x? = 1)
−λ1 ≥ 0 (because we should have x?1 = 1)
λ2 ≤ 13 (because we should have x?
2 = 1)
−λ3 ≥ 13 (because we should have x?
3 = 1)
Choosing λ1 = 0, λ2 = 13 and λ3 = −1
3 , we satisfy theseinequalities. Dual function:
g(0,1/3,−1/3) = 0 + 0 +13
+ 0 =13
31 / 37
Conclusions of Example
Different relaxations can result in different duality gaps
Computational trade-off: introducing more Lagrangemultipliers results in better bounds but larger search space
32 / 37
Table of Contents
1 Lagrange Dual Problem
2 Weak and Strong Duality
3 Optimality Conditions
4 Perturbation and Sensitivity Analysis
5 Examples
6 Dual Multipliers in AMPL
33 / 37
Non-Uniqueness of KKT Conditions
1 The KKT conditions of a problem depend on how wedefine the Lagrangian function
2 The sign of dual multipliers depends on the KKT conditions(therefore, how we define the Lagrangian function)
3 The sensitivity interpretation of dual multipliers depends onthe KKT conditions (therefore, how we define theLagrangian function)
4 Different software interprets user syntax differently!
34 / 37
Dual Multipliers in AMPL
In order to be able to anticipate the sign of multipliers thatAMPL will assign to constraints, note that:
A constraint of the form f1(x) ≤,=,≥ f2(x) is equivalentlyexpressed as f1(x)− f2(x) ≤,=,≥ 0,
the constraints are relaxed by subtracting their product withtheir corresponding multiplier from the Lagrangian function,
the sign of the dual multiplier is such that the Lagrangianfunction provides a bound to the optimization problem,
the primal-dual optimal pair is such that the KKT conditionscorresponding to this Lagrangian function are satisfied.
In this way, the dual multipliers reported by AMPL canalways be interpreted as sensitivities.
35 / 37
Example
min x + 2y s.t. 0 ≤ x , (λ1), x ≤ 2, (λ2), y = 1, (µ)
Objective function f (x , y) = x + 2y , inequality constraintsf1(x , y) = −x ≤ 0 (i.e., a ≤ constraint), f2(x , y) = x − 2,h(x , y) = y − 1
AMPL Lagrangian:L(x , y) = (x + 2y)− λ1(−x)− λ2(x − 2)− µ(y − 1)
36 / 37
AMPL KKT Conditions
KKT conditions:
Primal feasibility: g1(x , y) ≤ 0, g2(x , y) ≤ 0, h(x , y) = 0
Dual feasibility: λ1 ≤ 0, λ2 ≤ 0
Complementarity: λ1 ⊥ g1(x , y), λ2 ⊥ g2(x , y)
Stationarity:
∇f (x , y)− λ1∇g1(x , y)− λ2∇g2(x , y)− µ∇h(x , y) = 0
Solution: x = 0, y = 1, λ1 = −1, λ2 = 0, µ = 2
37 / 37