Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford...
Transcript of Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford...
![Page 1: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/1.jpg)
Analytic Center Cutting-Plane Method
• analytic center cutting-plane method
• computing the analytic center
• pruning constraints
• lower bound and stopping criterion
Prof. S. Boyd, EE364b, Stanford University
![Page 2: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/2.jpg)
Analytic center cutting-plane method
analytic center of polyhedron P = {z | aTi z � bi, i = 1, . . . ,m} is
AC(P) = argminz
−m
∑
i=1
log(bi − aTi z)
ACCPM is localization method with next query point x(k+1) = AC(Pk)(found by Newton’s method)
Prof. S. Boyd, EE364b, Stanford University 1
![Page 3: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/3.jpg)
ACCPM algorithm
given an initial polyhedron P0 known to contain X .
k := 0.repeat
Compute x(k+1) = AC(Pk).Query cutting-plane oracle at x(k+1).If x(k+1) ∈ X , quit.Else, add returned cutting-plane inequality to P.Pk+1 := Pk ∩ {z | aTz ≤ b}
If Pk+1 = ∅, quit.k := k + 1.
Prof. S. Boyd, EE364b, Stanford University 2
![Page 4: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/4.jpg)
Constructing cutting-planes
minimize f0(x)subject to fi(x) ≤ 0, i = 1, . . . , m
f0, . . . , fm : Rn → R convex; X is set of optimal points; p⋆ is optimal value
• if x is not feasible, say fj(x) > 0, we have (deep) feasibility cut
fj(x) + gTj (z − x) ≤ 0, gj ∈ ∂fj(x)
• if x is feasible, we have (deep) objective cut
gT0 (z − x) + f0(x) − f
(k)best ≤ 0, g0 ∈ ∂f0(x)
Prof. S. Boyd, EE364b, Stanford University 3
![Page 5: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/5.jpg)
Computing the analytic center
we must solve the problem
minimize Φ(x) = −∑m
i=1 log(bi − aTi x)
where domΦ = {x | aTi x < bi, i = 1, . . . , m}
• challenge: we are not given a point in domΦ
• some options:
– use phase I method to find a point in domΦ (or determine thatdomΦ = ∅); then use standard Newton method to compute AC
– use infeasible start Newton method starting from a point outsidedomΦ
Prof. S. Boyd, EE364b, Stanford University 4
![Page 6: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/6.jpg)
Infeasible start Newton method
minimize −∑m
i=1 log yi
subject to y = b − Ax
with variables x and y
• can be started from any x and any y ≻ 0
• e.g.: take initial x as previous point xprev, and choose y as
yi =
{
bi − aTi x bi − aT
i x > 01 otherwise
Prof. S. Boyd, EE364b, Stanford University 5
![Page 7: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/7.jpg)
• define primal and dual residuals as
rp = y + Ax − b, rd =
[
ATνg + ν
]
where g = −diag(1/yi)1 is gradient of objective and r = (rd, rp)
• Newton step at (x, y, ν) is defined by
0 0 AT
0 H IA I 0
∆x∆y∆ν
= −
[
rd
rp
]
,
where H = diag(1/y2i ) is Hessian of the objective
Prof. S. Boyd, EE364b, Stanford University 6
![Page 8: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/8.jpg)
• solve this system by block elimination
∆x = −(ATHA)−1(ATg − ATHrp)
∆y = −A∆x − rp
∆ν = −H∆y − g − ν
• options for computing ∆x:
– form ATHA, then use dense or sparse Cholesky factorization– solve (diagonally scaled) least-squares problem
∆x = argminz
∥
∥
∥H1/2Az − H1/2rp + H−1/2g
∥
∥
∥
2
– use iterative method such as conjugate gradients to (approximately)solve for ∆x
Prof. S. Boyd, EE364b, Stanford University 7
![Page 9: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/9.jpg)
Infeasible start Newton method algorithm
given starting point x, y ≻ 0, tolerance ǫ > 0, α ∈ (0, 1/2), β ∈ (0, 1).
ν := 0.
repeat
1. Compute Newton step (∆x, ∆y, ∆ν) by block elimination.2. Backtracking line search on ‖r‖2.
t := 1.while y + t∆y 6≻ 0, t := βt.while ‖r(x + t∆x, y + t∆y, ν + t∆ν)‖2 > (1 − αt)‖r(x, y, ν)‖2,
t := βt.3. Update. x := x + t∆x, y := y + t∆y, ν := ν + t∆ν.
until y = b − Ax and ‖r(x, y, ν)‖2 ≤ ǫ.
Prof. S. Boyd, EE364b, Stanford University 8
![Page 10: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/10.jpg)
Properties
• once any equality constraint is satisfied, it remains satisfied for all futureiterates
• once a step size t = 1 is taken, all equality constraints are satisfied
• if domΦ 6= ∅, t = 1 occurs in finite number of steps
• if domΦ = ∅, algorithm never converges
Prof. S. Boyd, EE364b, Stanford University 9
![Page 11: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/11.jpg)
Pruning constraints
• let x∗ be analytic center of P = {z | aTi z � bi, i = 1, . . . ,m}
• let H∗ be Hessian of barrier at x∗,
H∗ = −∇2m
∑
i=1
log(bi − aTi z)
∣
∣
∣
∣
∣
z=x∗
=
m∑
i=1
aiaTi
(bi − aTi x∗)2
• then, P ⊆ E = {z | (z − x∗)TH∗(z − x∗) ≤ m2}
define (ir)relevance measure ηi =bi − aT
i x∗
√
aTi H∗−1ai
Prof. S. Boyd, EE364b, Stanford University 10
![Page 12: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/12.jpg)
• ηi/m is normalized distance from hyperplane aTi x = bi to outer ellipsoid
• if ηi ≥ m, then constraint aTi x ≤ bi is redundant
common ACCPM constraint dropping schemes:
• drop all constraints with ηi ≥ m (guaranteed to not change P)
• drop constraints in order of irrelevance, keeping constant number,usually 3n – 5n
Prof. S. Boyd, EE364b, Stanford University 11
![Page 13: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/13.jpg)
PWL lower bound on convex function
• suppose f is convex, and g(i) ∈ ∂f(x(i)), i = 1, . . . , m
• then we have
f̂(z) = maxi=1,...,m
(
f(x(i)) + g(i)T (z − x(i)))
≤ f(z)
• f̂ is PWL lower bound on f
Prof. S. Boyd, EE364b, Stanford University 12
![Page 14: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/14.jpg)
Lower bound in ACCPM
• in solving convex problem
minimize f0(x)subject to f1(x) ≤ 0,
Cx � d
(by taking max of constraint functions we can assume there is only one)
• we have evaluated f0 and subgradient g0 at x(1), . . . , x(q)
• we have evaluated f1 and subgradient g1 at x(q+1), . . . , x(k)
• form piecewise-linear approximations f̂0, f̂1
Prof. S. Boyd, EE364b, Stanford University 13
![Page 15: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/15.jpg)
• form PWL relaxed problem
minimize f̂0(x)
subject to f̂1(x) ≤ 0,Cx � d
(can be solved via LP)
• optimal value is a lower bound on p⋆
• can easily construct a lower bound on the PWL relaxed problem, as aby-product of the analytic centering computation
• this, in turn, gives a lower bound on the original problem
Prof. S. Boyd, EE364b, Stanford University 14
![Page 16: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/16.jpg)
• form dual of PWL relaxed problem
maximize∑q
i=1 λi(f0(x(i)) − g
(i)T0 x(i))
+∑k
i=q+1 λi(f1(x(i)) − g
(i)T1 x(i)) − dTµ
subject to∑q
i=1 λig(i)0 +
∑ki=q+1 λig
(i)1 + CTµ = 0
µ � 0, λ � 0,∑q
i=1 λi = 1,
• optimality condition for x(k+1)
q∑
i=1
g(i)0
f(i)best − f0(x(i)) − g
(i)T0 (x(k+1) − x(i))
+
k∑
i=q+1
g(i)1
−f1(x(i)) − g(i)T1 (x(k+1) − x(i))
+m
∑
i=1
ci
di − cTi x(k+1)
= 0.
Prof. S. Boyd, EE364b, Stanford University 15
![Page 17: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/17.jpg)
• take τi = 1/(f(i)best − f0(x
(i)) − g(i)T0 (x(k+1) − x(i))) for i = 1, . . . , q.
• construct a dual feasible point by taking
λi =
{
τi/1T τ for i = 1, . . . , q
1/(−f1(x(i)) − g
(i)T1 (x(k+1) − x(i)))(1T τ) for i = q + 1, . . . , k,
µi = 1/(di − cTi x(k+1))(1T τ) i = 1, . . . ,m.
• using these values of λ and µ, we conclude that
p⋆ ≥ l(k+1),
where l(k+1) =∑q
i=1 λi(f0(x(i)) − g
(i)T0 x(i)) +
∑ki=q+1 λi(f1(x
(i)) − g(i)T1 x(i)) − dTµ.
Prof. S. Boyd, EE364b, Stanford University 16
![Page 18: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/18.jpg)
Stopping criterion
since ACCPM isn’t a descent method, we keep track of best point found,and best lower bound
• best function value so far: f(k)best = min
i=1,...,kf0(x
(k))
• best lower bound so far: l(k)best = max
i=1,...,kl(x(k))
• can stop when f(k)best − l
(k)best ≤ ǫ
• guaranteed to be ǫ-suboptimal
Prof. S. Boyd, EE364b, Stanford University 17
![Page 19: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/19.jpg)
Example: Piecewise linear minimization
problem instance with n = 20 variables, m = 100 terms, f⋆ ≈ 1.1
0 50 100 150 20010
−4
10−3
10−2
10−1
100
101
k
f(x
(k) )−
f⋆
Prof. S. Boyd, EE364b, Stanford University 18
![Page 20: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/20.jpg)
0 50 100 150 20010
−4
10−3
10−2
10−1
100
101
k
f(k
)best−
f⋆
Prof. S. Boyd, EE364b, Stanford University 19
![Page 21: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/21.jpg)
0 500 1000 1500 2000 250010
−4
10−3
10−2
10−1
100
101
Newton step
f best−
f⋆
Prof. S. Boyd, EE364b, Stanford University 20
![Page 22: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/22.jpg)
0 50 100 150 20010
−4
10−2
100
102
104
k
f(k)best − f⋆
f(k)best − l
(k)best
Prof. S. Boyd, EE364b, Stanford University 21
![Page 23: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/23.jpg)
ACCPM with constraint dropping
PWL objective, n = 20 variables, m = 100 terms
0 50 100 150 20010
−4
10−3
10−2
10−1
100
101
k
f(k
)best−
f⋆
no droppingkeeping 3n
Prof. S. Boyd, EE364b, Stanford University 22
![Page 24: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/24.jpg)
number of inequalities in P:
0 50 100 150 200 2500
50
100
150
200
250
k
no droppingkeeping 3n
Prof. S. Boyd, EE364b, Stanford University 23
![Page 25: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/25.jpg)
accuracy versus approximate cumulative flop count
0 50 100 150 20010
−4
10−3
10−2
10−1
100
101
Megaflops
f(k
)best−
f⋆
no droppingkeeping 3n
Prof. S. Boyd, EE364b, Stanford University 24
![Page 26: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/26.jpg)
Epigraph ACCPM
PWL objective, n = 20 variables, m = 100 terms
0 10 20 30 40 5010
−4
10−3
10−2
10−1
100
101
k
f(k
)−
f⋆
Prof. S. Boyd, EE364b, Stanford University 25
![Page 27: Analytic Center Cutting-Plane MethodNewton step f best − f ⋆ Prof. S. Boyd, EE364b, Stanford University 20. 0 50 100 150 200 10−4 10−2 100 102 104 k f(k) best − f ...](https://reader034.fdocuments.net/reader034/viewer/2022050221/5f666fc3063d1c4ef97ed27a/html5/thumbnails/27.jpg)
0 200 400 600 800 1000 120010
−4
10−3
10−2
10−1
100
101
Newton iterations
f(k
)best−
f⋆
Prof. S. Boyd, EE364b, Stanford University 26