Optimization. Issues What is optimization? What real life situations give rise to optimization...

32
Optimization

Transcript of Optimization. Issues What is optimization? What real life situations give rise to optimization...

Page 1: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Optimization

Page 2: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Issues

What is optimization? What real life situations give rise to

optimization problems? When is it easy to optimize? What are we trying to optimize? What can cause problems when we try to

optimize? What methods can we use to optimize?

Page 3: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

One-Dimensional Minimization

Golden section search

Brent’s method

Page 4: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

One-Dimensional Minimization

Golden section search: successively narrowing the brackets of upper and lower bounds

Terminating condition: |x3–x1|<

Start with x1,x2,x3 where f2 is smaller than f1 and f3Iteration: Choose x4 somewhere in the larger intervalTwo cases for f4: • f4a: [x1,x2,x4]• f4b: [x2,x4,x3]

Initial bracketing…

)(min xfRx

Page 5: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Upper bound a, lower bound b, initial estimate x f(a) > f(x) < f(b) This condition guarantees that a minimum is

contained somewhere within the interval. On each iteration a new point x' is selected using one

of the available algorithms. If the new point is a better estimate of the minimum,

i.e. where f(x') < f(x), then the current estimate of the minimum x is updated.

The new point also allows the size of the bounded interval to be reduced, by choosing the most compact set of points which satisfies the constraint f(a) > f(x) < f(b).

The interval is reduced until it encloses the true minimum to a desired tolerance.

This provides a best estimate of the location of the minimum and a rigorous error estimate.

From GSL

Page 6: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Golden Section Search

618.12

51

11

b

aa

ab

abccba

b

a

cb

cb

aa

c

Guaranteed linear convergence:[x1,x3]/[x1,x4] = 1.618

[GSL] Choosing the golden section as the bisection ratio can be shown to provide the fastest convergence for this type of algorithm.

Page 7: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Golden Section (reference)

Page 8: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Fibonacci Search (ref)

2

511

k

k

F

FFi: 0, 1, 1, 2, 3, 5, 8, 13, …

Related…

Page 9: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Parabolic Interpolation (Brent)

Page 10: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Brent Details (From GSL)

The minimum of the parabola is taken as a guess for the minimum.

If it lies within the bounds of the current interval then the interpolating point is accepted, and used to generate a smaller interval.

If the interpolating point is not accepted then the algorithm falls back to an ordinary golden section step.

The full details of Brent's method include some additional checks to improve convergence.

Page 11: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Brent(details)

The abscissa x that is the minimum of a parabola through three points (a,f(a)), (b,f(b)), (c,f(c))

Page 12: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Multi-Dimensional Minimization

Gradient Descent

Conjugate Gradient

Page 13: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

f: RnR. If f(x) is of class C2, objective function Gradient of f

Hessian of f

Gradient and Hessian

Page 14: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Optimality

Positive semi-definite Hessian

Taylor’s expansion

For one dimensional f(x)

Page 15: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Multi-Dimensional Optimization

0 :points critical

:

f

RRf n

Higher dimensional root finding is no easier (more difficult) than minimization

Page 16: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Quasi-Newton Method

The various quasi-Newton methods (DFP, BFGS, Broyden) differ in their choice of the solution to update B.

Taylor’s series of f(x) around xk:

B: an approximation to the Hessian matrix

The gradient of this approximation:

Setting this gradient to zero provides the Newton step:

Page 17: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Gradient Descent

Are the directions always orthogonal? Yes!

Page 18: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

ExampleMinimize

minimum

Page 19: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Page 20: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Gradient is perpendicular to level curves and surfaces

(proof)

Page 21: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Weakness of Gradient Descent

Narrow valley

Page 22: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

where

Any function f(x) can be locally approximated by a quadratic function

Conjugate gradient method is a method that works well on this kind of problems

Page 23: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Conjugate Gradient

An iterative method for solving linear systems Ax=b, where A is symmetric and positive definite

Guaranteed to converge in n steps, where n is the system size

Symmetric A is positive definite if it has (any of these):1. All n eigenvalues are positive2. All n upper left determinants are positive3. All n pivots are positive4. xTAx is positive except at x = 0

Symmetric A is positive definite if it has (any of these):1. All n eigenvalues are positive2. All n upper left determinants are positive3. All n pivots are positive4. xTAx is positive except at x = 0

Page 24: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Details (from wikipedia)

Two nonzero vectors u & v are conjugate w.r.t. A:

{pk} are n mutually conjugate directions. {pk} form a basis of Rn.

x*, the solution to Ax=b, can be expressed in this basis

Therefore,Find pk’sSolve k’s

Find pk’sSolve k’s

Page 25: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

The Iterative Method

Equivalent problem: find the minimal of the quadratic function,

Taking the first basis vector p1 to be the gradient of f at x = x0; the other vectors in the basis will be conjugate to the gradient

rk: the residual at kth step,

Note that rk is the negative gradient of f at x = xk

Page 26: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

The Algorithm

Page 27: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Example

yx

yx

bA

xbAxxxf

yfxf

TT

1022

2161

2

1,

51

18

)( 21

Stationary point at [-1/26, -5/26]

Page 28: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Solving Linear Equations

The optimality condition seems to suggest that CG can be used to solve linear equations

CG is only applicable for symmetric positive definite A. For arbitrary linear systems, solve the normal equation

since ATA is symmetric and positive-semidefinite for any A But, k(ATA) = k(A)^2! Slower convergence, worse accuracy

BiCG (biconjugate gradient) is the approach to use for general A

Page 29: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Multidimensional Minimizer [GSL]

Conjugate gradient Fletcher-Reeves, Polak-Ribiere

Quasi-Newton Broyden-Fletcher-Goldfarb-Shanno (BFGS) Utilizes 2nd order approximation

Steepest descent Inefficient (for demonstration purpose)

Simplex algorithm (Nelder and Mead) Without derivative

Page 30: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

GSL Example

Objective function: paraboloid

42

132

02),( ppyppxpyxf

Starting from (5,7)

30220110),( 22 yxyxf

Page 31: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

Conjugate gradientConverge in 12 iterations

Steepest descentConverge in 158 iterations

Page 32: Optimization. Issues What is optimization? What real life situations give rise to optimization problems? When is it easy to optimize? What are we trying.

[Solutions in Numerical Recipe]

Sec.2.7 linbcg (biconjugate gradient): general AReference A implicitly through atimes

Sec.10.6 frprmn (minimization) Model test problem: spacetime, …