Chapter 10 Minimization or Maximization of Functions.

20
Chapter 10 Minimization or Maximization of Functions

Transcript of Chapter 10 Minimization or Maximization of Functions.

Page 1: Chapter 10 Minimization or Maximization of Functions.

Chapter 10

Minimization or Maximization of Functions

Page 2: Chapter 10 Minimization or Maximization of Functions.

Optimization Problems

• Solution of equations can be formulated as an optimization problem, e.g., density functional theory in electronic structure, conformation of proteins, etc

• Minimization with constraints – operations research (linear programming, optimal conditions in management science, traveling salesman problem, etc)

Page 3: Chapter 10 Minimization or Maximization of Functions.

General Consideration

• Use function values only, or use function values and its derivatives

• Storage of O(N) or O(N2)

• With constraints or no constraints

• Choice of methods

Page 4: Chapter 10 Minimization or Maximization of Functions.

Local & Global Extremum

Page 5: Chapter 10 Minimization or Maximization of Functions.

Bracketing and Search in 1D

Bracket a minimum means that for given a < b < c, we have f(b) < f(a), and f(b) < f(c). There is a minimum in the interval (a,c).

a

b

c

Page 6: Chapter 10 Minimization or Maximization of Functions.

How accurate can we locate a minimum?

• Let b a minimum of function f(x),Taylor expanding around b, we have

• The best we can do is when the second correction term reaches machine epsilon comparing to the function value, so

21( ) ( ) ( )( )

2f x f b f b x b

2

2 | ( ) || | | |

( )

f bx b b

b f b

Page 7: Chapter 10 Minimization or Maximization of Functions.

Golden Section Search

• Choose x such that the ratio of intervals [a,b] to [b,c] is the same as [a,x] to [x,b]. Remove [a,x] if f[x] > f[b], or remove [b,c] if f[x] < f[b].

• The asymptotic limit of the ratio is the Golden mean

a b c

x

5 10.61803

2

Page 8: Chapter 10 Minimization or Maximization of Functions.
Page 9: Chapter 10 Minimization or Maximization of Functions.

Parabolic Interpolation & Brent’s Method

2 2( ) ( ) ( ) ( ) ( ) ( )1

2 ( ) ( ) ( ) ( ) ( ) ( )

b a f a f c b c f b f ax b

b a f a f c b c f b f a

Brent’s method combines parabolic interpolation with Golden section search, with some complicated bookkeeping. See NR, page 404-405 for details.

Page 10: Chapter 10 Minimization or Maximization of Functions.

Strategy in Higher Dimensions

1. Starting from a point P and a direction n, find the minimum on the line P + n, i.e., do a 1D minimization of y()=f(P+n)

2. Replace P by P + min n, choose another direction n’ and repeat step 1.

The trick and variation of the algorithms are on chosen n.

Page 11: Chapter 10 Minimization or Maximization of Functions.

Local Properties near Minimum

• Let P be some point of interest which is at the origin x=0. Taylor expansion gives

• Minimizing f is the same as solving the equation

2

,

1( ) ( )

2

1

2

i i ji i ji i j

T T

f ff f x x x

x x x

c

x P

b x x A x

A x b

T for transpose of a matrix

Page 12: Chapter 10 Minimization or Maximization of Functions.

Search along Coordinate Directions

Search minimum along x direction, followed by search minimum along y direction, and so on. Such method takes a very large number of steps to converge.

The curved loops represent f(x,y) = const.

Page 13: Chapter 10 Minimization or Maximization of Functions.

Steepest Descent

Search in the direction with the largest decrease, i.e., n = -f

Constant f contour line (surface) is perpendicular to n, because df = dxf = 0.

The current search direction n and next search direction are orthogonal, because for minimum we have

y’() = df(P+n)/d = nT f|P+n = 0

n

n’ nT n’ = 0

Page 14: Chapter 10 Minimization or Maximization of Functions.

Conjugate Condition

n1T

A n2 = 0Make a linear coordinate transformation, such that contour is circular and (search) vectors are orthogonal

Page 15: Chapter 10 Minimization or Maximization of Functions.

Conjugate Gradient Method

1. Start with steepest descent direction n0 = g0 = -f(x0), find new minimum x1

2. Build the next search direction n1 from g0 and g1 = -f(x1), such that n0An1 = 0

3. Repeat step 2 iteratively to find nj (a Gram-Schmidt orthogonalization process). The result is a set of N vectors (in N dimensions) ni

TAnj = 0

Page 16: Chapter 10 Minimization or Maximization of Functions.

Conjugate Gradient Algorithm

1. Initialize n0 = g0 = -f(x0), i = 0,

2. Find that minimizes f(xi+ni), let xi+1 =xi+ni

3. Compute new negative gradient gi+1 = -f(xi+1)

4. Compute

5. Update new search direction as ni+1 = gi+1 + ini; ++i, go to 2

1 1Ti i

i Ti i

g g

g g(Fletcher-Reeves)

Page 17: Chapter 10 Minimization or Maximization of Functions.

The Conjugate Gradient Program

Page 18: Chapter 10 Minimization or Maximization of Functions.

Simulated Annealing

• To minimize f(x), we make random change to x by the following rule:

• Set T a large value, decrease as we go• Metropolis algorithm: make local change

from x to x’. If f decreases, accept the change, otherwise, accept only with a small probability r = exp[-(f(x’)-f(x))/T]. This is done by comparing r with a random number 0 < ξ < 1.

Page 19: Chapter 10 Minimization or Maximization of Functions.

Traveling Salesman Problem

Singapore

Kuala Lumpur

Hong Kong

Taipei

Shanghai

Beijing Tokyo

Find shortest path that cycles through each city exactly once.

Page 20: Chapter 10 Minimization or Maximization of Functions.

Problem set 7

1. Suppose that the function is given by the quadratic form f=(1/2)xTAx, where A is a symmetric and positive definite matrix. Find a linear transform to x so that in the new coordinate system, the function becomes f = (1/2)|y|2, y = Ux [i.e., the contour is exactly circular or spherical]. If two vectors in the new system are orthogonal, y1

Ty2=0, what does it mean in the original system?

2. We’ll discuss the conjugate gradient method in some more detail following the paper: http://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf