Introduction to unconstrained optimization - direct search ...

17
Introduction to unconstrained optimization - direct search methods Jussi Hakanen Post-doctoral researcher [email protected] spring 2014 TIES483 Nonlinear optimization

Transcript of Introduction to unconstrained optimization - direct search ...

Page 1: Introduction to unconstrained optimization - direct search ...

Introduction to

unconstrained optimization

- direct search methods

Jussi Hakanen

Post-doctoral researcher [email protected]

spring 2014 TIES483 Nonlinear optimization

Page 2: Introduction to unconstrained optimization - direct search ...

Structure of optimization methods

Typically โ€“ Constraint handling

converts the problem to (a series of) unconstrained problems

โ€“ In unconstrained optimization a search direction is determined at each iteration

โ€“ The best solution in the search direction is found with line search

spring 2014 TIES483 Nonlinear optimization

Constraint handling

method

Unconstrained

optimization

Line

search

Page 3: Introduction to unconstrained optimization - direct search ...

Group discussion

1. What kind of optimality conditions there exist for

unconstrained optimization (๐‘ฅ โˆˆ ๐‘…๐‘›)?

2. List methods for unconstrained optimization?

โ€“ what are their general ideas?

Discuss in small groups (3-4) for 15-20 minutes

Each group has a secretary who writes down the

answers of the group

At the end, we summarize what each group found

spring 2014 TIES483 Nonlinear optimization

Page 4: Introduction to unconstrained optimization - direct search ...

Reminder: gradient and hessian Definition: If function ๐‘“: ๐‘…๐‘› โ†’ ๐‘… is differentiable, then the

gradient ๐›ป๐‘“(๐‘ฅ) consists of the partial derivatives ๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘– i.e.

๐›ป๐‘“ ๐‘ฅ =๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1, โ€ฆ ,

๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›

๐‘‡

Definition: If ๐‘“ is twice differentiable, then the matrix

๐ป ๐‘ฅ =

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1๐œ•๐‘ฅ1โ‹ฏ

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1๐œ•๐‘ฅ๐‘›

โ‹ฎ โ‹ฑ โ‹ฎ๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›๐œ•๐‘ฅ1โ‹ฏ

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›๐œ•๐‘ฅ๐‘›

is called the Hessian (matrix) of ๐‘“ at ๐‘ฅ

Result: If ๐‘“ is twice continuously differentiable, then ๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘–๐œ•๐‘ฅ๐‘—=

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘—๐œ•๐‘ฅ๐‘–

spring 2014 TIES483 Nonlinear optimization

Page 5: Introduction to unconstrained optimization - direct search ...

Reminder: Definite Matrices

Definition: A symmetric ๐‘› ร— ๐‘› matrix ๐ป is positive semidefinite if โˆ€ ๐‘ฅ โˆˆ โ„๐‘›

๐‘ฅ๐‘‡๐ป๐‘ฅ โ‰ฅ 0.

Definition: A symmetric ๐‘› ร— ๐‘› matrix ๐ป is positive definite if

๐‘ฅ๐‘‡๐ป๐‘ฅ > 0 โˆ€ 0 โ‰  ๐‘ฅ โˆˆ โ„๐‘›

Note: If โ‰ฅ โ†’ โ‰ค (> โ†’ <), then ๐ป is negative semidefinite (definite). If ๐ป is neither positive nor negative semidefinite, then it is indefinite.

Result: Let โˆ… โ‰  ๐‘† โŠ‚ ๐‘…๐‘› be open convex set and ๐‘“: ๐‘† โ†’ ๐‘… twice differentiable in ๐‘†. Function ๐‘“ is convex if and only if ๐ป(๐‘ฅโˆ—) is positive semidefinite for all ๐‘ฅโˆ— โˆˆ ๐‘†.

Page 6: Introduction to unconstrained optimization - direct search ...

Unconstraint problem

min ๐‘“ ๐‘ฅ , ๐‘ . ๐‘ก. ๐‘ฅ โˆˆ ๐‘…๐‘›

Necessary conditions: Let ๐‘“ be twice differentiable in ๐‘ฅโˆ—. If ๐‘ฅโˆ— is a local minimizer, then

โ€“ ๐›ป๐‘“ ๐‘ฅโˆ— = 0 (that is, ๐‘ฅโˆ— is a critical point of ๐‘“) and

โ€“ ๐ป ๐‘ฅโˆ— is positive semidefinite.

Sufficient conditions: Let ๐‘“ be twice differentiable in ๐‘ฅโˆ—. If

โ€“ ๐›ป๐‘“ ๐‘ฅโˆ— = 0 and

โ€“ ๐ป(๐‘ฅโˆ—) is positive definite,

then ๐‘ฅโˆ— is a strict local minimizer.

Result: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘… is twice differentiable in ๐‘ฅโˆ—. If ๐›ป๐‘“ ๐‘ฅโˆ— = 0 and ๐ป(๐‘ฅโˆ—) is indefinite, then ๐‘ฅโˆ— is a saddle point.

Page 7: Introduction to unconstrained optimization - direct search ...

Unconstraint problem

Adopted from Prof. L.T. Biegler (Carnegie Mellon

University)

Page 8: Introduction to unconstrained optimization - direct search ...

Descent direction

Definition: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘…. A vector ๐‘‘ โˆˆ ๐‘…๐‘› is a descent

direction for ๐‘“ in ๐‘ฅโˆ— โˆˆ ๐‘…๐‘› if โˆƒ ๐›ฟ > 0 s.t.

๐‘“ ๐‘ฅโˆ— + ๐œ†๐‘‘ < ๐‘“(๐‘ฅโˆ—) โˆ€ ๐œ† โˆˆ (0, ๐›ฟ].

Result: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘… be differentiable in ๐‘ฅโˆ—. If โˆƒ๐‘‘ โˆˆ ๐‘…๐‘›

s.t. ๐›ป๐‘“ ๐‘ฅโˆ— ๐‘‡๐‘‘ < 0 then ๐‘‘ is a descent direction for ๐‘“ in

๐‘ฅโˆ—.

spring 2014 TIES483 Nonlinear optimization

Page 9: Introduction to unconstrained optimization - direct search ...

Model algorithm for unconstrained

minimization

Let ๐‘ฅโ„Ž be the current estimate for ๐‘ฅโˆ—

1) [Test for convergence.] If conditions are satisfied, stop. The solution is ๐‘ฅโ„Ž.

2) [Compute a search direction.] Compute a non-zero vector ๐‘‘โ„Ž โˆˆ ๐‘…๐‘› which is the search direction.

3) [Compute a step length.] Compute ๐›ผโ„Ž > 0, the step length, for which it holds that ๐‘“ ๐‘ฅโ„Ž + ๐›ผโ„Ž๐‘‘โ„Ž < ๐‘“(๐‘ฅโ„Ž).

4) [Update the estimate for minimum.] Set ๐‘ฅโ„Ž+1 = ๐‘ฅโ„Ž + ๐›ผโ„Ž๐‘‘โ„Ž, โ„Ž = โ„Ž + 1 and go to step 1.

spring 2014 TIES483 Nonlinear optimization

From Gill et al., Practical Optimization, 1981, Academic Press

Page 10: Introduction to unconstrained optimization - direct search ...

On convergence

Iterative method: a sequence {๐‘ฅโ„Ž} s.t. ๐‘ฅโ„Ž โ†’ ๐‘ฅโˆ— when โ„Ž โ†’ โˆž

Definition: A method converges โ€“ linearly if โˆƒ๐›ผ โˆˆ [0,1) and ๐‘€ โ‰ฅ 0 s.t. โˆ€ โ„Ž โ‰ฅ ๐‘€

๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผ ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ,

โ€“ superlinearly if โˆƒ๐‘€ โ‰ฅ 0 and for some sequence ๐›ผโ„Ž โ†’ 0 it holds that โˆ€โ„Ž โ‰ฅ ๐‘€

๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผโ„Ž ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ,

โ€“ with degree ๐‘ if โˆƒ๐›ผ โ‰ฅ 0, ๐‘ > 0 and ๐‘€ โ‰ฅ 0 s.t. โˆ€โ„Ž โ‰ฅ ๐‘€ ๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผ ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ๐‘

.

If ๐‘ = 2 (๐‘ = 3), the convergence is quadratic (cubic).

spring 2014 TIES483 Nonlinear optimization

Page 11: Introduction to unconstrained optimization - direct search ...

Summary of group discussion for

methods

1. Newtonโ€™s method 1. Utilizes tangent

2. Golden section method 1. For line search

3. Downhill Simplex

4. Cyclic coordinate method 1. One coordinate at a time

5. Polytopy search (Nelder-Mead) 1. Idea based on geometry

6. Gradient descent (steepest descent) 1. Based on gradient information

spring 2014 TIES483 Nonlinear optimization

Page 12: Introduction to unconstrained optimization - direct search ...

Direct search methods

Univariate search, coordinate descent, cyclic

coordinate search

Hooke and Jeeves

Powellโ€™s method

spring 2014 TIES483 Nonlinear optimization

Page 13: Introduction to unconstrained optimization - direct search ...

Coordinate descent

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = 2๐‘ฅ12 + 2๐‘ฅ1๐‘ฅ2 + ๐‘ฅ2

2 + ๐‘ฅ1 โˆ’ ๐‘ฅ2

Page 14: Introduction to unconstrained optimization - direct search ...

Idea of pattern search

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Page 15: Introduction to unconstrained optimization - direct search ...

Hooke and Jeeves

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = ๐‘ฅ1 โˆ’ 2 4 + x1 โˆ’ 2x22

Page 16: Introduction to unconstrained optimization - direct search ...

Hooke and Jeeves with fixed step length

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = ๐‘ฅ1 โˆ’ 2 4 + x1 โˆ’ 2x22

Page 17: Introduction to unconstrained optimization - direct search ...

Powellโ€™s method

Most efficient pattern search method

Differs from Hooke and Jeeves so that for

each pattern search step one of the coordinate

directions is replaced with previous pattern

search direction.

spring 2014 TIES483 Nonlinear optimization