Introduction to unconstrained optimization - direct search ...

Post on 18-Dec-2021

11 views 0 download

Transcript of Introduction to unconstrained optimization - direct search ...

Introduction to

unconstrained optimization

- direct search methods

Jussi Hakanen

Post-doctoral researcher jussi.hakanen@jyu.fi

spring 2014 TIES483 Nonlinear optimization

Structure of optimization methods

Typically โ€“ Constraint handling

converts the problem to (a series of) unconstrained problems

โ€“ In unconstrained optimization a search direction is determined at each iteration

โ€“ The best solution in the search direction is found with line search

spring 2014 TIES483 Nonlinear optimization

Constraint handling

method

Unconstrained

optimization

Line

search

Group discussion

1. What kind of optimality conditions there exist for

unconstrained optimization (๐‘ฅ โˆˆ ๐‘…๐‘›)?

2. List methods for unconstrained optimization?

โ€“ what are their general ideas?

Discuss in small groups (3-4) for 15-20 minutes

Each group has a secretary who writes down the

answers of the group

At the end, we summarize what each group found

spring 2014 TIES483 Nonlinear optimization

Reminder: gradient and hessian Definition: If function ๐‘“: ๐‘…๐‘› โ†’ ๐‘… is differentiable, then the

gradient ๐›ป๐‘“(๐‘ฅ) consists of the partial derivatives ๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘– i.e.

๐›ป๐‘“ ๐‘ฅ =๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1, โ€ฆ ,

๐œ•๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›

๐‘‡

Definition: If ๐‘“ is twice differentiable, then the matrix

๐ป ๐‘ฅ =

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1๐œ•๐‘ฅ1โ‹ฏ

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ1๐œ•๐‘ฅ๐‘›

โ‹ฎ โ‹ฑ โ‹ฎ๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›๐œ•๐‘ฅ1โ‹ฏ

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘›๐œ•๐‘ฅ๐‘›

is called the Hessian (matrix) of ๐‘“ at ๐‘ฅ

Result: If ๐‘“ is twice continuously differentiable, then ๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘–๐œ•๐‘ฅ๐‘—=

๐œ•2๐‘“(๐‘ฅ)

๐œ•๐‘ฅ๐‘—๐œ•๐‘ฅ๐‘–

spring 2014 TIES483 Nonlinear optimization

Reminder: Definite Matrices

Definition: A symmetric ๐‘› ร— ๐‘› matrix ๐ป is positive semidefinite if โˆ€ ๐‘ฅ โˆˆ โ„๐‘›

๐‘ฅ๐‘‡๐ป๐‘ฅ โ‰ฅ 0.

Definition: A symmetric ๐‘› ร— ๐‘› matrix ๐ป is positive definite if

๐‘ฅ๐‘‡๐ป๐‘ฅ > 0 โˆ€ 0 โ‰  ๐‘ฅ โˆˆ โ„๐‘›

Note: If โ‰ฅ โ†’ โ‰ค (> โ†’ <), then ๐ป is negative semidefinite (definite). If ๐ป is neither positive nor negative semidefinite, then it is indefinite.

Result: Let โˆ… โ‰  ๐‘† โŠ‚ ๐‘…๐‘› be open convex set and ๐‘“: ๐‘† โ†’ ๐‘… twice differentiable in ๐‘†. Function ๐‘“ is convex if and only if ๐ป(๐‘ฅโˆ—) is positive semidefinite for all ๐‘ฅโˆ— โˆˆ ๐‘†.

Unconstraint problem

min ๐‘“ ๐‘ฅ , ๐‘ . ๐‘ก. ๐‘ฅ โˆˆ ๐‘…๐‘›

Necessary conditions: Let ๐‘“ be twice differentiable in ๐‘ฅโˆ—. If ๐‘ฅโˆ— is a local minimizer, then

โ€“ ๐›ป๐‘“ ๐‘ฅโˆ— = 0 (that is, ๐‘ฅโˆ— is a critical point of ๐‘“) and

โ€“ ๐ป ๐‘ฅโˆ— is positive semidefinite.

Sufficient conditions: Let ๐‘“ be twice differentiable in ๐‘ฅโˆ—. If

โ€“ ๐›ป๐‘“ ๐‘ฅโˆ— = 0 and

โ€“ ๐ป(๐‘ฅโˆ—) is positive definite,

then ๐‘ฅโˆ— is a strict local minimizer.

Result: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘… is twice differentiable in ๐‘ฅโˆ—. If ๐›ป๐‘“ ๐‘ฅโˆ— = 0 and ๐ป(๐‘ฅโˆ—) is indefinite, then ๐‘ฅโˆ— is a saddle point.

Unconstraint problem

Adopted from Prof. L.T. Biegler (Carnegie Mellon

University)

Descent direction

Definition: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘…. A vector ๐‘‘ โˆˆ ๐‘…๐‘› is a descent

direction for ๐‘“ in ๐‘ฅโˆ— โˆˆ ๐‘…๐‘› if โˆƒ ๐›ฟ > 0 s.t.

๐‘“ ๐‘ฅโˆ— + ๐œ†๐‘‘ < ๐‘“(๐‘ฅโˆ—) โˆ€ ๐œ† โˆˆ (0, ๐›ฟ].

Result: Let ๐‘“: ๐‘…๐‘› โ†’ ๐‘… be differentiable in ๐‘ฅโˆ—. If โˆƒ๐‘‘ โˆˆ ๐‘…๐‘›

s.t. ๐›ป๐‘“ ๐‘ฅโˆ— ๐‘‡๐‘‘ < 0 then ๐‘‘ is a descent direction for ๐‘“ in

๐‘ฅโˆ—.

spring 2014 TIES483 Nonlinear optimization

Model algorithm for unconstrained

minimization

Let ๐‘ฅโ„Ž be the current estimate for ๐‘ฅโˆ—

1) [Test for convergence.] If conditions are satisfied, stop. The solution is ๐‘ฅโ„Ž.

2) [Compute a search direction.] Compute a non-zero vector ๐‘‘โ„Ž โˆˆ ๐‘…๐‘› which is the search direction.

3) [Compute a step length.] Compute ๐›ผโ„Ž > 0, the step length, for which it holds that ๐‘“ ๐‘ฅโ„Ž + ๐›ผโ„Ž๐‘‘โ„Ž < ๐‘“(๐‘ฅโ„Ž).

4) [Update the estimate for minimum.] Set ๐‘ฅโ„Ž+1 = ๐‘ฅโ„Ž + ๐›ผโ„Ž๐‘‘โ„Ž, โ„Ž = โ„Ž + 1 and go to step 1.

spring 2014 TIES483 Nonlinear optimization

From Gill et al., Practical Optimization, 1981, Academic Press

On convergence

Iterative method: a sequence {๐‘ฅโ„Ž} s.t. ๐‘ฅโ„Ž โ†’ ๐‘ฅโˆ— when โ„Ž โ†’ โˆž

Definition: A method converges โ€“ linearly if โˆƒ๐›ผ โˆˆ [0,1) and ๐‘€ โ‰ฅ 0 s.t. โˆ€ โ„Ž โ‰ฅ ๐‘€

๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผ ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ,

โ€“ superlinearly if โˆƒ๐‘€ โ‰ฅ 0 and for some sequence ๐›ผโ„Ž โ†’ 0 it holds that โˆ€โ„Ž โ‰ฅ ๐‘€

๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผโ„Ž ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ,

โ€“ with degree ๐‘ if โˆƒ๐›ผ โ‰ฅ 0, ๐‘ > 0 and ๐‘€ โ‰ฅ 0 s.t. โˆ€โ„Ž โ‰ฅ ๐‘€ ๐‘ฅโ„Ž+1 โˆ’ ๐‘ฅโˆ— โ‰ค ๐›ผ ๐‘ฅโ„Ž โˆ’ ๐‘ฅโˆ— ๐‘

.

If ๐‘ = 2 (๐‘ = 3), the convergence is quadratic (cubic).

spring 2014 TIES483 Nonlinear optimization

Summary of group discussion for

methods

1. Newtonโ€™s method 1. Utilizes tangent

2. Golden section method 1. For line search

3. Downhill Simplex

4. Cyclic coordinate method 1. One coordinate at a time

5. Polytopy search (Nelder-Mead) 1. Idea based on geometry

6. Gradient descent (steepest descent) 1. Based on gradient information

spring 2014 TIES483 Nonlinear optimization

Direct search methods

Univariate search, coordinate descent, cyclic

coordinate search

Hooke and Jeeves

Powellโ€™s method

spring 2014 TIES483 Nonlinear optimization

Coordinate descent

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = 2๐‘ฅ12 + 2๐‘ฅ1๐‘ฅ2 + ๐‘ฅ2

2 + ๐‘ฅ1 โˆ’ ๐‘ฅ2

Idea of pattern search

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Hooke and Jeeves

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = ๐‘ฅ1 โˆ’ 2 4 + x1 โˆ’ 2x22

Hooke and Jeeves with fixed step length

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

๐‘“ ๐‘ฅ = ๐‘ฅ1 โˆ’ 2 4 + x1 โˆ’ 2x22

Powellโ€™s method

Most efficient pattern search method

Differs from Hooke and Jeeves so that for

each pattern search step one of the coordinate

directions is replaced with previous pattern

search direction.

spring 2014 TIES483 Nonlinear optimization