Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN...

22
Motivation Background ACM-ES Comparison-Based Optimizers Need Comparison-Based Surrogates Ilya Loshchilov 1,2 , Marc Schoenauer 1,2 , Michèle Sebag 2,1 1 TAO Project-team, INRIA Saclay - Île-de-France 2 and Laboratoire de Recherche en Informatique (UMR CNRS 8623) Université Paris-Sud, 91128 Orsay Cedex, France Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 1/ 20

description

Taking inspiration from approximate ranking, this paper investigates the use of rank-based Support Vector Machine as surrogate model within CMA-ES, enforcing the invariance of the approach with respect to monotonous transformations of the fitness function. Whereas the choice of the SVM kernel is known to be a critical issue, the proposed approach uses the Covariance Matrix adapted by CMA-ES within a Gaussian kernel, ensuring the adaptation of the kernel to the currently explored region of the fitness landscape at almost no computational overhead. The empirical validation of the approach on standard benchmarks, comparatively to CMA-ES and recent surrogate-based CMA-ES, demonstrates the efficiency and scalability of the proposed approach.

Transcript of Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN...

Page 1: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Comparison-Based Optimizers NeedComparison-Based Surrogates

Ilya Loshchilov1,2, Marc Schoenauer1,2, Michèle Sebag2,1

1TAO Project-team, INRIA Saclay - Île-de-France

2and Laboratoire de Recherche en Informatique (UMR CNRS 8623)Université Paris-Sud, 91128 Orsay Cedex, France

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 1/ 20

Page 2: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Content

1 MotivationWhy Comparison-Based Surrogates ?Previous Work

2 BackgroundCovariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

3 ACM-ESAlgorithmExperiments

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 2/ 20

Page 3: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Why Comparison-Based Surrogates ?Previous Work

Why Comparison-Based Surrogates ?

Surrogate Model (Meta-Model) assisted optimization

Construct the approximation model M(x) of f(x).

Optimize the model M(x) in lieu of f(x) to reducethe number of costly evaluations of the function f(x).

Example

f(x) = x21 + (x1 + x2)

2.

An efficient Evolutionary Algorithm (EA) withsurrogate models may be 4.3 faster on f(x).

But the same EA is only 2.4 faster on f ′(x) = f(x)1/4! a

aCMA-ES with quadratic meta-model (lmm-CMA-ES) on fSchwefel 2-D

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 3/ 20

Page 4: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Why Comparison-Based Surrogates ?Previous Work

Ordinal Regression in Evolutionary Computation

Goal: Find the function F (x) which preserves the ordering of thetraining points xi (xi has rank i):

xi � xj ⇔ F (xi) > F (xj)

F (x) is invariant to any rank-preserving transformation.

CMA-ES with Rank Support Vector Machine on Rosenbrock: 1

200 400 60010

6

104

102

100

200 600 100010

1

100

101

102

1000 2000 300010

0

101

102

103

# function evaluations

me

an

fitn

ess

(μI ,λ )CMA-ES

Poly, d =2Poly, d =4RBF, γ =1

n =2 n =5 n =10

# function evaluations# function evaluations

1T. Runarsson (2006). "Ordinal Regression in Evolutionary Computation"

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 4/ 20

Page 5: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Why Comparison-Based Surrogates ?Previous Work

Exploit the local topography of the function

CMA-ES adapts the covariance matrix C which describes thelocal structure of the function.

Mahalanobis (fully weighted Euclidean) distance:

d(xi, xj) =√

(xi − xj)TC−1(xi − xj)

Results of CMA-ES with quadratic meta-models: 2

2 4 8 16

102

104

106

108

n

O(F

LO

P)/

saved function e

valu

ation

lmm-CMA-ES

Speed-up: a factor of 2-4 for n ≥ 4

Complexity: from O(n4) to O(n6)

Rank-preserving invariance: NO

becomes intractable for n>16

2S. Kern et al. (2006). "Local Meta-Models for Optimization Using Evolution Strategies"

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 5/ 20

Page 6: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Why Comparison-Based Surrogates ?Previous Work

Tractable or Efficient ?

Answer: Tractable and Efficient and Invariant.

Ingredients: CMA-ES (Adaptive Encoding) and Rank SVM.

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 6/ 20

Page 7: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Covariance Matrix Adaptation Evolution StrategyDecompose to understand

While CMA-ES by definition is CMA and ES, only recently thealgorithmic decomposition has been presented. 3

Algorithm 1 CMA-ES = Adaptive Encoding + ES1: xi ← m+ σNi(0, I), for i = 1 . . . λ2: fi ← f(Bxi), for i = 1 . . . λ3: if Evolution Strategy (ES) then4: σ ← σexp∝( success rate

expected success rate−1)

5: if Cumulative Step-Size Adaptation ES (CSA-ES) then

6: σ ← σexp∝(‖evolution path‖

‖expected evolution path‖−1)

7: B ←AECMA-Update(Bx1, . . . , Bxµ)

3N. Hansen (2008). "Adaptive Encoding: How to Render Search Coordinate System Invariant"

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 7/ 20

Page 8: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Adaptive EncodingInspired by Principal Component Analysis (PCA)

Principal Component Analysis

Adaptive Encoding Update

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 8/ 20

Page 9: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Support Vector Machine for ClassificationLinear Classifier

L1

2

3

L

L

b

w

w

w2/ || ||

-1

b

b+1

0b

xi

supportvector

Main Idea

Training Data:D = {(xi, yi)|xi ∈ IRp, yi ∈ {−1,+1}}ni=1

〈w, xi〉 ≥ b+ ε⇒ yi = +1;〈w, xi〉 ≤ b− ε⇒ yi = −1;Dividing by ε > 0:〈w, xi〉 − b ≥ +1⇒ yi = +1;〈w, xi〉 − b ≤ −1⇒ yi = −1;

Optimization Problem: Primal Form

Minimize{w, ξ}12 ||w||2 + C

∑ni=1 ξi

subject to: yi(〈w, xi〉 − b) ≥ 1− ξi, ξi ≥ 0

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 9/ 20

Page 10: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Support Vector Machine for ClassificationLinear Classifier

L1

2

3

L

L

b

w

w

w2/ || ||

-1

b

b+1

0b

xi

supportvector

Optimization Problem: Dual Form

From Lagrange Theorem, instead of minimize F :Minimize{α,G}F −

i αiGi

subject to: αi ≥ 0, Gi ≥ 0Leaving the details, Dual form:Maximize{α}

∑ni αi − 1

2

∑ni,j=1 αiαjyiyj 〈xi, xj〉

subject to: 0 ≤ αi ≤ C,∑n

i αiyi = 0

Properties

Decision Function:F (x) = sign(

∑ni αiyi 〈xi, x〉 − b)

The Dual form may be solved using standardquadratic programming solver.

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 10/ 20

Page 11: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Support Vector Machine for ClassificationNon-Linear Classifier

ww, (x)F -b = +1

w, (x)F -b = -1

w2/ || ||

a) b) c)

support vector

xi

F

Non-linear classification with the "Kernel trick"

Maximize{α}

∑ni αi − 1

2

∑ni,j=1 αiαjyiyjK(xi, xj)

subject to: ai ≥ 0,∑n

i αiyi = 0,where K(x, x′) =def < Φ(x),Φ(x′) > is the Kernel functionDecision Function: F (x) = sign(

∑ni αiyiK(xi, x)− b)

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 11/ 20

Page 12: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Support Vector Machine for ClassificationNon-Linear Classifier: Kernels

Polynomial: k(xi, xj) = (〈xi, xj〉+ 1)d

Gaussian or Radial Basis Function: k(xi, xj) = exp(‖xi−xj‖

2

2σ2 )

Hyperbolic tangent: k(xi, xj) = tanh(k 〈xi, xj〉+ c)

Examples for Polynomial (left) and Gaussian (right) Kernels:

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 12/ 20

Page 13: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Ranking Support Vector Machine

Find F (x) which preserves the ordering of the training points.

w

L( 1r )

L( 2r)

xx

x

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 13/ 20

Page 14: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)Support Vector Machine (SVM)

Ranking Support Vector Machine

Primal problem

Minimize{w, ξ}12 ||w||2 +

∑Ni=1 Ciξi

subject to{

〈 w,Φ(xi)− Φ(xi+1) 〉 ≥ 1− ξi (i = 1 . . . N − 1)ξi ≥ 0 (i = 1 . . . N − 1)

Dual problem

Maximize{α}

∑N−1i αi −

∑N−1i,j αijK(xi − xi+1, xj − xj+1))

subject to 0 ≤ αi ≤ Ci (i = 1 . . . N − 1)

Rank Surrogate Function in the case 1 rank = 1 point

F(x) = ∑N−1i=1 αi(K(xi, x)−K(xi+1, x))

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 14/ 20

Page 15: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

Model LearningNon-separable Ellipsoid problem

K(xi, xj) = e−(xi−xj)

T (xi−xj)

2σ2 ; KC(xi, xj) = e−(xi−xj)

T C−1(xi−xj)

2σ2

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 15/ 20

Page 16: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

Optimization or filtering?Don’t be too greedy

Optimization: Significant Potential Speed-Up if the surrogatemodel is global and accurate enoughFiltering: "Guaranteed" Speed-Up with the local surrogate model

0 100 200 300 400 500

0.2

0.3

0.4

0.5

0.6

Rank

Pro

babili

ty D

ensity

Results

~Retain with rank ,Prescreen

(λ )

Evaluate

(λ′ )Retain with rank ,λ ~λ

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 16/ 20

Page 17: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

ACM-ES Optimization Loop

4. Generate pre-children and rank themaccording to surrogate fitness function.

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

X1X

2

-1 -0.5 0 0.5 1

-1

-0.5

0

0.5

1

X1

X2

A Select training points. B Build a surrogate model.

C Generate pre-children.D Select most promising children.

1. Select best training points.

3. Build a surrogate model using Rank SVM.

7. Add new training points andupdate parameters of CMA-ES.

λ′

k

2

[4]

. The change of coordinates, definedfrom the current covariance matrixand the current mean value , reads :

SurrogateModel

Rank-based

0 100 200 300 400 500

0.2

0.3

0.4

0.5

0.6

Rank

Pro

ba

bili

ty D

en

sity

Results

~Retain with rank ,Prescreen

(λ )

Evaluate

(λ′ )Retain with rank ,λ ~

5.

6.λ

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 17/ 20

Page 18: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

ResultsSpeed-up

0 5 10 15 20 25 30 35 400

1

2

3

4

5

Problem Dimension

Speedup

Schwefel

Schwefel1/4

Rosenbrock

Noisy Sphere

Ackley

0

1

2

3

4

5

Ellipsoid

Function n λ λ′ e ACM-ES spu CMA-ES

Schwefel 10 10 3 801 36 3.3 2667 8720 12 4 3531 179 2.0 7042 17240 15 5 13440 281 1.7 22400 289

Schwefel 10 10 3 1774 37 4.1 7220 206

20 12 4 6138 82 2.5 15600 29440 15 5 22658 390 1.8 41534 466

Rosenbrock 10 10 3 2059 143 (0.95) 3.7 7669 691 (0.90)

20 12 4 11793 574 (0.75) 1.8 21794 152940 15 5 49750 2412 (0.9) 1.6 82043 3991

NoisySphere 10 10 3 0.15 766 90 (0.95) 2.7 2058 14820 12 4 0.11 1361 212 2.8 3777 12740 15 5 0.08 2409 120 2.9 7023 173

Ackley 10 10 3 892 28 4.1 3641 15420 12 4 1884 50 3.5 6641 10840 15 5 3690 80 3.3 12084 247

Ellipsoid 10 10 3 1628 95 3.8 6211 26420 12 4 8250 393 2.3 19060 50140 15 5 33602 548 2.1 69642 644

Rastrigin 5 140 70 23293 1374 (0.3) 0.5 12310 1098 (0.75)

1/4

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 18/ 20

Page 19: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

ResultsLearning Time

100

101

102

103

101

102

103

10

Problem Dimension

Learn

ing tim

e (

ms) Slope=1.13

Cost of model learning/testing increasesquasi-linearly with on Sphere function:d

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 19/ 20

Page 20: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

Summary

ACM-ES

ACM-ES is from 2 to 4 times faster on Uni-Modal Problems.

Invariant to rank-preserving transformation: Yes

The computation complexity (the cost of speed-up) is O(n)comparing to O(n6)

The source code is available online:http://www.lri.fr/~ilya/publications/ACMESppsn2010.zip

Open Questions

Extention to multi-modal optimization

Adaptation of selection pressure and surrogate model complexity

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag ACM-ES = CMA-ES + RankSVM 20/ 20

Page 21: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

Summary

Thank you for your attention!

Questions?

Page 22: Fast Evolutionary Optimization: Comparison-Based Optimizers Need Comparison-Based Surrogates. PPSN 2010

MotivationBackground

ACM-ES

AlgorithmExperiments

Parameters

SVM Learning:

Number of training points: Ntraining = 30√d for all problems,

except Rosenbrock and Rastrigin, where Ntraining = 70√

(d)

Number of iterations: Niter = 50000√d

Kernel function: RBF function with σ equal to the averagedistance of the training points

The cost of constraint violation: Ci = 106(Ntraining − i)2.0

Offspring Selection Procedure:

Number of test points: Ntest = 500

Number of evaluated offsprings: λ′ = λ3

Offspring selection pressure parameters: σ2sel0 = 2σ2

sel1 = 0.8