Benchmarking Gaussian Processes and Random Forests on the ... · Surrogate CMA-ES Surrogate models...
Transcript of Benchmarking Gaussian Processes and Random Forests on the ... · Surrogate CMA-ES Surrogate models...
Surrogate CMA-ESSurrogate models
Experimental results
Benchmarking Gaussian Processes andRandom Forests on the BBOB Noiseless
Testbed
Lukáš Bajer1,2, Zbynek Pitra3,4, Martin Holena2
1Faculty of Mathematics and Physics, Charles University,2Institute of Computer Science, Czech Academy of Sciences, and
3National Institute of Mental Health4Faculty of Nuclear Sciences and Physical Engineering
Prague, Czech Republic
July 2015
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 1
Surrogate CMA-ESSurrogate models
Experimental results
Contents
1 Surrogate CMA-ES
2 Surrogate modelsGaussian ProcessesRandom Forests
3 Experimental results
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 2
Surrogate CMA-ESSurrogate models
Experimental results
The CMA-ESInput: m ∈ Rn, σ ∈ R+, λ ∈ NInitialize: C = I (and several other parameters)Set the weights w1, . . . wλ appropriately
while not terminate
1 xi = m + σyi, yi ∼ N(0,C), for i = 1, . . . , λ {sampling}
2 evaluate xi with the original fitness
3 m←∑µ
i=1 wi xi:λ = m + σyw, yw =∑µ
i=1 wi yi:λ {update mean}
4 update step-size σ
5 update C
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 3
m2
σ2,C2
Surrogate CMA-ESSurrogate models
Experimental results
The Surrogate CMA-ESInput: m ∈ Rn, σ ∈ R+, λ ∈ NInitialize: C = I (and several other parameters)Set the weights w1, . . . wλ appropriately
while not terminate
1 xi = m + σyi, yi ∼ N(0,C), for i = 1, . . . , λ {sampling}
2 evaluate xi with the original fitness f & build a model fM /evaluate xi with the model fM
3 m←∑µ
i=1 wi xi:λ = m + σyw, yw =∑µ
i=1 wi yi:λ {update mean}
4 update step-size σ
5 update C
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 4
m2, σ2
3rd
1st2nd3rd
6 evaluation by modeland ranking
Surrogate CMA-ESSurrogate models
Experimental results
The Surrogate CMA-ES
Input: g (generation), fM (model), A (archive), nREQ, σ, λ, m, C1: xk ∼ N
(m, σ2C
)k = 1, . . . , λ {CMA-ES sampling}
2: if g is original-evaluated then3: yk ← f (xk) k = 1, . . . , λ {fitness evaluation}4: A = A ∪ {(xk, yk)}λk=15: if |X| ≥ nREQ then6: X← TransformToTheEigenvectorBasis(X, σ, C)7: fM ← trainModel(X, y)8: end if9: else
10: X← TransformToTheEigenvectorBasis(X, σ, C)11: yk ← fM(xk) k = 1, . . . , λ {model evaluation}12: end if
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 5
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Gaussian Process
GP is a stochastic approximation method based on Gaussiandistributions
GP can express uncertainty of the prediction in a new point x:it gives a probability distribution of the output value
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 6
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Gaussian Process
given a set of N training points XN = (x1 . . . xN)>, xi ∈ Rd,
and measured values yN = (y1, . . . , yN)>
of a function f being approximated
yi = f (xi), i = 1, . . . ,N
GP considers vector of these function values as a samplefrom N-variate Gaussian distribution
yN ∼ N(0,CN)
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 7
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Gaussian Process prediction
Making predictionsLet CN+1 be extended covariance matrix – extended by entriesbelonging to an unseen point (x, y∗). Because yN is known and
the inverse C−1N+1 can be expressed using inverse of the training
covariance CN−1,
the density in a new point marginalize to 1D Gaussian density
p(y∗ |XN+1, yN) ∝ exp
(−1
2(y∗ − yN+1)
2
s2yN+1
)with the mean and variancegiven by
yN+1 = k>CN−1yN ,
s2yN+1
= κ− k>CN−1k.
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 8
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Decision tree
A decision tree is a tree where each split node stores a testfunction to be applied to the incoming data and each leaf storesa predictor.
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 9
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Decision treeAdvantages and disadvantages
Advantages:Relatively fastEasy to interpretAdaptive — structure and parameters learned from trainingdata
Disadvantages:Sharp decision boundariesNot the best predictive accuracy
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 10
Surrogate CMA-ESSurrogate models
Experimental results
Gaussian ProcessesRandom Forests
Random forests
A collection of randomly trained decision treesOverall prediction determined by averagingAll advantages of decision trees
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 11
Surrogate CMA-ESSurrogate models
Experimental results
Experimental results on BBOB (5 D)
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0Pr
opor
tion
of fu
nctio
n+ta
rget
pai
rs
RF5-CMAES
RF1-CMAES
GP5-CMAES
GP1-CMAES
CMA-ES
best 2009f1-24,5-D
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 12
Surrogate CMA-ESSurrogate models
Experimental results
Experimental results on BBOB (10 D)
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0Pr
opor
tion
of fu
nctio
n+ta
rget
pai
rs
RF5-CMAES
GP5-CMAES
RF1-CMAES
CMA-ES
GP1-CMAES
best 2009f1-24,10-D
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 13
Surrogate CMA-ESSurrogate models
Experimental results
Experimental results on BBOB (20 D)
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0Pr
opor
tion
of fu
nctio
n+ta
rget
pai
rs
RF5-CMAES
GP5-CMAES
RF1-CMAES
CMA-ES
GP1-CMAES
best 2009f1-24,20-D
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 14
Surrogate CMA-ESSurrogate models
Experimental results
ECDF results on the whole BBOB (5 D)
separable moderate ill-conditional
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
RF1-CMAES
CMA-ES
GP5-CMAES
GP1-CMAES
best 2009f1-5,5-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
RF1-CMAES
GP5-CMAES
GP1-CMAES
CMA-ES
best 2009f6-9,5-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
RF1-CMAES
GP5-CMAES
CMA-ES
GP1-CMAES
best 2009f10-14,5-D
multi-modal weakly structured multi-modal ill-conditional
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
GP5-CMAES
CMA-ES
RF1-CMAES
GP1-CMAES
best 2009f15-19,5-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
RF1-CMAES
CMA-ES
GP5-CMAES
GP1-CMAES
best 2009f20-24,5-D
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 15
Surrogate CMA-ESSurrogate models
Experimental results
ECDF results on the whole BBOB (20 D)
separable moderate ill-conditional
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
GP5-CMAES
RF1-CMAES
GP1-CMAES
CMA-ES
best 2009f1-5,20-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
GP5-CMAES
RF1-CMAES
CMA-ES
GP1-CMAES
best 2009f6-9,20-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
GP5-CMAES
RF1-CMAES
GP1-CMAES
CMA-ES
best 2009f10-14,20-D
multi-modal weakly structured multi-modal ill-conditional
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
GP5-CMAES
RF1-CMAES
GP1-CMAES
CMA-ES
best 2009f15-19,20-D
0 1 2 3log10 of (# f-evals / dimension)
0.0
0.2
0.4
0.6
0.8
1.0
Prop
ortio
n of
func
tion+
targ
et p
airs
RF5-CMAES
RF1-CMAES
CMA-ES
GP5-CMAES
GP1-CMAES
best 2009f20-24,20-D
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 16
Surrogate CMA-ESSurrogate models
Experimental results
Results on separable BBOB functions (1–5)
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
1 Sphere
CMA-ESGP1-CMAESGP5-CMAESRF1-CMAESRF5-CMAES
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
2 Ellipsoid separable
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
3 Rastrigin separable
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
4 Skew Rastrigin-Bueche separ
2 3 5 10 20 40
0
1
2
3
target RL/dim: 10
5 Linear slope
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 17
Surrogate CMA-ESSurrogate models
Experimental results
Results on ill conditional BBOB functions (10–14)
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
10 Ellipsoid
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
11 Discus
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
12 Bent cigar
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
13 Sharp ridge
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
14 Sum of different powers
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 18
Surrogate CMA-ESSurrogate models
Experimental results
Results on weakly structured multi-modal fcts (20–24)
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
20 Schwefel x*sin(x)
2 3 5 10 20 40
0
1
2
3
target RL/dim: 10
21 Gallagher 101 peaks
2 3 5 10 20 40
0
1
2
3
target RL/dim: 10
22 Gallagher 21 peaks
2 3 5 10 20 400
1
2
3
4
target RL/dim: 10
23 Katsuuras
2 3 5 10 20 40
0
1
2
3
target RL/dim: 10
24 Lunacek bi-Rastrigin
CMA-ESGP1-CMAESGP5-CMAESRF1-CMAESRF5-CMAES
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 19
Surrogate CMA-ESSurrogate models
Experimental results
Conclusions
S-CMA-ES speeded-up CMA-ES on several BBOBfunctionsGaussian processes usually exhibit better performancethan random forestsRandom forests’ performance is rather balanced in 20Dwhere Gaussian processes looses because of the highdimensionalityFurther investigation:
number of model generations adaptivityreduction of the model training phase by starting from oldparametersrandom forest model precision
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 20
Surrogate CMA-ESSurrogate models
Experimental results
Thank you!
bajer at cs dot cas dot cz pitra dot z at gmail dot com
Lukáš Bajer, Zbynek Pitra, Martin Holena Benchmarking GP and RF Surrogates for the CMA-ES 21