Adaptive Operator Selection via Online Learning and ...

24
Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli Leandro L. Minku Xin Yao CERCIA, School of Computer Science University of Birmingham, United Kingdom www.cs.bham.ac.uk/~pac265 [email protected] P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 1 / 23

Transcript of Adaptive Operator Selection via Online Learning and ...

Page 1: Adaptive Operator Selection via Online Learning and ...

Adaptive Operator Selection via Online Learningand Fitness Landscape Metrics

Pietro Consoli Leandro L. Minku Xin Yao

CERCIA, School of Computer Science

University of Birmingham, United Kingdom

www.cs.bham.ac.uk/~pac265

[email protected]

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 1 / 23

Page 2: Adaptive Operator Selection via Online Learning and ...

Outline

1 Motivation

2 Adaptive Crossover SelectionFitness Landscape MetricsOnline Learning

3 Case StudyCARP

4 Experimental StudiesResults

5 Future Work

6 Conclusions

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 2 / 23

Page 3: Adaptive Operator Selection via Online Learning and ...

Adaptive Crossover Selection

Different crossover operatorsmight lead to offspring withdifferent characteristics:

ExplorationFitnessGood traits transmission

We can expect differentsearch results on certaininstances

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 3 / 23

Page 4: Adaptive Operator Selection via Online Learning and ...

Adaptive Crossover Selection

Inst Op A Op B Op C Op D1 best x x x2 best x x x3 x x x best4 x best x x5 x x x best6 x x best x7 x x best x8 x x x best

Inst ACS(Op)1 A2 A3 D4 B5 D6 C7 C8 D

Adaptive Crossover SelectionAdaptively select the best crossover operator to use during the searchprocess

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 4 / 23

Page 5: Adaptive Operator Selection via Online Learning and ...

Adaptive Crossover Selection: Dynamic Scenario

Dynamic scenario:Different periods of thesearch might have differentbest crossover operators;Dynamic ACS potentiallybetter than static scenario

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 5 / 23

Page 6: Adaptive Operator Selection via Online Learning and ...

Research Questions

1 State-of-the art approaches for Credit Assignment consider theuse of just one measure (usually fitness). Enough to characterizethe current population distribution?

2 What Operator Selection Rule can handle a set of measures?

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 6 / 23

Page 7: Adaptive Operator Selection via Online Learning and ...

RQ1: Population Charachterization through FLA

Fitness Landscape Analysis (FLA): create a more “aware” snapshot ofthe current population distribution.

1 perform a set of 4 online FLA techniques during each generation;1 Average Escape Probability1 (Evolvability);2 Average ∆−Fitness of the neutral networks 2 (Neutrality);3 Average neutrality ratio 2 (Neutrality);4 Dispersion Metric3 (Population Distribution);

2 FLA not to predict hardness but to learn more the currentpopulation distribution.

1Lu, G., Li, J., Yao, X. - "Fitness-probability cloud and a measure of problem hardness forevolutionary algorithms" - 2011

2Vanneschi L., Pirola Y., Collard P. - "A Quantitative Study of Neutrality in GP BooleanLandscapes" - 2006

3Lunacek M., Whitley D., - "The Dispersion Metric and the CMA Evolution Strategy" - 2006P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 7 / 23

Page 8: Adaptive Operator Selection via Online Learning and ...

RQ2: Credit Assignment through Online Learning

Detection of changes analogous to Concept Drift tracking inOnline Learning;Concept Drift: change of the underlying distribution of the samplesduring the learning process;Online learning can be used to learn the relationship between FLAresults (input features) and the credit measure (output feature);Dynamic Weighted Majority (DWM) using Regression Trees asbase learners.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 8 / 23

Page 9: Adaptive Operator Selection via Online Learning and ...

Dynamic Weighted Majority

DWM(p, β, θ,, τ );initialize a set of experts and assign an initial weight wj = 1 to each;create a window of the last training instances wTS(xi );forall the instances (x i , yi ) do

update wTS;forall the expert ej do

λi = predict(ej , x i );if |λi − yi | < τ and i mod p = 0 then

wj = β ∗ wj ;endif wj < θ and i mod p = 0 then

delete expert ej ;endnormalize weights (maximum weight equal to 1);calculate global prediction σi (weighted average prediction);if |σi − yi | < τ and i mod p = 0 then

create new expert ej and train with wTS;endtrain all experts with the new instance (x i , yi );return σi ;

endend

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 9 / 23

Page 10: Adaptive Operator Selection via Online Learning and ...

Capacitated Arc Routing Problem

Case Study: Crossover Operator Selection using the MAENSalgorithm for Capacitated Arc Routing Problem 4;Considers the use of a suite of four different crossoveroperators;Credit Assignment Mechanism: Proportional Reward (PR);we exploit the Local Search of MAENS* to perform the FLAtechniques without extra computational cost.

4K. Tang, Y. Mei, X. Yao - "Memetic Algorithm with Extended Neighborhood Search forCapacitated Arc Routing Problems" - 2009P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 10 / 23

Page 11: Adaptive Operator Selection via Online Learning and ...

Credit Mechanism

Credit Assignment Mechanism: percentage of offspring generated byeach operator surviving to the next generation.

Proportional Reward

PR(i)t =|x ∈ popt+1 : x generated by operator i|

|popt+1|

Indirect effect of crossover operator;We entrust the selection/ranking operator of the algorithm toevaluate the individuals.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 11 / 23

Page 12: Adaptive Operator Selection via Online Learning and ...

CARP - instance

each arc (task) has a service cost and a demand;constraints: number of vehicles and capacity;objective function: minimize the total service cost;proved NP-Hard in 1981;many real-world applications (e.g. waste collection, road gritting).

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23

Page 13: Adaptive Operator Selection via Online Learning and ...

CARP - instance

each arc (task) has a service cost and a demand;constraints: number of vehicles and capacity;objective function: minimize the total service cost;proved NP-Hard in 1981;many real-world applications (e.g. waste collection, road gritting).

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23

Page 14: Adaptive Operator Selection via Online Learning and ...

MAENS*-II

FLA is performedduring each iteration;basic OperatorSelection Rule: largestinstantaneous rewardin order to reduce biasof previousperformances;Credit Assignmentthrough DWM.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 13 / 23

Page 15: Adaptive Operator Selection via Online Learning and ...

Experimental studies

Experiments conducted on a set of 42 non-easy CARP instancesbelonging to egl, val and Beullen’s benchmark sets;Average fitness values calculated over 30 independent runs;In order to provide a lower bound and a term of comparison for theresults, an Oracle using only the Proportional Reward is built;

1 Tested optimizationresults against MAENS*,Oracle;

2 Tested prediction abilityagainst Oracle.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 14 / 23

Page 16: Adaptive Operator Selection via Online Learning and ...

MAENS*-II vs MAENS*

MAENS* - uses MAB and Proportional Reward;MAENS*-II wins the comparisons with MAENS* on 20 instancesand loses on 18 out of 42 instances;Wilcoxon signed-rank test over the set of the instances suggeststhat there is no statistical difference between the results achievedby the two algorithms;6 instances show statistically different results using Wilcoxonrank-sum test on each couple of results.

Instance MAENS*-II MAENS*avg fitness std avg fitness std

D23 767.67 7.39 769.83 12.28E15 1604.33 5.59 1602.50 6.68E19 1442.00 4.58 1442.67 4.23F19 732.50 9.64 735.17 9.35

egl-s1-B 6397.59 12.70 6399.90 16.38egl-s2-B 13171.41 29.49 13179.07 26.11

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 15 / 23

Page 17: Adaptive Operator Selection via Online Learning and ...

MAENS*-II vs Oracle

Oracle achieves better results on 40 instances;On 2 instances MAENS*-II managed to achieve better results thanthe Oracle;If Oracle shows bound using only PR, then the use of FLA+PRcan enhance of the optimization ability of the algorithm.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 16 / 23

Page 18: Adaptive Operator Selection via Online Learning and ...

Prediction Ability: Oracle

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

13971.10

13519.52

13410.23

13359.21

13329.27

13306.66

13284.98

13266.16

13259.01

13246.96

13238.19

13230.69

13224.19

13219.14

13210.87

13202.81

13198.70

13193.36

13185.43

13178.60

13173.01

13166.71

13162.42

13157.75

13138.98

gsbx grx pbx spbx

Figure: Oracle Selection Rates on instance egl-s2-BP. Consoli (University of Birmingham) The 43rd CREST Open Workshop 17 / 23

Page 19: Adaptive Operator Selection via Online Learning and ...

Prediction Ability: MAENS*

0

0.1

0.2

0.3

0.4

0.5

0.6

13977.76

13518.30

13426.18

13372.69

13340.94

13319.99

13303.27

13290.76

13279.41

13269.52

13261.32

13254.27

13247.09

13240.74

13234.68

13228.45

13222.88

13218.08

13213.19

13207.88

13202.49

13194.81

13186.17

13176.59

13162.09

gsbx grx pbx spbx

Figure: MAENS* Selection Rates on instance egl-s2-BP. Consoli (University of Birmingham) The 43rd CREST Open Workshop 18 / 23

Page 20: Adaptive Operator Selection via Online Learning and ...

Prediction Ability: MAENS*-II

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

13983.50

13517.70

13424.79

13371.06

13331.73

13301.41

13281.70

13269.87

13262.24

13254.12

13245.67

13238.68

13232.95

13228.56

13223.33

13217.34

13212.08

13207.40

13203.44

13198.58

13192.89

13186.89

13179.02

13169.77

13156.15

gsbx grx pbx spbx

Figure: MAENS*-II Selection Rates on instance egl-s2-BP. Consoli (University of Birmingham) The 43rd CREST Open Workshop 19 / 23

Page 21: Adaptive Operator Selection via Online Learning and ...

Future work

Integrated with a Reinforcement Learning mechanism withconcurrent use of the operators;Tested the use of a diversity-based reward measure;Improved results when using RL;outperformed state-of-the-art on Large Scale CARP instances.

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 20 / 23

Page 22: Adaptive Operator Selection via Online Learning and ...

Conclusions

Conclusions:Novel Adaptive Operator Selection strategy based on a set of FLAmeasures and online learning;Achieved comparable results w.r.t. MAB and outperformed theoracle in a few instances but still non optimal detection of changesin environment;

Future Directions:Improving the detection of changes in the environment;Test on Software Engineering Problems?

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 21 / 23

Page 23: Adaptive Operator Selection via Online Learning and ...

References

Consoli, P.; Minku, L. L; Yao, X.; "Dynamic Selection of EvolutionaryAlgorithm Operators Based on Online Learning and FitnessLandscape Metrics", Proceedings of the 10th International Conferenceon Simulated Evolution And Learning (SEAL’14), LNCS 8886, pp.359-370, December 2014Consoli P., Yao, X.; "Diversity-Driven Selection of Multiple CrossoverOperators for the Capacitated Arc Routing Problem", EvolutionaryComputation in Combinatorial Optimisation, Proceedings of the 14thEuropean Conference, EvoCOP 2014, Granada, Spain, April 23-25,2014, LNCS 8600, 2014, pp 97-108This work was supported by UK Engineering and Physical SciencesResearch Council (EPSRC) (Grant Nos. EP/I010297/1 andEP/J017515/1).

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 22 / 23

Page 24: Adaptive Operator Selection via Online Learning and ...

Thank You!

P. Consoli (University of Birmingham) The 43rd CREST Open Workshop 23 / 23