Generation 1 Generation 5 Generation 10deep-learning.compute.dtu.dk/wp-content/uploads/... ·...

1
C OMPETITIVE C OEVOLUTION IN P ROBLEM D ESIGN & M ETAHEURISTICAL P ARAMETER T UNING M.S C .T HESIS G UÐMUNDUR E INARSSON ,T HOMAS P HILIP R UNARSSON &G UNNAR S TEFÁNSSON School of Engineering and Natural Sciences, University of Iceland, Iceland. A BSTRACT The Marine Research Institute of Iceland has over the last 20 years developed and used the program gadget for modeling of the marine ecosystem around Iceland. The estimation of pa- rameters in this program requires constrained optimization in a continuous domain. In this thesis a coevolutionary algorithm approach is developed to tune the optimization parameters in gadget. The objective of the coevolutionary algorithm is to find optimization parameters that both make the optimization methods in gadget more robust against poorly chosen starting values and tries to reduce the computation time while maintain- ing convergence. This is important when bootstrapping is per- formed on the models in gadget to reduce computation time. This may also ease the tuning of optimization parameters for new users and may reveal other local optima in the likelihood, which may give hint of model misspecification. The algorithm is tested on functions that have similar characteristics as the log- likelihood functions in gadget and some results shown for the case of modeling haddock. H EURISTICS AND M ETAHEURISTICS Heuristics in optimization are methods that find approximate solutions to optimization problems. The purpose is to provide good seeding values for exact algorithms and sometime they are the only viable option when the search space is large, mul- timodality is present in the objective function or the objective function is illconditioned. Other problems also come up where heuristics are used e.g. combinatorial optimization. Heuristics trade various aspects of exact algorithms for speed, e.g. optimality and precision. Another thing about heuristics is that they are often based on a rule of thumb or ex- amples from nature, which often makes it hard to proof any- thing for their performance. These methods are sometimes ad hoc and don’t generalize well to other problems. Metaheuristics can be summed up in the following defini- tion by Glover (1986) A metaheuristic is a high-level problem- independent algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms. The term is also used to refer to a problem specific implementation of a heuristic optimization algorithm according to the guidelines expressed in such a framework. Note that a metaheuristic can refer to a framework, such as evo- lutionary algorithms and a specific algorithm, such as differential evolution. One of the reasons these algorithms have become popular lately is because they usually scale very well to larger computer architectures, they are easily parallelizable. The reason for this is that many of these algorithms are based on agents or individ- uals that cover the search space. These can be independant or have some interactions so the computation for each agent can be calculated on an individual CPU core. Although metaheuristics may not be directly applicable to deep learning, it has still potential to be used for feature selec- tion and other aspects of machine learning. C OMPETITIVE C OEVOLUTION Coevolution is a term from evolutionary biology. It symbolises the change in a biological object triggered by the change in an- other biological object. It can happen cooperatively and com- petitively, and it can happen at the microscopic-level in DNA as correlated mutations or macroscopic-level where covarying traits between different species change. Hillis (1990) is the first person to publish an article on the usage of competitive coevolution as a metaheuristic. His pa- per has at the time this thesis is written, over 1100 citations. He used the competitive nature of the algorithm to create ab- normal/pathological cases to deal with, which helped tune an- other algorithm for generating minimal sorting networks so it would not get stuck at local optima when dealing with such ex- tremities. This setup has inspired others to do the same, when either tuning optimization algorithms or training/tuning game- playing strategies. When the species are in some sense symmetrical, i.e. repre- sentations of opposing players in some games, then it can occur that no strategy dominates all others, i.e. we have non-transitive players. One of the players can always find a strategy to counter what the other player is doing. This can create cyclic behaviour, and has been studied (Samothrakis et al., 2013). Although our problem is transitive in nature, cycling can still occur (De Jong, 2004) and this further emphasizes the importance of the poste- rior assessment of the behaviour in the coevolution. Evolutionary algorithms are special types of heuristics and metaheuristics. They are what is called generic-population based metaheuristics. They are formed from the analogy of evo- lution and the main aspect borrowed is the survival of the fittest. The methods developed from this framework do not need rich domain knowledge, but it can be incorporated. Some techni- cal terms have special synonyms in the theory of evolution- ary algorithms, which are summarized in the following table. EA term Regular term Fitness Objective/Quality Fitness landscape Objective Function Fitness Assessment Computation of objective function value Population Set of candidate solutions Individual Candidate solution Generation Population produced during a specific iteration Mutation Alter candidate solution indepen- dently Crossover Alter candidate solution with other so- lutions Chromosome Solutions parameter vector Gene Specific parameter in the Chromosome vector Allele Specific value of a gene Phenotype Classification after fitness assessment The section Algorithms and Examples summarizes how the algorithm is built using tools from the evolutionary algorithm framework and describes the problems used to test the algo- rithm. A LGORITHM AND E XAMPLES The species in the Coevolution consist of predators and prey. The predators are represented by numerical vectors which are pa- rameters for optimization algorithms and the prey are repre- sented by numerical vectors which represent starting values for the optimization. In general the prey can be any parameter vec- tor which defines a family of function for an optmization algo- rithm to call. There is no static objective function, since in each iteration of the algorithm the score for an individual is dependent on the individuals in the competing species. Therefore relative fitness assessment (RFA) is used. The RFA is the mean of the score for a certain induvidual against some of the individuals in the com- peting species and the score is a weighted sum of the number of iterations for each optimization algorithm and the final value from the optimization. This score/fitness then represents how well an individual performs relative to the other individuals in the species. The fitness is calculated after a generation of new individ- uals with genetic operators. The best individuals are kept for the following generations, they must be significantly better than the individual they were generated from such that they are chosen. The plot below vaguely describes how the genetic operators work in a plane for a given individual. The new potential candidates for the next generation are generated with the genetic operators from differential evolution (Storn et al., 1995). It is a combination of mutation and crossover operators. It is very tedious to perform the RFA such that all the indi- viduals interact. Therfore a sampling strategy was developed to decrease the number of fitness evaluations. In the thesis it is shown empirically that the Examples, the examples used were made to portray the algo- rithm’s capabilities to find hard starting values for the preda- tors. One of the examples is the following function. f MultMod (x, y )= x 2 + y 2 100 - 10e -(x-3) 2 -(y -3) 2 - 9e -(x+3) 2 -(y -3) 2 - 3e -(x+3) 2 -(y +3) 2 - 8e -(x-3) 2 -(y +3) 2 This function is shown in the following plot with signs changed. The predator was considered as parameters for simulated an- nealing and BFGS. This example clearly portrayed the coevolu- tionary algorithms capabilities to find the hardest starting val- ues, around the highest minimum. Where multiple optima are not present it learns to favor the best optimization algorithm for the job. This poster only gives a brief overlook of the project. It also included tests on the gadget program, but there one relative fitness assessment can take from a few seconds to several min- utes on small models. The scalability of the algorithm made parallelizing it easy. All the code was written in R and gadget is written in C++. I want to thank the Marine Research Institute of Iceland for funding me while working on my thesis. R ESULTS The following plot shows the movement of the prey individuals towards the worst starting positions. -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 1 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 5 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 10 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 15 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 20 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 X Y -7.5 -5.0 -2.5 0.0 z Generation 200 R EFERENCES Glover, Fred. Future paths for integer programming and links to artificial intelligence. Computers Operations Research 13, no. 5 (1986): 533-549. Samothrakis, Spyridon, Simon Lucas, Thomas Philip Runarsson, and David Robles. Coevolving game-playing agents: Measuring performance and intransitivities. Evolutionary Computation, IEEE Transactions on 17, no. 2 (2013): 213-226. Hillis, W. Daniel. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D: Nonlinear Phenomena 42, no. 1 (1990): 228-234. Einarsson, Guðmundur. Competitive Coevolution in Problem Design and Metaheuristical Parameter Tuning. M.Sc. Thesis. University of Iceland, Iceland (2014). Storn, Rainer, and Kenneth Price. Differential evolution-a simple and efficient adaptive scheme for global optimization over continuous spaces. Berkeley: ICSI, 1995.

Transcript of Generation 1 Generation 5 Generation 10deep-learning.compute.dtu.dk/wp-content/uploads/... ·...

Page 1: Generation 1 Generation 5 Generation 10deep-learning.compute.dtu.dk/wp-content/uploads/... · COMPETITIVE COEVOLUTION IN PROBLEM DESIGN & METAHEURISTICAL PARAMETER TUNING M.SC.THESIS

COMPETITIVE COEVOLUTION IN PROBLEM DESIGN &METAHEURISTICAL PARAMETER TUNINGM.SC. THESISGUÐMUNDUR EINARSSON, THOMAS PHILIP RUNARSSON & GUNNAR STEFÁNSSONSchool of Engineering and Natural Sciences, University of Iceland, Iceland.

ABSTRACTThe Marine Research Institute of Iceland has over the last 20years developed and used the program gadget for modelingof the marine ecosystem around Iceland. The estimation of pa-rameters in this program requires constrained optimization ina continuous domain. In this thesis a coevolutionary algorithmapproach is developed to tune the optimization parameters ingadget. The objective of the coevolutionary algorithm is tofind optimization parameters that both make the optimizationmethods in gadget more robust against poorly chosen startingvalues and tries to reduce the computation time while maintain-ing convergence. This is important when bootstrapping is per-formed on the models in gadget to reduce computation time.This may also ease the tuning of optimization parameters fornew users and may reveal other local optima in the likelihood,which may give hint of model misspecification. The algorithmis tested on functions that have similar characteristics as the log-likelihood functions in gadget and some results shown for thecase of modeling haddock.

HEURISTICS AND METAHEURISTICSHeuristics in optimization are methods that find approximatesolutions to optimization problems. The purpose is to providegood seeding values for exact algorithms and sometime theyare the only viable option when the search space is large, mul-timodality is present in the objective function or the objectivefunction is illconditioned. Other problems also come up whereheuristics are used e.g. combinatorial optimization.

Heuristics trade various aspects of exact algorithms forspeed, e.g. optimality and precision. Another thing aboutheuristics is that they are often based on a rule of thumb or ex-amples from nature, which often makes it hard to proof any-thing for their performance. These methods are sometimes adhoc and don’t generalize well to other problems.

Metaheuristics can be summed up in the following defini-tion by Glover (1986)

A metaheuristic is a high-level problem-independent algorithmic framework that providesa set of guidelines or strategies to develop heuristicoptimization algorithms. The term is also usedto refer to a problem specific implementation of aheuristic optimization algorithm according to theguidelines expressed in such a framework.

Note that a metaheuristic can refer to a framework, such as evo-lutionary algorithms and a specific algorithm, such as differentialevolution.

One of the reasons these algorithms have become popularlately is because they usually scale very well to larger computerarchitectures, they are easily parallelizable. The reason for thisis that many of these algorithms are based on agents or individ-uals that cover the search space. These can be independant orhave some interactions so the computation for each agent canbe calculated on an individual CPU core.

Although metaheuristics may not be directly applicable todeep learning, it has still potential to be used for feature selec-tion and other aspects of machine learning.

COMPETITIVE COEVOLUTIONCoevolution is a term from evolutionary biology. It symbolisesthe change in a biological object triggered by the change in an-other biological object. It can happen cooperatively and com-petitively, and it can happen at the microscopic-level in DNAas correlated mutations or macroscopic-level where covaryingtraits between different species change.

Hillis (1990) is the first person to publish an article on theusage of competitive coevolution as a metaheuristic. His pa-per has at the time this thesis is written, over 1100 citations.He used the competitive nature of the algorithm to create ab-normal/pathological cases to deal with, which helped tune an-other algorithm for generating minimal sorting networks so itwould not get stuck at local optima when dealing with such ex-tremities. This setup has inspired others to do the same, wheneither tuning optimization algorithms or training/tuning game-playing strategies.

When the species are in some sense symmetrical, i.e. repre-sentations of opposing players in some games, then it can occurthat no strategy dominates all others, i.e. we have non-transitiveplayers. One of the players can always find a strategy to counterwhat the other player is doing. This can create cyclic behaviour,and has been studied (Samothrakis et al., 2013). Although ourproblem is transitive in nature, cycling can still occur (De Jong,2004) and this further emphasizes the importance of the poste-rior assessment of the behaviour in the coevolution.

Evolutionary algorithms are special types of heuristics andmetaheuristics. They are what is called generic-populationbased metaheuristics. They are formed from the analogy of evo-lution and the main aspect borrowed is the survival of the fittest.The methods developed from this framework do not need richdomain knowledge, but it can be incorporated. Some techni-cal terms have special synonyms in the theory of evolution-ary algorithms, which are summarized in the following table.

EA term Regular termFitness Objective/QualityFitness landscape Objective FunctionFitness Assessment Computation of objective function

valuePopulation Set of candidate solutionsIndividual Candidate solutionGeneration Population produced during a specific

iterationMutation Alter candidate solution indepen-

dentlyCrossover Alter candidate solution with other so-

lutionsChromosome Solutions parameter vectorGene Specific parameter in the Chromosome

vectorAllele Specific value of a genePhenotype Classification after fitness assessment

The section Algorithms and Examples summarizes how thealgorithm is built using tools from the evolutionary algorithmframework and describes the problems used to test the algo-rithm.

ALGORITHM AND EXAMPLESThe species in the Coevolution consist of predators and prey. Thepredators are represented by numerical vectors which are pa-rameters for optimization algorithms and the prey are repre-sented by numerical vectors which represent starting values forthe optimization. In general the prey can be any parameter vec-tor which defines a family of function for an optmization algo-rithm to call.

There is no static objective function, since in each iterationof the algorithm the score for an individual is dependent on theindividuals in the competing species. Therefore relative fitnessassessment (RFA) is used. The RFA is the mean of the score for acertain induvidual against some of the individuals in the com-peting species and the score is a weighted sum of the numberof iterations for each optimization algorithm and the final valuefrom the optimization. This score/fitness then represents howwell an individual performs relative to the other individuals inthe species.

The fitness is calculated after a generation of new individ-uals with genetic operators. The best individuals are kept forthe following generations, they must be significantly betterthan the individual they were generated from such that theyare chosen. The plot below vaguely describes how the geneticoperators work in a plane for a given individual.

The new potential candidates for the next generation aregenerated with the genetic operators from differential evolution(Storn et al., 1995). It is a combination of mutation and crossoveroperators.

It is very tedious to perform the RFA such that all the indi-viduals interact. Therfore a sampling strategy was developedto decrease the number of fitness evaluations. In the thesis it isshown empirically that theExamples, the examples used were made to portray the algo-rithm’s capabilities to find hard starting values for the preda-tors. One of the examples is the following function.

fMultMod(x, y) =x2 + y2

100− 10e−(x−3)2−(y−3)2 − 9e−(x+3)2−(y−3)2

− 3e−(x+3)2−(y+3)2 − 8e−(x−3)2−(y+3)2

This function is shown in the following plot with signs changed.

The predator was considered as parameters for simulated an-nealing and BFGS. This example clearly portrayed the coevolu-tionary algorithms capabilities to find the hardest starting val-ues, around the highest minimum. Where multiple optima arenot present it learns to favor the best optimization algorithm forthe job.

This poster only gives a brief overlook of the project. It alsoincluded tests on the gadget program, but there one relativefitness assessment can take from a few seconds to several min-utes on small models. The scalability of the algorithm madeparallelizing it easy.

All the code was written in R and gadget is written in C++.I want to thank the Marine Research Institute of Iceland forfunding me while working on my thesis.

RESULTSThe following plot shows the movement of the prey individuals towards the worst starting positions.

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 1

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 5

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 10

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 15

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 20

−5.0

−2.5

0.0

2.5

5.0

−5.0 −2.5 0.0 2.5 5.0

X

Y

−7.5

−5.0

−2.5

0.0

z

Generation 200

REFERENCES• Glover, Fred. Future paths for integer programming and links to artificial intelligence. Computers Operations Research 13, no. 5 (1986): 533-549.

• Samothrakis, Spyridon, Simon Lucas, Thomas Philip Runarsson, and David Robles. Coevolving game-playing agents: Measuring performance and intransitivities.Evolutionary Computation, IEEE Transactions on 17, no. 2 (2013): 213-226.

• Hillis, W. Daniel. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D: Nonlinear Phenomena 42, no. 1 (1990): 228-234.

• Einarsson, Guðmundur. Competitive Coevolution in Problem Design and Metaheuristical Parameter Tuning. M.Sc. Thesis. University of Iceland, Iceland (2014).

• Storn, Rainer, and Kenneth Price. Differential evolution-a simple and efficient adaptive scheme for global optimization over continuous spaces. Berkeley: ICSI, 1995.