opt-2010-v2

102
Sept, 2010 ®Copyright of Shun-Feng Su 1 Optimization: Non-Derivative Approaches 非非非非非非非 Offered by 蘇蘇蘇 Shun-Feng Su, E-mail: [email protected] Department of Electrical Engineering, National Taiwan University of Science and Technology

Transcript of opt-2010-v2

Page 1: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

1

Optimization: Non-Derivative Approaches

非微分型最佳化 

Offered by 蘇順豐Shun-Feng Su,

E-mail: [email protected]

Department of Electrical Engineering,National Taiwan University of Science and Technology

 

Page 2: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

2

Preface

Optimization is central to many occasions involving decision or finding good solutions in various research problems.

In this talk, I shall provide some fundamental concepts and ideas about optimization.

This talk will also introduce one group of optimization techniques – non-derivative optimization, like genetic algorithms, ant systems, and particular swarm optimization.

Page 3: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

3

Preface

Fundamentals of Optimization

Traditional Optimization

Non-derivative Approaches Genetic Algorithms Particle Swarm Optimization Ant colony optimization

Epilogue

Outline

Page 4: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

4

Fundamentals of Optimization

Optimization is to find the best one among all possible alternatives.

It is easy to see that optimization is always a good means in demonstrating your research results.

But, the trick is what you mean “better”?

Page 5: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

5

Fundamentals of Optimization

Optimization is to find the best one among all possible alternatives.

It is easy to see that optimization is always a good means in demonstrating your research results.

But, the trick is what you mean “better”?

Why the optimal one is better than the others?In other words, based on which criterion the

evaluation is conducted?

Page 6: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

6

Fundamentals of Optimization

The measure of goodness of alternatives is described by an so-called objective function or performance index.

Thus, it is desired that when you see “optimal”, you should first check what is the objective function used.

Optimization then is to maximized or minimized the objective function considered.

Other terms used are cost function (maximized), fitness function (minimized), etc.

Page 7: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

7

Fundamentals of Optimization

Consider an intelligent system, usually optimization methodology is required due to :

Better selection of applicable knowledge or strategies can result in better performance;

In the learning process, an optimal way of defining the updating rule is required.

Page 8: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

8

Fundamentals of Optimization

Consider an intelligent system, usually optimization methodology is required due to:

Better selection of applicable knowledge or strategies can result in better performance;

In the learning process, an optimal way of defining the updating rule is required.

To act as a leaning mechanism is the most popular approach currently employed in the

literature.

Page 9: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

9

Fundamentals of Optimization

In general, an optimization problem requires finding a setting of variable vector (or parameters) of the system such that an objective function is optimized. Sometimes, the variable vector may have to satisfy some constraints.

• Alternatives are to choose among values Numerical approach.

• This is why optimization is considered as one part of computational intelligence.

Page 10: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

10

Preface

Fundamentals of Optimization

Traditional Optimization

Non-derivative Approaches Genetic Algorithms Particle Swarm Optimization Ant colony optimization

Epilogue

Outline

Page 11: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

11

Traditional Optimization

A traditional optimization problem can be expressed as

Min (or Max) f(x)

subject to xf( ) is the objective function to be optimized.

If some constraint like x is specified, it is referred to as a constrained optimization problem; otherwise it is called unconstrained optimization problem.

Page 12: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

12

Traditional approaches for unconstrained optimization

If the objective function can be explicitly expressed as a function of parameters, traditional mathematic approaches can be employed to solve the optimization:

Traditional optimization approaches can be classified into two categories; direct approach and incremental approach.

Page 13: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

13

Traditional approaches for unconstrained optimization

Direct approaches can be said to find the solution mathematically (to find the solution with certain properties).

In a direct approach, the idea is to directly find x such that df(x)/dx=0 or f(x)=0.

This kind of approaches is Newton kind of approaches.

In optimization, it is f(x)=0

Newton’s method is to find a way of solving f(x)=0 and the used approach

can also be iterative.

Page 14: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

14

Traditional approaches for unconstrained optimization

Increment approach is to find which way can improve the current situation based on the current error. (back forward approach)

Usually, an incremental approach is to update the parameter vector as x(k+1)=x(k)+x.

In fact, such an approach is usually fulfilled as a gradient approach; that is x=f(x)/x.

Need to find a relationship between the current error and the change of the variable considered;

that is why x=f(x)/x is employed.

Page 15: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

15

Traditional approaches for constrained optimization

In general, the constraint can be written as h(x)=0.

When the constraint is not expressed as such an equality (e.g., h(x)0), either the constraint is not effective (not used) or the minimier is located on the boundary (i.e., h(x)=0)

Page 16: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

16

Traditional approaches for constrained optimization

In general, the constraint can be written as h(x)=0.

A commonly-used approach is the Lagrange Theorem, which is to find x and such that

f(x)+ h(x)= 0

where is called the Lagrange multiplier.

Then, traditional unconstrained optimization approaches can be employed.

Page 17: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

17

Traditional Optimization

Traditional optimization approaches are to develop a formal model (objective function and constraints) that resembles the original problem and then to solve it by means of traditional mathematical methods.

In other words, in order to find f(x)=0 or f(x)/x, the objective function f( ) must be explicitly expressed as a function of the parameter vector x

Page 18: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

18

Traditional Optimization

Traditional optimization techniques even have some problems (like being trapped in local optima), those approaches have been shown to be very successful in many applications.

However, other drawbacks are found in those traditional optimization applications.

Page 19: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

19

Traditional Optimization

In real-world problems, the objective function and/or the constraints imposed on the variables may not be analytically treatable or even cannot be expressed in a closed form.

Thus, either there is no way of representing the problem considered in a form so that the derivative of the form can be performed or simplifications of the original problem formulation are required.

Page 20: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

20

Traditional Optimization

When there is no way of representing the problem in a closed form, backward kinds of approaches cannot be implemented.

When simplification is conducted, it is more than often that the found solutions do not solve the original problem but the simplified problem.

Thus, some approaches of use only forward are needed. Then it is impossible to modify candidates based on the current output.

Page 21: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

21

Traditional Optimization

If only forward evaluation is used, the approach is to find the objective function value for the current candidate and then try another candidate.

It is more like a search algorithm.

The issue may be how to define the next candidates.

Randomly select or select with some guidance?

Page 22: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

22

Preface

Fundamentals of Optimization

Traditional Optimization

Non-derivative Approaches Genetic Algorithms Particle Swarm Optimization Ant colony optimization

Epilogue

Outline

Page 23: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

23

Non-Derivative Optimization

An important property of such algorithms is that in the process, auxiliary forms of the objective function, such as derivations, are not required. non-derivative optimization

Non-derivative optimization does not define the relationship between the current situation (error) and the variable considered. Thus, another way of defining the finding the optimal solution need to be employed.

Page 24: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

24

Non-Derivative Optimization

A search algorithm is to find the solution based on trying possible candidates.

Since it is impossible to try all possibilities, how to define the next one usually is the key issue in search algorithms.

Non-derivative optimization are also called Evolutionary Computation, Nature Inspired Algorithm or meta-heuristic Algorithms.

Page 25: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

25

Non-Derivative Optimization

Page 26: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

26

Non-Derivative Optimization

Page 27: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

27

Non-Derivative Optimization

Page 28: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

28

Non-Derivative Optimization

Page 29: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

29

Non-Derivative Optimization

Non-Derivative Optimization approaches are to mimic various natural phenomena, like natural selection process or animal behaviors so as to find the best candidate for the problem.

Those search processes are to find the next candidates by using experience obtained from previous search together with some random search mechanisms.

That is to define the next candidates with some guidance and randomness.

Page 30: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

30

Non-Derivative Optimization

1. It works with a coding of solution set, not the solutions itself. need to code solutions

2. It searches from a population of solutions, not a single solution. parallel search

3. It uses payoff information (fitness function), not derivatives or other auxiliary knowledge.

non-derivative optimization

4. It uses probabilistic transition rules, not deterministic rules. random search with guidance (stochastic search)

Page 31: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

31

Genetic Algorithms

Genetic Algorithms (GAs) simulate the natural evolutionary process in searching for the best solution based on the mechanism of natural selection and natural genetic operation.

John Holland, from the University of Michigan began his work on GAs in the early 60s.

A first achievement was the publication of Adaptation in Natural and Artificial System in 1975.

Page 32: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

32

Genetic Algorithms

GA encodes solutions to the problem in a structure that can be stored in the computer.

This object is a genome (or chromosome). GA creates a population of genomes then applies genetic operators (crossover and mutation) to the candidates in the population to generate new candidates.

It uses various selection criteria so that it picks the best candidates for mating (and subsequent crossover). The objective function determines how 'good' each individual is.

Page 33: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

33

Genetic Algorithms

To represent solutions in terms of genes -- Representation of Candidate Solutions (CS):– Binary encoding– Real number encoding– Integer or literal permutation encoding– General data structure encoding : array,

tree, matrix, . . . , etc.

Page 34: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

34

Genetic Algorithms

A Genetic Algorithm (GA) emulates biological evolution to solve a complex problem.

GAs rely heavily on randomness. Instead of trying to solve the problem directly, they create random solutions and randomly mix them up until a good solution is found.

Page 35: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

35

Genetic Algorithms

The evolution starts from a population of completely random candidates and searches for the best generation by generation.

In each generation, multiple candidates are stochastically selected from the current population, modified (mutated or recombined) to form a new population, which is used in the next generation (iteration).

Page 36: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

36

Genetic Algorithms

Use of the encoding of the parameters, not the parameters themselves.

Work on a population of points, not a unique one.

Use the only values of the function to optimize, not their derived function or other auxiliary knowledge.

Use probabilistic transition function not determinist ones.

Page 37: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

37

Apply reproduction and crossover on P(t) to yield C(t)

Apply mutation on C(t) to yield and then evaluate D(t)

Select P(t+1) from P(t) and D(t) based on the fitness

Initialize population P(t)

Evaluate P(t)

Stop criterion satisfied ?

Stop

Flow chart of a simple genetic

algorithm

Page 38: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

38

Genetic Algorithms

GA uses three basic operators to manipulate the genetic composition (chromosomes) of a population:

Reproduction is a process of selecting parents for generating offspring. The most highly rated chromosomes in the current generation are most likely copied in the new generation.

Crossover provides a mechanism for chromosomes to mix and match attributes through random processes.

Mutation is to changed attributes (genes) in the new generation to bring new possibility. Mutation is a very important mechanism in avoiding local minimum in optimization search.

Page 39: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

39

Reproduction

Reproduction can be divided into two kinds of processes:

to select parents from the population and to determine who will survive in the next generation.

Both processes need to select among all candidates. Selection methodology can be considered based on the following foundations:

Sampling space Sampling Mechanism Probability Selection

Page 40: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

40

Reproduction

To select parents from the population:GA researchers have used a number of parent

selection methods. Some of the more popular methods are:Proportionate SelectionLinear Rank SelectionTournament Selection

How many parents will be selected is also an issue for designing GA.

Page 41: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

41

Proportionate Selection

In Proportionate Selection, candidates are assigned a probability of being selected based on their fitness: pi = fi / fj,

where pi is the probability that candidate i will be selected and fi is the fitness of candidate i.

This type of selection is also referred to as the roulette wheel selection.

Fitness maximum problem.If a minimum problem is consider, some modifications are needed.

Page 42: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

42

Proportionate Selection

There are a number of disadvantages associated with using proportionate selection:– Cannot be used on minimization problems,– Loss of selection pressure (search direction)

as population converges,– Susceptible to Super Individuals– Scaling issue for fitness values

Local optima issue

Page 43: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

43

Linear Rank Selection

In Linear Rank selection, candidates are assigned subjective fitness based on the rank within the population: sfi = (P-ri)(max-min)/(P-1) + min

where ri is the rank of individual i,

P is the population size,

Max represents the fitness to assign to the best candidate,

Min represents the fitness to assign to the worst candidate.

Page 44: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

44

Linear Rank Selection

pi = sfi / sfj Roulette Wheel Selection can be performed using the subjective fitness values.

One disadvantage associated with linear rank selection is that the population must be sorted on each cycle.

Page 45: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

45

Tournament Selection

In Tournament Selection, q candidates are randomly selected from the population and the best of the q candidates is returned as a parent.

Selection pressure increases as q is increased and decreases as q is decreased.

Page 46: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

46

Selecting Who Survives

An Example Genetic AlgorithmProcedure GA{ t = 0; Initialize P(t); Evaluate P(t); While (Not Done) { Parents(t) = Select_Parents(P(t));

Offspring(t) = Procreate(Parents(t));Evaluate(Offspring(t));P(t+1)= Select_Survivors(P(t),Offspring(t));t = t + 1;

}

Genetic operations: crossover and mutation

Select who survive

Parent selection

Page 47: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

47

Selecting Who Survives

By itself, pick best. Darwinian survival of the fittest. Give more copies to better guys.Ways to do: –truncation –roulette wheel –tournament

Page 48: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

48

Selection Who Survives

Basically, there are two types of selections in GAs:

Let = # of parents, = # of offspring,

( +) selection: select best out of offspring and old parents as parents of the next generation.

(, ) selection: select best offspring as parents of the next generation. ( <)

Page 49: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

49

Genetic Operators

Genetic Algorithms typically use two types of operators: Crossover and Mutation.

Crossover is usually the primary operator for inheriting properties from parents with mutation serving only as a mechanism to introduce diversity in the population.

However, when designing a GA, it is possible to develop unique crossover and mutation operators that take advantage of the structure of the problem.

Page 50: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

50

Crossover Operator

There are a number of crossover operators that have been used on binary and real-coded GAs:

Single-point Crossover,

Two-point Crossover,

Uniform Crossover

How many offspring will be generated is also an issue in designing GA.

Page 51: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

51

Crossover Operator

Given two parents, single-point crossover will generate a cut-point and recombines the first part of first parent with the second part of the second parent to create one offspring.

Example:Parent 1: X X | X X X X X

Parent 2: Y Y | Y Y Y Y Y

Offspring 1: X X Y Y Y Y Y

Offspring 2: Y Y X X X X X

Page 52: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

52

Two-Point Crossover

Two-Point crossover is very similar to single-point crossover except that two cut-points are generated instead of one.

Example:

Parent 1: X X | X X X | X X

Parent 2: Y Y | Y Y Y | Y Y

Offspring 1: X X Y Y Y X X

Offspring 2: Y Y X X X Y Y

Page 53: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

53

Uniform Crossover

In Uniform Crossover, a value of the first parent’s gene is assigned to the first offspring and the value of the second parent’s gene is to the second offspring with probability 0.5.

With probability 0.5 the value of the first parent’s gene is assigned to the second offspring and the value of the second parent’s gene is assigned to the first offspring.

Page 54: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

54

Uniform Crossover

Example:Parent 1: X X X X X X X

Parent 2: Y Y Y Y Y Y Y

Offspring 1: X Y X Y Y X Y

Offspring 2: Y X Y X X Y X

Page 55: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

55

Real-Coded Crossover Operators

For Real-Coded representations there exist a number of other crossover operators:

Mid-Point Crossover,

Flat Crossover (BLX-0.0),

BLX-0.5

Page 56: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

56

Mid-Point Crossover

Given two parents where X and Y represent a floating point number:

Parent 1: X

Parent 2: Y

Offspring: (X+Y)/2

If a chromosome contains more than one gene, then this operator can be applied to each gene with a probability of Pmp.

Page 57: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

57

Flat Crossover (BLX-0.0)

Flat crossover was developed by Radcliffe (1991)

Given two parents where X and Y represent a floating point number:

Parent 1: X

Parent 2: Y

Offspring: rnd(X,Y)

Of course, if a chromosome contains more than one gene then this operator can be applied to each gene with a probability of Pblx-0.0.

Page 58: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

58

BLX-

Developed by Eshelman & Schaffer (1992)

Given two parents where X and Y represent a floating point number, and where X < Y:

Parent 1: X

Parent 2: Y

Let = (Y-X), where = 0.5

Offspring: rnd(X-, Y+ )

Of course, if a chromosome contains more than one gene then this operator can be applied to each gene with a probability of Pblx-.

Page 59: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

59

Mutation (Binary-Coded)

In Binary-Coded GAs, each bit in the chromosome is mutated with probability pbm known as the mutation rate.

Parent1 1 0 0 0 0 1 0 Parent2 1 1 1 0 0 0 1 Child1 1 0 0 1 0 0 1 Child2 0 1 1 0 1 1 0

An Example of Sing le-point Crossover Between the Third and Fourth Genes with a Mutation Rate of

0.01 Applied to Binary Coded Chromosomes

Page 60: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

60

Mutation (Real-Coded)

In real-coded GAs, Gaussian mutation can be used.

For example, BLX-0.0 Crossover with Gaussian mutation.

Given two parents where X and Y represent a floating point number:

Parent 1: X

Parent 2: Y

Offspring: rnd(X,Y) + N(0,1)

Page 61: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

61

Advanced GA techniques

• Elitism – Carry over some portion of the best solutions to the next generation.

• Variable operators – Create multiple types of crossovers and mutations. Track the health of the offspring they produce, and adjust their usage accordingly.

• Tribes – Create separate populations that only occasionally mix. This may help avoid converging on local maxima.

Page 62: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

62

Advanced GA techniques

"General Structure of Hybrid Genetic Algorithms"Begin t←0; initialize P(t); evaluate P(t); while (not termination condition) do recombine P(t) to yield C(t); locally climb C(t); evaluate C(t); selecte P(t+1) from P(t) and C(t); t← t + 1; endend

Local search mechanism: to provide greedy

advance in candidate in this step.

There are various approaches and variants for this local search

mechanism.

Page 63: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

63

Genetic Algorithms

Genetic operations play a role of generating the new chromosomes for evolution. Hopefully, the best-fitted solution can be generated.

In the algorithm, randomness plays essential roles in all operations.

One attractive property of GA is that the performance of the solution is always getting better.

Page 64: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

64

Genetic Algorithms

In fact, GA should be understood as a general adaptable concept for problem solving rather than a collection of related and ready-to-use algorithms.

However, due to the nature of adaptation to the problems, the operations of GAs must be designed by the users.

Moreover, if the optimization is constrained, the initial population and the generations of new chromosomes must be carefully selected.

Page 65: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

65

Genetic Algorithms

Illegal chromosome can not be decoded to a solution. It can not be evaluated.

To use a penalty function is usually a bad approach to this situation.

Some repairing techniques have been proposed to convert an illegal or infeasible chromosome to an acceptable one.

Page 66: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

66

Genetic Algorithms -- websites

IlliGAL (http://www-illigal.ge.uiuc.edu/) - Illinois Genetic Algorithms Laboratory - Download technical reports and code

Golem Project (http://demo.cs.brandeis.edu/golem/) - Automatic Design and Manufacture of Robotic Lifeforms

Introduction to Genetic Algorithms Using RPL2 (http://www.epcc.ed.ac.uk/computing/training/document_archive/GAs-course/main.html)

Talk.Origins FAQ on the uses of genetic algorithms, by AdamMarczyk (http://www.talkorigins.org/faqs/genalg/genalg.html)

Genetic algorithm in search and optimization, by Richard Baker (http://www.fenews.com/fen5/ga.html)

Page 67: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

67

Genetic Algorithms -- websites

Genetic Algorithm and Markov chain Monte Carlo: Differential Evolution Markov chain makes Bayesian Computing easy (http://www.biometris.nl/Markov%20Chain.pdf)

Differential Evolution using Genetic Algorithm (http://www.icsi.berkeley.edu/~storn/code.html#hist)

Introduction to Genetic Algorithms and Neural Networks (http://www.ai-junkie.com/) including an example windows program

Genetic Algorithm Solves the Toads and Frogs Puzzle (http://www.cut-the-knot.org/SimpleGames/evolutions.shtml) (requires Java)

Not-So-Mad Science: Genetic Algorithms and Web Page Design for Marketers (http://www.marketingprofs.com/4/syrett6.asp) by Matthew Syrett

Page 68: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

68

Genetic Algorithms -- websites

http://www.aic.nrl.navy.mil/galist/The Genetic Algorithms Archives (maintained by Alan C Schultz at The Navy Center for Applied Research in Artificial Intelligence)

http://www.genetic-programming.org /(A source of information about the field of genetic programming)

http://www.genetic-programming.com/ (the home page of Genetic Programming Inc.)

http://www.genetic-programming.com/johnkoza.html(Home Page of Professor John R. Koza)

http://www-illigal.ge.uiuc.edu:8080/(International Society for Genetic and Evolutionary Computation)

http://www-illigal.ge.uiuc.edu/index.php3(Illinois Genetic Algorithms Laboratory ILLiGAL)

http://cs.felk.cvut.cz/~xobitko/ga/Introduction to Genetic Algorithmshttp://ww.lalena.com/ai/tsp/Travelling Salesman Problem Using Genetic

Algorithmshttp://www4.ncsu.edu/eos/users/d/dhloughl/public/stable.htmGenetic Algorithms

Online

Page 69: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

69

Genetic Algorithms -- websites

http://cs.gmn.edu/research/gag/George Mason University GA Group (GAG)

http://garage.cse.msu.edu/Michigan State University - Genetic Algorithms Research and Application Groups (GARAGe)

http://gaslab.cs.unr.edu/Genetic Adaptive Systems LAB (GASLAB)Evoluationary Computation (Journal)http://www.densis.fee.unicamp.br/~moscatoMemetic Algorithms –

Prof Pablo Moscato http://www.cs.newcastle.edu.au/~mendesMemetic Algorithms –

Softwareshttp://groups.yahoo.com/group/MALL/Memetic Algorithms Discussion

Grouphttp://www.cs.newcastle.edu.au/~nbiNewcastle Bioinformatics Grouphttp://webhost.ua.ac.be/eume/welcome.htm?

eume.sidebar.html&0European Chapter on Metaheuristics

Page 70: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

70

Other Non-derivation Optimization

Other often mentioned approaches are Particle Swarm Optimization (PSO) and Ants (ACS, ACO, etc).

The overall ideas are all similar in that they all use fitness values to guide the search with some random mechanisms associated with the search process.

Usually, these approaches can have better search performance than that of genetic algorithms.

Page 71: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

71

Particle Swarm Optimization

Particle Swarm Optimization is an optimization technique which provides an evolutionary based search. This search algorithm was introduced by R. Eberhart and J. Kennedy in Proc. 1995 IEEE Int'l. Conf. on Neural Networks IV, pp. 1942-1948.

PSO shares many similarities with evolutionary computation techniques such as Genetic Algorithms (GA). The system is initialized with a population of random solutions and searches for optima by updating generations. However, unlike GA, PSO has no evolution operators such as crossover and mutation.

Page 72: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

72

Particle Swarm Optimization

PSO algorithms are especially useful for parameter optimization in continuous, multi-dimensional search spaces.

PSO is mainly inspired by social behavior patterns of organisms that live and interact within large groups.

In PSO, the potential solutions, called particles, fly through the problem space by following the current optimum particles.

Page 73: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

73

Particle Swarm Optimization

The connection to search and optimization problems is made by assigning direction vectors and velocities to each particle in a multi-dimensional search space.

Each particle then 'moves' or 'flies' through the search space following its velocity vector, which is influenced by the directions and velocities of other particles in its neighborhood.

These localized interactions with neighboring particles propagate through the entire 'swarm' of potential solutions.

Page 74: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

74

Particle Swarm Optimization

How much influence a particular point has on other points is determined by its 'fitness‘; that is, a measure assigned to a potential solution, which captures how good it is compared to all other solution points.

Hence, an evolutionary idea of 'survival of the fittest' comes into play, as well as a social behavior component through a 'follow the local leader' effect and emergent pattern formation.

Page 75: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

75

Particle Swarm Optimization

Each particle keeps track of its coordinates in the problem space which are associated with the best fitnes) it has achieved so far. This value is called pbest.

Another "best" value is obtained so far by any particle in the neighbors of the particle. This location is called lbest.

When a particle takes all the population as its topological neighbors, the best value is a global best and is called gbest.

Page 76: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

76

Particle Swarm Optimization

The particle swarm optimization concept consists of, at each time step, changing the velocity of (accelerating) each particle toward its pbest and gbest (global version of PSO) or lbest locations (local version of PSO).

Acceleration is weighted by a random term, with separate random numbers being generated for acceleration toward pbest and gbest (or lbest) locations.

Page 77: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

77

PSO Algorithm

After finding the two best values, the particle updates its velocity and positions with following equation (a) and (b).

(a) v[ ] = v[ ] + c1 * rand() * (pbest[ ] - present[ ]) + c2 * rand() * (gbest[ ] - present[ ])

(b) present[ ] = present[ ] + v[ ]

Page 78: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

78

PSO Algorithm

1) Initialize the population - locations and velocities

2) Evaluate the fitness of the individual particle (pBest)

3) Keep track of the individuals highest fitness (gBest)

4) Modify velocities based on pBest and gBest position

5) Update the particles position

6) Terminate if the condition is met

7) Go to Step 2

Page 79: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

79

PSO Algorithm

Page 80: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

80

Particle Swarm Optimization

PSO shares many common points with GA. However, PSO does not have genetic operators like crossover and mutation. Particles update themselves with the internal velocity. They also have memory, which is important to the algorithm.

Compared with genetic algorithms (GAs), the information sharing mechanism in PSO is significantly different.

Page 81: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

81

Particle Swarm Optimization

In GAs, chromosomes share information with each other. So the whole population moves like a one group towards an optimal area. In PSO, only gbest (or lbest) gives out the information to others. It is a one -way information sharing mechanism.

The evolution only looks for the best solution. Compared with GA, all the particles tend to converge to the best solution quickly even in the local version in most cases.

Page 82: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

82

Ant Colony Optimization

Ant colony optimization (ACO) is a population-based metaheuristic that can be used to find approximate solutions to difficult optimization problems.

An analogy with the way ant colonies function has suggested the definition of a new computational paradigm.

M. Dorigo, V. Maniezzo and A. Colorni, “Ant System: Optimization by a Colony of Cooperating Agents,” IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 26, no. 1, pp. 29-41, 1996.

Page 83: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

83

Ant Colony Optimization

In ACO, a set of agents called artificial ants search for good solutions to a given optimization problem.

In ACO, the optimization problem is transformed into the problem of finding the best path on a weighted graph.

The artificial ants incrementally build solutions by moving on the graph.

The solution construction process is stochastic and is biased by a pheromone model.

Page 84: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

84

What is “pheromone”

A moving ant lays some pheromone on paths on which it traverses, thus marking the path by a trail of this substance.

While an isolated ant moves essentially at random,

an ant encountering a previously laid trail can detect it and follow it with a high probability.

Page 85: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

85

An example with Real Ants

(a) Ants follow a path between A and E.

(b) An obstacle is interposed.

(c) On the shorter path more pheromone is laid down.

Page 86: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

86

An Example with Artificial Ants

(a) The initial graph with distances.

(b) At time t = 0 there is no trail on the graph edges.

(c) At time t = 1 trail is stronger on shorter edges.

Page 87: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

87

Ant System

Each ant is a simple agent with the following characteristics:

it chooses a path to go to with a probability that is a function of heuristics (distance) and the amount of trail (pheromone) present on the connecting edge.

to force an ant to make legal tours, transitions to already visited towns are disallowed until a tour is completed.

when it completes a tour, it lays a substance called trail (pheromone) on each edge visited.

Page 88: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

88

Pseudo Code of Ant System

The ACO metaheuristic is:

Set parameters, initialize pheromone trailsSCHEDULE_ACTIVITIES ConstructAntSolutions DaemonActions {optional} UpdatePheromonesEND_SCHEDULE_ACTIVITIES

Page 89: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

89

Schedule_Activities

The Schedule_Activities does not specify how the three algorithmic components are scheduled and synchronized.

In most applications of ACO to NP-hard problems however, the three algorithmic components undergo a loop that consists in (i) the construction of solutions by all ants, (ii) the (optional) improvement of these solution via the use of a local search algorithm, and (iii) the update of the pheromones.

Page 90: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

90

ConstructAntSolutions

At each construction step, the current partial solution is extended by adding a feasible solution component from the set of feasible neighbors .

The process of constructing solutions can be regarded as a path on the construction graph GC(V,E).

The allowed paths in GC are implicitly defined by the solution construction mechanism that defines the set with respect to a partial solution .

Page 91: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

91

ConstructAntSolutions

The choice of a solution component is done probabilistically. The rules for the probabilistic choice of solution components vary across different ACO variants. The best known rule is the one of ant system (AS):

where and are the pheromone value and the heuristic value associated with the component. and are parameters used to representing the importance of pheromone and heuristics.

roulette wheel selection

Page 92: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

92

DaemonActions

Once solutions have been constructed, and before updating the pheromone values, often some problem specific actions may be required. These are often called daemon actions, and can be used to implement problem specific and/or centralized actions, which cannot be performed by single ants.

The most used daemon action is the use of local search to the constructed solutions: the locally optimized solutions are then used to decide which pheromone values to update.

Page 93: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

93

UpdatePheromones

The aim of the pheromone update is to increase the pheromone values associated with good solutions, and to decrease those that are associated with bad ones (not used).

Usually, this is achieved (i) by decreasing all the pheromone values through pheromone evaporation, and (ii) by increasing the pheromone levels associated with a chosen set of good solutions

Evaporation for all edges Add those who are good.

Page 94: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

94

Main ACO Algorithms

Several special cases of the ACO metaheuristic have been proposed in the literature.

Ant System (Dorigo 1992, Dorigo et al. 1991, 1996),

Ant Colony System (ACS) (Dorigo & Gambardella 1997), and

MAX-MIN Ant System (MMAS) (Stützle & Hoos 2000).

Page 95: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

95

Ant Systems

Ant system (AS) was the first ACO algorithm proposed in the literature .

Its main characteristic is that the pheromone values are updated by all ants that have completed the tour.

When constructing solutions, ants in AS traverse the construction graph and make a probabilistic decision at each vertex. The transition probability is

Page 96: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

96

Ant Colony Systems

The first major improvement over the original ant system was ant colony system (ACS), introduced by Dorigo and Gambardella (1997).

The main difference between ACS and AS is the decision rule used by the ants during the construction process.

Ants in ACS use the so-called pseudorandom proportional rule: the probability of selecting next edges depends on a random variable q uniformly distributed over [0, 1], and a parameter q0; if q<q0, then, among the feasible components, the component with maximal pheromone heurestic is chosen, otherwise the same equation as in AS is used.

Page 97: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

97

MAX-MIN Ant Systems

MAX-MIN ant system (MMAS) is another improvement, proposed by Stützle and Hoos (2000), over the original ant system idea.

MMAS differs from AS in that (i) only the best ant adds pheromone trails, and (ii) the minimum and maximum values of the pheromone are explicitly limited (in AS and ACS these values are limited implicitly, that is, the value of the limits is a result of the algorithm working rather than a value set explicitly by the algorithm designer).

Page 98: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

98

Other Non-derivation Optimization

It is because GAs and PSO are solution-wise search and swarm search algorithms are component-wise search.

Also, it can be found that solution-wise search algorithms are easier to be trapped into a local minimum if the initial population has some local optimum properties.

Component-wise search algorithms can easily escape from such an initial local optimum phenomena.

Page 99: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

99

Epilogue

Traditional optimization approaches are good but only for the mathematical form is true and can be manipulated.

Non derivate optimization is one nice kind of optimization techniques, but you need to adapt the methodology to the problem you face.

An often used idea is to adapt your problem to those traditional NP problems, like Travel Salesman Problem (TSP), Quadratic Assignment Problem (QAP), etc.

Page 100: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

100

Epilogue

Non-derivative optimization cannot guarantee the success of the search.

Most of unsuccessful cases are either the search converges too slow or the search gets stuck in local optima.

The guadiance is not strong

enough

Randomness is not sufficently

used.

Page 101: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

101

Epilogue

Non-derivative optimization cannot guarantee the success of the search.

Most of unsuccessful cases are either the search converges too slow or the search gets stuck in local optima.

How to strengthen and to balance those two factors are important issues in the design of those search approaches.

Page 102: opt-2010-v2

Sept, 2010

®Copyright of Shun-Feng Su

102

Thank you for your

attention!

Any Questions ?!

Shun-Feng Su,Professor of Department of Electrical Engineering,

National Taiwan University of Science and Technology

E-mail: [email protected],