Genetic Algorithms A technique for those who do not know how to solve the problem!

54
Genetic Algorithms A technique for those who do not know how to solve the problem!

Transcript of Genetic Algorithms A technique for those who do not know how to solve the problem!

Page 1: Genetic Algorithms A technique for those who do not know how to solve the problem!

Genetic Algorithms

A technique for those who do

not know how to solve the problem!

Page 2: Genetic Algorithms A technique for those who do not know how to solve the problem!

Selection Methods

• Fitness-Proportionate Selection– Roulette Wheel– Stochastic Universal Sampling

• Rank Selection

• Tournament Selection

• Steady-State Selection

• Sigma Scaling

• Elitism

Page 3: Genetic Algorithms A technique for those who do not know how to solve the problem!

Fitness Proportionate Selection(Roulette Wheel)

• Used by Holland’s original GA• Here the number of times an individual is expected to reproduce is

equal to individual_Fitness/AveFitness= N* Pi

Method1. Sum the total expected values of individuals in Pop. Call this sum T2. Repeat N times where N is the number of individuals in Pop.Choose a random integer r between 0 and T

Loop through the individuals in the pop., summing the expected values, until the sum is greater than or equal to r. The individual whose expected value puts the sum over r is the one selected.

Faults: Selection in small populations is often far from the expected values.

Page 4: Genetic Algorithms A technique for those who do not know how to solve the problem!

Stochastic Universal• In order to generate individuals that better follow

their expected values we can use Stochastic Universal Sampling (SAS)

• Here we make one call to rand() and select N equally spaced individuals from the population

• The Roulette Wheel method made N random calls to select N individuals

• Start with a random number between 1 and 1/N

Page 5: Genetic Algorithms A technique for those who do not know how to solve the problem!

Sigma Scaling

• Fitness proportionate selection suffers from premature convergence. More emphasis is on exploitation as apposed to exploration. Sigma Scaling addresses this problem.

• This keeps the selection pressure relatively constant over a run.

ExpVal(i,t) = 1+ (fitness(i)-mean(t))/2*SD(t)If SD(t) <>0 otherwise ExpVal(i,t)=1.0; This, for example, gives an individual whose

fitness is one SD above the mean 1.5 offspring out of N.

Page 6: Genetic Algorithms A technique for those who do not know how to solve the problem!

Elitism

• First used by Kenneth De Jong(1975)

• Here the best individuals are carried over to the new population.

• This often significantly improves the GA’s performance.

In GAlib we set this with the command

ga.elitist(gaTrue);

The best individual is copied over.

Page 7: Genetic Algorithms A technique for those who do not know how to solve the problem!

Rank Selection

• This scheme is designed to prevent to-quick convergence.

• Here the rank of an individual is used, instead of its absolute fitness, in the selection process.

• The probability of selection isPi = (N-i+1)/(1+2+..+N)

Select by generating the array of partial sums and displace into it.

Page 8: Genetic Algorithms A technique for those who do not know how to solve the problem!

Tournament Selection

• Fitness proportionate methods require two passes through the population

• Rank scaling requires sorting.

• Here k (often two) individuals are chosen from a population. The best is selected and inserted into the population.

• Do this N times

Page 9: Genetic Algorithms A technique for those who do not know how to solve the problem!

Steady-State Selection

• This scheme is used when we would like the two populations to overlap.

• Here a percentage of the old population is first copied to the new population.

• The remainder of the population is then filled using crossover etc.

• The fraction of the new individuals at each generation is called the “generation gap”

Page 10: Genetic Algorithms A technique for those who do not know how to solve the problem!

GAlib’s Selection Scheme Constructors

• GARankSelector – The rank selector picks the best member of the

population every time.

• GARouletteWheelSelector – This selection method picks an individual based on

the magnitude of the fitness score relative to the rest of the population. The higher the score, the more likely an individual will be selected. Any individual has a probability p of being chosen where p is equal to the fitness of the individual divided by the sum of the fitnesses of each individual in the population.

Page 11: Genetic Algorithms A technique for those who do not know how to solve the problem!

More Selection Schemes

• GATournamentSelector – The tournament selector uses the roulette wheel method to

select two individuals then picks the one with the higher score. The tournament selector typically chooses higher valued individuals more often than the RouletteWheelSelector.

• GADSSelector – The deterministic sampling selector (DS) uses a two-staged

selection procedure. In the first stage, each individual's expected representation is calculated. A temporary population is filled using the individuals with the highest expected numbers. Any remaining positions are filled by first sorting the original individuals according to the decimal part of their expected representation, then selecting those highest in the list. The second stage of selection is uniform random selection from the temporary population.

Page 12: Genetic Algorithms A technique for those who do not know how to solve the problem!

More Selection Schemes

• GASRSSelector – The stochastic remainder sampling selector (SRS) uses a two-

staged selection procedure. In the first stage, each individual's expected representation is calculated. A temporary population is filled using the individuals with the highest expected numbers. Any fractional expected representations are used to give the individual more likelihood of filling a space. For example, an individual with e of 1.4 will have 1 position then a 40% chance of a second position. The second stage of selection is uniform random selection from the temporary population.

• GAUniformSelector – The stochastic uniform sampling selector picks randomly from

the population. Any individual in the population has a probability p of being chosen where p is equal to 1 divided by the population size.

Page 13: Genetic Algorithms A technique for those who do not know how to solve the problem!

GAlib Scaling Constructors• GANoScaling()

– The fitness scores are identical to the objective scores. No scaling takes place.

• GALinearScaling(float c = gaDefLinearScalingMultiplier) – The fitness scores are derived from the

objective scores using the linear scaling method described in Goldberg's book. You can specify the scaling coefficient. Negative objective scores are not allowed with this method. Objective scores are converted to fitness scores using the relation f = a * obj + b where a and b are calculated based upon the objective scores of the individuals in the population as described in Goldberg's book.

f

obj

a*obj+b

Page 14: Genetic Algorithms A technique for those who do not know how to solve the problem!

More Scalings

• GASigmaTruncationScaling(float c = gaDefSigmaTruncationMultiplier) Use this scaling method if your objective scores will be negative.

It scales based on the variation from the population average and truncates arbitrarily at 0. The mapping from objective to fitness score for each individual is given by

f = obj - (obj_ave - c * obj_dev)

GAlib Usage

GASigmaTruncationScaling sigmaTruncation; //Declare objectga.scaling(sigmaTruncation);

Page 15: Genetic Algorithms A technique for those who do not know how to solve the problem!

More Scalings

• GAPowerLawScaling(int k = gaDefPowerScalingFactor) – Power law scaling maps objective scores to fitness scores using

an exponential relationship defined as f = obj ^ k

• GASharing(GAGenomeComparator func = 0, float cutoff = gaDefSharingCutoff, float alpha = 1) This scaling method is used to do speciation. The fitness score

is derived from its objective score by comparing the individual against the other individuals in the population. If there are other similar individuals then the fitness is derated. The distance function is used to specify how similar to each other two individuals are. A distance function must return a value of 0 or higher, where 0 means that the two individuals are identical (no diversity).

Page 16: Genetic Algorithms A technique for those who do not know how to solve the problem!

Crossover Schemes available for GA1DArrayGenome<T>

There are many crossover methods built into GAlib. Generally they are genome specific.

1. Single point crossover2. Two point crossover3. Uniform Crossover4. EvenOdd Crossover5. Partial Match Crossover6. Order Crossover7. CycleCrossover.

Page 17: Genetic Algorithms A technique for those who do not know how to solve the problem!

Two Point Crossover

Chromosome 1: 11011*0010*0110110

Chromosome 2: 01011*1100*0011110

Offspring 1: 11011*1100*0110110

Offspring 2: 01011*0010*0011110

=+

Page 18: Genetic Algorithms A technique for those who do not know how to solve the problem!

Uniform Crossover

In this method each gene of the offspring is selected randomly from the corresponding genes of the parents.

One-point and two-point crossover produce two offspring, whilst uniform crossover produces only one.

Page 19: Genetic Algorithms A technique for those who do not know how to solve the problem!

Creating your own Crossover

You can write your own crossover that is specific to your genome. In your GA you announce to the genome that you have done this by using the following command

Genome.crossover(MyCrossover);

You then must write the code for MyCrossover.

Page 20: Genetic Algorithms A technique for those who do not know how to solve the problem!

GAlib Sexual Crossover

Sexual crossover takes four arguments: two parents and two children. If one child is nil, the operator should be able to generate a single child. The genomes have already been allocated, so the crossover operator should simply modify the contents of the child genome as appropriate. The crossover function should return the number of crossovers that occurred.

Your crossover function should be able to operate on one or two children, so be sure to test the child pointers to see if the genetic algorithm is asking you to create one or two children.

Page 21: Genetic Algorithms A technique for those who do not know how to solve the problem!

Example Crossoverint MyCrossover(const GAGenome& p1, const GAGenome& p2, GAGenome* c1, GAGenome* c2){

GA1DBinaryStringGenome &mom=(GA1DBinaryStringGenome &)p1;GA1DBinaryStringGenome &dad=(GA1DBinaryStringGenome &)p2; int n=0; unsigned int site = GARandomInt(0, mom.length()); unsigned int len = mom.length() - site; if(c1){ GA1DBinaryStringGenome &sis=(GA1DBinaryStringGenome &)*c1; sis.copy(mom, 0, 0, site); sis.copy(dad, site, site, len); n++; } if(c2){ GA1DBinaryStringGenome &bro=(GA1DBinaryStringGenome &)*c2; bro.copy(dad, 0, 0, site); bro.copy(mom, site, site, len); n++; } return n;

}

Page 22: Genetic Algorithms A technique for those who do not know how to solve the problem!

Permutation Crossovers

• Required for TSP

• Required for Decoding messages etc

• Random crossover of two permutations seldom result in another permutation

• A permutation space is N! in size.

• SO!

Page 23: Genetic Algorithms A technique for those who do not know how to solve the problem!

Categories of Perm. Crossovers

• Disqualification– Just kill the bad chromosomes. Why is this bad?

• Repairing– Invalid chromosomes are fixed.

• Inventing Specialized Operators– Crossovers generate only legal permutations

• Transformation– Transform permutation space into a vector

space and cross in vector space.

Page 24: Genetic Algorithms A technique for those who do not know how to solve the problem!

Permutation Operators

• Partially mapped crossover (PMX)

• Order crossover (X)

• Uniform order crossover

• Edge recombination

• There are many other that we will not discuss.

Page 25: Genetic Algorithms A technique for those who do not know how to solve the problem!

Partially Mapped Crossover(Goldbert & Lingle, 1985)

Given two parents s and t, PMX randomly

picks two crossover points. The child is constructed in the following way. Starting with a copy of s, the positions between the crossover points are, one by one, set to the values of t in these positions. This is performs by applying a swap to s. The swap is defined by the corresponding values in s and t within the selected region.

Page 26: Genetic Algorithms A technique for those who do not know how to solve the problem!

PMX example

6 2 3 4 1 7 5

6 2 3 4 1 7 5

6 2 3 1 4 7 5 6 2 4 1 3 7 5

75 2 4 1 3 7 67

6 2 3 4 1 7 576 2 4 1 3 7 57

6 2 3 4 1 7 575 2 4 1 3 7 67 6 2 3 4 1 7 575 2 4 1 3 7 67

Nochange

For the second offspring just swap the parentsand apply the same operation

First offspring

Page 27: Genetic Algorithms A technique for those who do not know how to solve the problem!

Order Crossover(Davis 1985)

• This crossover first determines to crossover points. It then copies the segment between them from one of the parents into the child. The remaining alleles are copied into the child (l to r) in the order that they occur in the other parents.

• Switching the roles of the parents will generate the other child.

Page 28: Genetic Algorithms A technique for those who do not know how to solve the problem!

Order Crossover Example

1 2 3 4 5 6 7 8 9

3 4 7 2 8 9 1 6 54 5 6 7

The remaining alleles are 1 2 3 8 9. Their order in the other parent is 3 2 8 9 1

3 2 8 4 5 6 7 9 1

3 4 7 2 8 9 1 6 5

Page 29: Genetic Algorithms A technique for those who do not know how to solve the problem!

Uniform Order Crossover(Davis 1991)

• Here a randomly-generated binary mask is used to define the elements to be taken from that parent.

• The only difference between this and order crossover is that these elements in order crossover are contiguous.1 2 3 4 5 6 7 8 9

3 4 7 2 8 9 1 6 5

1 1 0 1 0 0 0 1 0 1 2 3 4 7 9 6 8 5

offspring

Page 30: Genetic Algorithms A technique for those who do not know how to solve the problem!

Edge Recombination(Whitley Starkweather Fuquay 1989 )

• This operator was specially designed for the TSP problem.

• This scheme ensures that every edge (link) in the child was shared with one or other of its parents.

• This has been shown to be very effective in TSP applications.

• Constructs an edge map, which for each site lists the edges available to it from the two parents that involve that city. Mark edges that occur in both with a +.

Page 31: Genetic Algorithms A technique for those who do not know how to solve the problem!

Example Edge Table

g d m h b j f i a k e c

c e k a g b h i j f m d

a: +k, g ,i g: a, b, c, d b: +h,g,i

h: +b, i, m c: +3, d, g i: h, j, a, f

d: +m, g, c j: +f, i, b e: +k, +c

k: +e, +a f: +j, m, i m: +d, f, h

Page 32: Genetic Algorithms A technique for those who do not know how to solve the problem!

Edge Recombination Algorithm

• Pick a city at random• Set current_city to this city.• Remove reference to current_city form table.• Examine list for current_city:

– If there is a common entry(+) pick that– Else pick entry which has the shortest list– Split ties randomly

• If stuck (list is empty), start from other end, or else pick a new city at random.

Page 33: Genetic Algorithms A technique for those who do not know how to solve the problem!

Example Continued

• Randomly pick a, delete all a’s from table [a]• Select k (common neighbor) [ak]• Select e (only item in k’s list) [ake]• Select c (only item in e’s list) [akec]• d or g: pick d at random [akecd]• Select m (common edge with d) [akecdm]• f or h: pick h at random [akecdmh]• Select b ( common edge) [akecdmhb]• Select g (shortest list -0) [akecdmhbg]• g has empty list so reverse direction [gbhmdcdka]• Select i (only item in a’s list) [gbhmdcdkai]• Select f at random, then j [gbhmdcekaifj]

Page 34: Genetic Algorithms A technique for those who do not know how to solve the problem!

Inversion Transformations

• This scheme will allow normal crossover and mutation to operate as usual.

• In order to accomplish this we map the permutation space to a set of contiguous vectors .

• Given a permutation of the set {1,2,3,…,N} let a j denote the number of integers in the permutation which precede j but are greater than j. The sequence a1,a2,a3,…,an is called the inversion sequence of the permutation.

• The inversion sequence of 6 2 3 4 1 7 6 is4 1 1 1 2 0 0

There are 4 integers greater than 1

Page 35: Genetic Algorithms A technique for those who do not know how to solve the problem!

Inversion of Permutations

• The inversion sequence of a permutation is unique! Hence there is a 1-1 correspondence between permutations and their inversion sequence. Also the right most inv number is 0 so dropped.

1

1 2 2 1

1 2 3 1 3 2 3 1 2 3 2 1 2 3 1 2 1 30 1 2

0

1

x

y

(0 0) (0 1) (1 1) (2 1) (2 0) (1 0)

Page 36: Genetic Algorithms A technique for those who do not know how to solve the problem!

Inversions Continued

• What does a 4 digit permutation map to?1234 -> (0 0 0)2134 -> (1 0 0)4321 -> (3 2 1)2413 -> (2 0 1)1423 -> (0 1 1)etcMaps to a partial 3D lattice structure

Page 37: Genetic Algorithms A technique for those who do not know how to solve the problem!

Converting Perm to Inv

Input perm: array of permutationOutput: inv: array holding inv sequenceFor (i=1;i<=N;i++){

inv[i]=0;m=1;while(perm[m]<>i){

if (perm[m]>i )then inv[i]++; m++; }

Page 38: Genetic Algorithms A technique for those who do not know how to solve the problem!

Convert inv to Perm

Input: inv[]Output: perm[]For(i=1;i<=N;i++){

for(m=i+1;m<=N;m++)if (pos[m]>=inv[i]+1)pos[m]++;

pos[i]=omv[i]+1;}For(i=1;i<=N;i++) perm[i]=i;

Page 39: Genetic Algorithms A technique for those who do not know how to solve the problem!

So what do we do?

• Our population is of course a set of permutations.

• These permutations are each mapped to their inv to create a population of inv’s say

• We do normal crossovers in this mapped population as well as normal mutations.

• In order to determine fitness we of course must apply Fitness(Inverse(inv))

• Is this all worth doing?

Page 40: Genetic Algorithms A technique for those who do not know how to solve the problem!

Mutations of Permutations

• Swap Mutation

• Scramble Mutation

• 2-Swap

• Insert

These will maintain legal permutations

Page 41: Genetic Algorithms A technique for those who do not know how to solve the problem!

Swap Mutation

• Select two positions at random and swap the allele values at those positions.

• Sometimes called the “order-based” mutation.

ABCDEFGJ => AECDBFGJ

Page 42: Genetic Algorithms A technique for those who do not know how to solve the problem!

Scramble Mutation

• Pick a subset of positions at random and reorder their contents randomly

• Some research has shown swap is best and others have shown scramble is best in certain apps. Who knows?

ABCDEFGH => AHFDECGB

Page 43: Genetic Algorithms A technique for those who do not know how to solve the problem!

Other Permutation Mutations

• 2-Swap (nice for TSP)– Pick two point and invert subtour– AB.CDEF.GH => AB.FEDC.GH

• Insert Mutation– Pick a value at random (say E), insert into

another (rand chosen position, say B) and shift the rest over

– ABCDEFG => AEBCDFG

Page 44: Genetic Algorithms A technique for those who do not know how to solve the problem!

How about code breaking

• Assume that we have the 26 letters of the alphabet permutated. This permutation is used to encode a normal message. How do we decode this using a GA?

• Is this even a good idea?

• What is the fitness function?

Page 45: Genetic Algorithms A technique for those who do not know how to solve the problem!

Encoding

“ABCDEFGHIJKLMNOPQRSTUVWXYZ “

“TUHNIXWAVBJQCDPZ_MOSYLRKEFG”

The decoding string is just a permutation of the original string.

Hence “NOW IS THE TIME” encodes to

“DPKGV etc “

Page 46: Genetic Algorithms A technique for those who do not know how to solve the problem!

Fitness?

• The textbook assumes that it know the answer during the lookup phase. What if you don’t know the answer?

• One possibility involves the use of a dictionary.• When you attempt a decoding and get

something like AVE IS HEI TIME

You can use the spaces as a separators and access the dictionary for each “word” . In this case IS and TIME are found and hence increase its fitness.

Page 47: Genetic Algorithms A technique for those who do not know how to solve the problem!

Can we evolve the equation of gravity (F= m1m2/r2) kg is omitted.

• The first question is how do we represent a function like F in a chromosome?

• By using trees of course.

• Expression trees to be precise.

Expressions (A+B)*C-D and m1m2/r2

easily represented as trees although several different trees may work.

Why?

Page 48: Genetic Algorithms A technique for those who do not know how to solve the problem!

(A+B)*C-D m1m2/r2

+

A B

*

C

-

D

/

* *

m1 m2 R R

Page 49: Genetic Algorithms A technique for those who do not know how to solve the problem!

Things to note

• Interior nodes are operators while leaf nodes are variables or constants.

• Some operators may be unary such as SQRT as well as the usual binary operators.

• If trees are used as a chromosome then specialized operators for mutation and crossover need to be developed.

• GALib has support for trees

GATreeGenome<T>

Page 50: Genetic Algorithms A technique for those who do not know how to solve the problem!

Tree Mutation

• GALib as well as many other uses the swap node mutator as well as a swap tree mutator.

• Swap node swaps the contents of the two specified nodes. Sub-trees connected to either node are not affected; only the specified nodes are swapped.

• Swap tree swaps the contents of the two specified nodes as well as any sub-trees connected to the specified nodes.

• Of course you can define anything that you want.

Page 51: Genetic Algorithms A technique for those who do not know how to solve the problem!

An Example Tree Crossover

*

A -

SQR*

AA A

/

A

/

A /

/ /

A A A A

*

A -

*

A A

A

//

A /

/ /

A A ASQR

AA

Children

Page 52: Genetic Algorithms A technique for those who do not know how to solve the problem!

Fitness

• We can calculate the fitness of a function by running it on a set of “fitness cases”

• These are a set of inputs for which the correct output is known. For example in the case of gravity we can build a set of triples (m1,m2,r,F) which represent the two masses, the distance between them and the resulting force between them.

Page 53: Genetic Algorithms A technique for those who do not know how to solve the problem!

John Koza’s work(1992,1994)(Genetic Programming GP)

• Koza has used schemes such as just discussed to evolve Lisp programs

• A Lisp function is really a preorder listing of the expression tree.

• He used 10% population overlap• Kozo typically does not use mutation, instead he builds

large initial populations with the (he hopes) necessary diversity.

• Chunking has also been addressed by some of his later research. Chunking is a mechanism for automatically chunking parts of a chromosome so they will not be split up under crossover. (IE subroutines?)

Page 54: Genetic Algorithms A technique for those who do not know how to solve the problem!

Questions about GP’s

• Will the technique scale up to more complex cases (bigger programs)

• What if the function and variable set is large?

• GP often finds a function that satisfies the test cases but when applied to remaining data will not work.