Optimal n fe Tian-Li Yu & Kai-Chun Fan. n fe n fe = Population Size × Convergence Time n fe is one...

30
Optimal n fe Tian-Li Yu & Kai-Chun Fan

Transcript of Optimal n fe Tian-Li Yu & Kai-Chun Fan. n fe n fe = Population Size × Convergence Time n fe is one...

Optimal nfeTian-Li Yu & Kai-Chun Fan

nfe

• nfe = Population Size × Convergence Time

• nfe is one of the common used metrics to measure the performance of GAs.

• The optimal nfe is an nfe that is based on a reliable population size (large enough to get good solutions.)

• This slide propose a new method, called Optimal nfe , to find the optimal nfe .

Agenda

• User guide• Bisection• Optimal nfe

• Conclusion

Installation

• Components of DSMGA• Source codes only– Modified files (they do not affect the original programs)

• Makefile• global.cpp• global.h

– Additional file• optnfe.cpp

• After compiling– optnfe

Usage

./optnfe ell lower upper– optnfe

• executable file

– ell• problem size

– lower• population size lower bound• -1 for no assignment

– upper• population size upper bound• -1 for no assignment

Algorithm

• 2 x 2 phases– Bisection ( 2 phases )• Lower bound of reliable population size to solve

problem

– Optimal nfe ( 2 phases)• Optimal (minimal) nfe to solve problem

Agenda

• User guide• Bisection• Optimal nfe

• Conclusion

Bisection Method in Mathematics

• A root-finding algorithm which repeatedly bisects an interval then selects a subinterval in which a root must lie for further processing

• Intermediate value theorem • f:[a, b] → R, if f(a) < u < f(b) or f(b) < u < f(a) , then

there exists c in (a, b) where f(c) = u .

Bisection Method in GAs

• Martin Pelikan, 2005– Given that we know the optimum (population

size), the number of generations can be determined by the GA itself.

– The problem is how to determine an adequate population size for each problem size.

• This slide uses “Bisection” to call the bisection method proposed by Martin Pelikan.

Bisection (Martin Pelikan)

1. Start with some very small N (say, N = 10) 2. Double N until GA convergence is sufficiently reliable 3. (min, max) = (N/2, N) 4. repeat until (max - min) / min < 0.1 (use your own threshold here)

{N = (min + max) / 2 if N leads to sufficiently reliable convergence then

max = N else

min = N

}5. Compute all your statistics for this problem size using

pop.size = max

Bisection in GAs

• Martin Pelikan:What I like on using the bisection method is that all parameter tweaking will be done automatically, because the remaining GA parameters do not influence time complexity as much (mostly by some constant factor and that's not so important when we talk about asymptotic upper bounds). To eliminate noise, the bisection can be run several times and the results of all bisection runs can be averaged (right now I use 10 rounds of bisection, each with 10 runs).

Bisection in GAs

• Martin Pelikan:Of course, the above thinking is correct only if larger populations lead to better solutions (given enough generations we should all be able to believe this fact) and if smaller populations do not significantly increase the number of generations (in all cases I studied, any sufficiently large population size lead to about the same number of generations, as can be expected for many variants of evolutionary algorithms based primarily on recombination).

Population Size vs. Convergence Time

nfe = Population Size × Convergence Time

– Order• Population Size > Convergence Time

– Constant?

population size lower bound (optimum)found by Bisection with 10×100 runs in the experiments for DMC

nfe

population size

Agenda

• User guide• Bisection• Optimal nfe

• Conclusion

• Empirical results– According to the abilities of GAs?• DMC(left), perfect model for trap(right)• Maybe the scale of lower bound is also a key point.

Assumption

Population size lower boundnfe

population size

Prerequisite

• Population size lower bound from Bisection– Efficiency– An accurate search for nfe may take thousands of

runs for each population size.– Empirical results show that meaningful search

ranges are not so large. Touching the impractical population size (too small or to large) may waste lots of time.

Optimal nfe

• 2 phases– Phase 1

• Calculate the nfe of population size lower bound (from Bisection).

• Search for the range (upper bound) of population size according to nfe .

– Phase 2• Divide the range into four pieces, and then calculate the

slope for each one.• Reduce the range recursively by observing the variations

in slopes.

Optimal nfe Phase 1

• Case 1– nfe(1) < nfe(2)

nfe

population size

nfe(1)

nfe(1)

nfe(2) nfe(2)

some ratio(10% population size lower bound in default)

Optimal nfe Phase 1

• Case 2– nfe(1) > nfe(2) AND nfe(3) > nfe(2)

nfe

population size

nfe(1)

nfe(2)

nfe(1)

nfe(2)

Which one?

nfe(3)

nfe(3)

Optimal nfe Phase 1

• Case 3 (extension of case 2)– nfe(1) > nfe(2) > … > nfe(i-1)

– nfe(i-1) > nfe(i) AND nfe(i+1) > nfe(i)nfe

population size

nfe(1)

nfe(5)

nfe(6)

nfe(2)

nfe(3)

nfe(4)

like case 2

Optimal nfe Phase 2

• 7 (or 9) cases– Case 1 (and 8)

1

2

3

4

5

1 2

3

4

5

nfe(2) - nfe(1) ≥ 0

Optimal nfe Phase 2

• 7 cases– Case 2 & 3

1

2

3

4

5

1

2 3

4

5

nfe(2) - nfe(1) < 0nfe(3) - nfe(2) > 0

nfe(3) - nfe(2) = 0

Optimal nfe Phase 2

• 7 cases– Case 4 & 5

1

2

3

4

51

2

3 4

5

nfe(3) - nfe(2) < 0nfe(4) - nfe(3) > 0

nfe(4) - nfe(3) = 0

Optimal nfe Phase 2

• 7 cases– Case 6 & 7 (and 9)

1

2

3

4

5

1

2

3

4

5nfe(4) - nfe(3) < 0nfe(5) - nfe(4) > 0

nfe(5) - nfe(4) ≤ 0

1

2

3

4 5

Optimal nfe Algorithm1. Start with the optimum population size pop(0) from Bisection and then get nfe(0)

2. Calculate nfe(i+1) where pop(i+1) = pop(i)*1.1, repeat until nfe(i+1) > nfe(i)

3. Keep the information of i-1 as lower and i+1 as upper

4. Repeat until pop(upper) – pop(lower) < pop(0) * 0.01 (to assign the precision here) {

for ( k = 1 to 4 ) {calculate nfe of pop(k) , where

pop(k) = ( ((4 – k) * pop(lower) + k * pop(upper)) / 4 )

if nfe(k) ≥ nfe(k-1) {

lower ← k - 2 or 0 (if k = 1)

upper ← k

}}

}5. Optimal nfe is the minimum nfe with its population size between pop(lower) and pop(upper) that

has already been calculated in above runs.

Speed Up

• Unnecessary calculation skipping– If and only if nfe(i) - nfe(i-1) < 0 ,then

nfe(i+1) - nfe(i) should be considered.

• Table look-up

In fact

Empirical results show that, the population size difference between Bisection and Optimal nfe are almost lower than 20%.

– For example, if the Bisection optimum is 1000, the population size for optimal nfe is smaller than 1200.

– If the precision is 1%, the worst case of sweep method is 20 runs, the average case is less than 10 runs.

– However, if higher precision is needed, Optimal nfe beats the sweep method obviously.• 1% (100 for 10000): 10 runs vs. 11 runs• 0.1% (10 for 10000): 100 runs vs. 17 runs• 0.01% (1 for 10000): 1000 runs vs. 25runs

Agenda

• User guide• Bisection• Optimal nfe

• Conclusion

Problems and Future Work

• Some empirical results show that nfe over population size may be a concave-up or an increasing curve.– A proof or strict experiments are needed.– Due to the abilities of GAs?

• Maybe the more powerful the GA is, the closer results are from Optimal nfe to Bisection.

• Maybe the scale of lower bound is a key point.

• If there exist a problem such that Optimal nfe and Bisection result in contraries?

Conclusion

• nfe is one of the commonly used metrics to measure the performance of GAs.

• Bisection is a method to find the minimum reliable population size efficiently.

• Due to some empirical results, the optima from Bisection are not necessarily the population sizes for optimal nfe .

• This slide propose a new method, Optimal nfe , which can automatically calculate the optimal (minimal) nfe for GAs processing.

• The difference between Bisection and Optimal nfe still has to be discussed in the future.