Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
A Gentle Introduction to the Time Complexity Analysis ofEvolutionary Algorithms
Pietro S. Oliveto and Xin Yao
CERCIA, School of Computer Science, University of Birmingham, UK
WCCI 2012Brisbane, Australia, 10 June 2012
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 1 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Aims and Goals of this Tutorial
This tutorial will provide an overview ofthe goals of time complexity analysis of Evolutionary Algorithms (EAs)the most common and effective techniques
You should attend if you wish totheoretically understand the behaviour and performance of the search algorithms youdesignfamiliarise with the techniques used in the time complexity analysis of EAspursue research in the area
enable you or enhance your ability to1 understand theoretically the behaviour of EAs on different problems2 perform time complexity analysis of simple EAs on common toy problems3 read and understand research papers on the computational complexity of EAs4 have the basic skills to start independent research in the area5 follow the time complexity of EAs for combinatorial optimization Tutorial later on today
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 2 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Outline I
1 MotivationIntroduction to the theory of EAsConvergence analysis of EAsComputational complexity of EAs
2 Basic Probability TheoryProbability spaceUnion boundRandom variables and expectationsLaw of total probability
3 Evolutionary AlgorithmsGeneral EAs(1+1)-EA and RLSGeneral propertiesGeneral upper bound
4 Tail InequalitiesMarkov’s inequalityChernoff bounds
5 Artificial Fitness LevelsCoupon collector’s problemAFL method for upper boundsAFL method for parent populationsAFL for non-elitist EAs
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 3 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Outline II
AFL for lower bounds
6 Drift AnalysisAdditive Drift TheoremMultiplicative Drift TheoremSimplified Negative Drift Theorem
7 Typical Run Investigations
8 ConclusionsOverviewState-of-the-artFurther reading
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 4 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Introduction to the theory of EAs
Evolutionary Algorithms and Computer Science
Goals of design and analysis of algorithms
1 correctness“does the algorithm always output the correct solution?”
2 computational complexity“how many computational resources are required?”
For Evolutionary Algorithms (General purpose)
1 convergence“Does the EA find the solution in finite time?”
2 time complexity“how long does it take to find the optimum?”(time = n. of fitness function evaluations)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 5 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Introduction to the theory of EAs
Brief history
Theoretical studies of Evolutionary Algorithms (EAs), albeit few, have always existedsince the seventies [Goldberg, 1989];
Early studies were concerned with explaining the behaviour rather than analysingtheir performance.
Schema Theory was considered fundamental;First proposed to understand the behaviour of the simple GA [Holland, 1992];It cannot explain the performance or limit behaviour of EAs;Building Block Hypothesis was wrong [Reeves and Rowe, 2002];
Convergence results appeared in the nineties [Rudolph, 1998];Related to the time limit behaviour of EAs.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 6 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Convergence analysis of EAs
Convergence
Definition
Ideally the EA should find the solution in finite steps with probability 1(visit the global optimum in finite time);
If the solution is held forever after, then the algorithm converges to the optimum!
Conditions for Convergence ([Rudolph, 1998])
1 There is a positive probability to reach any point in the search space from anyother point
2 The best found solution is never removed from the population (elitism)
Canonical GAs using mutation, crossover and proportional selection Do Notconverge!
Elitist variants Do converge!
In practice, is it interesting that an algorithm converges to the optimum?
Most EAs visit the global optimum in finite time (RLS does not!)
How much time?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 7 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Convergence analysis of EAs
Convergence
Definition
Ideally the EA should find the solution in finite steps with probability 1(visit the global optimum in finite time);
If the solution is held forever after, then the algorithm converges to the optimum!
Conditions for Convergence ([Rudolph, 1998])
1 There is a positive probability to reach any point in the search space from anyother point
2 The best found solution is never removed from the population (elitism)
Canonical GAs using mutation, crossover and proportional selection Do Notconverge!
Elitist variants Do converge!
In practice, is it interesting that an algorithm converges to the optimum?
Most EAs visit the global optimum in finite time (RLS does not!)
How much time?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 7 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Convergence analysis of EAs
Convergence
Definition
Ideally the EA should find the solution in finite steps with probability 1(visit the global optimum in finite time);
If the solution is held forever after, then the algorithm converges to the optimum!
Conditions for Convergence ([Rudolph, 1998])
1 There is a positive probability to reach any point in the search space from anyother point
2 The best found solution is never removed from the population (elitism)
Canonical GAs using mutation, crossover and proportional selection Do Notconverge!
Elitist variants Do converge!
In practice, is it interesting that an algorithm converges to the optimum?
Most EAs visit the global optimum in finite time (RLS does not!)
How much time?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 7 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Computational Complexity of EAs
P. K. Lehre, 2011
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 8 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Computational Complexity of EAs
P. K. Lehre, 2011
Generally means predicting the resources the algorithm requires:
Usually the computational time: the number of primitive steps;
Usually grows with size of the input;
Usually expressed in asymptotic notation;
Exponential runtime: Inefficient algorithmPolynomial runtime: “Efficient” algorithm
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 8 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Computational Complexity of EAs
P. K. Lehre, 2011
However (EAs):
1 In practice the time for a fitness function evaluation is much higher than the rest;2 EAs are randomised algorithms
They do not perform the same operations even if the input is the same!They do not output the same result if run twice!
Hence, the runtime of an EA is a random variable Tf .We are interested in:
1 Estimating E(Tf ), the expected runtime of the EA for f ;
2 Estimating p(Tf ≤ t), the success probability of the EA in t steps for f .
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 8 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Asymptotic notation
f(n) ∈ O(g(n)) ⇐⇒ ∃ constants c, n0 > 0 st. 0 ≤ f(n)≤cg(n) ∀n ≥ n0
f(n) ∈ Ω(g(n)) ⇐⇒ ∃ constants c, n0 > 0 st. 0 ≤ cg(n)≤f(n) ∀n ≥ n0
f(n) ∈ Θ(g(n)) ⇐⇒ f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n))
f(n) ∈ o(g(n)) ⇐⇒ limn→∞
f(n)
g(n)= 0
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 9 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Exercise 1: Asymptotic Notation
[Lehre, Tutorial]
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 10 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Computational complexity of EAs
Motivation Overview
Overview
Goal: Analyze the correctness and performance of EAs;
Difficulties: General purpose, randomised;
EAs find the solution in finite time; (convergence analysis)
How much time? → Derive the expected runtime and the success probability;
Next
Basic Probability Theory: probability space, random variables, expectations(expected runtime)
Randomised Algorithm Tools: Tail inequalities (success probabilities)
Along the way
Understand that the analysis cannot be done over all functions
Understand why the success probability is important (expected runtime notalways sufficient)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 11 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Probability Axioms
Probability space
Sample space Ω (eg. 1 die: 1, 2, 3, 4, 5, 6 ∈ Ω)
Allowable events: = ⊆ Ω (eg. E ∈ = := 2 or 3 is the outcome of 1 die)
Probability function: Pr : = → R (eg. Pr(E) = 2/6 = 1/3)
Probability Axioms
For any event E, 0 ≤ Pr(E) ≤ 1
Pr(Ω) = 1
For any countably finite sequence of pairwise mutually disjoint eventsE1, E2, E3, . . .
P r` [i≥1
Ei´
=Xi≥1
Pr(Ei)
1 Die (Ei the event that i shows up)
Ω
4 5 6
1 2 3
Pr(E1) = P (E2) = · · · = P (E6) = 1/6
Pr(E1 ∪ E2) = Pr(E1) + Pr(E2) = 2/6 = 1/3
Pr(E1 ∪ E2 ∪ E3) = Pr(E1) + Pr(E2) + Pr(E3) = 3/6 = 1/2
Pr(E1 ∪ E2 ∪ · · · ∪ E6) = Pr(E1) + . . . P r(E6) = 6/6 = 1 = Pr(Ω)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 12 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Probability Axioms
Probability space
Sample space Ω (eg. 1 die: 1, 2, 3, 4, 5, 6 ∈ Ω)
Allowable events: = ⊆ Ω (eg. E ∈ = := 2 or 3 is the outcome of 1 die)
Probability function: Pr : = → R (eg. Pr(E) = 2/6 = 1/3)
Probability Axioms
For any event E, 0 ≤ Pr(E) ≤ 1
Pr(Ω) = 1
For any countably finite sequence of pairwise mutually disjoint eventsE1, E2, E3, . . .
P r` [i≥1
Ei´
=Xi≥1
Pr(Ei)
1 Die (Ei the event that i shows up)
Ω
4 5 6
1 2 3
Pr(E1) = P (E2) = · · · = P (E6) = 1/6
Pr(E1 ∪ E2) = Pr(E1) + Pr(E2) = 2/6 = 1/3
Pr(E1 ∪ E2 ∪ E3) = Pr(E1) + Pr(E2) + Pr(E3) = 3/6 = 1/2
Pr(E1 ∪ E2 ∪ · · · ∪ E6) = Pr(E1) + . . . P r(E6) = 6/6 = 1 = Pr(Ω)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 12 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Probability Axioms
Probability space
Sample space Ω (eg. 1 die: 1, 2, 3, 4, 5, 6 ∈ Ω)
Allowable events: = ⊆ Ω (eg. E ∈ = := 2 or 3 is the outcome of 1 die)
Probability function: Pr : = → R (eg. Pr(E) = 2/6 = 1/3)
Probability Axioms
For any event E, 0 ≤ Pr(E) ≤ 1
Pr(Ω) = 1
For any countably finite sequence of pairwise mutually disjoint eventsE1, E2, E3, . . .
P r` [i≥1
Ei´
=Xi≥1
Pr(Ei)
1 Die (Ei the event that i shows up)
Ω
4 5 6
1 2 3
Pr(E1) = P (E2) = · · · = P (E6) = 1/6
Pr(E1 ∪ E2) = Pr(E1) + Pr(E2) = 2/6 = 1/3
Pr(E1 ∪ E2 ∪ E3) = Pr(E1) + Pr(E2) + Pr(E3) = 3/6 = 1/2
Pr(E1 ∪ E2 ∪ · · · ∪ E6) = Pr(E1) + . . . P r(E6) = 6/6 = 1 = Pr(Ω)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 12 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1?
Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1?
Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1?
Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?
Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Probability space
Independent Events and Conditional Probabilities
Two diceE1: Event that the first die is a 1;E2: Event that the second die is a 1;
What is the probability that the first die is 1? Pr(E1) = 636
What is the probability that the second die is 1? Pr(E2) = 636
What is the probability that both dice are 1? Pr(E3) = Pr(E1 ∩ E2) = 136
(independent events)
Definition
Two events E and F are independent if and only if Pr(E ∩ F ) = Pr(E) · Pr(F )
What is the probability that second die is 1 given that the first die is 1?
Pr(E2|E1) =Pr(E2∩E1)Pr(E1)
) =1/361/6
= 1/6 (conditional probabilities)
What about at least one of the two dice is a 1?Pr(E1 ∪ E2) = 1136
E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 13 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
Pr(E1) = 6/36 = 1/6
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
Pr(E1) = 6/36 = 1/6Pr(E2) = 6/36 = 1/6
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
Pr(E1) = 6/36 = 1/6Pr(E2) = 6/36 = 1/6Pr(E2|E1) = 1/6
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
Pr(E1) = 6/36 = 1/6Pr(E2) = 6/36 = 1/6Pr(E2|E1) = 1/6Pr(E1 ∪ E2) = 6/36 + 5/36 = 11/36 ≤ 6/36 + 6/36 = Pr(E1) + Pr(E2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union bound
Ω1,1 1,2 1,3 1,4 1,5 1,6
2,1 2,2 2,3 2,4 2,5 2,6
3,1 3,2 3,3 3,4 3,5 3,6
4,1 4,2 4,3 4,4 4,5 4,6
6,1 6,2 6,3 6,4 6,5 6,6
5,1 5,2 5,3 5,4 5,5 5,6
Pr(E1) = 6/36 = 1/6Pr(E2) = 6/36 = 1/6Pr(E2|E1) = 1/6Pr(E1 ∪ E2) = 6/36 + 5/36 = 11/36 ≤ 6/36 + 6/36 = Pr(E1) + Pr(E2)E1 and E2 are not mutually disjoint events!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 14 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Union bound
Union Bound
Theorem (Union bound)
For any finite or countably finite sequence of events E1, E2, . . .
P r` [i≥1
Ei´≤Xi≥1
Pr(Ei)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 15 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Random Variables
Definition (Random Variable)
A random variable X on a sample space Ω is a real-valued function X : Ω→ R. Adiscrete random variable takes only a finite or countably finite number of values.
Probability:
Pr(X = a) =Ps∈Ω;X(s)=a Pr(s)
Expectation:
E(X) =Pi i · Pr(X = i)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 16 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Random Variables
Definition (Random Variable)
A random variable X on a sample space Ω is a real-valued function X : Ω→ R. Adiscrete random variable takes only a finite or countably finite number of values.
Probability:
Pr(X = a) =Ps∈Ω;X(s)=a Pr(s)
Expectation:
E(X) =Pi i · Pr(X = i)
Example 1 (one die):Let X be the value of one die;
Pr(X = 6) =1
6
E(X) =1
6· 1 +
1
6· 2 +
1
6· 3 + . . .
1
6· 6 =
7
2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 16 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Random Variables
Definition (Random Variable)
A random variable X on a sample space Ω is a real-valued function X : Ω→ R. Adiscrete random variable takes only a finite or countably finite number of values.
Probability:
Pr(X = a) =Ps∈Ω;X(s)=a Pr(s)
Expectation:
E(X) =Pi i · Pr(X = i)
Example 1 (two dice): Let X be the value of the sum of two dice;
Pr(X = 4) = Pr(1, 3) + Pr(3, 1) + Pr(2, 2) =3
36=
1
12
E(X) =1
36· 2 +
2
36· 3 +
3
36· 4 + . . .
1
36· 12 = 7
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 16 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Linearity of Expectation
Definition
For any collection of discrete random variables X1, X2, . . . , Xn with finiteexpectations,
Eˆ nXi=1
Xi˜
=nXi=1
E(Xi)
Example 1 (two dice): Let X be the value of the sum of two dice, X1 the value of die1 and X2 the value of die 2
E(Xi) =7
2
E(X) = E(X1) + E(X2) = 7
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 17 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Binomial Random Variables
We run an experiment that succeeds with probability p and fails 1− p (Eg. coin flips)Consider n trials.
Definition
A binomial random variable X ∼ B(n, p) with parameters n and p represents thenumber of successes in n independent experiments each of which succeds withprobability p.
Probability:
Pr(X = j) =`nj
´pj(1− p)n−j
Expectation: Let Xi = 1 if the ith trial is successful; (linearity of expectation)
E(X) = EˆPn
i Xi˜
=Pni=1 E(Xi) = np
Example 1 (Initialisation of EA): Let X be the number of ones in the initial bit-string(p= 1/2, bit-string length =n)
E(X) = np = n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 18 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Binomial Random Variables
We run an experiment that succeeds with probability p and fails 1− p (Eg. coin flips)Consider n trials.
Definition
A binomial random variable X ∼ B(n, p) with parameters n and p represents thenumber of successes in n independent experiments each of which succeds withprobability p.
Probability:
Pr(X = j) =`nj
´pj(1− p)n−j
Expectation: Let Xi = 1 if the ith trial is successful; (linearity of expectation)
E(X) = EˆPn
i Xi˜
=Pni=1 E(Xi) = np
Example 1 (Initialisation of EA): Let X be the number of ones in the initial bit-string(p= 1/2, bit-string length =n)
E(X) = np = n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 18 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Geometric Random Variables
We run an experiment that succeeds with probability p and fails 1− p (Eg. coin flips)What is the number of trials until we get a success? (eg. heads?)
Definition
A geometric random variable X with parameter p represents the number of trials untilthe first success.
Probability:
Pr(X = n) = (1− p)n−1p
Expectation: E(X) = 1/p (waiting time argument)
Example (expected time for bit i to flip): Let X be the number of steps until bit iflips with mutation rate pm = 1/n
E(X) = 1/p = n
Pr(X = n ·√n+ 1) = (1− p)n
√n+1−1p =
„1− 1
n
«n√n1n≤„
1e
«√n1n≤ e−
√n
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 19 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Random variables and expectations
Geometric Random Variables
We run an experiment that succeeds with probability p and fails 1− p (Eg. coin flips)What is the number of trials until we get a success? (eg. heads?)
Definition
A geometric random variable X with parameter p represents the number of trials untilthe first success.
Probability:
Pr(X = n) = (1− p)n−1p
Expectation: E(X) = 1/p (waiting time argument)
Example (expected time for bit i to flip): Let X be the number of steps until bit iflips with mutation rate pm = 1/n
E(X) = 1/p = n
Pr(X = n ·√n+ 1) = (1− p)n
√n+1−1p =
„1− 1
n
«n√n1n≤„
1e
«√n1n≤ e−
√n
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 19 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Law of total probability
Law of Total Probability
Let E,F be mutually disjoint events and E ∪ F = Ω.
Theorem
Pr(E) = Pr(E|F ) · Pr(F ) + Pr(E|F ) · Pr(F )
E(X) = E(X|F ) · Pr(F ) + E(X|F ) · Pr(F )
Immediate Consequence:
Pr(E) ≥ Pr(E|F ) · Pr(F )
E(X) ≥ E(X|F ) · Pr(F )
Often used to derive lower bounds on the expected time!
We will use this to show that expected values may not be sufficient → successprobabilities!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 20 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General EAs
Evolutionary Algorithms
Algorithm ((µ+λ)-EA)
1 Let t = 0;
2 Initialize P0 with µ individuals chosen uniformly at random;Repeat
3 Create λ new individuals:1 choose x ∈ Pt uniformly at random;2 flip each bit in x with probability p;
4 Create the new population Pt+1 by choosing the best µ individuals out of µ+ λ;
5 Let t = t+ 1.Until a stopping condition is fulfilled.
if µ = λ = 1 we get a (1+1)-EA;
p = 1/n is generally considered as best choice [Back, 1993, Droste et al., 1998];
By introducing stochastic selection and crossover we obtain a GeneticAlgorithm(GA)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 21 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General EAs
Evolutionary Algorithms
Algorithm ((µ+λ)-EA)
1 Let t = 0;
2 Initialize P0 with µ individuals chosen uniformly at random;Repeat
3 Create λ new individuals:1 choose x ∈ Pt uniformly at random;2 flip each bit in x with probability p;
4 Create the new population Pt+1 by choosing the best µ individuals out of µ+ λ;
5 Let t = t+ 1.Until a stopping condition is fulfilled.
if µ = λ = 1 we get a (1+1)-EA;
p = 1/n is generally considered as best choice [Back, 1993, Droste et al., 1998];
By introducing stochastic selection and crossover we obtain a GeneticAlgorithm(GA)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 21 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General EAs
Evolutionary Algorithms
Algorithm ((µ+λ)-EA)
1 Let t = 0;
2 Initialize P0 with µ individuals chosen uniformly at random;Repeat
3 Create λ new individuals:1 choose x ∈ Pt uniformly at random;2 flip each bit in x with probability p;
4 Create the new population Pt+1 by choosing the best µ individuals out of µ+ λ;
5 Let t = t+ 1.Until a stopping condition is fulfilled.
if µ = λ = 1 we get a (1+1)-EA;
p = 1/n is generally considered as best choice [Back, 1993, Droste et al., 1998];
By introducing stochastic selection and crossover we obtain a GeneticAlgorithm(GA)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 21 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General EAs
Evolutionary Algorithms
Algorithm ((µ+λ)-EA)
1 Let t = 0;
2 Initialize P0 with µ individuals chosen uniformly at random;Repeat
3 Create λ new individuals:1 choose x ∈ Pt uniformly at random;2 flip each bit in x with probability p;
4 Create the new population Pt+1 by choosing the best µ individuals out of µ+ λ;
5 Let t = t+ 1.Until a stopping condition is fulfilled.
if µ = λ = 1 we get a (1+1)-EA;
p = 1/n is generally considered as best choice [Back, 1993, Droste et al., 1998];
By introducing stochastic selection and crossover we obtain a GeneticAlgorithm(GA)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 21 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA and RLS
1+1-EA
Algorithm ((1+1)-EA)
Initialise P0 with x ∈ 1, 0n by flipping each bit with p = 1/2 ;Repeat
Create x′ by flipping each bit in x with p = 1/n;
If f(x′) ≥ f(x) Then x′ ∈ Pt+1 Else x ∈ Pt+1;
Let t = t+ 1; Until stopping condition.
If only one bit is flipped per iteration: Random Local Search (RLS).How does it work?
Given x, how many bits will flip in expectation?
E[X] = E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn] =
„E[Xi] = 1 · 1/n+ 0 · (1− 1/n) = 1 · 1/n = 1/n E(X) = np
«
=
nXi=1
1 · 1/n = n/n = 1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 22 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA and RLS
1+1-EA
Algorithm ((1+1)-EA)
Initialise P0 with x ∈ 1, 0n by flipping each bit with p = 1/2 ;Repeat
Create x′ by flipping each bit in x with p = 1/n;
If f(x′) ≥ f(x) Then x′ ∈ Pt+1 Else x ∈ Pt+1;
Let t = t+ 1; Until stopping condition.
If only one bit is flipped per iteration: Random Local Search (RLS).How does it work?
Given x, how many bits will flip in expectation?
E[X] = E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn] =
„E[Xi] = 1 · 1/n+ 0 · (1− 1/n) = 1 · 1/n = 1/n E(X) = np
«
=
nXi=1
1 · 1/n = n/n = 1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 22 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA and RLS
1+1-EA
Algorithm ((1+1)-EA)
Initialise P0 with x ∈ 1, 0n by flipping each bit with p = 1/2 ;Repeat
Create x′ by flipping each bit in x with p = 1/n;
If f(x′) ≥ f(x) Then x′ ∈ Pt+1 Else x ∈ Pt+1;
Let t = t+ 1; Until stopping condition.
If only one bit is flipped per iteration: Random Local Search (RLS).How does it work?
Given x, how many bits will flip in expectation?
E[X] = E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn] =
„E[Xi] = 1 · 1/n+ 0 · (1− 1/n) = 1 · 1/n = 1/n E(X) = np
«
=
nXi=1
1 · 1/n = n/n = 1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 22 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA and RLS
1+1-EA
Algorithm ((1+1)-EA)
Initialise P0 with x ∈ 1, 0n by flipping each bit with p = 1/2 ;Repeat
Create x′ by flipping each bit in x with p = 1/n;
If f(x′) ≥ f(x) Then x′ ∈ Pt+1 Else x ∈ Pt+1;
Let t = t+ 1; Until stopping condition.
If only one bit is flipped per iteration: Random Local Search (RLS).How does it work?
Given x, how many bits will flip in expectation?
E[X] = E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn] =
„E[Xi] = 1 · 1/n+ 0 · (1− 1/n) = 1 · 1/n = 1/n E(X) = np
«
=
nXi=1
1 · 1/n = n/n = 1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 22 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA and RLS
1+1-EA
Algorithm ((1+1)-EA)
Initialise P0 with x ∈ 1, 0n by flipping each bit with p = 1/2 ;Repeat
Create x′ by flipping each bit in x with p = 1/n;
If f(x′) ≥ f(x) Then x′ ∈ Pt+1 Else x ∈ Pt+1;
Let t = t+ 1; Until stopping condition.
If only one bit is flipped per iteration: Random Local Search (RLS).How does it work?
Given x, how many bits will flip in expectation?
E[X] = E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn] =
„E[Xi] = 1 · 1/n+ 0 · (1− 1/n) = 1 · 1/n = 1/n E(X) = np
«
=nXi=1
1 · 1/n = n/n = 1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 22 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«
What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General properties
1+1-EA: 2
How likely is it that exactly one bit flips?
„Pr(X = j) =
`nj
´pj(1− p)n−j
«What is the probability of exactly one bit flipping?
Pr(X = 1) =“n
1
”· 1/n · (1− 1/n)n−1 = (1− 1/n)n−1 ≥ 1/e ≈ 0.37
Is it more likely that 2 bits flip or none?
Pr(X = 2) =“n
2
”· 1/n2 · (1− 1/n)n−2 =
=n · (n− 1)
21/n2 · (1− 1/n)n−2 =
= 1/2 · (1− 1/n)n−1 ≈ 1/(2e)
WhilePr(X = 0) =
“n0
”(1/n)0 · (1− 1/n)n ≈ 1/e
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 23 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´
5 it implies an upper bound on the expected runtime of O(nn)(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: General Upper bound
Theorem ([Droste et al., 2002])
The expected runtime of the (1+1)-EA for an arbitrary function defined in 0, 1n isO(nn)
Proof
1 Let i be the number of bit positions in which the current solution x and theglobal optimum x∗ differ;
2 Each bit flips with probability 1/n, hence does not flip with probability (1− 1/n);
3 In order to reach the global optimum the algorithm has to mutate the i bits andleave the n− i bits unchanged;
4 Then:
p(x∗|x) =
„1
n
«i„1−
1
n
«n−i≥„
1
n
«n= n−n
`p = n−n
´5 it implies an upper bound on the expected runtime of O(nn)
(E(X) = 1/p = nn) (waiting time argument).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 24 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
General Upper bound Exercises
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = 1/2 for anarbitrary function defined in 0, 1n is O(2n)
Proof Left as Exercise.
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = χ/n for anarbitrary function defined in 0, 1n is O((n/χ)n)
Proof Left as Exercise.
Theorem
The expected runtime of RLS for an arbitrary function defined in 0, 1n is infinite.
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 25 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
General Upper bound Exercises
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = 1/2 for anarbitrary function defined in 0, 1n is O(2n)
Proof Left as Exercise.
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = χ/n for anarbitrary function defined in 0, 1n is O((n/χ)n)
Proof Left as Exercise.
Theorem
The expected runtime of RLS for an arbitrary function defined in 0, 1n is infinite.
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 25 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
General Upper bound Exercises
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = 1/2 for anarbitrary function defined in 0, 1n is O(2n)
Proof Left as Exercise.
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = χ/n for anarbitrary function defined in 0, 1n is O((n/χ)n)
Proof Left as Exercise.
Theorem
The expected runtime of RLS for an arbitrary function defined in 0, 1n is infinite.
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 25 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
General Upper bound Exercises
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = 1/2 for anarbitrary function defined in 0, 1n is O(2n)
Proof Left as Exercise.
Theorem
The expected runtime of the (1+1)-EA with mutation probability p = χ/n for anarbitrary function defined in 0, 1n is O((n/χ)n)
Proof Left as Exercise.
Theorem
The expected runtime of RLS for an arbitrary function defined in 0, 1n is infinite.
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 25 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
General upper bound
1+1-EA: Conclusions & Exercises
In general:
P (i− bitflip) =“ni
” 1
ni
„1−
1
n
«n−i≤
1
i!
„1−
1
n
«n−i≈
1
i!e
What about RLS?
Expectation: E[X] = 1
P(1-bitflip) = 1
What about initialisation?
How many one-bits in expectation after initialisation?
E[X] = n · 1/2 = n/2
How likely is it that we get exactly n/2 one-bits?
Pr(X = n/2) =“ n
n/2
” 1
nn/2
„1−
1
n
«n/2„n = 100, P r(X = 50) ≈ 0.0796
«
Tail Inequalities help us to deal with these kind of problems.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 26 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Tail Inequalities
Given a random variable X it may assume values that are considerably larger or lowerthan its expectation;
Tail inequalities:
Estimate the probability that X deviates from the expectation by a definedamount δ;
For many intermediate results, expected values are useless;
May turn expected runtimes into bounds that hold with overwhelming probability.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 27 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Markov’s inequality
Markov Inequality
The fundamental inequality from which many others are derived.
Definition (Markov’s Inequality)
Let X be a random variable assuming only non-negative values, and E[X] itsexpectation. Then for all t ∈ R+,
Pr[X ≥ t] ≤E[X]
t.
E[X] = 1; then: Pr[X ≥ 2] ≤ E[X]2≤ 1
2(Number of bits that flip)
E[X] = n/2; then Pr[X ≥ (2/3)n] ≤ E[X](2/3)n
=n/2
(2/3)n≤ 3
4
(Number of one-bits after initialisation)
Usually Markov’s inequality is used iteratively to obtain stronger bounds!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 28 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Markov’s inequality
Markov Inequality
The fundamental inequality from which many others are derived.
Definition (Markov’s Inequality)
Let X be a random variable assuming only non-negative values, and E[X] itsexpectation. Then for all t ∈ R+,
Pr[X ≥ t] ≤E[X]
t.
E[X] = 1; then: Pr[X ≥ 2] ≤ E[X]2≤ 1
2(Number of bits that flip)
E[X] = n/2; then Pr[X ≥ (2/3)n] ≤ E[X](2/3)n
=n/2
(2/3)n≤ 3
4
(Number of one-bits after initialisation)
Usually Markov’s inequality is used iteratively to obtain stronger bounds!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 28 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Markov’s inequality
Markov Inequality
The fundamental inequality from which many others are derived.
Definition (Markov’s Inequality)
Let X be a random variable assuming only non-negative values, and E[X] itsexpectation. Then for all t ∈ R+,
Pr[X ≥ t] ≤E[X]
t.
E[X] = 1; then: Pr[X ≥ 2] ≤ E[X]2≤ 1
2(Number of bits that flip)
E[X] = n/2; then Pr[X ≥ (2/3)n] ≤ E[X](2/3)n
=n/2
(2/3)n≤ 3
4
(Number of one-bits after initialisation)
Usually Markov’s inequality is used iteratively to obtain stronger bounds!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 28 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Markov’s inequality
Markov Inequality
The fundamental inequality from which many others are derived.
Definition (Markov’s Inequality)
Let X be a random variable assuming only non-negative values, and E[X] itsexpectation. Then for all t ∈ R+,
Pr[X ≥ t] ≤E[X]
t.
E[X] = 1; then: Pr[X ≥ 2] ≤ E[X]2≤ 1
2(Number of bits that flip)
E[X] = n/2; then Pr[X ≥ (2/3)n] ≤ E[X](2/3)n
=n/2
(2/3)n≤ 3
4
(Number of one-bits after initialisation)
Usually Markov’s inequality is used iteratively to obtain stronger bounds!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 28 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Markov’s inequality
Markov Inequality
The fundamental inequality from which many others are derived.
Definition (Markov’s Inequality)
Let X be a random variable assuming only non-negative values, and E[X] itsexpectation. Then for all t ∈ R+,
Pr[X ≥ t] ≤E[X]
t.
E[X] = 1; then: Pr[X ≥ 2] ≤ E[X]2≤ 1
2(Number of bits that flip)
E[X] = n/2; then Pr[X ≥ (2/3)n] ≤ E[X](2/3)n
=n/2
(2/3)n≤ 3
4
(Number of one-bits after initialisation)
Usually Markov’s inequality is used iteratively to obtain stronger bounds!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 28 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bounds
Let X1, X2, . . . Xn be independent Poisson trials each with probability pi;For X =
Pni=1 Xi the expectation is E(X) =
Pni=1 pi.
Definition (Chernoff Bounds)
1 for 0 ≤ δ ≤ 1, Pr(X ≤ (1− δ)E[X]) ≤ e−E[X]δ2
2 .
2 for δ > 0, Pr(X > (1 + δ)E[X]) ≤»
eδ
(1+δ)1+δ
–E[X]
.
What is the probability that we have more than (2/3)n one-bits at initialisation?
pi = 1/2, E[X] = n · 1/2 = n/2,(we fix δ = 1/3→ (1 + δ)E[X] = (2/3)n); then:
Pr[X > (2/3)n] ≤„
e1/3
(4/3)4/3
«n/2= c−n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 29 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bounds
Let X1, X2, . . . Xn be independent Poisson trials each with probability pi;For X =
Pni=1 Xi the expectation is E(X) =
Pni=1 pi.
Definition (Chernoff Bounds)
1 for 0 ≤ δ ≤ 1, Pr(X ≤ (1− δ)E[X]) ≤ e−E[X]δ2
2 .
2 for δ > 0, Pr(X > (1 + δ)E[X]) ≤»
eδ
(1+δ)1+δ
–E[X]
.
What is the probability that we have more than (2/3)n one-bits at initialisation?
pi = 1/2, E[X] = n · 1/2 = n/2,(we fix δ = 1/3→ (1 + δ)E[X] = (2/3)n); then:
Pr[X > (2/3)n] ≤„
e1/3
(4/3)4/3
«n/2= c−n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 29 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bounds
Let X1, X2, . . . Xn be independent Poisson trials each with probability pi;For X =
Pni=1 Xi the expectation is E(X) =
Pni=1 pi.
Definition (Chernoff Bounds)
1 for 0 ≤ δ ≤ 1, Pr(X ≤ (1− δ)E[X]) ≤ e−E[X]δ2
2 .
2 for δ > 0, Pr(X > (1 + δ)E[X]) ≤»
eδ
(1+δ)1+δ
–E[X]
.
What is the probability that we have more than (2/3)n one-bits at initialisation?
pi = 1/2, E[X] = n · 1/2 = n/2,
(we fix δ = 1/3→ (1 + δ)E[X] = (2/3)n); then:
Pr[X > (2/3)n] ≤„
e1/3
(4/3)4/3
«n/2= c−n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 29 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bounds
Let X1, X2, . . . Xn be independent Poisson trials each with probability pi;For X =
Pni=1 Xi the expectation is E(X) =
Pni=1 pi.
Definition (Chernoff Bounds)
1 for 0 ≤ δ ≤ 1, Pr(X ≤ (1− δ)E[X]) ≤ e−E[X]δ2
2 .
2 for δ > 0, Pr(X > (1 + δ)E[X]) ≤»
eδ
(1+δ)1+δ
–E[X]
.
What is the probability that we have more than (2/3)n one-bits at initialisation?
pi = 1/2, E[X] = n · 1/2 = n/2,(we fix δ = 1/3→ (1 + δ)E[X] = (2/3)n); then:
Pr[X > (2/3)n] ≤„
e1/3
(4/3)4/3
«n/2= c−n/2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 29 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bound Simple Application
Bitstring of length n = 100
Pr(Xi) = 1/2 and E(X) = np = 100/2 = 50.
What is the probability to have at least 75 1-bits?
Markov: Pr(X ≥ 75) ≤ 5075
= 23
Chernoff: Pr(X ≥ (1 + 1/2)50) ≤„ √
e
(3/2)3/2
«50
< 0.0045
Truth: Pr(X ≥ 75) =P100i=75
`100i
´2−100 < 0.000000282
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 30 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bound Simple Application
Bitstring of length n = 100
Pr(Xi) = 1/2 and E(X) = np = 100/2 = 50.What is the probability to have at least 75 1-bits?
Markov: Pr(X ≥ 75) ≤ 5075
= 23
Chernoff: Pr(X ≥ (1 + 1/2)50) ≤„ √
e
(3/2)3/2
«50
< 0.0045
Truth: Pr(X ≥ 75) =P100i=75
`100i
´2−100 < 0.000000282
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 30 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
Chernoff Bound Simple Application
Bitstring of length n = 100
Pr(Xi) = 1/2 and E(X) = np = 100/2 = 50.What is the probability to have at least 75 1-bits?
Markov: Pr(X ≥ 75) ≤ 5075
= 23
Chernoff: Pr(X ≥ (1 + 1/2)50) ≤„ √
e
(3/2)3/2
«50
< 0.0045
Truth: Pr(X ≥ 75) =P100i=75
`100i
´2−100 < 0.000000282
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 30 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Chernoff bounds
OneMax
OneMax (x)=Pni=1 x[i])
ones(x)
f(x)
1
1
2
2
n
n
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 31 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 0 p0 = 66
E(T0) = 66
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
50 1 2 3 4 5
0 0 0 0 0 0 p0 = 66
E(T0) = 66
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
20 1 2 3 4 5
0 0 0 0 0 1 p1 = 56
E(T1) = 65
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
00 1 2 3 4 5
0 0 1 0 0 1 p2 = 46
E(T2) = 64
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
50 1 2 3 4 5
1 0 1 0 0 1 p3 = 36
E(T0) = 63
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 0 0 p3 = 36
E(T3) = 63
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
40 1 2 3 4 5
1 0 1 0 0 1 p3 = 36
E(T3) = 63
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
10 1 2 3 4 5
1 0 1 0 1 1 p4 = 26
E(T4) = 62
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
30 1 2 3 4 5
1 1 1 0 1 1 p5 = 16
E(T5) = 61
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
0 1 2 3 4 5
1 1 1 1 1 1 p5 = 16
E(T5) = 61
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i])
0 1 2 3 4 5
0 0 0 0 0 1 p0 = 66
E(T0) = 66
0 1 2 3 4 5
0 0 1 0 0 1 p1 = 56
E(T1) = 65
0 1 2 3 4 5
1 0 1 0 0 1 p2 = 46
E(T2) = 64
0 1 2 3 4 5
1 0 1 0 1 1 p3 = 36
E(T3) = 63
0 1 2 3 4 5
1 1 1 0 1 1 p4 = 26
E(T4) = 62
0 1 2 3 4 5
1 1 1 1 1 1 p5 = 16
E(T5) = 61
E(T ) = E(T0) + E(T1) + · · ·+ E(T5) = 1/p0 + 1/p1 + · · ·+ 1/p5 =
=
5Xi=0
1
pi=
5Xi=0
6
i= 6
6Xi=1
1
i= 6 · 2.45 = 14.7
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 32 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 0 0 0 0 0 p0 = nn
E(T0) = nn
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 5 n
0 0 0 0 0 0 0 0 0 0 p0 = nn
E(T0) = nn
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
20 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
n0 1 2 3
0 0 1 0 0 1 0 0 0 0 p2 = n−2n
E(T2) = nn−2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 1 p2 = n−2n
E(T2) = nn−2
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 1 p2 = n−2n
E(T2) = nn−2
0 1 2 3 n
1 1 1 1 0 1 1 1 1 1 pn−1 = 1n
E(Tn−1) = n1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 1 p2 = n−2n
E(T2) = nn−2
40 1 2 3 n
1 1 1 1 0 1 1 1 1 1 pn−1 = 1n
E(Tn−1) = n1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 1 p2 = n−2n
E(T2) = nn−2
0 1 2 3 n
1 1 1 1 1 1 1 1 1 1 pn−1 = 1n
E(Tn−1) = n1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
RLS for OneMax ( OneMax (x)=∑n
i=1 x[i]) : Generalisation
0 1 2 3 n
0 0 0 0 0 1 0 0 0 0 p0 = nn
E(T0) = nn
0 1 2 3 n
0 0 1 0 0 1 0 0 0 0 p1 = n−1n
E(T1) = nn−1
0 1 2 3 n
0 0 1 0 0 1 0 0 0 1 p2 = n−2n
E(T2) = nn−2
0 1 2 3 n
1 1 1 1 1 1 1 1 1 1 pn−1 = 1n
E(Tn−1) = n1
E(T ) = E(T0) + E(T1) + · · ·+ E(Tn−1) = 1/p1 + 1/p2 + · · ·+ 1/pn−1 =
=
n−1Xi=0
1
pi=
nXi=1
n
i= n
nXi=1
1
i= n ·H(n) = n logn+ Θ(n) = O(n logn)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 33 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Coupon collector’s problem
Coupon collector’s problem
The Coupon collector’s problemThere are n types of coupons and at each trial one coupon is chosen at random. Eachcoupon has the same probability of being extracted. The goal is to find the exactnumber of trials before the collector has obtained all the n coupons.
Theorem (The coupon collector’s Theorem)
Let T be the time for all the n coupons to be collected. Then
E(T ) =
n−1Xi=0
1
pi+1=
n−1Xi=0
n
n− i= n
n−1Xi=0
1
i=
= n(logn+ Θ(1)) = n logn+O(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 34 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Coupon collector’s problem
Coupon collector’s problem: Upper bound on time
What is the probability that the time to collect n coupons is greater thann lnn+O(n)?
Theorem (Coupon collector upper bound on time)
Let T be the time for all the n coupons to be collected. Then
Pr(T ≥ (1 + ε)n lnn) ≤ n−ε
Proof
1n
Probability of choosing a given coupon
1− 1n
Probability of not choosing a given coupon„1− 1
n
«tProbability of not choosing a given coupon for t rounds
The probability that one of the n coupons is not chosen in t rounds is less than
n ·„
1− 1n
«t(Union Bound)
Hence, for t = cn lnn
Pr(T ≥ cn lnn) ≤ n`1− 1/n)cn lnn ≤ n · e−c lnn = n · n−c = n−c+1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 35 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Coupon collector’s problem
Coupon collector’s problem: Upper bound on time
What is the probability that the time to collect n coupons is greater thann lnn+O(n)?
Theorem (Coupon collector upper bound on time)
Let T be the time for all the n coupons to be collected. Then
Pr(T ≥ (1 + ε)n lnn) ≤ n−ε
Proof
1n
Probability of choosing a given coupon
1− 1n
Probability of not choosing a given coupon„1− 1
n
«tProbability of not choosing a given coupon for t rounds
The probability that one of the n coupons is not chosen in t rounds is less than
n ·„
1− 1n
«t(Union Bound)
Hence, for t = cn lnn
Pr(T ≥ cn lnn) ≤ n`1− 1/n)cn lnn ≤ n · e−c lnn = n · n−c = n−c+1
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 35 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Coupon collector’s problem
Coupon collector’s problem: lower bound on time
What is the probability that the time to collect n coupons is less than n lnn+O(n)?
Theorem (Coupon collector lower bound on time (Doerr, 2011))
Let T be the time for all the n coupons to be collected. Then for all ε > 0
Pr(T < (1− ε)(n− 1) lnn) ≤ exp(−nε)
Corollary
The expected time for RLS to optimise OneMax is Θ(n lnn). Furthermore,
Pr(T ≥ (1 + ε)n lnn) ≤ n−ε
andPr(T < (1− ε)(n− 1) lnn) ≤ exp(−nε)
What about the (1+1)-EA?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 36 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Coupon collector’s problem
Coupon collector’s problem: lower bound on time
What is the probability that the time to collect n coupons is less than n lnn+O(n)?
Theorem (Coupon collector lower bound on time (Doerr, 2011))
Let T be the time for all the n coupons to be collected. Then for all ε > 0
Pr(T < (1− ε)(n− 1) lnn) ≤ exp(−nε)
Corollary
The expected time for RLS to optimise OneMax is Θ(n lnn). Furthermore,
Pr(T ≥ (1 + ε)n lnn) ≤ n−ε
andPr(T < (1− ε)(n− 1) lnn) ≤ exp(−nε)
What about the (1+1)-EA?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 36 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels [Droste et al., 2002]
Observation Due to elitism, fitness is monotone increasing
Idea Divide the search space |S| = 2n into m < 2n sets A1, . . . Am such that:
1 ∀i 6= j : Ai ∩Aj = ∅2Smi=0 Ai = 0, 1n
3 for all points a ∈ Ai and b ∈ Aj it happens that f(a) < f(b) if i < j.
requirement Am contains only optimal search points.
Then:pi probability that point in Ai is mutated to a point in Aj with j > i.
Expected time: E(T ) ≤Pi
1pi
Very simple, yet often powerful method for upper bounds
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 37 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels [Droste et al., 2002]
Observation Due to elitism, fitness is monotone increasing
Idea Divide the search space |S| = 2n into m < 2n sets A1, . . . Am such that:
1 ∀i 6= j : Ai ∩Aj = ∅2Smi=0 Ai = 0, 1n
3 for all points a ∈ Ai and b ∈ Aj it happens that f(a) < f(b) if i < j.
requirement Am contains only optimal search points.
Then:pi probability that point in Ai is mutated to a point in Aj with j > i.
Expected time: E(T ) ≤Pi
1pi
Very simple, yet often powerful method for upper bounds
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 37 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels
D. Sudholt, Tutorial 2011
Let:
p(Ai) be the probability that a random chosen point belongs to level Aisi be the probability to leave level Ai for Aj with j > i
Then:
E(T ) ≤X
1≤i≤m−1
p(Ai) ·„
1
si+ · · ·+
1
sm−1
«≤„
1
s1+ · · ·+
1
sm−1
«=
m−1Xi=1
1
si
Inequality 1: Law of total probability`E(T ) =
Pi Pr(F ) · E(T |F )
´Inequality 2: Trivial!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 38 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA for OneMax
Theorem
The expected runtime of the (1+1)-EA for OneMax is O(n lnn).
Proof
The current solution is in level Ai if it has i zeroes (hence n− i ones)
To reach a higher fitness level it is sufficient to flip a zero into a one and leavethe other bits unchanged
The probability is si ≥ i · 1n
„1− 1
n
«n−1
≥ ien
Hence, 1si≤ en
i
Then`
Artificial Fitness Levels´:
E(T ) ≤m−1Xi=1
s−1i ≤
nXi=1
en
i≤ e · n
m−1Xi=1
1
i≤ e · n · (lnn+ 1) = O(n lnn)
Is the (1+1)-EA quicker than n lnn?
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 39 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n lnn).
Proof Idea
1 At most n/2 one-bits are created during initialisation with probability at least 1/2(By symmetry of the binomial distribution).
2 There is a constant probability that in cn lnn steps one of the n/2 remainingzero-bits does not flip.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 40 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n lnn).
Proof Idea
1 At most n/2 one-bits are created during initialisation with probability at least 1/2(By symmetry of the binomial distribution).
2 There is a constant probability that in cn lnn steps one of the n/2 remainingzero-bits does not flip.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 40 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
(1+1)-EA lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n lnn).
Proof Idea
1 At most n/2 one-bits are created during initialisation with probability at least 1/2(By symmetry of the binomial distribution).
2 There is a constant probability that in cn lnn steps one of the n/2 remainingzero-bits does not flip.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 40 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip
(1− 1/n)t a given bit does not flip in t steps1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip(1− 1/n)t a given bit does not flip in t steps
1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip(1− 1/n)t a given bit does not flip in t steps1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip(1− 1/n)t a given bit does not flip in t steps1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip(1− 1/n)t a given bit does not flip in t steps1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax
Theorem (Droste,Jansen,Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof of 2.1− 1/n a given bit does not flip(1− 1/n)t a given bit does not flip in t steps1− (1− 1/n)t it flips at least once in t steps
(1− (1− 1/n)t)n/2 n/2 bits flip at least once in t steps
1− [1− (1− 1/n)t]n/2 at least one of the n/2 bits does not flip in t steps
Set
t = (n− 1) logn. Then:
1− [1− (1− 1/n)t]n/2 = 1− [1− (1− 1/n)(n−1) logn]n/2 ≥
≥ 1− [1− (1/e)logn]n/2 = 1− [1− 1/n]n/2 =
= 1− [1− 1/n]n·1/2 ≥ 1− (2e)−1/2 = c
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 41 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax (2)
Theorem (Droste, Jansen, Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof
1 At most n/2 one-bits are created during initialisation with probability at least 1/2(By symmetry of the binomial distribution).
2 There is a constant probability that in cn logn steps one of the n/2 remainingzero-bits does not flip.
The Expected runtime is:
E[T ] =
∞Xt=1
t · p(t) ≥ [(n− 1) logn] · p[t = (n− 1) logn] ≥
≥ [(n− 1) logn] · [(1/2) · (1− (2e)−1/2) = Ω(n logn)
First inequality: law of total probability
The upper bound given by artificial fitness levels is indeed tight!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 42 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Lower bound for OneMax (2)
Theorem (Droste, Jansen, Wegener, 2002)
The expected runtime of the (1+1)-EA for OneMax is Ω(n logn).
Proof
1 At most n/2 one-bits are created during initialisation with probability at least 1/2(By symmetry of the binomial distribution).
2 There is a constant probability that in cn logn steps one of the n/2 remainingzero-bits does not flip.
The Expected runtime is:
E[T ] =
∞Xt=1
t · p(t) ≥ [(n− 1) logn] · p[t = (n− 1) logn] ≥
≥ [(n− 1) logn] · [(1/2) · (1− (2e)−1/2) = Ω(n logn)
First inequality: law of total probability
The upper bound given by artificial fitness levels is indeed tight!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 42 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels Exercises:
(LeadingOnes(x) =
∑ni=1
∏ij=1 x[j]
)
Theorem
The expected runtime of RLS for LeadingOnes is O(n2).
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1n
and s−1i = n
E(T ) ≤Pn−1i=1 s−1
i =Pni=1 n = O(n2)
Theorem
The expected runtime of the (1+1)-EA for LeadingOnes is O(n2).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 43 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels Exercises:
(LeadingOnes(x) =
∑ni=1
∏ij=1 x[j]
)
Theorem
The expected runtime of RLS for LeadingOnes is O(n2).
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1n
and s−1i = n
E(T ) ≤Pn−1i=1 s−1
i =Pni=1 n = O(n2)
Theorem
The expected runtime of the (1+1)-EA for LeadingOnes is O(n2).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 43 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels Exercises:
(LeadingOnes(x) =
∑ni=1
∏ij=1 x[j]
)
Theorem
The expected runtime of RLS for LeadingOnes is O(n2).
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1n
and s−1i = n
E(T ) ≤Pn−1i=1 s−1
i =Pni=1 n = O(n2)
Theorem
The expected runtime of the (1+1)-EA for LeadingOnes is O(n2).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 43 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Artificial Fitness Levels Exercises:
(LeadingOnes(x) =
∑ni=1
∏ij=1 x[j]
)
Theorem
The expected runtime of RLS for LeadingOnes is O(n2).
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1n
and s−1i = n
E(T ) ≤Pn−1i=1 s−1
i =Pni=1 n = O(n2)
Theorem
The expected runtime of the (1+1)-EA for LeadingOnes is O(n2).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 43 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Fitness Levels Advanced Exercises (Populations)
Theorem
The expected runtime of (1+λ)-EA for LeadingOnes is O(λn+ n2)[Jansen et al., 2005].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1−„
1− 1en
«λ≥ 1− e−λ/(en)
1 si ≥ 1− 1e Case 1: λ ≥ en
2 si ≥ λ2en Case 2: λ < en
E(T ) ≤ λ ·Pn−1i=1 s−1
i ≤ λ„„Pn
i=11c
«+
„Pni=1
2enλ
««=
O
„λ ·„n+ n2
λ
««= O(λ · n+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 44 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Fitness Levels Advanced Exercises (Populations)
Theorem
The expected runtime of (1+λ)-EA for LeadingOnes is O(λn+ n2)[Jansen et al., 2005].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1
si = 1−„
1− 1en
«λ≥ 1− e−λ/(en)
1 si ≥ 1− 1e Case 1: λ ≥ en
2 si ≥ λ2en Case 2: λ < en
E(T ) ≤ λ ·Pn−1i=1 s−1
i ≤ λ„„Pn
i=11c
«+
„Pni=1
2enλ
««=
O
„λ ·„n+ n2
λ
««= O(λ · n+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 44 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Fitness Levels Advanced Exercises (Populations)
Theorem
The expected runtime of the (µ+1)-EA for LeadingOnes is O(µ · n2).
Proof Left as Exercise.
Theorem
The expected runtime of the (µ+1)-EA for OneMax is O(µ · n logn).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 45 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Fitness Levels Advanced Exercises (Populations)
Theorem
The expected runtime of the (µ+1)-EA for LeadingOnes is O(µ · n2).
Proof Left as Exercise.
Theorem
The expected runtime of the (µ+1)-EA for OneMax is O(µ · n logn).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 45 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for upper bounds
Fitness Levels Advanced Exercises (Populations)
Theorem
The expected runtime of the (µ+1)-EA for LeadingOnes is O(µ · n2).
Proof Left as Exercise.
Theorem
The expected runtime of the (µ+1)-EA for OneMax is O(µ · n logn).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 45 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Artificial Fitness Levels for Populations
D. Sudholt, Tutorial 2011
Let:
To be the expected time for a fraction χ(i) of the population to be in level Ai
si be the probability to leave level Ai for Aj with j > i given χ(i) in level Ai
Then:
E(T ) ≤m−1Xi=1
„1
si+ To
«
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 46 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Applications to (µ+1)-EA
Theorem
The expected runtime of (µ+1)-EA for LeadingOnes is O(µn logn+ n2)[Witt, 2006].
Proof
Let partition Ai contain search points with exactly i leading ones
To leave level Ai it suffices to flip the zero at position i+ 1 of the best individual
We set χ(i) = n/ lnn
Given j copies of the best individual another replica is created with probability
jµ
„1− 1
n
«n≥ j
2eµ
To ≤Pn/ lnnj=1
2eµj
= 2eµ lnn
1 si ≥ n/ lnnµ · 1
en = 1eµ lnn Case 1: µ > n
lnn
2 si ≥ n/ lnnµ · 1
en ≥1en Case 2: µ ≤ n
lnn
E(T ) ≤Pn−1i=1 (To + s−1
i ) ≤Pni=1
„2eµ lnn+
`en+ eµ lnn
´«=
n ·„
2eµ lnn+`en+ eµ lnn
´«= O(nµ lnn+ n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 47 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Populations Fitness Levels: Exercise
Theorem
The expected runtime of the (µ+1)-EA for OneMax is O(µn+ n logn).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 48 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL method for parent populations
Populations Fitness Levels: Exercise
Theorem
The expected runtime of the (µ+1)-EA for OneMax is O(µn+ n logn).
Proof Left as Exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 48 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL for non-elitist EAs
Advanced: Fitness Levels for non-Elitist Populations [Lehre, 2011]
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 49 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL for lower bounds
Advanced: Fitness Levels for Lower Bounds [Sudholt, 2010]
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 50 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
AFL for lower bounds
Artificial Fitness Levels: Conclusions
It’s a powerful general method to obtain (often) tight upper bounds on theruntime of simple EAs;
For offspring populations tight bounds can often be achieved with the generalmethod;
For parent populations takeover times have to be introduced;
Recent methods have been presented to deal with non-elitism and for lowerbounds.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 51 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis: Example 1
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step
QuestionHow many steps does Peter need to reach his hotel?
n steps
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 52 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis: Example 1
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step
QuestionHow many steps does Peter need to reach his hotel?n steps
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 52 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis: Formalisation
Define a distance function d(x) to measure the distance from the hotel;
d(x) = x, x ∈ 0, . . . , n
(In our case the distance is simply the number of metres from the hotel).
Estimate the expected “speed” (drift), the expected decrease in distance from thegoal;
d(Xt)− d(Xt+1) =
(0, if Xt = 0,
1, if Xt ∈ 1, . . . , n
TimeThen the expected time to reach the hotel (goal) is:
E(T ) =maximum distance
drift=n
1= n
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 53 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis: Example 2
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel but had a few drinks.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step with probability 0.6.
Peter moves away from the hotel of 1 meter in each step with probability 0.4.
QuestionHow many steps does Peter need to reach his hotel?
5n stepsLet us calculate this through drift analysis.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 54 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis: Example 2
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel but had a few drinks.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step with probability 0.6.
Peter moves away from the hotel of 1 meter in each step with probability 0.4.
QuestionHow many steps does Peter need to reach his hotel?5n stepsLet us calculate this through drift analysis.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 54 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Drift Analysis (2): Formalisation
Define the same distance function d(x) as before to measure the distance fromthe hotel;
d(x) = x, x ∈ 0, . . . , n
(simply the number of metres from the hotel).
Estimate the expected “speed” (drift), the expected decrease in distance from thegoal;
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected dicrease in distance (drift) is:
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
TimeThen the expected time to reach the hotel (goal) is:
E(T ) =maximum distance
drift=
n
0.2= 5n
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 55 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Additive Drift Theorem
Theorem (Additive Drift Theorem for Upper Bounds [He and Yao, 2001])
Let Xtt≥0 be a Markov process over a set of states S, and d : S → R+0 a function
that assigns a non-negative real number to every state. Let the time to reach theoptimum be T := mint ≥ 0 : d(Xt) = 0. If there exists δ > 0 such that at any timestep t ≥ 0 and at any state Xt > 0 the following condition holds:
E(∆(t)|d(Xt) > 0) = E(d(Xt)− d(Xt+1) | d(Xt) > 0) ≥ δ (1)
then
E(T | X0 > 0) ≤d(X0)
δ(2)
and
E(T ) ≤E(d(X0))
δ. (3)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 56 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Analysis for Leading Ones
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is O(n2)
Proof
1 Let g(Xt) = i where i is the number of missing leading ones;
2 The negative drift is 0 since if a leading one is removed from the current solutionthe new point will not be accepted;
3 A positive drift (i.e. of length 1) is achieved as long as the first 0 is flipped andthe leading ones are remained unchanged:
E(∆+(t)) =
n−iXk=1
k · (p(∆+(t)) = k) ≥ 1 · 1/n ·`1− 1/n)n−1 ≥ 1/(en)
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤g(X0)
δ≤
n
1/(en)= e · n2 = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 57 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise LeadingOnes is O(n2)
Proof
Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(n2)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 58 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise LeadingOnes is O(n2)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(λn)
Proof
Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(n2)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 58 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise LeadingOnes is O(n2)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(n2)
Proof
Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 58 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise LeadingOnes is O(n2)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise LeadingOnes isO(n2)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 58 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
(1,λ)-EA Analysis for LeadingOnes
Theorem
Let λ = n. Then the expected time for the (1,λ)-EA to optimise LeadingOnes isO(n2) (Jansen and Neumann, Tutorial)
Proof
Distance: let d(x) = n− i where i is the number of leading ones;
Drift:
E[d(Xt)− d(Xt+1)]
≥ 1 ·„
1−„
1−1
en
«n«− n ·
„1−
„1−
1
n
«n«n= c1 − n · cn2 = Ω(1)
Hence,
E(generations) ≤max distance
drift=
n
Ω(1)= O(n)
and,E(T ) ≤ n · E(generations) = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 59 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
(1,λ)-EA Analysis for LeadingOnes
Theorem
Let λ = n. Then the expected time for the (1,λ)-EA to optimise LeadingOnes isO(n2) (Jansen and Neumann, Tutorial)
Proof
Distance: let d(x) = n− i where i is the number of leading ones;
Drift:
E[d(Xt)− d(Xt+1)]
≥ 1 ·„
1−„
1−1
en
«n«− n ·
„1−
„1−
1
n
«n«n= c1 − n · cn2 = Ω(1)
Hence,
E(generations) ≤max distance
drift=
n
Ω(1)= O(n)
and,E(T ) ≤ n · E(generations) = O(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 59 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Additive Drift Theorem
Theorem (Additive Drift Theorem for Lower Bounds [He and Yao, 2004])
Let Xtt≥0 be a Markov process over a set of states S, and d : S → R+0 a function
that assigns a non-negative real number to every state. Let the time to reach theoptimum be T := mint ≥ 0 : d(Xt) = 0. If there exists δ > 0 such that at any timestep t ≥ 0 and at any state Xt > 0 the following condition holds:
E(∆(t)|d(Xt) > 0) = E(d(Xt)− d(Xt+1) | d(Xt) > 0) ≤ δ (4)
then
E(T | X0 > 0) ≥d(X0)
δ(5)
and
E(T ) ≥E(d(X0))
δ. (6)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 60 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
Sources of progress
1 Flipping the leftmost zero-bit;
2 Bits to right of the leftmost zero-bit that are one-bits (free riders).
Proof
1 Let the current solution have n− i leading ones (i.e. 1n−i0∗).
2 We define the distance function as the number of missing leading ones, i.e.g(X) = i.
3 The n− i+ 1 bit is a zero;
4 let E[Y ] be the expected number of one-bits after the first zero (i.e. the freeriders).
5 Such i− 1 bits are uniformely distributed at initialisation and still are!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 61 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Theorem for LeadingOnes (lower bound)
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
The expected number of free riders is:
E[Y ] =
i−1Xk=1
k · Pr(Y = k) =
i−1Xk=1
Pr(Y ≥ k) =
i−1Xk=1
(1/2)k ≤ 1
The negative drift is 0;
Let p(A) be the probability that the first zero-bit flips into a one-bit.
The positive drift (i.e. the decrease in distance) is bounded as follows:
E(∆+(t)) ≤ p(A) · E[∆+(t)|A] = 1/n · (1 + E[Y ]) ≤ 2/n = δ
Since, also at initialisation the expected number of free riders is less than 2, itfollows that E[d(X0)] ≥ n− 2,
By applying the Drift Theorem we get
E(T ) ≥E(d(X0)
δ=n− 2
2/n= Ω(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 62 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Theorem for LeadingOnes (lower bound)
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
The expected number of free riders is:
E[Y ] =
i−1Xk=1
k · Pr(Y = k) =
i−1Xk=1
Pr(Y ≥ k) =
i−1Xk=1
(1/2)k ≤ 1
The negative drift is 0;
Let p(A) be the probability that the first zero-bit flips into a one-bit.
The positive drift (i.e. the decrease in distance) is bounded as follows:
E(∆+(t)) ≤ p(A) · E[∆+(t)|A] = 1/n · (1 + E[Y ]) ≤ 2/n = δ
Since, also at initialisation the expected number of free riders is less than 2, itfollows that E[d(X0)] ≥ n− 2,
By applying the Drift Theorem we get
E(T ) ≥E(d(X0)
δ=n− 2
2/n= Ω(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 62 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Theorem for LeadingOnes (lower bound)
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
The expected number of free riders is:
E[Y ] =
i−1Xk=1
k · Pr(Y = k) =
i−1Xk=1
Pr(Y ≥ k) =
i−1Xk=1
(1/2)k ≤ 1
The negative drift is 0;
Let p(A) be the probability that the first zero-bit flips into a one-bit.
The positive drift (i.e. the decrease in distance) is bounded as follows:
E(∆+(t)) ≤ p(A) · E[∆+(t)|A] = 1/n · (1 + E[Y ]) ≤ 2/n = δ
Since, also at initialisation the expected number of free riders is less than 2, itfollows that E[d(X0)] ≥ n− 2,
By applying the Drift Theorem we get
E(T ) ≥E(d(X0)
δ=n− 2
2/n= Ω(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 62 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Theorem for LeadingOnes (lower bound)
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
The expected number of free riders is:
E[Y ] =
i−1Xk=1
k · Pr(Y = k) =
i−1Xk=1
Pr(Y ≥ k) =
i−1Xk=1
(1/2)k ≤ 1
The negative drift is 0;
Let p(A) be the probability that the first zero-bit flips into a one-bit.
The positive drift (i.e. the decrease in distance) is bounded as follows:
E(∆+(t)) ≤ p(A) · E[∆+(t)|A] = 1/n · (1 + E[Y ]) ≤ 2/n = δ
Since, also at initialisation the expected number of free riders is less than 2, itfollows that E[d(X0)] ≥ n− 2,
By applying the Drift Theorem we get
E(T ) ≥E(d(X0)
δ=n− 2
2/n= Ω(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 62 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Additive Drift Theorem
Drift Theorem for LeadingOnes (lower bound)
Theorem
The expected time for the (1+1)-EA to optimise LeadingOnes is Ω(n2).
The expected number of free riders is:
E[Y ] =
i−1Xk=1
k · Pr(Y = k) =
i−1Xk=1
Pr(Y ≥ k) =
i−1Xk=1
(1/2)k ≤ 1
The negative drift is 0;
Let p(A) be the probability that the first zero-bit flips into a one-bit.
The positive drift (i.e. the decrease in distance) is bounded as follows:
E(∆+(t)) ≤ p(A) · E[∆+(t)|A] = 1/n · (1 + E[Y ]) ≤ 2/n = δ
Since, also at initialisation the expected number of free riders is less than 2, itfollows that E[d(X0)] ≥ n− 2,
By applying the Drift Theorem we get
E(T ) ≥E(d(X0)
δ=n− 2
2/n= Ω(n2)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 62 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
Lets calculate the runtime of the (1+1)-EA using the additive Drift Theorem.
1 Let g(Xt) = i where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)− d(Xt+1)] ≥ 1 ·i
n
„1−
1
n
«n−1
≥i
en≥
1
en
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected initial distance is E(d(X0)) = n/2
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤E[(d(X0)]
δ≤
n/2
1/(en)= e/2 · n2 = O(n2)
We need a different distance function!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 63 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
Lets calculate the runtime of the (1+1)-EA using the additive Drift Theorem.
1 Let g(Xt) = i where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)− d(Xt+1)] ≥ 1 ·i
n
„1−
1
n
«n−1
≥i
en≥
1
en
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected initial distance is E(d(X0)) = n/2
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤E[(d(X0)]
δ≤
n/2
1/(en)= e/2 · n2 = O(n2)
We need a different distance function!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 63 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
Lets calculate the runtime of the (1+1)-EA using the additive Drift Theorem.
1 Let g(Xt) = i where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)− d(Xt+1)] ≥ 1 ·i
n
„1−
1
n
«n−1
≥i
en≥
1
en
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected initial distance is E(d(X0)) = n/2
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤E[(d(X0)]
δ≤
n/2
1/(en)= e/2 · n2 = O(n2)
We need a different distance function!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 63 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
Lets calculate the runtime of the (1+1)-EA using the additive Drift Theorem.
1 Let g(Xt) = i where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)− d(Xt+1)] ≥ 1 ·i
n
„1−
1
n
«n−1
≥i
en≥
1
en
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected initial distance is E(d(X0)) = n/2
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤E[(d(X0)]
δ≤
n/2
1/(en)= e/2 · n2 = O(n2)
We need a different distance function!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 63 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
Lets calculate the runtime of the (1+1)-EA using the additive Drift Theorem.
1 Let g(Xt) = i where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)− d(Xt+1)] ≥ 1 ·i
n
„1−
1
n
«n−1
≥i
en≥
1
en
4 Hence, E[∆(t)] = E(∆+(t))− E(∆−(t)) ≥ 1/(en) = δ
5 The expected initial distance is E(d(X0)) = n/2
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤E[(d(X0)]
δ≤
n/2
1/(en)= e/2 · n2 = O(n2)
We need a different distance function!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 63 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
1 Let g(Xt) = ln(i+ 1) where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)−d(Xt+1)] ≥ ln(i+1)·i
n
„1−
1
n
«n−1
≥ln(i+ 1)
en≥
ln(2)
en
4 Hence, ∆(t) = E(∆+(t))− E(∆−(t)) ≥ ln(2)/(en) = δ
5 The distance is d(X0) ≤ ln(n+ 1)
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤d(X0)
δ≤
ln(n+ 1)
ln(2)/(en)= O(n lnn)
If the amount of progress depends on the distance from the optimum we need to use alogarithmic distance!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 64 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
1 Let g(Xt) = ln(i+ 1) where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)−d(Xt+1)] ≥ ln(i+1)·i
n
„1−
1
n
«n−1
≥ln(i+ 1)
en≥
ln(2)
en
4 Hence, ∆(t) = E(∆+(t))− E(∆−(t)) ≥ ln(2)/(en) = δ
5 The distance is d(X0) ≤ ln(n+ 1)
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤d(X0)
δ≤
ln(n+ 1)
ln(2)/(en)= O(n lnn)
If the amount of progress depends on the distance from the optimum we need to use alogarithmic distance!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 64 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
1 Let g(Xt) = ln(i+ 1) where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)−d(Xt+1)] ≥ ln(i+1)·i
n
„1−
1
n
«n−1
≥ln(i+ 1)
en≥
ln(2)
en
4 Hence, ∆(t) = E(∆+(t))− E(∆−(t)) ≥ ln(2)/(en) = δ
5 The distance is d(X0) ≤ ln(n+ 1)
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤d(X0)
δ≤
ln(n+ 1)
ln(2)/(en)= O(n lnn)
If the amount of progress depends on the distance from the optimum we need to use alogarithmic distance!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 64 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
1 Let g(Xt) = ln(i+ 1) where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)−d(Xt+1)] ≥ ln(i+1)·i
n
„1−
1
n
«n−1
≥ln(i+ 1)
en≥
ln(2)
en
4 Hence, ∆(t) = E(∆+(t))− E(∆−(t)) ≥ ln(2)/(en) = δ
5 The distance is d(X0) ≤ ln(n+ 1)
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤d(X0)
δ≤
ln(n+ 1)
ln(2)/(en)= O(n lnn)
If the amount of progress depends on the distance from the optimum we need to use alogarithmic distance!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 64 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Drift Analysis for OneMax
1 Let g(Xt) = ln(i+ 1) where i is the number of zeroes in the bitstring;
2 The negative drift is 0 since solution with less one-bits will not be accepted;
3 A positive drift is achieved as long as a 0 is flipped and the ones remainunchanged:
E(∆+(t)) = E[d(Xt)−d(Xt+1)] ≥ ln(i+1)·i
n
„1−
1
n
«n−1
≥ln(i+ 1)
en≥
ln(2)
en
4 Hence, ∆(t) = E(∆+(t))− E(∆−(t)) ≥ ln(2)/(en) = δ
5 The distance is d(X0) ≤ ln(n+ 1)
The expected runtime is (i.e. Eq. (6)):
E(T | d(X0) > 0) ≤d(X0)
δ≤
ln(n+ 1)
ln(2)/(en)= O(n lnn)
If the amount of progress depends on the distance from the optimum we need to use alogarithmic distance!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 64 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Multiplicative Drift Theorem
Theorem (Multiplicative Drift, [Doerr et al., 2010])
Let Xtt∈N0 be random variables describing a Markov process over a finite statespace S ⊆ R. Let T be the random variable that denotes the earliest point in timet ∈ N0 such that Xt = 0.If there exist δ, cmin, cmax > 0 such that
1 E[Xt −Xt+1 | Xt] ≥ δXt and
2 cmin ≤ Xt ≤ cmax,
for all t < T , then
E[T ] ≤2
δ· ln„
1 +cmax
cmin
«
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 65 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
(1+1)-EA Analysis for OneMax
Theorem
The expected time for the (1+1)-EA to optimise OneMax is O(n lnn)
Proof
Distance: let i be the number of zeroes;
E[Xt+1|Xt = i] ≥ i− 1 · ien
= i ·„
1− 1en
«= Xt ·
„1− 1
en
«E[Xt −Xt+1|Xt = i] ≤ Xt −Xt ·
„1− 1
en
«= 1
enXt (δ = 1
en)
1 = cmin ≤ Xt ≤ cmax = n
Hence,
E[T ] ≤2
δ· ln„
1 +cmax
cmin
«= 2en ln(1 + n) = O(n lnn)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 66 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
(1+1)-EA Analysis for OneMax
Theorem
The expected time for the (1+1)-EA to optimise OneMax is O(n lnn)
Proof
Distance: let i be the number of zeroes;
E[Xt+1|Xt = i] ≥ i− 1 · ien
= i ·„
1− 1en
«= Xt ·
„1− 1
en
«E[Xt −Xt+1|Xt = i] ≤ Xt −Xt ·
„1− 1
en
«= 1
enXt (δ = 1
en)
1 = cmin ≤ Xt ≤ cmax = n
Hence,
E[T ] ≤2
δ· ln„
1 +cmax
cmin
«= 2en ln(1 + n) = O(n lnn)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 66 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise OneMax is O(n logn)
Proof
Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise OneMax is O(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise OneMax isO(n logn)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 67 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise OneMax is O(n logn)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise OneMax is O(λn)
Proof
Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise OneMax isO(n logn)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 67 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise OneMax is O(n logn)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise OneMax is O(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise OneMax isO(n logn)
Proof
Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 67 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Multiplicative Drift Theorem
Exercises
Theorem
The expected time for RLS to optimise OneMax is O(n logn)
Proof Left as exercise.
Theorem
Let λ ≥ en. Then the expected time for the (1+λ)-EA to optimise OneMax is O(λn)
Proof Left as exercise.
Theorem
Let λ < en. Then the expected time for the (1+λ)-EA to optimise OneMax isO(n logn)
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 67 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Drift Analysis: Example 3
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel but had too many drinks.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step with probability 0.4.
Peter moves away from the hotel of 1 meter in each step with probability 0.6.
QuestionHow many steps does Peter need to reach his hotel?
at least 2cn steps with overwhelming probability (exponential time)We need Negative-Drift Analysis.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 68 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Drift Analysis: Example 3
Friday night dinner at the restaurant.Peter walks back from the restaurant to the hotel but had too many drinks.
The restaurant is n meters away from the hotel;
Peter moves towards the hotel of 1 meter in each step with probability 0.4.
Peter moves away from the hotel of 1 meter in each step with probability 0.6.
QuestionHow many steps does Peter need to reach his hotel?at least 2cn steps with overwhelming probability (exponential time)We need Negative-Drift Analysis.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 68 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Simplified Drift Theorem
Xt, t ≥ 0, (Markov process over S ⊆ [0, N ])
∆t(i) := (Xt+1 −Xt | Xt = i) for i ∈ S and t ≥ 0 (Drift)
[a, b] (Interval in State space S)
T ∗ := mint ≥ 0: Xt ≤ a | X0 ≥ b (Time to reach a)
Theorem (Simplified Negative-Drift Theorem, [Oliveto and Witt, 2011])
Suppose there exist three constants δ,ε,r such that for all t ≥ 0:
1 E(∆t(i)) ≥ ε for a < i < b,
2 Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
for i > a and j ≥ 1.
ThenProb(T ∗ ≤ 2c
∗(b−a)) = 2−Ω(b−a)
.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 69 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Simplified Drift Theorem
Xt, t ≥ 0, (Markov process over S ⊆ [0, N ])
∆t(i) := (Xt+1 −Xt | Xt = i) for i ∈ S and t ≥ 0 (Drift)
[a, b] (Interval in State space S)
T ∗ := mint ≥ 0: Xt ≤ a | X0 ≥ b (Time to reach a)
Theorem (Simplified Negative-Drift Theorem, [Oliveto and Witt, 2011])
Suppose there exist three constants δ,ε,r such that for all t ≥ 0:
1 E(∆t(i)) ≥ ε for a < i < b,
2 Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
for i > a and j ≥ 1.
ThenProb(T ∗ ≤ 2c
∗(b−a)) = 2−Ω(b−a)
.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 69 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Simplified Drift Theorem
1 E(∆t(i)) ≥ ε for a < i < b,
2 Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
for i > a and j ≥ 1.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 70 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Negative-Drift Analysis: Example (3)
Define the same distance function d(x) = x, x ∈ 0, . . . , n (metres from thehotel) (b=n-1, a=1).
Estimate the increase in distance from the goal (negative drift);
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected increase in distance (negative drift) is: (Condition 1)
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
Probability of jumps (i.e. Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
) (set δ = r = 1)
(Condition 2):
Pr(∆t(i) = −j) =
(0 < (1/2)j−1, if j > 1,
0.6 < (1/2)0 = 1, if j = 1
Then the expected time to reach the hotel (goal) is:
Pr(T ≤ 2c(b−a)) = Pr(T ≤ 2c(n−2)) = 2−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 71 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Negative-Drift Analysis: Example (3)
Define the same distance function d(x) = x, x ∈ 0, . . . , n (metres from thehotel) (b=n-1, a=1).
Estimate the increase in distance from the goal (negative drift);
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected increase in distance (negative drift) is: (Condition 1)
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
Probability of jumps (i.e. Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
) (set δ = r = 1)
(Condition 2):
Pr(∆t(i) = −j) =
(0 < (1/2)j−1, if j > 1,
0.6 < (1/2)0 = 1, if j = 1
Then the expected time to reach the hotel (goal) is:
Pr(T ≤ 2c(b−a)) = Pr(T ≤ 2c(n−2)) = 2−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 71 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Negative-Drift Analysis: Example (3)
Define the same distance function d(x) = x, x ∈ 0, . . . , n (metres from thehotel) (b=n-1, a=1).
Estimate the increase in distance from the goal (negative drift);
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected increase in distance (negative drift) is: (Condition 1)
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
Probability of jumps (i.e. Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
) (set δ = r = 1)
(Condition 2):
Pr(∆t(i) = −j) =
(0 < (1/2)j−1, if j > 1,
0.6 < (1/2)0 = 1, if j = 1
Then the expected time to reach the hotel (goal) is:
Pr(T ≤ 2c(b−a)) = Pr(T ≤ 2c(n−2)) = 2−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 71 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Negative-Drift Analysis: Example (3)
Define the same distance function d(x) = x, x ∈ 0, . . . , n (metres from thehotel) (b=n-1, a=1).
Estimate the increase in distance from the goal (negative drift);
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected increase in distance (negative drift) is: (Condition 1)
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
Probability of jumps (i.e. Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
) (set δ = r = 1)
(Condition 2):
Pr(∆t(i) = −j) =
(0 < (1/2)j−1, if j > 1,
0.6 < (1/2)0 = 1, if j = 1
Then the expected time to reach the hotel (goal) is:
Pr(T ≤ 2c(b−a)) = Pr(T ≤ 2c(n−2)) = 2−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 71 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Negative-Drift Analysis: Example (3)
Define the same distance function d(x) = x, x ∈ 0, . . . , n (metres from thehotel) (b=n-1, a=1).
Estimate the increase in distance from the goal (negative drift);
d(Xt)− d(Xt+1) =
8><>:0, if Xt = 0,
1, if Xt ∈ 1, . . . , nwith probability 0.6
−1, if Xt ∈ 1, . . . , nwith probability 0.4
The expected increase in distance (negative drift) is: (Condition 1)
E[d(Xt)− d(Xt+1)] = 0.6 · 1 + 0.4 · (−1) = 0.6− 0.4 = 0.2
Probability of jumps (i.e. Prob(∆t(i) = −j) ≤ 1(1+δ)j−r
) (set δ = r = 1)
(Condition 2):
Pr(∆t(i) = −j) =
(0 < (1/2)j−1, if j > 1,
0.6 < (1/2)0 = 1, if j = 1
Then the expected time to reach the hotel (goal) is:
Pr(T ≤ 2c(b−a)) = Pr(T ≤ 2c(n−2)) = 2−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 71 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Needle in a Haystack
Theorem
Let η > 0 be constant. Then there is a constant c > 0 such that with probability1− 2−Ω(n) the (1+1)-EA on Needle creates only search points with at mostn/2 + ηn ones in 2cn steps.
Proof Idea
1 By Chernoff bounds the probability that the initial bit string has less thann/2− γn zeroes is e−Ω(n).
2 we set b := n/2− γn and a := n/2− 2γn where γ := η/2;
3 Let Xt denote the number of zeroes in the bit string at time step t.;
4 Let ∆(i) denote the random increase of the number of zeroes (i.e. the drift);
5 Now we have to check that the two conditions of the Simplified drift theoremhold.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 72 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Needle in a Haystack
Theorem
Let η > 0 be constant. Then there is a constant c > 0 such that with probability1− 2−Ω(n) the (1+1)-EA on Needle creates only search points with at mostn/2 + ηn ones in 2cn steps.
Proof Idea
1 By Chernoff bounds the probability that the initial bit string has less thann/2− γn zeroes is e−Ω(n).
2 we set b := n/2− γn and a := n/2− 2γn where γ := η/2;
3 Let Xt denote the number of zeroes in the bit string at time step t.;
4 Let ∆(i) denote the random increase of the number of zeroes (i.e. the drift);
5 Now we have to check that the two conditions of the Simplified drift theoremhold.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 72 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Needle in a Haystack (2)
Proof of Condition 1Condition 1 holds if E(∆(i)) ≥ ε for some constant ε > 0.
1 The (1+1)-EA flips 0-bits and 1-bits independently with probability 1/n.
2 Since there are i zero-bits and n− i one-bits in the current string, the expectednumber of zero-bits flipped into one-bits is i/n and the viceversa is (n− i)/n.
3 Hence,
E(∆(i)) =n− in−i
n=
n− 2i
n≥ 2γ = ε
and Condition 1 holds.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 73 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Needle in a Haystack (3)
Proof of Condition 2Condition 2 is: Prob(∆(i) = −j) ≤ 1/(1 + δ)j−r for j ∈ N0.
1 Since to reach state i− j or less from state i, at least j bits have to flip, theprobability of a drift of length j can be bounded as follows:
Prob(∆(i) ≤ −j) ≤“nj
”„ 1
n
«j≤
1
j!≤„
1
2
«j−1
This proves Condition 2 by setting δ = r = 1.
Applying the Simplified Drift TheoremFor a constant c∗ > 0 the global optimum is found in 2c
∗(b−a) = 2cn steps withprobability at most 2−Ω(b−a) = 2−Ω(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 74 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Exercise: Trap Functions
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
0 ones(x)
f(x)
3
3
2
2
n
nn+ 1
1
1
Theorem
With overwhelming probability at least 1− 2−Ω(n) the (1+1)-EA requires 2Ω(n) stepsto optimise Trap .
Proof Left as exercise.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 75 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Exercise: Trap Functions
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
0 ones(x)
f(x)
3
3
2
2
n
nn+ 1
1
1
Theorem
With overwhelming probability at least 1− 2−Ω(n) the (1+1)-EA requires 2Ω(n) stepsto optimise Trap .
Proof Left as exercise.P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 75 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Simplified Negative Drift Theorem
Drift Analysis Conclusion
Overview
Additive Drift Analysis (upper and lower bounds);
Multiplicative Drift Analysis;
Simplified Negative-Drift Theorem;
Advanced Lower bound Drift Techniques
Drift Analysis for Stochastic Populations (mutation) [Lehre, 2010];
Simplified Drift Theorem combined with bandwidth analysis (mutation +crossover stochastic populations = GAs) [Oliveto and Witt, 2012];
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 76 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Typical run investigations
Typical Runs have been introduced by considering that
the “global behaviour of a process” is predictable with high probability;
the “local behaviour” is quite unpredictable;
Method
1 Divide the process into k phases which are long enough to assure that some evenhappens with probability pk;
2 Hence, does not happen with failure probability 1− pk;
3 The last phase should lead the EA to the global optimum;
4 The failure probability is 1−Pki=1 pi;
5 The runtime is lower than the sum of each phase time.
Generally Chernoff bounds are used to obtain the failure probabilities!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 77 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Typical run investigations
Typical Runs have been introduced by considering that
the “global behaviour of a process” is predictable with high probability;
the “local behaviour” is quite unpredictable;
Method
1 Divide the process into k phases which are long enough to assure that some evenhappens with probability pk;
2 Hence, does not happen with failure probability 1− pk;
3 The last phase should lead the EA to the global optimum;
4 The failure probability is 1−Pki=1 pi;
5 The runtime is lower than the sum of each phase time.
Generally Chernoff bounds are used to obtain the failure probabilities!
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 77 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions
Theorem
With overwhelming probability at least 1− e−Ω(n) the (1+1)-EA requires nΩ(n) stepsto optimise Trap .
The deceptive trap function is the following:
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
Proof IdeaWe use the Typical run investigations method and split the analysis into the followingthree phases:
1 This phase ends when at least n/3 one-bits are in the current solution.
2 This phase ends when the 1n bitstring is the current solution.
3 This phase ends when the global optimum is found.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 78 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions
Theorem
With overwhelming probability at least 1− e−Ω(n) the (1+1)-EA requires nΩ(n) stepsto optimise Trap .
The deceptive trap function is the following:
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
Proof IdeaWe use the Typical run investigations method and split the analysis into the followingthree phases:
1 This phase ends when at least n/3 one-bits are in the current solution.
2 This phase ends when the 1n bitstring is the current solution.
3 This phase ends when the global optimum is found.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 78 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions
Theorem
With overwhelming probability at least 1− e−Ω(n) the (1+1)-EA requires nΩ(n) stepsto optimise Trap .
The deceptive trap function is the following:
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
Proof IdeaWe use the Typical run investigations method and split the analysis into the followingthree phases:
1 This phase ends when at least n/3 one-bits are in the current solution.
2 This phase ends when the 1n bitstring is the current solution.
3 This phase ends when the global optimum is found.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 78 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions
Theorem
With overwhelming probability at least 1− e−Ω(n) the (1+1)-EA requires nΩ(n) stepsto optimise Trap .
The deceptive trap function is the following:
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
Proof IdeaWe use the Typical run investigations method and split the analysis into the followingthree phases:
1 This phase ends when at least n/3 one-bits are in the current solution.
2 This phase ends when the 1n bitstring is the current solution.
3 This phase ends when the global optimum is found.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 78 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions
Theorem
With overwhelming probability at least 1− e−Ω(n) the (1+1)-EA requires nΩ(n) stepsto optimise Trap .
The deceptive trap function is the following:
Trap(x) =
n+ 1 if x = 0n
OneMax(x) otherwise.
Proof IdeaWe use the Typical run investigations method and split the analysis into the followingthree phases:
1 This phase ends when at least n/3 one-bits are in the current solution.
2 This phase ends when the 1n bitstring is the current solution.
3 This phase ends when the global optimum is found.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 78 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (2)
We need to show that the runtime required to complete the three phases is nΩ(n) andthat the failure probability of each phase is at most e−Ω(n).
Phase 1 This phase ends when at least n/3 one-bits are in the currentsolution.
Proof
1 By using Chernoff bounds we calculate the probability that the algorithm isinitialised with less than n/3 one-bits.
2 n/2 expected one-bits after initialisation (i.e. E(X) = n · p = n/2).
3 Setting δ = 1/3 we get
P (X ≤ n/3) ≤ e−E(X)δ2/2 = e−(n/2)·(1/9)·(1/2) = e−n/36 = e−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 79 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (2)
We need to show that the runtime required to complete the three phases is nΩ(n) andthat the failure probability of each phase is at most e−Ω(n).
Phase 1 This phase ends when at least n/3 one-bits are in the currentsolution.
Proof
1 By using Chernoff bounds we calculate the probability that the algorithm isinitialised with less than n/3 one-bits.
2 n/2 expected one-bits after initialisation (i.e. E(X) = n · p = n/2).
3 Setting δ = 1/3 we get
P (X ≤ n/3) ≤ e−E(X)δ2/2 = e−(n/2)·(1/9)·(1/2) = e−n/36 = e−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 79 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (2)
We need to show that the runtime required to complete the three phases is nΩ(n) andthat the failure probability of each phase is at most e−Ω(n).
Phase 1 This phase ends when at least n/3 one-bits are in the currentsolution.
Proof
1 By using Chernoff bounds we calculate the probability that the algorithm isinitialised with less than n/3 one-bits.
2 n/2 expected one-bits after initialisation (i.e. E(X) = n · p = n/2).
3 Setting δ = 1/3 we get
P (X ≤ n/3) ≤ e−E(X)δ2/2 = e−(n/2)·(1/9)·(1/2) = e−n/36 = e−Ω(n)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 79 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (3)
Phase 2 This phase ends when the 1n bitstring is the current solution.
Proof
1 The probability of reaching the 0n bitstring in one step is less than n−n/3;
2 in expected time c · n logn the algorithm climbs up the OneMax .
3 By Markov’s inequality,
P`X ≥ e · c · n logn
´≤
c · n logn
e · c · n logn= 1/e
4 By applying it iteratively: P`X ≥ e · c · n2 logn
´≤ e−n
5 The probability that in c · n2 logn steps n/3 precise bits flip isc · n2 logn · n−n/3 = n−Ω(n). (union bound)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 80 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (4)
Phase 3
This phase ends when the global optimum is found.
Proof
1 The expected time to reach the global optimum is nn because all the bits have tobe flipped at the same time;
2 By Chernoff bounds the probability that the time is less than (1/2)nn is (i.e.δ = 1/2):
P (T ≤ (1/2)nn) ≤ e−nn·1/4·1/2 = e−n
n·1/8 = e−Ω(n)
Summing up
1 the algorithm requires exponential runtime 1 + c · n2 logn+ (1/2)nn = nΩ(n)
2 With probability at least 1− e−Ω(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 81 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (4)
Phase 3
This phase ends when the global optimum is found.
Proof
1 The expected time to reach the global optimum is nn because all the bits have tobe flipped at the same time;
2 By Chernoff bounds the probability that the time is less than (1/2)nn is (i.e.δ = 1/2):
P (T ≤ (1/2)nn) ≤ e−nn·1/4·1/2 = e−n
n·1/8 = e−Ω(n)
Summing up
1 the algorithm requires exponential runtime 1 + c · n2 logn+ (1/2)nn = nΩ(n)
2 With probability at least 1− e−Ω(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 81 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (4)
Phase 3
This phase ends when the global optimum is found.
Proof
1 The expected time to reach the global optimum is nn because all the bits have tobe flipped at the same time;
2 By Chernoff bounds the probability that the time is less than (1/2)nn is (i.e.δ = 1/2):
P (T ≤ (1/2)nn) ≤ e−nn·1/4·1/2 = e−n
n·1/8 = e−Ω(n)
Summing up
1 the algorithm requires exponential runtime 1 + c · n2 logn+ (1/2)nn = nΩ(n)
2 With probability at least 1− e−Ω(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 81 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
(1+1)-EA for Trap functions (4)
Phase 3
This phase ends when the global optimum is found.
Proof
1 The expected time to reach the global optimum is nn because all the bits have tobe flipped at the same time;
2 By Chernoff bounds the probability that the time is less than (1/2)nn is (i.e.δ = 1/2):
P (T ≤ (1/2)nn) ≤ e−nn·1/4·1/2 = e−n
n·1/8 = e−Ω(n)
Summing up
1 the algorithm requires exponential runtime 1 + c · n2 logn+ (1/2)nn = nΩ(n)
2 With probability at least 1− e−Ω(n).
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 81 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
ExpectationTrap
ExpectationTrap(x) =
n− 1/2 if x = 0n
OneMax(x) otherwise.
Theorem
The expected time for the (1+1)-EA to optimise ExpectationTrap is Ω`(n/2)n
´.
[Jansen and Neumann, Tutorial]
Observation
Prob(Local) = 2−n (probability the algorithm is initialised with the 0n bitstring)
E(T |X0 = 0n) = nn (all bits need to be flipped)
Proof
E(T ) ≥ E(T |X0 = 0n) = nn · 2−n = (n/2)n (law of total probability)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 82 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
ExpectationTrap
ExpectationTrap(x) =
n− 1/2 if x = 0n
OneMax(x) otherwise.
Theorem
The expected time for the (1+1)-EA to optimise ExpectationTrap is Ω`(n/2)n
´.
[Jansen and Neumann, Tutorial]
Observation
Prob(Local) = 2−n (probability the algorithm is initialised with the 0n bitstring)
E(T |X0 = 0n) = nn (all bits need to be flipped)
Proof
E(T ) ≥ E(T |X0 = 0n) = nn · 2−n = (n/2)n (law of total probability)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 82 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
ExpectationTrap
ExpectationTrap(x) =
n− 1/2 if x = 0n
OneMax(x) otherwise.
Theorem
The expected time for the (1+1)-EA to optimise ExpectationTrap is Ω`(n/2)n
´.
[Jansen and Neumann, Tutorial]
Observation
Prob(Local) = 2−n (probability the algorithm is initialised with the 0n bitstring)
E(T |X0 = 0n) = nn (all bits need to be flipped)
Proof
E(T ) ≥ E(T |X0 = 0n) = nn · 2−n = (n/2)n (law of total probability)
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 82 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Overview
Final Overview
Overview
Basic Probability Theory
Tail Inequalities
Artificial Fitness Levels
Drift Analysis
Typical Runs
Other Techniques (Not covered)
Family Trees [Witt, 2006]
Gambler’s Ruin & Martingales [Jansen and Wegener, 2001]
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 83 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
State-of-the-art
State of the Art in Computational Complexity of RSHs
OneMax (1+1) EA O(n logn)(1+λ) EA O(λn + n logn)(µ+1) EA O(µn + n logn)1-ANT O(n2) w.h.p.(µ+1) IA O(µn + n logn)
Linear Functions (1+1) EA Θ(n logn)cGA Θ(n2+ε), ε > 0 const.
Max. Matching (1+1) EA eΩ(n) , PRAS
Sorting (1+1) EA Θ(n2 logn)SS Shortest Path (1+1) EA O(n3 log(nwmax))
MO (1+1) EA O(n3)MST (1+1) EA Θ(m2 log(nwmax))
(1+λ) EA O(n log(nwmax))1-ANT O(mn log(nwmax))
Max. Clique (1+1) EA Θ(n5)(rand. planar) (16n+1) RLS Θ(n5/3)Eulerian Cycle (1+1) EA Θ(m2 logm)Partition (1+1) EA 4/3 approx., competitive avg.
Vertex Cover (1+1) EA eΩ(n) , arb. bad approx.
Set Cover (1+1) EA eΩ(n) , arb. bad approx.SEMO Pol. O(logn)-approx.
Intersection of (1+1) EA 1/p-approximation in
p ≥ 3 matroids O(|E|p+2 log(|E|wmax))UIO/FSM conf. (1+1) EA eΩ(n)
See [Oliveto et al., 2007] for an overview.
P. K. Lehre, 2008
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 84 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Further reading
Further Reading
[Auger and Doerr, 2011, Neumann and Witt, 2010]
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 85 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Further reading
Coming up Conference
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 86 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Further reading
References I
Auger, A. and Doerr, B. (2011).
Theory of Randomized Search Heuristics: Foundations and Recent Developments.World Scientific Publishing Co., Inc., River Edge, NJ, USA.
Back, T. (1993).
Optimal mutation rates in genetic search.In In Proceedings of the Fifth International Conference on Genetic Algorithms (ICGA), pages 2–8.
Doerr, B., Johannsen, D., and Winzen, C. (2010).
Multiplicative drift analysis.In Proceedings of the 12th annual conference on Genetic and evolutionary computation, GECCO ’10, pages 1449–1456. ACM.
Droste, S., Jansen, T., and Wegener, I. (1998).
A rigorous complexity analysis of the (1 + 1) evolutionary algorithm for separable functions with boolean inputs.Evolutionary Computation, 6(2):185–196.
Droste, S., Jansen, T., and Wegener, I. (2002).
On the analysis of the (1+1) evolutionary algorithm.Theoretical Computer Science, 276(1-2):51–81.
Goldberg, D. E. (1989).
Genetic Algorithms for Search, Optimization, and Machine Learning.Addison-Wesley.
He, J. and Yao, X. (2001).
Drift analysis and average time complexity of evolutionary algorithms.Artificial Intelligence, 127(1):57–85.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 87 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Further reading
References II
He, J. and Yao, X. (2004).
A study of drift analysis for estimating computation time of evolutionary algorithms.Natural Computing: an international journal, 3(1):21–35.
Holland, J. H. (1992).
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and ArtificialIntelligence.The MIT Press.
Jansen, T., Jong, K. A. D., and Wegener, I. A. (2005).
On the choice of the offspring population size in evolutionary algorithms.Evolutionary Computation, 13(4):413–440.
Jansen, T. and Wegener, I. (2001).
Evolutionary algorithms - how to cope with plateaus of constant fitness and when to reject strings of the same fitness.IEEE Trans. Evolutionary Computation, 5(6):589–599.
Lehre, P. K. (2010).
Negative drift in populations.In PPSN (1), pages 244–253.
Lehre, P. K. (2011).
Fitness-levels for non-elitist populations.In Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO ’11, pages 2075–2082. ACM.
Neumann, F. and Witt, C. (2010).
Bioinspired Computation in Combinatorial Optimization: Algorithms and Their Computational Complexity.Springer-Verlag New York, Inc., New York, NY, USA, 1st edition.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 88 / 92
Motivation Basic Probability Theory Evolutionary Algorithms Tail Inequalities Artificial Fitness Levels Drift Analysis Typical Run Investigations Conclusions
Further reading
References III
Oliveto, P. and Witt, C. (2012).
On the analysis of the simple genetic algorithm (to appear).In Proceedings of the 12th annual conference on Genetic and evolutionary computation, GECCO ’12, pages –. ACM.
Oliveto, P. S., He, J., and Yao, X. (2007).
Time complexity of evolutionary algorithms for combinatorial optimization: A decade of results.International Journal of Automation and Computing, 4(3):281–293.
Oliveto, P. S. and Witt, C. (2011).
Simplified drift analysis for proving lower bounds inevolutionary computation.Algorithmica, 59(3):369–386.
Reeves, C. R. and Rowe, J. E. (2002).
Genetic Algorithms: Principles and Perspectives: A Guide to GA Theory.Kluwer Academic Publishers, Norwell, MA, USA.
Rudolph, G. (1998).
Finite Markov chain results in evolutionary computation: A tour d’horizon.Fundamenta Informaticae, 35(1–4):67–89.
Sudholt, D. (2010).
General lower bounds for the running time of evolutionary algorithms.In PPSN (1), pages 124–133.
Witt, C. (2006).
Runtime analysis of the (µ+1) ea on simple pseudo-boolean functions evolutionary computation.In GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, pages 651–658, New York,NY, USA. ACM Press.
P. S. Oliveto & X. Yao (University of Birmingham) Runtime Analysis of EAs WCCI 2012 89 / 92
Top Related