Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Algorithm
-
Upload
martin-pelikan -
Category
Technology
-
view
1.291 -
download
2
description
Transcript of Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Algorithm
Pairwise and Problem-Specific Distance Metricsin the Linkage Tree Genetic Algorithm
Martin Pelikan1, Mark W. Hauschild1, Dirk Thierens2
1 Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)University of Missouri, St. Louis, MO
[email protected], [email protected]
2 Utrecht UniversityUtrecht, The [email protected]
Download MEDAL Report No. 2011001
http://medal.cs.umsl.edu/files/2011001.pdf
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Motivation
Linkage learning
I Standard crossover often ineffective in presence of epistasis.I Linkage learning aims to learn interactions between problem
variables to ensure that crossover does not disrupt importantpartial solutions and it combines them effectively.
I Various evolutionary algorithms capable of linkage learningexist.
This study
I Focuses on linkage tree genetic algorithm (LTGA).I Proposes and analyzes two distance metrics in LTGA.I Analyzes LTGA scalability on a large number of problems.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Outline
1. Linkage tree genetic algorithm (LTGA).
2. Distance metrics in LTGA.
3. Experiments.
4. Summary and conclusions.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Linkage Tree
Linkage treeI Leaves are individual variables (string positions).I Each internal node has two subtrees.I Each node represents a subset of variables (descendants).I Descendants of any node form a linkage group.I Linkage groups used as masks in LTGA crossover.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Linkage Tree Genetic Algorithm
LTGA procedure
I Starts with a random population.I Initial population may undergo local search.I Each generation performs two rounds of crossover to generate
a new population of the same size.
LTGA crossover
I Start with pair (X, Y ) of parents.I For each linkage group [π1, π2, . . . , πk] in T (bottom to top)
I Create X ′ and Y ′ by exchanging bits in positions {π1, . . . , πk}between X and Y .
I If best(X ′, Y ′) is better than best(X, Y ), then replace (X, Y )with (X ′, Y ′).
I The best of the two parents after applying each linkage groupsurvives to the next population.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Learning Linkage Tree
Learning linkage treeI Start with each variable being a separate linkage group.I Each step merges two closest groups.I Distance of linkage groups based on variation of information.I Each iteration should merge most strongly interacting groups.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Measuring Cluster Distances in LTGA
Distance metric based on variation of informationI Distance of clusters Ci and Cj :
D(Ci, Cj) = 2− H(Ci) + H(Cj)
H(Ci, Cj)
whereI H(Ci, Cj) is the entropy of Ci ∪ Cj
I H(Ci) is the entropy of Ci
I H(Cj) is the entropy of Cj
Bottleneck in learning linkage treeI Most time spent by measuring cluster distances.I Can we alleviate this bottleneck?I We discuss two distance metrics that address this issue.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Pairwise Metric
Pairwise metric
I Start by measuring distances between pairs of variables.I Cluster distance computed as average distance between pairs
of variables
D′(Ci, Cj) =1
|Ci| × |Cj |∑
ci∈Ci
∑cj∈Cj
D(ci, cj)
Good news
I We only need pairwise statistics.I This results in much faster distance computation.I Surprisingly, this also helps scalability.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Pairwise Metric
Pairwise metric
I Start by measuring distances between pairs of variables.I Cluster distance computed as average distance between pairs
of variables
D′(Ci, Cj) =1
|Ci| × |Cj |∑
ci∈Ci
∑cj∈Cj
D(ci, cj)
Good news
I We only need pairwise statistics.I This results in much faster distance computation.I Surprisingly, this also helps scalability.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Problem-Specific Metrics
Basic idea
I If we could estimate distance of clusters without computingstatistics from current population, we could possibly
I save lot of time in learning tree, andI reduce the population sizes and number of generations.
Where to get distances from?
I Problem-specific information.I Learning from optimization runs on similar problems.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Additively Decomposable Functions (ADFs)
Additively decomposable function
I Additively decomposable function:
f(X1, . . . , Xn) =m∑
i=1
fi(Si)
I fi is ith subfunctionI Si is subset of variables from {X1, . . . , Xn}
I Variables in located in the same subproblem are expected tointeract more strongly.
I Can we use this fact to create a distance metric for LTGA?
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Problem-Specific Metric for ADFs
Distance metric for ADFs
I Create graph G = (V,E).I V = {X1, X2, . . . , Xn}.I E = {(i, j) : Xi, Xj ∈ Sk}.I Define weight of each edge from E as d(i, j) = 1.I Define li,j the shortest path between i and j.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Problem-Specific Metric for ADFs
Distance metric for ADFs
I Use G to compute distances between variables
D′′(Xi, Xj) =
{li,j if a path between Xi and Xj existsn otherwise
I Cluster distance is defined as an average of pairwise distances
D′′(Ci, Cj) =1
|Ci| × |Cj |∑
ci∈Ci
∑cj∈Cj
D′′(ci, cj)
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Experiments: Test Problems
Problems
I Concatenated traps of order k.I Nearest-neighbor NK landscapes with wrap-around
neighborhoods.I 2D Ising spin glass.
Why these test problems?
I All test problems require linkage learning.I All test problems are nontrivial.I Yet all test problems are solvable in polynomial time.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Experiments: Setup
Test problem parameters, instances
I Traps of order k ∈ {5, 6, 7, 8} were tested.I NK landscapes with k = 5 were tested.I For all problems, n was varied.I For NK landscapes and spin glasses, for each n, 1,000
instances were generated and tested.
LTGA setup
I Bisection was used to find minimum population size forconvergence to the optimum in 10 out of 10 independent runs.
I For traps, bisection is repeated 10 times for each n.I Max. number of generations is set to a sufficiently large value.I Bit-flip local search run on initial population.I Use standard, pairwise, and problem-specific metric.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Pairwise Metric on Trap-5
102
103
104
105
106
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (original), O(n1.27)
LTGA (pairwise), O(n1.25)
I Pairwise metric allows us to solve much larger problems.I Scalability is slightly improved (surprising).I Results for trap-6 and trap-7 similar.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Pairwise Metric on NK
20 40 60 80 10010
3
104
105
106
107
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (original), O(n5.14)
LTGA (pairwise), O(n3.23)
I Pairwise metric allows us to solve much larger problems.I Scalability is significantly improved (surprising).
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Pairwise Metric on 2D Spin Glass
64 100 144 196 25610
4
105
106
107
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (original), O(n5.38)
LTGA (pairwise), O(n3.50)
I Pairwise metric allows us to solve much larger problems.I Scalability is significantly improved (surprising).
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Problem-Specific Metric on Trap-5
102
103
104
105
106
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (pairwise), O(n1.25)
LTGA (problem), O(n1.26)
I Problem-specific metric similar to pairwise metric.I CPU slightly decreased though with problem-specific metric.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Problem-Specific Metric on NK
20 40 60 80 10010
3
104
105
106
107
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (pairwise), O(n3.23)
LTGA (problem), O(n2.87)
I Problem-specific metric slightly better than pairwise one.I So problem-specific metric pays off.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Results: Problem-Specific Metric on 2D Spin Glass
64 100 144 196 25610
4
105
106
107
108
Problem size, n
Num
ber
of e
valu
atio
ns
LTGA (problem), O(n4.05)
LTGA (pairwise), O(n3.50)
I Problem-specific metric scales worse than pairwise one!I Problem-specific metric is not that great for 2D spin glass.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Conclusions and Future Work
Conclusions
I LTGA provides opportunities for efficiency enhancements.I LTGA also provides promising tool for using problem-specific
knowledge and learning from experience whensolving many instancesof similar problems.
I Pairwise metric provides important improvement.I Problem-specific metric demonstrates the ability of LTGA to exploit
problem-specific knowledge on additively decomposable functions.I But the results based on problem-specific information are mixed.
Future work
I Design more robust and effective problem-specific metrics.I Design methods to learn distance metrics for specific problem classes.I Improve performance of LTGA on problems of complex structure.I Adopt efficiency enhancement techniques for other evolutionary
algorithms to LTGA, including model-directed local search, fitnessmodeling, parallelization, and others.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA
Acknowledgments
Acknowledgments
I NSF; NSF CAREER grant ECS-0547013.
I University of Missouri; High Performance ComputingCollaboratory sponsored by Information Technology Services;Research Award; Research Board.
Martin Pelikan, Mark W. Hauschild, Dirk Thierens Pairwise and Problem-Specific Metrics in LTGA