A CONNECTIONIST MACHINE FOR GENETIC …978-1-4613-1997-9/1.pdf · Iterated simulated annealing...
Transcript of A CONNECTIONIST MACHINE FOR GENETIC …978-1-4613-1997-9/1.pdf · Iterated simulated annealing...
A CONNECTIONIST MACHINE FOR GENETIC HILLCLIMBING
THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE
KNOWLEDGE REPRESENTATION, LEARNING AND EXPERT SYSTEMS
Other books in the series:
Consulting Editor
Tom M. Mitchell
Universal Subgoaling and Chunking of Goal Hierarchies. 1. Laird, P. Rosenbloom, A. Newell. ISBN 0-89838-213-0.
Machine Learning: A Guide to Current Research. T. Mitchell, 1. Carbonell, R. Michalski. ISBN 0-89838-214-9.
Machine Learning of Inductive Bias. P. Utgoff. ISBN 0-89838-223-8.
A CONNECTIONIST MACHINE FOR
GENETIC HILLCLIMBING
by
David H. Ackley Carnegie Mellon University
..... " KLUWER ACADEMIC PUBLISHERS
Boston/DordrechtiLancaster
Distributors for North America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061, USA
Distributors for the UK and Ireland: Kluwer Academic Publishers MTP Press Limited Falcon House, Queen Square Lancaster LAI IRN, UNITED KINGDOM
Distributors for all other countries: Kluwer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS
Library of Congress Cataloging-in-Publication Data
Ackley, David H. A connectionist machine for genetic hillclimbing.
(The Kluwer international series in engineering and computer science; SECS 28)
Originally presented as the author's thesis (Ph. D.)-Carnegie Mellon University, Pittsburgh, 1987.
Bibliography: p. Includes index. I. Artificial intelligence-Data processing.
I. Title. I I. Series. Q336.A25 1987 006.3 87-13536
ISBN-13: 978-1-4612-9192-3 e-ISBN-13: 978-1-4613-1997-9 DOl: 10.1007/978-1-4613-1997-9
Copyright © 1987 by Kluwer Academic Publishers Softcover reprint of the hardcover 1st edition 1987 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061.
To Helen and Sheldon
Contents
1. Introduction 1
1.1. Satisfying hidden strong constraints 1
1.2. Function optimization 3 1.2.1. The methodology of heuristic search 7 1.2.2. The shape of function spaces 8
1.3. High-dimensional binary vector spaces 20 1.3.1. Graph partitioning 22
1.4. Dissertation overview 26
1.5. Summary 27
2. The model 29
2.1. Design goal: Learning while searching 30 2.1.1. Knowledge representation 30 2.1.2. Point-based search strategies 31 2.1.3. Population-based search strategies 32 2.1.4. Combination rules 33 2.1.5. Election rules 35 2.1.6. Summary: Learning while searching 38
2.2. Design goal: Sustained exploration 38 2.2.1. Searching broadly 39 2.2.2. Convergence and divergence 39 2.2.3. Mode transitions 41 2.2.4. Resource allocation via taxation 45 2.2.5. Summary: Sustained exploration 47
2.3. Connectionist computation 48 2.3.1. Units and links 48 2.3.2. A three-state stochastic unit 51 2.3.3. Receptive fields 56
vii
2.4. Stochastic iterated genetic hillclimbing 59 2.4.1. Knowledge representation in SIGH 59 2.4.2. The SIGH control algorithm 60 2.4.3. Formal definition 67
2.5. Summary 69
3. Empirical demonstrations 71
3.1. Methodology 72 ,3.1.1. Notation 72 3.1.2. Parameter tuning 73 3.1.3. Non-termination 74
3.2. Seven algorithms 75 3.2.1. Iterated hillclimbing-steepest ascent (IHC-SA) 75 3.2.2. Iterated hillclimbing-next ascent (IHC-NA) 75 3.2.3. Stochastic hillclimbing (SHC) 76 3.2.4. Iterated simulated annealing (ISA) 77 3.2.5. Iterated genetic search-Uniform combination (IGS-U) 77 3.2.6. Iterated genetic search-Ordered combination (IGS-O) 81 3.2.7. Stochastic iterated genetic hillclimbing (SIGH) 81
3.3. Six functions 82 3.3.1. A linear space-"One Max" 83 3.3.2. A local maximum-"Two Max" 87 3.3.3. A large local maximum-"']}ap" 90 3.3.4. Fine-grained local maxima-"Porcupine" 93 3.3.5. Flat areas-"Plateaus" 96 3.3.6. A combination space-"Mix" 99
4. Analytic properties 103
4.1. Problem definition 104
4.2. Energy functions 105
4.3. Basic properties of the learning algorithm 109 4.3.1. Motivating the approach 109 4.3.2. Defining reinforcement signals 112 4.3.3. Defining similarity measures 114 4.3.4. The equilibrium distribution 121
4.4. Convergence 124
4.5. Divergence 127
5. Graph partitioning 133
5.1. Methodology 137 5.1.1. Problems 138 5.1.2. Algorithms 139 5.1.3. Data collection 140 5.1.4. Parameter tuning 141
viii
5.2. Adding a linear component
5.3. Experiments on random graphs
5.4. Experiments on multilevel graphs
6. Related work
6.1. The problem space formulation
6.2. Search and learning 6.2.1. Learning while searching 6.2.2. Symbolic learning 6.2.3. Hillclimbing 6.2.4. Stochastic hillclimbing and simulated annealing 6.2.5. Genetic algorithms
6.3. Connectionist modelling 6.3.1. Competitive learning 6.3.2. Back propagation 6.3.3. Boltzmann machines 6.3.4. Stochastic iterated genetic hillclimbing 6.3.5. Harmony theory 6.3.6. Reinforcement models
7. Limitations and variations
7.1. Current limitations 7.1.1. The problem 7.1.2. The SIGH model
7.2. Possible variations 7.2.1. Exchanging parameters 7.2.2. Beyond symmetric connections 7.2.3. Simultaneous optimization 7.2.4. Widening the bottleneck 7.2.5. Temporal credit assignment 7.2.6. Learning a function
8. Discussion and conclusions
8.1. Stability and change
8.2. Architectural goals 8.2.1. High potential parallelism 8.2.2. Highly incremental 8.2.3. "Generalized Hebbian" learning 8.2.4. Unsupervised learning 8.2.5. "Closed loop" interactions 8.2.6. Emergent properties
8.3. Discussion 8.3.1. The processor/memory distinction 8.3.2. Physical computation systems 8.3.3. Between mind and brain
ix
141
144
150
155 156
160 161 162 164 164 167
171 173 174 176 179 182 184
191
191 191 192
194 194 195 197 198 198 200
203 203
205 206 208 209 210 211 212
213 213 218 219
8.4. Conclusions 8.4.1. Recapitulation 8.4.2. Contributions
References
Index
x
222 222 225
231
239
Preface
In the "black box function optimization" problem, a search strategy is required to find an extremal point of a function without knowing the structure of the function or the range of possible function values. Solving such problems efficiently requires two abilities. On the one hand, a strategy must be capable of learning while searching: It must gather global information about the space and concentrate the search in the most promising regions. On the other hand, a strategy must be capable of sustained exploration: If a search of the most promising region does not uncover a satisfactory point, the strategy must redirect its efforts into other regions of the space.
This dissertation describes a connectionist learning machine that produces a search strategy called stochastic iterated genetic hillclimbing (SIGH). Viewed over a short period of time, SIGH displays a coarse-to-fine searching strategy, like simulated annealing and genetic algorithms. However, in SIGH the convergence process is reversible. The connectionist implementation makes it possible to diverge the search after it has converged, and to recover coarse-grained information about the space that was suppressed during convergence. The successful optimization of a complex function by SIGH usually involves a series of such converge/diverge cycles.
SIGH can be viewed as a generalization of a genetic algorithm and a stochastic hillclimbing algorithm, in which genetic search discovers starting points for subsequent hillclimbing, and hillclimbing biases the population for subsequent genetic search. Several search strategies -including SIGH, hillclimbers, genetic algorithms, and simulated annealing - are tested on a set of illustrative functions and on a series of graph partitioning problems. SIGH is competitive with genetic algorithms and simulated annealing in most cases, and markedly superior in a function where the uphill directions usually lead away from the global maximum. In that case, SIGH's ability to pass information from one coarse-to-fine search to the next is crucial. Combinations of genetic and hillclimbing techniques can offer dramatic performance improvements over either technique alone.
xi
Acknowledgments
This research was supported by the System Development Foundation and by National Science Foundation grant IST-8520359 to Geoffrey E. Hinton.
I thank Geoff Hinton for his scholarship, clarity, enthusiasm, and friendship. His leadership and patient support over the last four years helped me to redirect an idealistic stubbornness into scientifically productive channels, and to bring this research to a successful conclusion. I thank Hans Berliner for guidance and inspiration in the early years of my graduate career. I also thank the other members of my committee, Jaime Carbonell and David Rumelhart, for valuable feedback on this dissertation.
The meetings of the Boltzmann group at Carnegie Mellon provided an invaluable forum for discussion and learning about all matters connectionist. The members and visitors of the group taught me much about the way of the scientist in thought, speech, and behavior.
The support for research at the Computer Science Department at Carnegie Mellon is unsurpassed in human, organizational, and computational resources. I thank Alan N ewell, in particular, for useful comments on several aspects of this work, and for setting, by example, a high scientific standard. Sharon Burks and the other members of the administrative and operations staff provided excellent support. This dissertation was composed primarily on an aging terminal that Bob McDivett managed to keep functioning far beyond its natural span.
My family and my circle of friends, new and old, near and far, gave me warmth, support, and identity. Each in a unique way, they are all indispensable to me. I cannot possibly thank them properly; thank goodness I don't need to.
Finally, my years in Pittsburgh would have meant little without Gail Kaiser and Peter N euss. Pete understood and encouraged my wildest thoughts as no other, and gave me the confidence to pursue them wherever they led. Gail pushed, prodded, and ultimately dragged me by main force into growing up. These debts will take a lifetime to repay.
xii
A CONNECTIONIST MACHINE FOR GENETIC HILLCLIMBING