A CONNECTIONIST MACHINE FOR GENETIC …978-1-4613-1997-9/1.pdf · Iterated simulated annealing...

A CONNECTIONIST MACHINE FOR GENETIC HILLCLIMBING

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

KNOWLEDGE REPRESENTATION, LEARNING AND EXPERT SYSTEMS

Other books in the series:

Consulting Editor

Tom M. Mitchell

Universal Subgoaling and Chunking of Goal Hierarchies. 1. Laird, P. Rosenbloom, A. Newell. ISBN 0-89838-213-0.

Machine Learning: A Guide to Current Research. T. Mitchell, 1. Carbonell, R. Michalski. ISBN 0-89838-214-9.

Machine Learning of Inductive Bias. P. Utgoff. ISBN 0-89838-223-8.

A CONNECTIONIST MACHINE FOR

GENETIC HILLCLIMBING

by

David H. Ackley Carnegie Mellon University

..... " KLUWER ACADEMIC PUBLISHERS

Boston/DordrechtiLancaster

Distributors for North America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061, USA

Distributors for the UK and Ireland: Kluwer Academic Publishers MTP Press Limited Falcon House, Queen Square Lancaster LAI IRN, UNITED KINGDOM

Distributors for all other countries: Kluwer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS

Library of Congress Cataloging-in-Publication Data

Ackley, David H. A connectionist machine for genetic hillclimbing.

(The Kluwer international series in engineering and computer science; SECS 28)

Originally presented as the author's thesis (Ph. D.)-Carnegie Mellon University, Pittsburgh, 1987.

Bibliography: p. Includes index. I. Artificial intelligence-Data processing.

I. Title. I I. Series. Q336.A25 1987 006.3 87-13536

ISBN-13: 978-1-4612-9192-3 e-ISBN-13: 978-1-4613-1997-9 DOl: 10.1007/978-1-4613-1997-9

Copyright © 1987 by Kluwer Academic Publishers Softcover reprint of the hardcover 1st edition 1987 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061.

To Helen and Sheldon

Contents

1. Introduction 1

1.1. Satisfying hidden strong constraints 1

1.2. Function optimization 3 1.2.1. The methodology of heuristic search 7 1.2.2. The shape of function spaces 8

1.3. High-dimensional binary vector spaces 20 1.3.1. Graph partitioning 22

1.4. Dissertation overview 26

1.5. Summary 27

2. The model 29

2.1. Design goal: Learning while searching 30 2.1.1. Knowledge representation 30 2.1.2. Point-based search strategies 31 2.1.3. Population-based search strategies 32 2.1.4. Combination rules 33 2.1.5. Election rules 35 2.1.6. Summary: Learning while searching 38

2.2. Design goal: Sustained exploration 38 2.2.1. Searching broadly 39 2.2.2. Convergence and divergence 39 2.2.3. Mode transitions 41 2.2.4. Resource allocation via taxation 45 2.2.5. Summary: Sustained exploration 47

2.3. Connectionist computation 48 2.3.1. Units and links 48 2.3.2. A three-state stochastic unit 51 2.3.3. Receptive fields 56

vii

2.4. Stochastic iterated genetic hillclimbing 59 2.4.1. Knowledge representation in SIGH 59 2.4.2. The SIGH control algorithm 60 2.4.3. Formal definition 67

2.5. Summary 69

3. Empirical demonstrations 71

3.1. Methodology 72 ,3.1.1. Notation 72 3.1.2. Parameter tuning 73 3.1.3. Non-termination 74

3.2. Seven algorithms 75 3.2.1. Iterated hillclimbing-steepest ascent (IHC-SA) 75 3.2.2. Iterated hillclimbing-next ascent (IHC-NA) 75 3.2.3. Stochastic hillclimbing (SHC) 76 3.2.4. Iterated simulated annealing (ISA) 77 3.2.5. Iterated genetic search-Uniform combination (IGS-U) 77 3.2.6. Iterated genetic search-Ordered combination (IGS-O) 81 3.2.7. Stochastic iterated genetic hillclimbing (SIGH) 81

3.3. Six functions 82 3.3.1. A linear space-"One Max" 83 3.3.2. A local maximum-"Two Max" 87 3.3.3. A large local maximum-"']}ap" 90 3.3.4. Fine-grained local maxima-"Porcupine" 93 3.3.5. Flat areas-"Plateaus" 96 3.3.6. A combination space-"Mix" 99

4. Analytic properties 103

4.1. Problem definition 104

4.2. Energy functions 105

4.3. Basic properties of the learning algorithm 109 4.3.1. Motivating the approach 109 4.3.2. Defining reinforcement signals 112 4.3.3. Defining similarity measures 114 4.3.4. The equilibrium distribution 121

4.4. Convergence 124

4.5. Divergence 127

5. Graph partitioning 133

5.1. Methodology 137 5.1.1. Problems 138 5.1.2. Algorithms 139 5.1.3. Data collection 140 5.1.4. Parameter tuning 141

viii

5.2. Adding a linear component

5.3. Experiments on random graphs

5.4. Experiments on multilevel graphs

6. Related work

6.1. The problem space formulation

6.2. Search and learning 6.2.1. Learning while searching 6.2.2. Symbolic learning 6.2.3. Hillclimbing 6.2.4. Stochastic hillclimbing and simulated annealing 6.2.5. Genetic algorithms

6.3. Connectionist modelling 6.3.1. Competitive learning 6.3.2. Back propagation 6.3.3. Boltzmann machines 6.3.4. Stochastic iterated genetic hillclimbing 6.3.5. Harmony theory 6.3.6. Reinforcement models

7. Limitations and variations

7.1. Current limitations 7.1.1. The problem 7.1.2. The SIGH model

7.2. Possible variations 7.2.1. Exchanging parameters 7.2.2. Beyond symmetric connections 7.2.3. Simultaneous optimization 7.2.4. Widening the bottleneck 7.2.5. Temporal credit assignment 7.2.6. Learning a function

8. Discussion and conclusions

8.1. Stability and change

8.2. Architectural goals 8.2.1. High potential parallelism 8.2.2. Highly incremental 8.2.3. "Generalized Hebbian" learning 8.2.4. Unsupervised learning 8.2.5. "Closed loop" interactions 8.2.6. Emergent properties

8.3. Discussion 8.3.1. The processor/memory distinction 8.3.2. Physical computation systems 8.3.3. Between mind and brain

ix

141

144

150

155 156

160 161 162 164 164 167

171 173 174 176 179 182 184

191

191 191 192

194 194 195 197 198 198 200

203 203

205 206 208 209 210 211 212

213 213 218 219

8.4. Conclusions 8.4.1. Recapitulation 8.4.2. Contributions

References

Index

x

222 222 225

231

239

Preface

In the "black box function optimization" problem, a search strategy is required to find an extremal point of a function without knowing the structure of the function or the range of possible function values. Solving such problems efficiently requires two abilities. On the one hand, a strategy must be capable of learning while searching: It must gather global information about the space and concentrate the search in the most promising regions. On the other hand, a strategy must be capable of sustained exploration: If a search of the most promising region does not uncover a satisfactory point, the strategy must redirect its efforts into other regions of the space.

This dissertation describes a connectionist learning machine that produces a search strategy called stochastic iterated genetic hillclimbing (SIGH). Viewed over a short period of time, SIGH displays a coarse-to-fine searching strategy, like simulated annealing and genetic algorithms. However, in SIGH the convergence process is reversible. The connectionist implementation makes it possible to diverge the search after it has converged, and to recover coarse-grained information about the space that was suppressed during convergence. The successful optimization of a complex function by SIGH usually involves a series of such converge/diverge cycles.

SIGH can be viewed as a generalization of a genetic algorithm and a stochastic hillclimbing algorithm, in which genetic search discovers starting points for subsequent hillclimbing, and hillclimbing biases the population for subsequent genetic search. Several search strategies -including SIGH, hillclimbers, genetic algorithms, and simulated annealing - are tested on a set of illustrative functions and on a series of graph partitioning problems. SIGH is competitive with genetic algorithms and simulated annealing in most cases, and markedly superior in a function where the uphill directions usually lead away from the global maximum. In that case, SIGH's ability to pass information from one coarse-to-fine search to the next is crucial. Combinations of genetic and hillclimbing techniques can offer dramatic performance improvements over either technique alone.

xi

Acknowledgments

This research was supported by the System Development Foundation and by National Science Foundation grant IST-8520359 to Geoffrey E. Hinton.

I thank Geoff Hinton for his scholarship, clarity, enthusiasm, and friendship. His leadership and patient support over the last four years helped me to redirect an idealistic stubbornness into scientifically productive channels, and to bring this research to a successful conclusion. I thank Hans Berliner for guidance and inspiration in the early years of my graduate career. I also thank the other members of my committee, Jaime Carbonell and David Rumelhart, for valuable feedback on this dissertation.

The meetings of the Boltzmann group at Carnegie Mellon provided an invaluable forum for discussion and learning about all matters connectionist. The members and visitors of the group taught me much about the way of the scientist in thought, speech, and behavior.

The support for research at the Computer Science Department at Carnegie Mellon is unsurpassed in human, organizational, and computational resources. I thank Alan N ewell, in particular, for useful comments on several aspects of this work, and for setting, by example, a high scientific standard. Sharon Burks and the other members of the administrative and operations staff provided excellent support. This dissertation was composed primarily on an aging terminal that Bob McDivett managed to keep functioning far beyond its natural span.

My family and my circle of friends, new and old, near and far, gave me warmth, support, and identity. Each in a unique way, they are all indispensable to me. I cannot possibly thank them properly; thank goodness I don't need to.

Finally, my years in Pittsburgh would have meant little without Gail Kaiser and Peter N euss. Pete understood and encouraged my wildest thoughts as no other, and gave me the confidence to pursue them wherever they led. Gail pushed, prodded, and ultimately dragged me by main force into growing up. These debts will take a lifetime to repay.

xii

A CONNECTIONIST MACHINE FOR GENETIC HILLCLIMBING

A CONNECTIONIST MACHINE FOR GENETIC …978-1-4613-1997-9/1.pdf · Iterated simulated annealing...

Documents

Transcript of A CONNECTIONIST MACHINE FOR GENETIC …978-1-4613-1997-9/1.pdf · Iterated simulated annealing...