On learning through competition

SAND88-0738A ON LEARNING THROUGH COMPETITION David Heath 1 School of OR & IE, Upson Hall, Cornell University, Ithaca NY 14853-7501 Carl Diegert Sandia National Laboratories, Albuquerque NM 87185-5800 A new model for nondeterministic, adaptive behavior of a system constructed from many simple parts captures some of the flavor of learning. The paper identifies learning with a mathematical convergence property of the model, and formally proves this convergence. The convergence suggests problem-solving algorithms, as illustrated by the new algorithm and computational results for the travelling salesman problem that conclude the paper. The theorem may aid in establishing an emergent behavior of some parallel distributed computations, where the simple parts are identified with neurons. The new, stochastic model, however, is more intuitively explained by associating each simple part with one bettor in a crowd at a racetrack. Money is redistributed as the system (the crowd) evolves (races are run). This redistribution is the mechanism by which the system collectively learns. Examples in the paper apply the main result: Theorem 1 Let (Q,U,P) be a probability space, and let (,~"n)n=0,1 .... be an increasing sequence of sigma fields. Let X, be any f2 _ bounded sequence of random variables adapted to (Y,,). Then: lim 1 n-1 - ~ E(Xk+~ - Xk I 7k) = 0. • n ~ o o n k----0 Explanations identify the sequence X~ with the (random) sequence of fortunes of bettor k, and give specific fixed rules to define how each bettor places bets, and how bets are settled. The rules, of course, fall within the theorem's general conditions: the payoff scheme eventually causes bettors that do badly to reduce the sizes of their bets. The concluding travelling salesman application is explained in terms of bettors, but no results are proved. Like Hopfield and Tank, n-city problems are addressed by systems of n 2 simple elements. The new algorithm, however, injects .randomness into each step of the search, not just into the starting point. Numerical experiments give some preliminary insight into the nature of the new search. 1Research supported by the U. S. Army research office through the Mathematical Sciences Institute at Cornell University. 100

Upload
david-heath
Category

Documents
view
218
download
2

Embed Size (px):

Transcript of On learning through competition

SAND88-0738A

ON L E A R N I N G T H R O U G H C O M P E T I T I O N

D a v i d H e a t h 1

School of OR & IE, Upson Hall, Cornell University, Ithaca NY 14853-7501

Carl D i e g e r t

S a n d i a N a t i o n a l L a b o r a t o r i e s , A l b u q u e r q u e N M 87185-5800

A new model for nondeterministic, adaptive behavior of a system constructed from many

simple parts captures some of the flavor of learning. The paper identifies learning with a

mathematical convergence property of the model, and formally proves this convergence. The

convergence suggests problem-solving algorithms, as illustrated by the new algorithm and

computational results for the travelling salesman problem that conclude the paper.

The theorem may aid in establishing an emergent behavior of some parallel distributed computations, where the simple parts are identified with neurons. The new, stochastic model, however, is more intuitively explained by associating each simple part with one bettor in a

crowd at a racetrack. Money is redistributed as the system (the crowd) evolves (races are run). This redistribution is the mechanism by which the system collectively learns.

Examples in the paper apply the main result:

Theorem 1 Let (Q,U,P) be a probability space, and let (,~"n)n=0,1 .... be an increasing se-

quence of sigma fields. Let X , be any f2 _ bounded sequence of random variables adapted to

(Y,,). Then:

lim 1 n-1 - ~ E(Xk+~ - Xk I 7k) = 0. • n~oo n k----0

Explanations identify the sequence X~ with the (random) sequence of fortunes of bettor k, and give specific fixed rules to define how each bettor places bets, and how bets are

settled. The rules, of course, fall within the theorem's general conditions: the payoff scheme

eventually causes bettors that do badly to reduce the sizes of their bets. The concluding travelling salesman application is explained in terms of bettors, but no

results are proved. Like Hopfield and Tank, n-city problems are addressed by systems of

n 2 simple elements. The new algorithm, however, injects .randomness into each step of the search, not just into the starting point. Numerical experiments give some preliminary insight

into the nature of the new search.

1Research suppor ted by the U. S. Army research office through the Mathemat ica l Sciences Inst i tute at Cornell University.

100