Analyzing cultural evolution by iterated learning

51
Analyzing cultural evolution by iterated learning Tom Griffiths Department of Psychology Cognitive Science Program UC Berkeley QuickTime™ and a TIFF (Uncompressed) decomp are needed to see this p

description

Analyzing cultural evolution by iterated learning. Tom Griffiths Department of Psychology Cognitive Science Program UC Berkeley. Learning languages from utterances. blicket toma dax wug blicket wug. S  X Y X  {blicket,dax} Y  {toma, wug}. Learning functions from ( x , y ) pairs. - PowerPoint PPT Presentation

Transcript of Analyzing cultural evolution by iterated learning

Page 1: Analyzing cultural evolution by iterated learning

Analyzing cultural evolution by iterated learning

Tom GriffithsDepartment of Psychology

Cognitive Science Program

UC Berkeley QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 2: Analyzing cultural evolution by iterated learning

Inductive problems

blicket toma

dax wug

blicket wug

S X Y

X {blicket,dax}

Y {toma, wug}

Learning languages from utterances

Learning functions from (x,y) pairs

Learning categories from instances of their members

Page 3: Analyzing cultural evolution by iterated learning

Learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.data

Page 4: Analyzing cultural evolution by iterated learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 5: Analyzing cultural evolution by iterated learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Iterated learning(Kirby, 2001)

What are the consequences of learners learning from other learners?

Page 6: Analyzing cultural evolution by iterated learning

Outline

Part I: Formal analysis of iterated learning

Part II: Iterated learning in the lab

Page 7: Analyzing cultural evolution by iterated learning

Outline

Part I: Formal analysis of iterated learning

Part II: Iterated learning in the lab

Page 8: Analyzing cultural evolution by iterated learning

Objects of iterated learning

How do constraints on learning (inductive biases) influence cultural universals?

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 9: Analyzing cultural evolution by iterated learning

Language

• The languages spoken by humans are typically viewed as the result of two factors– individual learning – innate constraints (biological evolution)

• This limits the possible explanations for different kinds of linguistic phenomena

Page 10: Analyzing cultural evolution by iterated learning

Linguistic universals

• Human languages possess universal properties– e.g. compositionality

(Comrie, 1981; Greenberg, 1963; Hawkins, 1988)

• Traditional explanation:– linguistic universals reflect strong innate constraints

specific to a system for acquiring language(e.g., Chomsky, 1965)

Page 11: Analyzing cultural evolution by iterated learning

Cultural evolution

• Languages are also subject to change via cultural evolution (through iterated learning)

• Alternative explanation:– linguistic universals emerge as the result of the fact that

language is learned anew by each generation (using general-purpose learning mechanisms, expressing weak constraints on languages)

(e.g., Briscoe, 1998; Kirby, 2001)

Page 12: Analyzing cultural evolution by iterated learning

Analyzing iterated learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

PL(h|d): probability of inferring hypothesis h from data d

PP(d|h): probability of generating data d from hypothesis h

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

Page 13: Analyzing cultural evolution by iterated learning

• Variables x(t+1) independent of history given x(t)

• Converges to a stationary distribution under easily checked conditions (i.e., if it is ergodic)

x x x x x x x x

Transition matrixP(x(t+1)|x(t))

Markov chains

Page 14: Analyzing cultural evolution by iterated learning

Analyzing iterated learning

d0 h1 d1 h2PL(h|d) PP(d|h) PL(h|d)

d2 h3PP(d|h) PL(h|d)

d PP(d|h)PL(h|d)h1 h2d PP(d|h)PL(h|d)

h3

A Markov chain on hypotheses

d0 d1h PL(h|d) PP(d|h)d2h PL(h|d) PP(d|h) h PL(h|d) PP(d|h)

A Markov chain on data

Page 15: Analyzing cultural evolution by iterated learning

Bayesian inference

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Reverend Thomas Bayes

Page 16: Analyzing cultural evolution by iterated learning

Bayes’ theorem

P(h | d) P(d | h)P(h)

P(d | h )P( h )h H

Posteriorprobability

Likelihood Priorprobability

Sum over space of hypothesesh: hypothesis

d: data

Page 17: Analyzing cultural evolution by iterated learning

A note on hypotheses and priors

• No commitment to the nature of hypotheses– neural networks (Rumelhart & McClelland, 1986)

– discrete parameters (Gibson & Wexler, 1994)

• Priors do not necessarily represent innate constraints specific to language acquisition– not innate: can reflect independent sources of data– not specific: general-purpose learning algorithms also

have inductive biases expressible as priors

Page 18: Analyzing cultural evolution by iterated learning

Iterated Bayesian learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

PL (h | d) PP (d | h)P(h)

PP (d | h )P( h )h H

Assume learners sample from their posterior distribution:

Page 19: Analyzing cultural evolution by iterated learning

Stationary distributions

• Markov chain on h converges to the prior, P(h)

• Markov chain on d converges to the “prior predictive distribution”

P(d) P(d | h)h

P(h)

(Griffiths & Kalish, 2005)

Page 20: Analyzing cultural evolution by iterated learning

Explaining convergence to the prior

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

• Intuitively: data acts once, prior many times

• Formally: iterated learning with Bayesian agents is a Gibbs sampler on P(d,h)

(Griffiths & Kalish, 2007)

Page 21: Analyzing cultural evolution by iterated learning

Gibbs sampling

For variables x = x1, x2, …, xn

Draw xi(t+1) from P(xi|x-i)

x-i = x1(t+1), x2

(t+1),…, xi-1(t+1)

, xi+1(t)

, …, xn(t)

Converges to P(x1, x2, …, xn)

(a.k.a. the heat bath algorithm in statistical physics)

(Geman & Geman, 1984)

Page 22: Analyzing cultural evolution by iterated learning

Gibbs sampling

(MacKay, 2003)

Page 23: Analyzing cultural evolution by iterated learning

Explaining convergence to the prior

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

When target distribution is P(d,h) = PP(d|h)P(h), conditional distributions are PL(h|d) and PP(d|h)

Page 24: Analyzing cultural evolution by iterated learning

Implications for linguistic universals

• When learners sample from P(h|d), the distribution over languages converges to the prior– identifies a one-to-one correspondence between

inductive biases and linguistic universals

Page 25: Analyzing cultural evolution by iterated learning

Iterated Bayesian learning

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

PL (h | d) PP (d | h)P(h)

PP (d | h )P( h )h H

Assume learners sample from their posterior distribution:

Page 26: Analyzing cultural evolution by iterated learning

PL (h | d) PP (d | h)P(h)

PP (d | h )P( h )h H

r

r = 1 r = 2 r =

From sampling to maximizing

Page 27: Analyzing cultural evolution by iterated learning

• General analytic results are hard to obtain– (r = is Monte Carlo EM with a single sample)

• For certain classes of languages, it is possible to show that the stationary distribution gives each hypothesis h probability proportional to P(h)r

– the ordering identified by the prior is preserved, but not the corresponding probabilities

(Kirby, Dowman, & Griffiths, 2007)

From sampling to maximizing

Page 28: Analyzing cultural evolution by iterated learning

Implications for linguistic universals

• When learners sample from P(h|d), the distribution over languages converges to the prior– identifies a one-to-one correspondence between inductive

biases and linguistic universals

• As learners move towards maximizing, the influence of the prior is exaggerated– weak biases can produce strong universals– cultural evolution is a viable alternative to traditional

explanations for linguistic universals

Page 29: Analyzing cultural evolution by iterated learning

Infinite populations in continuous time

• “Language dynamical equation”

• “Neutral model” (fj(x) constant)

• Stable equilibrium at first eigenvector of Q, which is our stationary distribution

dx i

dt qij f j (x)

j

x j (x)x i

(Nowak, Komarova, & Niyogi, 2001)

dx i

dt qij

j

x j x i

(Komarova & Nowak, 2003)

dx

dt(Q I)x

Page 30: Analyzing cultural evolution by iterated learning

Analyzing iterated learning

• The outcome of iterated learning is strongly affected by the inductive biases of the learners– hypotheses with high prior probability ultimately

appear with high probability in the population

• Clarifies the connection between constraints on language learning and linguistic universals…

• …and provides formal justification for the idea that culture reflects the structure of the mind

Page 31: Analyzing cultural evolution by iterated learning

Outline

Part I: Formal analysis of iterated learning

Part II: Iterated learning in the lab

Page 32: Analyzing cultural evolution by iterated learning

Inductive problems

blicket toma

dax wug

blicket wug

S X Y

X {blicket,dax}

Y {toma, wug}

Learning languages from utterances

Learning functions from (x,y) pairs

Learning categories from instances of their members

Page 33: Analyzing cultural evolution by iterated learning

Revealing inductive biases

• Many problems in cognitive science can be formulated as problems of induction– learning languages, concepts, and causal relations

• Such problems are not solvable without bias(e.g., Goodman, 1955; Kearns & Vazirani, 1994; Vapnik, 1995)

• What biases guide human inductive inferences?

If iterated learning converges to the prior, then it may provide a method for investigating biases

Page 34: Analyzing cultural evolution by iterated learning

Serial reproduction(Bartlett, 1932)

Page 35: Analyzing cultural evolution by iterated learning

General strategy

• Step 1: use well-studied and simple tasks for which people’s inductive biases are known– function learning– concept learning

• Step 2: explore learning problems where effects of inductive biases are controversial– frequency distributions– systems of color terms

Page 36: Analyzing cultural evolution by iterated learning

Iterated function learning

• Each learner sees a set of (x,y) pairs

• Makes predictions of y for new x values

• Predictions are data for the next learner

data hypotheses

(Kalish, Griffiths, & Lewandowsky, 2007)

Page 37: Analyzing cultural evolution by iterated learning

Function learning experiments

Stimulus

Response

Slider

Feedback

Examine iterated learning with different initial data

Page 38: Analyzing cultural evolution by iterated learning

1 2 3 4 5 6 7 8 9

IterationInitialdata

Page 39: Analyzing cultural evolution by iterated learning

Iterated concept learning

• Each learner sees examples from a species

• Identifies species of four amoebae

• Species correspond to boolean concepts

data hypotheses

(Griffiths, Christian, & Kalish, 2006)

Page 40: Analyzing cultural evolution by iterated learning

Types of concepts(Shepard, Hovland, & Jenkins, 1961)

Type I

Type II

Type III

Type IV

Type V

Type VI

shape

size

color

Page 41: Analyzing cultural evolution by iterated learning

Results of iterated learningPr

obab

ilit

y

Iteration

Prob

abil

ity

Iteration

Human learners Bayesian model

Page 42: Analyzing cultural evolution by iterated learning

Frequency distributions

• Each learner sees objects receiving two labels

• Produces labels for those objects at test

• First learner: one label {0,1,2,3,4,5}/10 times

data hypotheses

(Reali & Griffiths, submitted)

5 x “DUP”

5 x “NEK”P(“DUP”| ) =

(Vouloumanos, 2008)

Page 43: Analyzing cultural evolution by iterated learning

Results after one generation

Fre

quen

cy o

f ta

rget

labe

l

Condition

Page 44: Analyzing cultural evolution by iterated learning

Results after five generations

Frequency of target label

Page 45: Analyzing cultural evolution by iterated learning

The Wright-Fisher model

• Basic model: x copies of gene A in population of N

• With mutation…

x(t1) ~ Binomial(N,) =x(t )

N

x(t1) ~ Binomial(N,) =x(t )(1 u) (N x(t ))v

N

Page 46: Analyzing cultural evolution by iterated learning

Iterated learning and Wright-Fisher

• Basic model is MAP with uniform prior, mutation model is MAP with Beta prior

• Extends to other models of genetic drift…– connection between drift models and inference

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

x x x

Nv

1 u v,

Nu

1 u v

with Florencia Reali

Page 47: Analyzing cultural evolution by iterated learning

Cultural evolution of color terms

with Mike Dowman and Jing Xu

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 48: Analyzing cultural evolution by iterated learning

Identifying inductive biases

• Formal analysis suggests that iterated learning provides a way to determine inductive biases

• Experiments with human learners support this idea– when stimuli for which biases are well understood are used,

those biases are revealed by iterated learning

• What do inductive biases look like in other cases?– continuous categories– causal structure– word learning– language learning

Page 49: Analyzing cultural evolution by iterated learning

Conclusions

• Iterated learning provides a lens for magnifying the inductive biases of learners– small effects for individuals are big effects for groups

• When cognition affects culture, studying groups can give us better insight into individuals

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.data

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 50: Analyzing cultural evolution by iterated learning

CreditsJoint work with…

Brian ChristianMike Dowman

Mike KalishSimon Kirby

Steve LewandowskyFlorencia Reali

Jing Xu

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Computational Cognitive Science Labhttp://cocosci.berkeley.edu/

Page 51: Analyzing cultural evolution by iterated learning