EASy 7 December 2009Artificial Life Lecture 181 Artificial Life lecture 18 Evolution and Learning...

7 December 2009

Artificial Life Lecture 18

1

EASy

Artificial Life lecture 18Artificial Life lecture 18Artificial Life lecture 18Artificial Life lecture 18

Evolution and Learning

(A) Evolution of Learning(B) The Baldwin Effect (this week’s seminars)(C) A teaser …

7 December 2009


2

EASy

Evolution and LearningEvolution and LearningEvolution and LearningEvolution and Learning

Exploring Adaptive Agency II: Simulating the Evolution of Associative LearningPeter Todd and Geoffrey Miller, pp. 306-315 inFrom Animals to Animats, J-A Meyer & S. Wilson (eds)MIT Press 1991 (Proc of SAB90)

Looking at some aspects of the relationship betweenevolution and learning.

ALSO: look at how the model is simplified as far as is reasonably possible – lessons for your Alife project

7 December 2009


3

EASy

Evolution and LearningEvolution and LearningEvolution and LearningEvolution and Learning

Darwinian Evolution is done by a population (which may, or may not, be composed of learning individuals).

As shorthand, one might talk of a population ‘learning’ to adapt to changed environmental conditions (eg climate change) through evolution.

But usually it avoids confusion if one talks of a population evolving, an individual learning.

7 December 2009


4

EASy

Why bother to Learn - 1?Why bother to Learn - 1?Why bother to Learn - 1?Why bother to Learn - 1?

Todd & Miller suggest 2 reasons:

(1) Learning is a cheap way of getting complex behaviours, which would be rather expensive if 'hard-wired' by evolution.Eg: parental imprinting in birds –

'the first large moving thing you see is mum, learn to recognise her'.

7 December 2009


5

EASy

Why bother to Learn – 2?Why bother to Learn – 2?Why bother to Learn – 2?Why bother to Learn – 2?

(2) Learning can track environmental changes faster than evolution.

Eg: it might take millennia for humans to evolve so as to speak English at birth, and would require the English language not to change too much.

But we have evolved to learn whatever language we are exposed to as children -- hence have no problems keeping up with language changes.

7 December 2009


6

EASy

Learning needs FeedbackLearning needs FeedbackLearning needs FeedbackLearning needs Feedback

Several kinds of feedback:Supervised -- teacher tells you exactly what you should have doneUnsupervised -- you just get told good/bad, but not what was wrong or how to improve.(..cold...warm...warmer...)

Evolution roughly equates to unsupervised learning -- if a creature dies early then this is negative feedback as far as its chances of 'passing on its genes' are concerned -- but evolution doesn’t 'suggest what it should have done'.

7 December 2009


7

EASy

Evolving your own feedback?Evolving your own feedback?Evolving your own feedback?Evolving your own feedback?

However, it may be possible, under some circumstances, for evolution to create, within 'one part of an organism', some subsystem that can act as a 'supervisor' for another subsystem.

Cf.DH Ackley & ML Littman. Interactions between learning and evolution. In Artificial Life II, Langton et al (eds), Addison Wesley 1991.

Later work by same authors on Evolutionary Reinforcement Learning

7 December 2009


8

EASy

The Todd & Miller modelThe Todd & Miller modelThe Todd & Miller modelThe Todd & Miller model

Creatures come across food (+10 pts) and poison (-10)

Food and poison always have different smells, sweet and sour. BUT sometimes smells drift around, and cannot be reliably distinguished. In different worlds, the reliability of smell is x% where 50 <= x <= 100.

Food and poison always have different colours, red and green. BUT in some worlds it is food-red poison-green, in other worlds it is food-green poison-red. The creatures’vision is always perfect, but 'they dont know whetherred is safe or dangerous'.

7 December 2009


9

EASy

The model – ctd.The model – ctd.The model – ctd.The model – ctd.

Maybe they can learn, using their unreliable smell?Todd & Miller claim that simplest associative learning needs:(A) an input that (unreliably) senses what is known to be good or bad (smell)(B) another sensor such as that for colour, above(C) output such that behaviour alters fitness.(D) an evolved, fixed connection (A)->(C) with the appropriate weighting (+ve for good, -ve for bad)(E) a learnable, plastic connection (B)->(C) which can be built up by association with the activation of (C)

7 December 2009


10

EASy

Their ‘Brains’Their ‘Brains’Their ‘Brains’Their ‘Brains’

Creatures brains are genetically specified, with exactly3 neurons connected thus:

7 December 2009


11

EASy

Genetic specificationGenetic specificationGenetic specificationGenetic specification

The genotype specifies for each neuron whether it isinput sweet-sensorinput sour-sensorinput red-sensorinput green-sensorhidden unit or interneuronoutput or decision unit: eat/dont-eat

and for each neuron the bias (0, 1, 2, 3) (+/-)and for each link the weight (0, 1, 2, 3) (+/-)and whether weight is fixed or plastic

7 December 2009


12

EASy

Plastic linksPlastic linksPlastic linksPlastic links

Some of the links between neurons are fixed, some are plastic.

For plastic weights on a link from P to Q, Hebb rule:Change in WEIGHTPQ = k * AP * AQ where AP is the current activation of neuron P.

I.e.:- if the before and after activations are the same sign (tend to be correlated), increase strength of linkIf opposite sign (anti-correlated), decrease strength

7 December 2009


13

EASy

Within each neuronWithin each neuronWithin each neuronWithin each neuron

The sigmoid is a 'squashing function' such as

xe

xf 1

1

7 December 2009


14

EASy

Possible brainsPossible brainsPossible brainsPossible brains

So one possible genetically designed brain would be this:colour-blind eater – this is not a learner.

Whatever neuron 2 is, the links are 0, hence it can be ignored. This depends purely on smell, and has theconnection wired up with the right +ve sign, so that it eats things that smell sweet.

7 December 2009


15

EASy

Is a colour-blind eater any good?Is a colour-blind eater any good?Is a colour-blind eater any good?Is a colour-blind eater any good?

This creature depends purely on smell, and has theconnection wired up with the right +ve sign, so that it eats things that smell sweet.

Though it may occasionally ignore food that (noisily)smells sour, and eat poison that (noisily) smells sweet,on average it will do better than random.

So evolving within a population, this (non-learning) design will do better than random, and increase in the population.

7 December 2009


16

EASy

A learning brainA learning brainA learning brainA learning brain

This different possible design is a colour-learner

If this one is born in a world where smell is 75% accurate, and food is red, then (more often than not) seeing red is associated with the positive output.

7 December 2009


17

EASy

Is a colour-learner any good?Is a colour-learner any good?Is a colour-learner any good?Is a colour-learner any good?

… seeing red is associated with the positive output.So, with Hebbs rule, the RH connection gets built up +ve more strongly, until it fires the output on its own -- it can even over-rule the smell input on the 25% of occasions when it is mistaken.

Contrariwise, if it is born in a world where food is green, then Hebbs rule will build up a strong –ve connection -- with the same results.

So over a period, this will do better than the previous one

7 December 2009


18

EASy

Evolutionary runsEvolutionary runsEvolutionary runsEvolutionary runs

Populations are initialised randomly.

Of course many have no input neurons, or no output neurons, or stupid links

- these will behave stupidly or not at all.

The noise level on smell is set at some fixed value between 50% accurate (chance) and 100% accurate.

Then individuals are tested in a number of worlds where (50/50) red is food or red is poison.

7 December 2009


19

EASy

Keeping statisticsKeeping statisticsKeeping statisticsKeeping statistics

Statistics are kept for how long it takes for colour-blind eater to turn up and take over the population; and for colour-learner likewise.[Statistics: multiple runs to reduce chance elements, keep record of standard deviations]

When smell noise-level is 50%, then there is no available information, no improvement seen.When smell is 100% accurate, then colour-learning is unnecessary.What happens between 50% and 100% ?

7 December 2009


20

EASy

Results on a graphResults on a graphResults on a graphResults on a graph

Colour-learner= top line colour-blind eater=bottom line

7 December 2009


21

EASy

U-shaped curveU-shaped curveU-shaped curveU-shaped curve

As smell-sense accuracy goes from 50% (left) to 100% (right), average number of generations needed to evolve colour-blind eater decreases steadily (lower line)

But average number of generations needed to evolvecolour-learner goes through the U-shaped curve (upper line)

7 December 2009


22

EASy

Lessons for doing (Alife) projectsLessons for doing (Alife) projectsLessons for doing (Alife) projectsLessons for doing (Alife) projects

Make your model as simple as possible whilst still showing the phenomenon of interest.

More complicated models, with lots of bells and whistles,

(a) Are more difficult to get to work AND

(b) If they do work, they prove less than the simple model! Because it is not clear (without further investigation) which bit of your model actually was crucial, and which bit is unnecessary.

7 December 2009


23

EASy

More lessonsMore lessonsMore lessonsMore lessons

Do comparisons: your preferred model versus a simple variant (appropriate for the task) – e.g. compare with random search (if appr), or with one part disabled (if appr), or with a standard textbook method (if appr), or …

Graphs can be very informative -- provided the axes and scales are clearly labelled, and the caption or text explains them fully.

Usually it is appropriate to report mean (average) of several runs, and also to report standard deviations.

7 December 2009


24

EASy

Baldwin Effect – more detail in seminarBaldwin Effect – more detail in seminarBaldwin Effect – more detail in seminarBaldwin Effect – more detail in seminar

A much more subtle and interesting interaction between Learning and Evolution is the Baldwin effect -- named after Baldwin (1896).

Roughly speaking, this is an effect such that, under some circumstances, the ability of creatures to learn something guides evolution, such that in later generations their descendants can do the same job innately, without the need to learn.

7 December 2009


25

EASy

Speaking English at birth?Speaking English at birth?Speaking English at birth?Speaking English at birth?

Human children are born with the capacity to learn a language within their first few years.

Pick up the local language, English, Japanese, Inuit etc …

But suppose that within 1000 years everybody speaks English – then it won’t be necessary to work out “what happens to be the local language”

7 December 2009


26

EASy

Saving energy …Saving energy …Saving energy …Saving energy …

If any child has a mutation that predisposes it to saying “mummy, daddy” earlier than the other children

Then on average it will do slightly better at kindergarten --- at school --- in life --- have more children – this mutation could be expected to spread.

That child would have less learning to do, which would save energy and increase its fitness

7 December 2009


27

EASy

In the long termIn the long termIn the long termIn the long term

All human children will be born speaking good English at birth!

OK, where are the problems, what is in the small print?

7 December 2009


28

EASy

Ostrich bumsOstrich bumsOstrich bumsOstrich bums

Development and learning are in some sense much the same .

Often development is on a longer time-scale than learning, and development is constrained to just one (or a few) pathways, whereas learning may have many possible pathways (eg many different languages).

In the ‘everyone speaks English’ scenario, what was originally learnt becomes a standard developmental pathway – like ostrich bums

7 December 2009


29

EASy

Ostriches, long grass, sharp rocksOstriches, long grass, sharp rocksOstriches, long grass, sharp rocksOstriches, long grass, sharp rocks

7 December 2009


30

EASy

CallositiesCallositiesCallositiesCallosities

Figure 4 | The callosities on the ventralsurface of the ostrich. The callosities aredepicted by arrows. How did they becomeassimilated into the genome?

7 December 2009


31

EASy

Is it Lamarckian?Is it Lamarckian?Is it Lamarckian?Is it Lamarckian?

This sounds like Lamarckism -- "giraffes stretched their necks to reach higher trees, and the increased neck-length in the adults was directly inherited by their children".

Lamarckism of this type is almost universally considered impossible -- why ??

The Baldwin effect gives the impression of Lamarckism, without the flaws.

To be pursued in seminar

7 December 2009


32

EASy

Public Service AnnouncementPublic Service AnnouncementPublic Service AnnouncementPublic Service Announcement

On all your courses, including this one, please go to your Study Direct page and give FEEDBACK – this week !

7 December 2009


33

EASy

Feedback on Alife course?Feedback on Alife course?Feedback on Alife course?Feedback on Alife course?

Any verbal feedback now? And in seminars.

What was missing that you expected?

And a teaser (cf ‘Circular causation’…)

7 December 2009


34

EASy

Circular Causation poser …Circular Causation poser …Circular Causation poser …Circular Causation poser …

A cart with its wheels linked through gears to a propellor.

Can it run directly downwind faster than the wind?

EASy 7 December 2009Artificial Life Lecture 181 Artificial Life lecture 18 Evolution and Learning...

Documents

Transcript of EASy 7 December 2009Artificial Life Lecture 181 Artificial Life lecture 18 Evolution and Learning...