Simple learning in connectionist networks

47
Simple learning in connectionist networks

description

Simple learning in connectionist networks. Learning associations in connectionist networks Associationism (James, 1890) Association by contiguity Generalization by similarity. Connectionist implementation - PowerPoint PPT Presentation

Transcript of Simple learning in connectionist networks

Page 1: Simple learning in connectionist networks

Simple learning in connectionist networks

Page 2: Simple learning in connectionist networks

Learning associations in connectionist networks

Associationism (James, 1890)

Association by contiguity

Generalization by similarity

Connectionist implementation

Represent items as patterns of activity, where similarity is measured in terms of overlap/correlation of patterns

Represent contiguity of items as simultaneous presence of their patterns over two groups of units (A and B)

Adjust weights on connections between units in A and units in B so that the

pattern on A tends to cause the corresponding pattern on B

As a result, when we next present the same or similar pattern on A, it tends to produce the corresponding pattern on B (perhaps somewhat weakened or distorted)

Page 3: Simple learning in connectionist networks

Correlational learning: Hebb rule

What Hebb actually said:

When an axon of cell A is near enough to excite a cell B and repeatedly and consistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficacy, as one of the cells firing B, is increased.

The minimal version of the Hebb rule:

When there is a synapse between cell A and cell B, increment the strength of the synapse whenever A and B fire together (or in close succession).

The minimal Hebb rule as implemented in a network:

Page 4: Simple learning in connectionist networks

Why?

Page 5: Simple learning in connectionist networks
Page 6: Simple learning in connectionist networks

So weight from sending to receiving unitends up being proportional to their correlationover patterns…

Page 7: Simple learning in connectionist networks
Page 8: Simple learning in connectionist networks

Final weights: +4, 0

Page 9: Simple learning in connectionist networks
Page 10: Simple learning in connectionist networks
Page 11: Simple learning in connectionist networks
Page 12: Simple learning in connectionist networks
Page 13: Simple learning in connectionist networks
Page 14: Simple learning in connectionist networks

Brad

Pitt

Post

leJe

nnife

rAn

isto

nLa

wre

nce

Face perception units

Page 15: Simple learning in connectionist networks

Face perception neurons (“Sending units”)

Name neurons(“receiving units”)

“Synapses”

Page 16: Simple learning in connectionist networks

Simple learning model

• Imagine neurons have a threshold set at 0. If input is above threshold (positive number), the neuron fires strongly. If it is below threshold (negative number), the neuron fires weakly.

• When a face is directly observed, the features it possesses get positive input, all others get negative input.

• When a name is presented, it gets positive input, all others get negative input.

Page 17: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Pitt

Page 18: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Postle

Page 19: Simple learning in connectionist networks

Learning to name faces• Goal: Encode “memory” of which name goes with which

name…• …so, when given face, correct name units activate.• Face units are “sending units”, name units are “receiving

units”• Synapse has a “weight” that describes effect of sending unit

on receiving unit– Negative weight= inhibitory influence, positive weight =

excitatory influence• Activation rule for name units:

– Take state of sending unit * value of synapse (weight), summed over all sending units to get net input.

– Activate strongly if net input >0, otherwise activate weakly.

Page 20: Simple learning in connectionist networks

What happens if the synapses (weights) are all zero?

Page 21: Simple learning in connectionist networks

Face perception neurons

Name neurons

Page 22: Simple learning in connectionist networks

Hebbian learning

Page 23: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Pitt

Page 24: Simple learning in connectionist networks

Face perception neurons

Name neurons

Page 25: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Postle

Page 26: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Postle

Page 27: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Pitt

Page 28: Simple learning in connectionist networks

What about new inputs?

Page 29: Simple learning in connectionist networks

Face perception neurons

Name neurons

Not Brad Pitt or Postle

Page 30: Simple learning in connectionist networks

What about more memories?

Page 31: Simple learning in connectionist networks

Face perception neurons

Name neurons

JenniferAniston

Page 32: Simple learning in connectionist networks

Face perception neurons

Name neurons

JenniferLawrence

Page 33: Simple learning in connectionist networks

Face perception neurons

Name neurons

Page 34: Simple learning in connectionist networks

• With Hebbian learning, many different patterns can be stored in a single configuration of weights.

• What would happen if we kept training this model with the same four faces?

• Same weight configuration, just with larger weights.

Page 35: Simple learning in connectionist networks

Hebbian learning

• Captures “contiguity” of learning (things that occur together should become associated in memory)

• Learning rule (∆wij = ε aj ai) is proportional to correlation coefficient between units.

• Captures effects of “similarity” in learning (a learned response should generalize to things with similar representations)

• With linear units (ai = neti), output is a weighted sum of the unit’s previous activation to stored patterns…

• …where the “weight” in the sum is proportional to the similarity (dot product) between the current input pattern and the previously stored pattern

• So, a test pattern that is completely dissimilar to a stored pattern (dot product of zero) will not be affected by the stored pattern at all.

Page 36: Simple learning in connectionist networks

Limitations of Hebbian learning

• Many association problems it cannot solve

• Especially where similar input patterns must produce quite different outputs.

• Without further constraints, weights grow without bound

• Each weight is learned independently of all others

• Weight changes are exactly the same from pass to pass.

Page 37: Simple learning in connectionist networks

Error-correcting learning(delta rule)

Page 38: Simple learning in connectionist networks

Face perception neurons

Name neurons

Brad Pitt

Target states

Error!

Squared error!

Page 39: Simple learning in connectionist networks
Page 40: Simple learning in connectionist networks
Page 41: Simple learning in connectionist networks

Input 1 Input 2 Output

1 0 1

0 1 0

0 0 0

1 1 1

1 2

3

w1 w2

Page 42: Simple learning in connectionist networks
Page 43: Simple learning in connectionist networks

Delta-rule learning with linear units

• Captures intuition that learning should reduce discrepancy between what you predict and what actually happens (error).

• Basic approach: Measure error for current input/output pair; compute derivative of this error with respect to each weight; change weight by small amount to reduce error.

• For squared error, change in weight is proportional to (tj – aj) ai

• …that is, the correlation between the input activation and the current error.

• Weight changes can be viewed as performing “gradient descent” in error (each change will reduce total error).

• Algorithm is inherently multi-pass: changes to a given weight will vary from pass to pass, b/c error depends on state of weights.

Page 44: Simple learning in connectionist networks

Delta rule will learn any association for which a solution exists!

Page 45: Simple learning in connectionist networks
Page 46: Simple learning in connectionist networks
Page 47: Simple learning in connectionist networks