Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit...
-
Upload
loraine-cross -
Category
Documents
-
view
221 -
download
0
description
Transcript of Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit...
Activations, attractors, and associators
Jaap MurreUniversiteit van Amsterdam en
Universiteit [email protected]
Toets
• Op welke wijze abstraheert een neuraal network neuron (‘node’) van een biologisch neuron?
• Noem tenminste 5 kenmerken
Overview
• Interactive activation model• Hopfield networks• Constraint satisfaction• Attractors• Traveling salesman problem• Hebb rule and Hopfield networks• Bidirectional associative networks• Linear associative networks
Much of perception is dealing with ambiguity
LAB
Many interpretations are processed in parallel
CAB
The final interpretation must satisfy many constraintsIn the recognition of letters and words:i. Only one word can occur at a given positionii. Only one letter can occur at a given
positioniii. A letter-on-a-position activates a wordiv. A feature-on-a-position activates a letter
i. Only one word can occur at a given position
LAP CAP CAB
L.. C.. .A. ..P ..B
ii. Only one letter can occur at a given position
L.. C.. .A. ..P ..B
LAP CAP CAB
iii. A letter-on-a-position activates a word
L.. C.. .A. ..P ..B
LAP CAP CAB
iv. A feature-on-a-position activates a letter
L.. C.. .A. ..P ..B
LAP CAP CAB
L.. C.. .A. ..P ..B
LAP CAP CAB
Recognition of a letter is a process of constraint satisfaction
L.. C.. .A. ..P ..B
LAP CAP CAB
Recognition of a letter is a process of constraint satisfaction
L.. C.. .A. ..P ..B
LAP CAP CAB
Recognition of a letter is a process of constraint satisfaction
L.. C.. .A. ..P ..B
LAP CAP CAB
Recognition of a letter is a process of constraint satisfaction
L.. C.. .A. ..P ..B
LAP CAP CAB
Recognition of a letter is a process of constraint satisfaction
Hopfield (1982)
• Bipolar activations – -1 or 1
• Symmetric weights (no self weights) – wij= wji
• Asynchronous update rule– Select one neuron randomly and update it
• Simple threshold rule for updating
Energy of a Hopfield network
Energy E = - ½ i,jwjiaiaj
E = - ½ i(wjiai + wijai)aj = - iwjiai aj
Net input to node j is iwjiai = netj
Thus, we can write E = - netj aj
Given a net input, netj, find aj so that - netjaj is minimized
• If netj is positive set aj to 1
• If netj is negative set aj to -1
• If netj is zero, don’t care (leave aj as is)• This activation rule ensures that the energy
never increases• Hence, eventually the energy will reach a
minimum value
Attractor• An attractor is a stationary network state
(configuration of activation values)• This is a state where it is not possible to minimize
the energy any further by just flipping one activation value
• It may be possible to reach a deeper attractor by flipping many nodes at once
• Conclusion: The Hopfield rule does not guarantee that an absolute energy minimum will be reached
Attractor
Local minimum
Global minimum
Example: 8-Queens problem
• Place 8 queens on a chess board such that they are not able to take each other
• This implies the following three constraints:– 1 queen per column– 1 queen per row– 1 queen on any diagonal
• This encoding of the constraints ensures that the attractors of the network correspond to valid solutions
The constraints are satisfied by inhibitory connections
Column
Row
Diagonals
Diagonals
Problem: how to ensure that exactly 8 nodes are 1?• A term may be added to control for this in
the activation rule• Binary nodes may be used with a bias• It is also possible to use continuous valid
nodes with Hopfield networks (e.g, between 0 and 1)
Traveling Salesman Problem
The energy minimization question can also be turned around
• Given ai and aj, how should we set the weight wji = wji so that the energy is minimized?
E = - ½ wjiaiaj, so that – when aiaj = 1, wji must be positive
– when aiaj = -1, wji must be negative
• For example, wji= aiaj, where is a learning constant
Hebb and Hopfield
• When used with Hopfield type activation rules, the Hebb learning rule places patterns at attractors
• If a network has n nodes, 0.15n random patterns can be reliably stored by such a system
• For complete retrieval it is typically necessary to present the network with over 90% of the original pattern
Bidirectional Associative Memories (BAM, Kosko 1988)• Uses binary nodes (0 or 1)• Symmetric weights• Input and output layer• Layers are updated in order,
using threshold activation rule
• Nodes within a layer are updated synchronously
BAM
• BAM is in fact a Hopfield network with two layers of nodes
• Within a layer, weights are 0• These neurons are not dependent on each other (no
mutual inputs)• If updated synchronously, there is therefore no
danger of increasing the network energy• BAM is similar to the core of Grossberg’s Adaptive
Resonance Theory (Lecture 4)
Linear Associative Networks
• Invented by Kohonen (1972), Nakano (1972), and by Anderson (1972)
• Two layers• Linear activation rule
– Activation is equal to net input• Can store patterns• Their behavior is mathematically tractable
using matrix algebra
Associating an input vector p with an output vector q
Storage: W = qpT
with = (pTp)-1
Recall: Wp = qpTp = pTpq = q
Inner product pTp gives a scalar
3 0 1 4 0 1301401
p
pT
9 0 1 16 0 1
9 0 116 0 1
27 = (pTp)-1 = 1/27
Outer product qpT gives a matrix
3 0 1 4 0 1120241
3 0 1 4 0 1 6 0 2 8 0 2 0 0 0 0 0 0 6 0 2 8 0 212 0 4 16 0 4 3 0 1 4 0 1
q output vector
pT input vector
W/ weight matrix divided by constant
Final weight matrix W = qpT
3 0 1 4 0 10.11 0 0.04 0.15 0 0.04 10.22 0 0.07 0.3 0 0.07 2
0 0 0 0 0 0 00.22 0 0.07 0.3 0 0.07 20.44 0 0.15 0.59 0 0.15 40.11 0 0.04 0.15 0 0.04 1
Recall: Wp = q
1 0.11 0 0.04 0.15 0 0.04 3
2 0.22 0 0.07 0.3 0 0.07 0
0 0 0 0 0 0 0 1
2 0.22 0 0.07 0.3 0 0.07 4
4 0.44 0 0.15 0.59 0 0.15 0
1 0.11 0 0.04 0.15 0 0.04 1
0.113 + 00 + 0.04 1 + 0.154 + 0 0 + 0.041 = 10.223 + 00 + 0.07 1 + 0.304 + 0 0 + 0.071 = 2
Weight matrix Input vectorOutput vector
Storing n patterns
Storage: Wk = kqkpkT, with k = pk
Tpk
W = W1 + W2 + … + Wk + … + Wn
Recall: Wpk = kqkpkTpk + Error = q + Error
Error = W1pk + … + Whpk + … + Wnpk
is 0 only if phTpk for all h k
Conclusion
• LANs work only well, if the input patterns are (nearly) orthogonal
• If an input pattern overlaps with others, then recall will be contaminated with the output patterns of those overlapping patterns
• It is, therefore, important that input patterns are orthogonal (i.e., have little overlap)
LANs have limited representational power• For each three-layer LAN,
there exists an equivalent two layer LAN
• Proof: Suppose that q = Wp and r = Vq, than we have
r = Vq = VWp = Xpwith X = VW
p
q
r
W
V
p
r
X
Summing up
• There is a wide variety of ways to store and retrieve patterns in neural networks based on the Hebb rule– Willshaw network (associator)– BAM– LAN– Hopfield network
• In Hopfield networks, stored patterns can be viewed as attractors
Summing up
• Finding an attractor is a process of constraint satisfaction. It can can be used as:– A recognition model– A memory retrieval model– A way of solving the traveling salesman
problem and other difficult problems