Self-organizing map
description
Transcript of Self-organizing map
Self-organizing map
Speech and Image Processing UnitDepartment of Computer ScienceUniversity of Joensuu, FINLAND
Pasi Fränti
Clustering Methods: Part 9
SOM main principles
• Self-organizing map (SOM) is a clustering method suitable especially for visualization.
• Clustering represented by centroids organized in a 1-d or 2-d network.
• Dimensionality reduction and visualization possibility achieved as side product.
• Clustering performed by competitive learning principle.
Self-organizing mapInitial configuration
M nodes, one for
each cluster
Initial locationsnot important
Nodes connected by networktopology (1-d or 2-d)
Self-organizing mapFinal configuration
Nodelocations
adaptduring
learningstage
Networkkeeps neighbor vectors close to each other
Network limits the movement of vectors
during learning
SOM pseudo code
SOM-algorithm( X, C ): RandomizeNeurons(C). D Dmax. t 0. REPEAT
REPEAT T times LearnVectors(X, C, D, t). t t + 1.
END-REPEAT D D - 1.
UNTIL D < 0. END-SOM
(1/2)
Learning stage
SOM pseudo code
LearnVectors( X, C, D, t ): FOR each xi X DO
j FindNearestNeuron(xi, C). FOR d = -D to D DO
p j+d cp cp + (t,d)(xi -cp)
END-FOR END-FOR
END
Update
centroids
(2/2)
Competitive learning
• Each data vector is processed once.
• Find nearest centroid:
c c t d x cj j i j ,
p d x cij M
i j
arg min ,1
• The centroid is updated by moving it towards the data vector by:
• Learning stage similar to k-means but centroid update has different principle.
t d AT t
Te
dD,
t d A e etT
dD,
• Decreases with time movement is large in the beginning but eventually stabilizes.
• Linear decrease of weighting:
Learning rate ()
• Exponential decrease of weighting:
Neighborhood (d)
Neighboring centroids are also updated:
t d A e etT
dD,
Effect is stronger for nearby centroids:
Weighting of the neighborhood
Weighting decreases exponentially
• Number of iterations T– Convergence of SOM is rather slow
Should be set as high as possible
– Roughly 100-1000 iterations at minimum.
• Size of the initial neighborhood Dmax
– Small enough to allow local adaption.
– Value D=0 indicates no neighbor structure
• Maximum learning rate A– Higher values have mostly random effect.
– Most critical are the final stages (D2)
– Optimal choices of A and Dmax highly correlated.
Parameter setup
Difficulty of parameter setup
150
160
170
180
190
200
210
220
230
240
250
1 2 5 10 20 40
Initial neighborhood size
ms
e
204080
Dmax=2, T=10 Dmax=10, T=2
Dmax=5, T=4
Fixing total number of iterations (TDmax) to 20, 40 and 80.
Optimal parameter combination non-trivial.
• To reduce the effect of parameter set-up, should be as high as possible.
• Enough time to adapt at the cost of high time complexity.
• Adaptive number of iterations:
Adaptive number of iterations
T D TD
1
2 max
• For Dmax=10 and Tmax=100:
Ti = {1, 1, 1, 1, 2, 3, 6, 13, 25, 50, 100}
Example of SOM (1-d)
One clustertoo many
One cluster missing
Example of SOM (2-d)
(to appear sometime in future)
1. T. Kohonen, Self-Organization and Associative Memory. Springer-
Verlag, New York, 1988.
2. N.M. Nasrabadi and Y. Feng, "Vector quantization of images based
upon the Kohonen self-organization feature maps", Neural Networks,
1 (1), 518, 1988.
3. P. Fränti, "On the usefulness of self-organizing maps for the
clustering problem in vector quantization", 11th Scandinavian Conf.
on Image Analysis (SCIA’99), Kangerlussuaq, Greenland, vol. 1, 415-
422, 1999.
Literature