Self-organizing map

Self-organizing map

Speech and Image Processing UnitDepartment of Computer ScienceUniversity of Joensuu, FINLAND

Pasi Fränti

Clustering Methods: Part 9

SOM main principles

• Self-organizing map (SOM) is a clustering method suitable especially for visualization.

• Clustering represented by centroids organized in a 1-d or 2-d network.

• Dimensionality reduction and visualization possibility achieved as side product.

• Clustering performed by competitive learning principle.

Self-organizing mapInitial configuration

M nodes, one for

each cluster

Initial locationsnot important

Nodes connected by networktopology (1-d or 2-d)

Self-organizing mapFinal configuration

Nodelocations

adaptduring

learningstage

Networkkeeps neighbor vectors close to each other

Network limits the movement of vectors

during learning

SOM pseudo code

SOM-algorithm( X, C ): RandomizeNeurons(C). D Dmax. t 0. REPEAT

REPEAT T times LearnVectors(X, C, D, t). t t + 1.

END-REPEAT D D - 1.

UNTIL D < 0. END-SOM

(1/2)

Learning stage

SOM pseudo code

LearnVectors( X, C, D, t ): FOR each xi X DO

j FindNearestNeuron(xi, C). FOR d = -D to D DO

p j+d cp cp + (t,d)(xi -cp)

END-FOR END-FOR

END

Update

centroids

(2/2)

Competitive learning

• Each data vector is processed once.

• Find nearest centroid:

c c t d x cj j i j ,

p d x cij M

i j

arg min ,1

• The centroid is updated by moving it towards the data vector by:

• Learning stage similar to k-means but centroid update has different principle.

t d AT t

Te

dD,

t d A e etT

dD,

• Decreases with time movement is large in the beginning but eventually stabilizes.

• Linear decrease of weighting:

Learning rate ()

• Exponential decrease of weighting:

Neighborhood (d)

Neighboring centroids are also updated:

t d A e etT

dD,

Effect is stronger for nearby centroids:

Weighting of the neighborhood

Weighting decreases exponentially

• Number of iterations T– Convergence of SOM is rather slow

Should be set as high as possible

– Roughly 100-1000 iterations at minimum.

• Size of the initial neighborhood Dmax

– Small enough to allow local adaption.

– Value D=0 indicates no neighbor structure

• Maximum learning rate A– Higher values have mostly random effect.

– Most critical are the final stages (D2)

– Optimal choices of A and Dmax highly correlated.

Parameter setup

Difficulty of parameter setup

150

160

170

180

190

200

210

220

230

240

250

1 2 5 10 20 40

Initial neighborhood size

ms

e

204080

Dmax=2, T=10 Dmax=10, T=2

Dmax=5, T=4

Fixing total number of iterations (TDmax) to 20, 40 and 80.

Optimal parameter combination non-trivial.

• To reduce the effect of parameter set-up, should be as high as possible.

• Enough time to adapt at the cost of high time complexity.

• Adaptive number of iterations:

Adaptive number of iterations

T D TD

1

2 max

• For Dmax=10 and Tmax=100:

Ti = {1, 1, 1, 1, 2, 3, 6, 13, 25, 50, 100}

Example of SOM (1-d)

One clustertoo many

One cluster missing

Example of SOM (2-d)

(to appear sometime in future)

1. T. Kohonen, Self-Organization and Associative Memory. Springer-

Verlag, New York, 1988.

2. N.M. Nasrabadi and Y. Feng, "Vector quantization of images based

upon the Kohonen self-organization feature maps", Neural Networks,

1 (1), 518, 1988.

3. P. Fränti, "On the usefulness of self-organizing maps for the

clustering problem in vector quantization", 11th Scandinavian Conf.

on Image Analysis (SCIA’99), Kangerlussuaq, Greenland, vol. 1, 415-

422, 1999.

Literature

Self-organizing map

Documents

Transcript of Self-organizing map