Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of...

Clustering Methods: Part 2d

Pasi Fränti

31.3.2014

Speech & Image Processing UnitSchool of Computing

University of Eastern FinlandJoensuu, FINLAND

Swap-based algorithms

Part I:

Random Swap algorithm

P. Fränti and J. KivijärviRandomised local search algorithm for the clustering problem Pattern Analysis and Applications, 3 (4), 358-369, 2000.

Pseudo code of Random Swap

RandomSwap(X) C, P

C SelectRandomRepresentatives(X); P OptimalPartition(X, C);

REPEAT T times

(Cnew,j) RandomSwap(X, C); Pnew LocalRepartition(X, Cnew, P, j); Cnew, Pnew Kmeans(X, Cnew, Pnew); IF f(Cnew, Pnew) < f(C, P) THEN

(C, P) Cnew , Pnew;

RETURN (C, P);

Demonstration of the algorithm

Two centroids , butonly one cluster .

One centroid , buttwo clusters .

Centroid swap

Swap is made fromcentroid rich area tocentroid poor area.

Local repartition

Fine-tuning by K-means1st iteration

Fine-tuning by K-means2nd iteration

Fine-tuning by K-means3rd iteration

Fine-tuning by K-means16th iteration

Fine-tuning by K-meansFinal result after 25 iterations

Implementation of the swap

1. Random swap:

2. Re-partition vectors from old cluster:

3. Create new cluster:

c x j random M i random Nj i ( , ), ( , )1 1

p d x c i p jik M

arg min ,1

p d x c i Nik j k p

arg min , ,2

Random swap as local search

Study neighbor solutions

Select one and move

Random swap as local search

Fine-tune solution by hill-climbing technique!

Role of K-means

Consider only local optima!

Role of K-means

Effective search space

Role of swap: reduce search space

Chain reaction by K-means after swap

176.53

163.93 163.63 163.51 163.08

K-means Random+ RS

K-means+ RS

Split +RS

Ward +RS

Bridge

Independency of initializationResults for T = 5000 iterations

BestInitial

Initial

Part II:

Efficiency of Random Swap

Probability of good swap

• Select a proper centroid for removal:

– There are M clusters in total: premoval=1/M.

• Select a proper new location:

– There are N choices: padd=1/N

– Only M are significantly different: padd=1/M

• In total:– M2 significantly different swaps.

– Probability of each different swap is pswap=1/M2

– Open question: how many of these are good?

Number of neighbors

Open question: what is the size of neighborhood ()?

Voronoi neighbors Neighbors by distance

1 2 3 4 5 6 7 8 9

Number of neighbours

Average = 3.9

Observed number of neighborsData set S2

Average number of neighbors

• Probability of not finding good swap:T

Expected number of iterations

1loglogM

• Estimated number of iterations:

Observed q-values Estimated iterations (T ) S1 S2 S3 S4 S1 S2 S3 S4

q=10% 19% 14% 22% 22% 53 47 39 37 q=1% 3.1% 1.2% 1.0% 3.6% 106 93 78 74 q=0.1% 0.1% 0.1% 0.2% 1.1% 159 140 117 111 Expected: 72 56 55 48 23 21 17 16

Estimated number of iterationsdepending on T

S1 S2 S3 S4

Observed = Number of iterations needed in practice.Estimated = Estimate of the number of iterations needed for given q

Probability of success (p)depending on T

0 50 100 150 200 250 300

Iterations

0.000000001

0.00000001

0.0000001

0.000001

0.00001

0.0001

0 50 100 150 200 250 300

Iterations

Probability of failure (q) depending on T

0.01 %

0.10 %

1.00 %

10.00 %

100.00 %

16 32 64 128 256 512 1024

Dimensionality

Observed for q=0.10%

Observed for q=1%

Observed for q=10%

Observed probabilities depending on dimensionality

ln -α

2222-ln

Bounds for the number of iterations

Upper limit:

Lower limit similarly; resulting in:

Multiple swaps (w)

Probability for performing less than w swaps:

Expected number of iterations:

K-means clustering result(3 swaps needed)

Final clustering result

Number of swaps neededExample from image quantization

Efficiency of the random swap

Total time to find correct clustering:– Time per iteration Number of

iterations

Time complexity of a single step:– Swap: O(1)– Remove cluster: 2MN/M = O(N)– Add cluster: 2N = O(N)– Centroids: 2(2N/M) + 2 + 2 = O(N/M) – (Fast) K-means iteration: 4N = O(N)*

*See Fast K-means for analysis.

Observed number of steps at iteration: Step: Time complexity:

50 100 500 Centroid swap 2 2 2 2

Cluster removal 2N 7,526 8,448 10,137 Cluster addition 2N 8,192 8,192 8,192 Update centroids 4N/M + 2 + 1 53 61 60

K-means iterations 4N 300,901 285,555 197,327 Total O(N) 316,674 302,258 215,718

Time complexity and the observed number of steps

0 50 100 150 200 250 300 350 400 450 500

k-means 2. iterationk-means 1. iterationlocal repartition

Bridge

Time spent by K-means iterations

0.1 1 10 100160

1 iteration2 iterations3 iterations4 iterations5 iterations

10 20 30 40 50167

Ve rs io n w ith o n e it e ra t io n s e e ms to b e w e a ke s t a ll th e t ime .

Ve rs io n s w ith o th e r a mo u n ts o fit e ra t io n s a re p re t ty e v e n .

T im e (s )

B rid ge

Effect of K-means iterations

Total time complexity

Number of iterations needed (T):

Mq-MNT

2 lnln ,

ln -α

t = O(αN)

Total time:

Time complexity of a single step (t):

Time complexity: conclusions

1. Logarithmic dependency on q

2. Linear dependency on N

3. Quadratic dependency on M (With large number of clusters, can be too slow)

4. Inverse dependency on (worst case = 2) (Higher the dimensionality and higher the cluster overlap, faster the method)

NMq-MNT

Bridge

0.1 1 10 100 1000Time

Random Swap

Repeated k-means

Time-distortion performance

Missa1

1 10 100 1000Time

RamdomSwap

Repeated k-means

1 10 100 1000 10000Time

Random Swap

Repeated k-means

Birch1

Birch2

1 10 100 1000

Repeated k-means

Random Swap

Europe

1 10 100 1000Time

E Repeated k-means

RandomSwap

KDD-Cup04 Bio

100 1000 10000 100000Time

Random Swap

Repeated k-means

References

Random swap algorithm:• P. Fränti and J. Kivijärvi, "Randomised local search algorithm

for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000.

• P. Fränti, J. Kivijärvi and O. Nevalainen, "Tabu search algorithm for codebook generation in VQ", Pattern Recognition, 31 (8), 1139‑1148, August 1998.

Pseudo code:• http://cs.joensuu.fi/sipu/soft/

Efficiency of Random swap algorithm:• P. Fränti, O. Virmajoki and V. Hautamäki, “Efficiency of

random swap based clustering", IAPR Int. Conf. on Pattern Recognition (ICPR’08), Tampa, FL, Dec 2008.

Part III:

Example when 4 swaps needed

MSE = 4.2 * 109 MSE = 3.4 * 109

1st swap

MSE = 3.1* 109 MSE = 3.0 * 109

2nd swap

MSE = 2.3 * 109 MSE = 2.1 * 109

3rd swap

MSE = 1.9 * 109 MSE = 1.7 * 109

4th swap

MSE = 1.3 * 109

Final result

Part IV:

Deterministic Swap

Deterministic swap

Cluster Removal Addition 1 0.80 0.39

2 1.04 0.64 3 5.48 1.09 4 5.66 0.92 5 6.50 0.76 6 7.67 1.01 7 8.47 0.45 8 9.10 0.75 9 9.90 1.42

10 11.09 1.26 11 11.47 0.61 12 12.17 4.70 13 14.61 0.94 14 16.41 0.93 15 16.68 1.41

Costs for the swap:

From where to where?

• Merge two existing clusters [Frigui 1997, Kaukoranta 1998] following the spirit of agglomerative clustering.

• Local optimization: remove the prototype that increases the cost function value least [Fritzke 1997, Likas 2003, Fränti 2006].

• Smart swap: find two nearest prototypes, and remove one of them randomly [Chen, 2010].

• Pairwise swap: locate a pair of inconsistent prototypes in two solutions [Zhao, 2012].

Cluster removal

1. Select an existing cluster– Depending on strategy: 1..M choices.– Each choice takes O(N) time to test.

2. Select a location within this cluster– Add new prototype– Consider only existing points

Cluster addition

Select the cluster

• Cluster with the biggest MSE– Intuitive heuristic [Fritzke 1997, Chen 2010]

– Computationally demanding:

• Local optimization– Try all clusters for the addition [Likas et al, 2003]

– Computationally demanding: O(NM)-O(N2)

Select the location

1. Current prototype + ε [Fritzke 1997]

2. Furthest vector [Fränti et al 1997]

3. Any other split heuristic [Fränti et al, 1997]

4. Random location

5. Every possible location [Likas et al, 2003]

Complexity of swaps

Furthest point in cluster

Prototype removed

Cluster where added

Furthest pointselected

• Initialization: O(MN) • Swap Iteration

– Finding nearest pair: O(M2) – Calculating distortion: O(N) – Sorting clusters: O(M∙logM) – Evaluation of result: O(N) – Repartition and fine-tuning: O(N) Total: O(MN+M2+I∙N)

• Number of iteration expected: < 2∙M

• Estimated total time: O(2M2N)

Smart swap

Nearestprototypes

Cluster with largest distortion

SmartSwap(X,M) → C,PC ← InitializeCentroids(X);P ←PartitionDataset(X, C);Maxorder ← log2M;

order ← 1;WHILE order < Maxorder ci, cj ←FindNearestPair(C);

S ← SortClustersByDistortion(P, C); cswap ←RandomSelect(ci, cj );

clocation ←sorder;

Cnew ← Swap(cswap, clocation);

Pnew ← LocalRepartition(P, Cnew);

KmeansIteration(Pnew, Cnew);

IF f(Cnew) < f(C), THEN

order ← 1; C ←Cnew ;

ELSE order ← order + 1; KmeansIteration(P, C);

Smart swappseudo code

Pairwise swap

Unpaired prototypes

Unpairedprototypes

Nearest neighborsof each other

Nearest neighbor ofthe other set further than in the same set

→Subject to swap

Combinations of random and deterministic swap

Variant Removal Addition

RR Random Random

RD Random Deterministic

DR Deterministic Random

DD Deterministic Deterministic

D2R Deterministic+ data update

Random

D2D Deterministic+ data update

Deterministic

Summary of the time complexities

Random removal

Deterministic removal

RR RD DR DD D2R D2DRemoval O(1) O(1) O(MN) O(MN) O(αN) O(αN)

Addition O(1) O(N) O(1) O(N) O(1) O(N)

Repartition

O(N) O(N) O(N) O(N) O(N) O(N)

K-means O(αN) O(αN) O(αN) O(αN) O(αN) O(αN)

O(αN) O(αN) O(MN) O(MN) O(αN) O(αN)

Profiles of the processing time

RR RD DR DD D2R D2D

Time (

s) / it

eratio

Others

Repartition

K-means

RR RD DR DD D2R D2D

Time (

s) / it

eratio

Others

Repartition

K-means

Bridge Birch2

Test data setsData set Type of data set Number of data

vectors (N) Number of clusters (M)

Dimension of data vector (d)

Bridge Gray-scale image 4086 256 16

House* RGB image 34112 256 3

Miss America Residual vectors 6480 256 16

Europe Differential coordinates 169673 2

BIRCH1-BIRCH3 Synthetically generated 100000 100 2

S1- S4 Synthetically generated 5000 15 2

Dim32-1024 Synthetically generated 1000 256 32 – 1024

Data set S1 Data set S2 Data set S3 Data set S4

Birch data sets

Birch1 Birch2 Birch3

ExperimentsBridge

1 10 100160

Time (s)

Bridge

RRDRRDDD

RandomSwap

ExperimentsBridge

Bridge

0.1 1 10 100Time

Random Swap

Repeated k-means

10 1002

Time (s)

Birch2

RRDRRDDD

ExperimentsBirch2

Random Swap

Missa1

1 10 100

RamdomSwap

Repeated k-means

ExperimentsMiss America

Quality comparisons (MSE)with 10 second time constraint

18:14:16:15:14:12:1Average speed-up from RR to RD

2.785.111.025.586.10171.20RD-variant

4.435.701.265.856.41174.08Random Swap

4.105.491.525.926.58177.66Repeated K-means

22.3513.102.378.3412.12251.32Repeated Random

Birch2

Birch1

Europe×107

Miss America

HouseBridge

Literature1. P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the

clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000.

2. P. Fränti, J. Kivijärvi and O. Nevalainen, "Tabu search algorithm for codebook generation in VQ", Pattern Recognition, 31 (8), 1139‑1148, August 1998.

3. P. Fränti, O. Virmajoki and V. Hautamäki, “Efficiency of random swap based clustering", IAPR Int. Conf. on Pattern Recognition (ICPR’08), Tampa, FL, Dec 2008.

4. P. Fränti, M. Tuononen and O. Virmajoki, "Deterministic and randomized local search algorithms for clustering", IEEE Int. Conf. on Multimedia and Expo, (ICME'08), Hannover, Germany, 837-840, June 2008.

5. P. Fränti and O. Virmajoki, "On the efficiency of swap-based clustering", Int. Conf. on Adaptive and Natural Computing Algorithms (ICANNGA'09), Kuopio, Finland, LNCS 5495, 303-312, April 2009.

5. J. Chen, Q. Zhao, and P. Fränti, "Smart swap for more efficient clustering", Int. Conf. Green Circuits and Systems (ICGCS’10), Shanghai, China, 446-450, June 2010.

6. B. Fritzke, The LBG-U method for vector quantization – an improvement over LBG inspired from neural networks. Neural Processing Letters 5(1) (1997) 35-45.

7. P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pat. Rec., 39 (5), 761-765, May 2006.

8. T. Kaukoranta, P. Fränti and O. Nevalainen "Iterative split-and-merge algorithm for VQ codebook generation", Optical Engineering, 37 (10), 2726-2732, October 1998.

9. H. Frigui and R. Krishnapuram, "Clustering by competitive agglomeration". Pattern Recognition, 30 (7), 1109-1119, July 1997.

Literature

10. A. Likas, N. Vlassis and J.J. Verbeek, "The global k-means clustering algorithm", Pattern Recognition 36, 451-461, 2003.

11. PAM (Kaufman and Rousseeuw, 1987)

12. CLARA (Kaufman and Rousseeuw in 1990)

13. CLARANS (A Clustering Algorithm based on Randomized Search) (Ng and Han 1994)

14. R.T. Ng and J. Han, “CLARANS: A method for clustering objects for spatial data mining,” IEEE Transactions on knowledge and data engineering, 14 (5), September/October 2002.

Literature

Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of...

Documents

Transcript of Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of...

CXPA Finland 2015 - Sirte Pihlaja - CXPA Finland

NP theory Pasi Fränti 22.10.2013. Turing machine vs. RAM © GEOFF DRAPER.

Variable Metric For Binary Vector Quantization UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Ismo Kärkkäinen and Pasi Fränti.

pcmtuno.files.wordpress.com fileJapan-Japan Heinonen, Petri Finland-Finland Caloander, Jenna Finland-Finland Flouret, Fabian France-France Schwab, Claudia Suisse-Suisse Abelsson-Svensson,

Kalevankangas ja ilmatorjunta esitelmän 31.3.2014 tiivistelmä

THE PRECAMBRIAN ROCKS OF SOUTHERN FINLAND … Finland bedrock_ENG_062011_100dpiS.pdf · PRECAMBRIAN ROCKS OF SOUTHERN FINLAND & ESTONIA 3 The bedrock of southwestern Finland By looking

Four aspects of relevance: content, time, location and network Pasi Fränti, Jinhua Chen and Andrei Tabarcea.

IMIA Conference 1995 – Helsinki, Finland · IMIA Conference 1995 – Helsinki, Finland. IMIA Conference 1995 – Helsinki, Finland. IMIA Conference 1995 – Helsinki, Finland. de

CIS - measuring user innovation in Finland - Statistics Finland

a nsä - Tunturisuunnistustunturisuunnistus.fi/2014/wp-content/uploads/sites/... · Friman Kaisa, HyRa x 1 Friman Sami, HyRa x 2 Fränti Arto x 2 Fränti Juha x 3 Fränti Kati x 1

SUOMI/FINLAND (FINLAND) - Trusted List ID ...

Cleantech Finland - Пресс-конференция Cleantech Finland, Москва 2013

Unsere Reise nach Wien 31.3.2014 bis 6.4.2014

Valmetin osavuosikatsaus 1.1. · Valmetin osavuosikatsaus 1.1.–31.3.2014 Toimintaympäristö parantunut jokseenkin vuoden 2014 ensimmäisen vuosineljänneksen aikana Asiakasaktiviteetti

BIM penetration in Finland · FINLAND BIM in Finland today TOMI HENTTINEN M.Sc. (Archit.) SAFA Gravicon Oy, CEO buildingSMART Finland, Chair

Results for Quarter and year ended 31.3.2014 - Forbesforbes.co.in/Files/20140529114609ResultsforQuarterandyearended31.3... · PART g - Statement of Standalone BSE ... 8,555 1,273

Design & Analysis of Algorithms Combinatory optimization SCHOOL OF COMPUTING Pasi Fränti 20.10.2014.

Team Finland ja Business Finland

Randomized Algorithms Pasi Fränti 1.10.2014. Treasure island Treasure worth 20.000 awaits 5000 DAA expedition 5000 ? ? Map for sale: 3000.

Cluster validation Pasi Fränti Clustering methods: Part 3 Speech and Image Processing Unit School of Computing University of Eastern Finland 15.4.2014.