1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by...

15
1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks so that the number of neurons can be increased over time until the networks reaches a good estimation of the cluster number in a data set. The algorithm, called Branching Competitive Learning (BCL), shows a fast convergence of the synaptic vectors to cluster centroids, and more importantly, shows the ability to automatically detect cluster number in a data distribution. We illustrate the formulation of the Branching Criteria and demonstrate the efficiency of BCL for data clustering through a set of experiments.
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    3

Transcript of 1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by...

1

Abstract This paper presents a novel modification to the classical

Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks so that the number of neurons can be increased over time until the networks reaches a good estimation of the cluster number in a data set.

The algorithm, called Branching Competitive Learning (BCL), shows a fast convergence of the synaptic vectors to cluster centroids, and more importantly, shows the ability to automatically detect cluster number in a data distribution.

We illustrate the formulation of the Branching Criteria and demonstrate the efficiency of BCL for data clustering through a set of experiments.

2

Introduction

Some applications of data clustering:• Pattern recognition• Vector quantization image coding• Image database indexing

Key problems of most clustering algorithms:• The number of clusters must be appropriately

preselected, e.g., K-mean and classical CL• Sensitive to the preselected cluster number and the

initialization of synaptic vectors, e.g., RPCL

3

Contributions1 Propose a neuron branching mechanism to estimate

cluster number and cluster data

2 Present the Branching Criteria

3 Present a new way of hierarchical data clustering, i.e., multi-resolution clustering

The advantages of BCL for clustering:• The ability to automatically detect cluster number

• Fast convergence of synaptic vectors

• Convenience to implement multiresolution data clustering

4

The Branching CriterionThere are two conditions for a neuron to spawn a new one:• The Angle Criteria — Based on the angle between current

moving direction and the previous moving direction of a synaptic vector:

(1)• The Distance Criteria — Based on the distance between the

input sample and winner:

(2)

where:

: A randomly selected sample at t, the current step

: The winner in current competition

, : Angle and distance thresholds

ang r x (t)−

r ω c,

r x (t −1)−

r ω c ≥ϕ0

min(||r x (t)−

r ω c ||, ||

r x (t−1)−

r ω c ||)≥d0

)(txr

cωr

0ϕ 0d

5

The Algorithm of BCL for Clustering

Initialize the first synaptic vector2 Randomly take a sample from the data set, find

the winner of current competition in the synaptic vector set , i.e., , where the is the frequency that has won the competition up to now

3 If satisfies the branching criterion above, a new neuron is spawn off from :

otherwise, update by

xr

{r ω i} (i =1, 2,L ,n) |||| min arg jjj xc ωγ

rr−=

r ω c

cωr

1+nωr

)(1 cccn x ωαωωrrrr

−+=+

cωr

cωr

)( ccc x ωαωrrr

−+

6

Illustrations of BCL

Figure 1: (a) An illustration of the procedure of the BCL algorithm, where:

(1) Initialization of the first synaptic vector

(2) Branching points of synaptic vectors

(3) Final convergence of synaptic vectors

(b) An illustration of a branching point, where: and

)1( −tcωr

)(tcωr

)(1 tn+ωr

)(txr

ϕ

1d

2d

(b)(a)

ϕ ≥ϕ0

min (d1, d2) ≥d0

7

Experiments

We conduct three sets of experiments:I The first set of experiments examines the ability of BCL to

detect cluster number automatically

II The second set of experiments shows a multiresolution clustering in BCL scheme

III The third set of experiments compares the performance of BCL and RPCL

The experimental environment: • Pentium II PC with 128 Meg of internal memory running on

Windows98• Implement BCL and RPCL algorithm in Visual C++6.0

8

Experiment I (1)Cluster Number Detection

(a) (b)

Figure 2: The learning and branching traces on a data set, which contains four Gaussian clusters (1,000 samples) with =0.5 and centered at (-2, 0), (2, 0), (0, -2), and (0, 2) respectively. The first synaptic vector is initialized by point (4, 4) in (a) or by point (0, 0) in (b).

9

Experiment I (2)

(c) (d)

Figure 3: The learning and branching traces on a data set, which contains four overlapping Gaussian clusters (1,000 samples) with =0.5 and centered at (-1.5, 0), (1.5, 0), (0, -1.5), and (0, 1.5) res-pectively. The first synaptic vector is initialized by point (3, 3) in (c) or by point (0, 0) in (d).

10

Experiment II (1)Multiresolution Data Clustering

(a) (b)

Figure 4: A multiresolution data set, where:

(a) A view of the data set in a small resolution

(b) A view of the data set in a large resolution

Data clustering and cluster number detection are resolution-dependent

11

Experiment II (2)(d)

Figure 5: The learning and branching traces on the multiresolution data set, where:

(c) Presents the learning trace in level 1

(d) Presents the learning trace in 2-level BCL

(c)

12

Experiment II (3)

Figure 6: The error between the centroids of data clusters and the estimated cluster centroids.

13

Experiment III (1)Comparison of BCL and RPCL

Two measures used for comparing BCL and RPCL:• The average accuracy of data clustering• The average speed or the average running time of the

algorithms

Table I. 5-Dimensional data setsData Set Distribution Sample Cluster Overlapping

No. Numb. Numb. 1 Uniform 2,000 10 none

2 Uniform 2,000 10 small

3 Gaussian 2,000 10 small

4 Gaussian 2,000 10 larger

14

Experiment III (2)Experimental results (over 20 trials)

Table II. Average Accuracy

RPCL BCL Data Set 1 94.74% 100.00% Data Set 2 94.00% 100.00% Data Set 3 97.92% 97.16% Data Set 4 94.88% 96.27%

Table II. Average Speed (s)

RPCL BCL Data Set 1 68.65 49.96 Data Set 2 97.29 31.02 Data Set 3 71.91 50.61 Data Set 4 82.89 62.71

15

Discussion & Conclusion

Discussion:• Robust to various initial conditions• The can be seen as a resolution control• How to choose ?

The advantages of BCL for clustering:• The ability to automatically detect cluster number

• Fast convergence of synaptic vectors

• Convenience to implement multiresolution data clustering

0d0d