On Tolerant Fuzzy c-Means - J-STAGE Home

4
On Tolerant Fuzzy c-Means Yukihiro HAMASUNA Yasunori ENDO Sadaaki MIYAMOTO Dept. of Risk Eng. Dept. of Risk Eng. Dept. of Risk Eng. University of Tsukuba University of Tsukuba University of Tsukuba Ibaraki, 305-8573, Japan Ibaraki, 305-8573, Japan Ibaraki, 305-8573, Japan [email protected] [email protected] [email protected] Abstract—This paper presents a new type of clustering algorithm by using tolerance vector. The tolerance vector is considered from a new viewpoint that the vector shows a correlation between each data and cluster centers in the proposed algorithm. First, a new concept of tolerance is introduced into optimization problem. This optimization problem is based on conventional fuzzy c-means (FCM) by Bezdek. Second, the optimization problem with the tolerance is solved by using the Karush-Kuhn-Tucker conditions. Next, a new clustering algorithm is constructed based on the unique and explicit optimal solutions of the optimization problem. Finally, the effectiveness of the proposed algorithm is verified through some numerical examples. Index Terms—clustering, optimization, tolerance vec- tor I. I NTRODUCTION Hard c-means (HCM) is one of well-known clus- tering algorithms. Many clustering algorithms which are based on HCM have been proposed, for example, standard fuzzy c-means (sFCM) [1] and entropy reg- ularized fuzzy c-means (eFCM) [2]. When we handle a set of data, data contains inherent uncertainty. For example, errors, ranges or some lost attributes are many caused. In these cases, each data is represented by an interval or a set, instead of a point. In case of handling data with uncertainty, some significant methods have been proposed [3], [4]. These methods can not only handle data with uncertainty but also obtain high quality results by considering data with uncertainty. Thus, handling data with uncertainty is very important problem in the field of data mining. Therefore, from the viewpoint of handling data with uncertainty, we have proposed a concept of tolerance which shows the correlation between each data point and cluster centers [5]. The concept of tolerance has already been proposed which handle the data with uncertainty, for example, errors, ranges or some lost attributes. Handling such data with uncertainties, the concept of tolerance has already been proposed and some clustering algorithms have been constructed [5], [6], [7]. In these algorithms, tolerance is defined as hyper-sphere [5], [7] or hyper- rectangle [6]. In this paper, we propose a new concept of tolerance and consider a new type of clustering algorithm by using tolerance vector. The tolerance vector shows a correlation between each data and cluster centers in the proposed algorithm. Each data has tolerance vectors whose numbers are equivalent to clusters. First, we consider a new type of optimization problem by introducing the new concept of tolerance vector. Second, the optimal solutions are derived from Karush- Kuhn-Tucker (KKT) conditions. From these processes to derive the optimal solutions of the optimization problem, we construct a new clustering algorithm by using tolerance vector. Moreover, the effectiveness of the proposed algorithm is verified through some numerical examples. This algorithm is different from conventional algorithms from the viewpoint of the meaning of tolerance vector. II. PREPARATION Let data set, cluster and cluster center of C i be X = {x k | x k =(x 1 k ,...,x p k ) T R p ,k =1 ...n}, C i (i = 1 c) and v i =(v i1 ,...,v ip ) T V , respectively. We assume that μ ki mean the relevance degree between x k and C i . We call U =[μ ki ] a partition matrix. A. Fuzzy c-Means Fuzzy c-means is based on the optimization of following objective function [1]. J (U, V )= n X k=1 c X i=1 (μ ki ) m kx k - v i k 2 min . And the constraint for μ ki is c X i=1 μ ki =1 ki [0, 1] , k. The optimal solutions of v i and μ ki are expressed as follows. For v i , v i = n k=1 (μ ki ) m x k n k=1 (μ ki ) m . For μ ki , μ ki = c X j=1 kx k - v i k 2 kx k - v j k 2 1 m-1 -1 . If x k = v i , we set μ ki =1. 574 TH-A5-1 SCIS & ISIS 2008

Transcript of On Tolerant Fuzzy c-Means - J-STAGE Home

On Tolerant Fuzzy c-MeansYukihiro HAMASUNA Yasunori ENDO Sadaaki MIYAMOTO

Dept. of Risk Eng. Dept. of Risk Eng. Dept. of Risk Eng.University of Tsukuba University of Tsukuba University of Tsukuba

Ibaraki, 305-8573, Japan Ibaraki, 305-8573, Japan Ibaraki, 305-8573, [email protected] [email protected] [email protected]

Abstract—This paper presents a new type of clusteringalgorithm by using tolerance vector. The tolerance vectoris considered from a new viewpoint that the vector showsa correlation between each data and cluster centersin the proposed algorithm. First, a new concept oftolerance is introduced into optimization problem. Thisoptimization problem is based on conventional fuzzyc-means (FCM) by Bezdek. Second, the optimizationproblem with the tolerance is solved by using theKarush-Kuhn-Tucker conditions. Next, a new clusteringalgorithm is constructed based on the unique and explicitoptimal solutions of the optimization problem. Finally,the effectiveness of the proposed algorithm is verifiedthrough some numerical examples.

Index Terms—clustering, optimization, tolerance vec-tor

I. INTRODUCTION

Hard c-means (HCM) is one of well-known clus-tering algorithms. Many clustering algorithms whichare based on HCM have been proposed, for example,standard fuzzy c-means (sFCM) [1] and entropy reg-ularized fuzzy c-means (eFCM) [2].

When we handle a set of data, data contains inherentuncertainty. For example, errors, ranges or some lostattributes are many caused. In these cases, each datais represented by an interval or a set, instead of apoint. In case of handling data with uncertainty, somesignificant methods have been proposed [3], [4]. Thesemethods can not only handle data with uncertainty butalso obtain high quality results by considering datawith uncertainty. Thus, handling data with uncertaintyis very important problem in the field of data mining.

Therefore, from the viewpoint of handling data withuncertainty, we have proposed a concept of tolerancewhich shows the correlation between each data pointand cluster centers [5].

The concept of tolerance has already been proposedwhich handle the data with uncertainty, for example,errors, ranges or some lost attributes. Handling suchdata with uncertainties, the concept of tolerance hasalready been proposed and some clustering algorithmshave been constructed [5], [6], [7]. In these algorithms,tolerance is defined as hyper-sphere [5], [7] or hyper-rectangle [6].

In this paper, we propose a new concept of toleranceand consider a new type of clustering algorithm byusing tolerance vector. The tolerance vector shows

a correlation between each data and cluster centersin the proposed algorithm. Each data has tolerancevectors whose numbers are equivalent to clusters.First, we consider a new type of optimization problemby introducing the new concept of tolerance vector.Second, the optimal solutions are derived from Karush-Kuhn-Tucker (KKT) conditions. From these processesto derive the optimal solutions of the optimizationproblem, we construct a new clustering algorithm byusing tolerance vector. Moreover, the effectivenessof the proposed algorithm is verified through somenumerical examples. This algorithm is different fromconventional algorithms from the viewpoint of themeaning of tolerance vector.

II. PREPARATION

Let data set, cluster and cluster center of Ci be X ={xk | xk = (x1

k, . . . , xpk)T ∈ Rp, k = 1 . . . n}, Ci(i =

1 ∼ c) and vi = (vi1, . . . , vip)T ∈ V , respectively. We

assume that µki mean the relevance degree betweenxk and Ci. We call U = [µki] a partition matrix.

A. Fuzzy c-Means

Fuzzy c-means is based on the optimization offollowing objective function [1].

J(U, V ) =

n∑

k=1

c∑

i=1

(µki)m‖xk − vi‖

2 → min .

And the constraint for µki is

c∑

i=1

µki = 1 , µki ∈ [0, 1] , ∀k.

The optimal solutions of vi and µki are expressed asfollows. For vi,

vi =

∑nk=1

(µki)m xk

∑nk=1

(µki)m.

For µki,

µki =

c∑

j=1

‖xk − vi‖2

‖xk − vj‖2

1

m−1

−1

.

If xk = vi, we set µki = 1.574

TH-A5-1 SCIS & ISIS 2008

B. Original Concept of Tolerance

The original concept of tolerance has been proposedto handle the data with uncertainties, for example,errors, ranges or some lost attributes. On this concept,tolerance is defined as the admissible range of the dataand tolerance vectors are defined as the vectors withinthe range of tolerance. Some clustering algorithmshave been proposed based on the concept of tolerance[5], [6], [7]. The clustering methods with tolerancedefined as hyper-rectangle can classify a set of data,for example, heart disease data set [8] which has somelost attributes. The proposed method is expected toobtain higher quality results with heart disease dataset which has some lost attributes.

C. New Concept of Tolerance

The new concept of tolerance is different from theoriginal one. The upper bound of tolerance vector isdefined as tolerance. The vector which shows correla-tion between each data and cluster centers is defined astolerance vector. Each data unit has tolerance vectorswhose numbers are equivalent to clusters.

We define the upper bound of tolerance vectorκki = (κ11, . . . , κnc) ≥ 0 and tolerance vectorE = {εki | εki = (ε1

ki, . . . , εpki)

T ∈ Rp, ‖εki‖2 ≤

κ2

ki, k = 1 . . . n, i = 1 . . . c} which mean theadmissible range of the data, and the vector withinthe range of tolerance, respectively. The constraintcondition is expressed as follows.

‖εki‖2 ≤ κ2

ki (κki ≥ 0), ∀k, ∀i.

Figure 1 is an illustrative example about a newconcept of tolerance and tolerance vectors in R2.

���

� ���

����

���� � ��

���� � ���

� ���

� ��

Fig. 1. New concept of tolerance in R2

III. TOLERANT FUZZY c-MEANS

In this section, we discuss a new optimization prob-lem for clustering. We formulate the tolerant fuzzy c-means (T-FCM) by introducing the notion of toleranceinto an optimization problem and consider to optimizethis objective function under the constraints aboutmembership degree and tolerance.

The dissimilarity for clustering is the squared Eu-clidean distance between a data with tolerance vectorand cluster center :

dki = ‖xk + εki − vi‖2.

The optimization problem is expressed as follows,

J(U,E, V ) =

n∑

k=1

c∑

i=1

(µki)mdki (1)

under the following constraints,c

i=1

µki = 1 , µki ∈ [0, 1] , ∀k. (2)

‖εki‖2 ≤ κ2

ki (κki ≥ 0), ∀k, ∀i. (3)

The goal is to find the solutions which minimize theobjective function (1) under the constraints (2) and (3).

From the convexity of (1), we introduce the fol-lowing Lagrange function to solve this optimizationproblem.

The Lagrangian function L1 is as follows :

L1 =J(U,E, V )

+

n∑

k=1

νk(

c∑

i=1

µki − 1) +

n∑

k=1

c∑

i=1

δki(‖εki‖2 − κ2

ki)

Karush-Kuhn-Tucker conditions (KKT conditions) areas follows :

{

∂L1

∂vi

= 0, ∂L1

∂εki

= 0, ∂L1

∂µki

= 0, ∂L1

∂νk

= 0,∂L1

∂δki

≤ 0, δki∂L1

∂δki

= 0, δki ≥ 0.(4)

First, we consider vi. From KKT conditions, we canget

∂L1

∂vi

= −2n

k=1

(µki)m(xk + εki − vi) = 0.

From the above, we can get

vi =

∑nk=1

(µki)m(xk + εki)

∑nk=1

(µki)m. (5)

For εki from KKT conditions, we can get

∂L1

∂εki

= 2(µki)m(xk+εki − vi) + 2δkiεki = 0,

εki = −(µki)

m(xk − vi)

(µki)m + δki

. (6)

From δki∂L1

∂δki

= 0,

δki(‖εki‖2 − κ2

ki) = 0. (7)

From (7), we should consider two cases, i.e., δki =0 and ‖εki‖

2 = κ2

ki. First, we consider the case ofδki = 0. In case of δki = 0, the constraint (3) is notconsidered. From (6), we can get

εki = −(xk − vi).

575

On the other hand, in case that ‖εki‖2 = κ2

ki,

‖εki‖2 = ‖ −

(xk − vi)

(µki)m + δki

‖2 = κ2

ki.

From (µki)m + δki > 0,

(µki)m + δki =

‖xk − vi‖

κki

. (8)

From (6), (8),

εki =−κki(xk − vi)

‖xk − vi‖.

From the above, we get an optimal solution about εki

as follows,

εki = −αki(xki − vi). (9)

Here �

αki = min

{

κki

‖xk − vi‖, 1

}

.

For µki, from ∂L1

∂µki

= 0,

∂L1

∂µki

= m(µki)m−1dki + νk = 0, (10)

we have

µki =

[

νk

mdki

]1

m−1

. (11)

In addition, from the constraint condition (2),c

i=1

[

νk

mdki

]1

m−1

= 1. (12)

From (11) and (12), we have

µki =

c∑

j=1

dki

dkj

1

m−1

−1

. (13)

If some xk + εki = vi, we set µki = 1/|C ′|. Here,|C ′| is number of cluster centers which satisfies xk +εki = vi.

IV. ALGORITHMS

We describe the algorithm of T-FCM which cor-responds with the above discussion. Algorithm ofT-FCM is constructed from alternative optimizationprocess about vi, εki and µki.

Algorithm 1 (T-FCM):Step 1 Give the value κki.

Set the initial value of εki ∈ E, and vi ∈ V .Step 2 Calculate vi ∈ V from (5).Step 3 Calculate εki ∈ E from (9).Step 4 Calculate µki ∈ U from (13).

If the criterion is not satisfied, go back toStep 2.

In this algorithm, the criterion is convergence of thecluster center vi or objective function value, otherwisenumber of repetition.

V. ILLUSTRATIVE EXAMPLES

In this section, we show some illustrative examplesof classification by the above-mentioned algorithm.Fig 2 is the illustrative example of sample data. Thissample data is mapped into two dimensional patternspace and consists of 312 points. This sample datashould be divided into two clusters. In these figures,‘+’ means one cluster, ‘×’ means another cluster and‘∗’ means cluster centers, respectively. Values in eachdata sets are normalized between 0 and 10.

Fig. 3 shows a result in case of κk1 = κk2 = 0.0which is equivalent to conventional FCM. Fig. 4 showsa result in case of κk1 = 0.0, κk2 = 2.5. Furthermore,Fig. 5 and 6 are illustrative examples of fuzzy classi-fication function.

From these results, proposed algorithm can classifythis sample data though it is difficult by conventionalFCM.

0

2

4

6

8

10

0 2 4 6 8 10

Fig. 2. Sample data

0

2

4

6

8

10

0 2 4 6 8 10

Fig. 3. κk1 = κk2 = 0.0

VI. CONCLUSIONS

In this paper, we have formulated the optimizationproblem based on a new concept of tolerance and de-

576

0

2

4

6

8

10

0 2 4 6 8 10

Fig. 4. κk1 = 0.0, κk2 = 2.5

0 2

4 6

8 10 0

2 4

6 8

10

0

0.2

0.4

0.6

0.8

1

Membership GradeMembership Grade

Fig. 5. Fuzzy Classification Function 1

rived the optimal solutions for tolerant fuzzy c-means.From these results, we have constructed a new clus-tering algorithm. In the discussion, we have defined anew concept of tolerance which shows a correlationbetween each data and cluster centers. Moreover, theeffectiveness of proposed algorithm have been verifiedthrough some numerical examples.

The proposed technique is essentially different fromthe past one from the viewpoint of handling tolerancein the optimization framework.

Future studies include the calculation of other typesof data sets, introduction of other shapes of tolerance.Moreover calculation tolerance vectors εki by differentmethod to fuzzificate εki is important work. Especially,the way to define and determine the tolerance vectorsεki plays very important role in the proposed discus-sion.

0 2

4 6

8 10 0

2 4

6 8

10

0

0.2

0.4

0.6

0.8

1

Membership GradeMembership Grade

Fig. 6. Fuzzy Classification Function 2

ACKNOWLEDGMENTS

This study is partly supported by the Grant-in-Aid for Scientific Research (C) and (B) (ProjectNo.18500170 and No.19300074) from the Ministry ofEducation, Culture, Sports, Science and Technology,Japan.

REFERENCES

[1] J. C. Bezdek, ‘Pattern Recognition with Fuzzy ObjectiveFunction Algorithms’, Plenum Press,New York (1981).

[2] Miyamoto, S., Mukaido, M., ‘Fuzzy c-means as a regulariza-tion and maximum entropy approach’, Proc. of the 7th Interna-tional Fuzzy Systems Association World Congress (IFSA’97),June 25-30, 1997, Prague, Czech, vol. 2, pp.86-92 (1997).

[3] W. K. Ngai, B. Kao, C. K. Chui, R. Cheng, M. Chau, K. Y. Yip,‘Efficient Clustering of Uncertain Data’, Proc. of the Sixth In-ternational Conference on Data Mining (ICDM’06), December18-22, 2006, Hong Kong, China, pp.436-445 (2006).

[4] H. Hamdan, G. Govaert, ‘Mixture model clustering of uncer-tain data’, Proc. of the 14th IEEE International Conference onFuzzy Systems (FUZZ-IEEE 2005), May 22-25, 2005, Reno,Nevada, pp.879-884 (2005).

[5] Murata, R., Endo, Y., Haruyama, H. , Miyamoto, S., ‘OnFuzzy c-Means for data with Tolerance’, Journal of AdvancedComputational Intelligence and Intelligent Informatics 10(5),pp.673-681 (2006).

[6] Hamasuna, Y., Endo, Y., Hasegawa Y., Miyamoto, S., ‘Twoclustering algorithms for data with tolerance based on Hardc-Means’, IEEE International Conference on Fuzzy Systems(FUZZ-IEEE2007), pp.688-691 (2007).

[7] Hamasuna, Y., Endo, Y., Miyamoto, S., Hasegawa, Y., ‘OnHard Clustering for Data with Tolerance’, Journal of JapanSociety for Fuzzy Theory and Intelligent Informatics, Vol.20,No.3, pp.388-398 (2008) (in Japanese).

[8] UCI Machine Learning Repository Content Summaryhttp://archive.ics.uci.edu/ml/datasets/Heart+Disease

577