Fuzzy dm

18
Fuzzy Clustering By:- Akshay Chaudhari

Transcript of Fuzzy dm

Page 1: Fuzzy dm

Fuzzy ClusteringBy:- Akshay Chaudhari

Page 2: Fuzzy dm

AgendaIntroduction

Fuzzy C-Mean clustering

Algorithm

Complexity analysis

Pros. and cons.

References

Page 3: Fuzzy dm

IntroductionFuzzy clustering is a method of clustering

which allows one piece of data to belong to two or more clusters.

In other words, each data is a member of every cluster but with a certain degree known as membership value.

This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition.

Page 4: Fuzzy dm

Applications

Image segmentationMedical imaging

X-ray Computer Tomography (CT)Magnetic Resonance Imaging (MRI)Position Emission Tomography (PET)

Image and speech enhancement

Edge detection

Video shot change detection

Page 5: Fuzzy dm

Fuzzy C-means Clustering

Page 6: Fuzzy dm

Fuzzy C-means Clustering

Page 7: Fuzzy dm

Fuzzy C-means Clustering

Page 8: Fuzzy dm

Fuzzy C-means Clustering

Page 9: Fuzzy dm

Fuzzy C-means Clustering

Page 10: Fuzzy dm

Fuzzy C-means Clustering

Page 11: Fuzzy dm

Fuzzy C-means Clustering

Page 12: Fuzzy dm

Fuzzy C-Mean Algorithm1. Select an initial fuzzy pseudo-partition,

i.e. ,assign values to all uij.

2. repeat

3. Compute the centroid of each cluster using fuzzy pseudo-partition.

4. Recompute fuzzy pseudo-partition, i.e., the uij.

5. until the centroids don’t change.

Page 13: Fuzzy dm

Algorithm

Page 14: Fuzzy dm

An example X=[3 7 10 17 18 20] and assume C=2

Initially, set U randomly

Compute centroids, cj using , assume m=2

c1=13.16; c2=11.81

Compute new membership values, uij using

New U:

Repeat centroid and membership computation until changes in membership values are smaller than say 0.01

5.09.07.04.08.09.0

5.01.03.06.02.01.0U

N

i

mij

N

ii

mij

j

u

xuc

1

1

41.038.035.076.062.057.0

59.062.065.024.038.043.0U

1

2

1 ||||

||||

1

mC

k ki

ji

ij

cx

cxu

Page 15: Fuzzy dm

Complexity analysis Time complexity of the fuzzy c mean algorithm

is O(ndc2i)

Where i number FCM over entire dataset. n number of data points. c number of clusters d number of dimensions

where… i grows very slowly with n,c and d.

Page 16: Fuzzy dm

Pros. & Cons.Pros:

Allows a data point to be in multiple clustersA more natural representation of the behavior of

genesgenes usually are involved in multiple functions

Cons:Need to define c, the number of clustersNeed to determine membership cutoff valueClusters are sensitive to initial assignment of

centroidsFuzzy c-means is not a deterministic algorithm

Page 17: Fuzzy dm

Referenceshttp://home.dei.polimi.it/matteucc/Clustering/

tutorial_html/cmeans.html

http://en.wikipedia.org/wiki/Fuzzy_clustering

Section 9.2 from Introduction to Data Mining by Tan, Kumar, Steinbach

Page 18: Fuzzy dm

Thank You …..