Fuzzy dm
-
Upload
akshay-chaudhari -
Category
Technology
-
view
509 -
download
0
Transcript of Fuzzy dm
Fuzzy ClusteringBy:- Akshay Chaudhari
AgendaIntroduction
Fuzzy C-Mean clustering
Algorithm
Complexity analysis
Pros. and cons.
References
IntroductionFuzzy clustering is a method of clustering
which allows one piece of data to belong to two or more clusters.
In other words, each data is a member of every cluster but with a certain degree known as membership value.
This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition.
Applications
Image segmentationMedical imaging
X-ray Computer Tomography (CT)Magnetic Resonance Imaging (MRI)Position Emission Tomography (PET)
Image and speech enhancement
Edge detection
Video shot change detection
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-means Clustering
Fuzzy C-Mean Algorithm1. Select an initial fuzzy pseudo-partition,
i.e. ,assign values to all uij.
2. repeat
3. Compute the centroid of each cluster using fuzzy pseudo-partition.
4. Recompute fuzzy pseudo-partition, i.e., the uij.
5. until the centroids don’t change.
Algorithm
An example X=[3 7 10 17 18 20] and assume C=2
Initially, set U randomly
Compute centroids, cj using , assume m=2
c1=13.16; c2=11.81
Compute new membership values, uij using
New U:
Repeat centroid and membership computation until changes in membership values are smaller than say 0.01
5.09.07.04.08.09.0
5.01.03.06.02.01.0U
N
i
mij
N
ii
mij
j
u
xuc
1
1
41.038.035.076.062.057.0
59.062.065.024.038.043.0U
1
2
1 ||||
||||
1
mC
k ki
ji
ij
cx
cxu
Complexity analysis Time complexity of the fuzzy c mean algorithm
is O(ndc2i)
Where i number FCM over entire dataset. n number of data points. c number of clusters d number of dimensions
where… i grows very slowly with n,c and d.
Pros. & Cons.Pros:
Allows a data point to be in multiple clustersA more natural representation of the behavior of
genesgenes usually are involved in multiple functions
Cons:Need to define c, the number of clustersNeed to determine membership cutoff valueClusters are sensitive to initial assignment of
centroidsFuzzy c-means is not a deterministic algorithm
Referenceshttp://home.dei.polimi.it/matteucc/Clustering/
tutorial_html/cmeans.html
http://en.wikipedia.org/wiki/Fuzzy_clustering
Section 9.2 from Introduction to Data Mining by Tan, Kumar, Steinbach
Thank You …..