MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette...
Transcript of MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette...
![Page 1: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/1.jpg)
MMG991 Session 7
• Non-hierarchical cluster analysis– Review fundamental concepts– As implemented in S-Plus
• Microarray data• Other applications
– Open discussion on implementation• Selection of projects
![Page 2: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/2.jpg)
Cluster analysis revisited• Hierarchical methods
– Goals– Agglomorative– Divisive– Unsupervised– output
• Partitioning methods– Goals– k-means– pam, clara and fanny– Supervised
• Selecting the number of groups– cutree()
– output
![Page 3: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/3.jpg)
Cutting the tree
• cutree()– Returns a vector of group number for the objects clustered– Input tree (output of hclust()– Height of cut (h) or number of groups (k)
• Visualizing the cuts– Currently no default plotting routine– So, what can we do
• Table of groupings• “decorate” the tree
![Page 4: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/4.jpg)
Setting up the example
set1a <-read.table("set1a.txt", header=T, sep="\t")set1a[,1]<-paste(as.character(set1a[,1]),"1a",sep=".")row.names(set1a)<-set1a[,1]set1a<-set1a[sort(dimnames(set1a)[[1]]),-1]
set1a.norm<-(set1a-apply(set1a,1,mean))/apply(set1a,1,stdev)for(i in 1:ncol(set1a.norm))
dimnames(set1a.norm)[[2]][i]<-paste("exp-", as.character(i), sep="")
graphsheet()par(mfcol=c(2,1))
![Page 5: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/5.jpg)
Before0
24
68
10
0 50 100 150 200 250 300
010
2030
4050
![Page 6: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/6.jpg)
1 2 3 45
6 78
9 1011 12
1314 15 16 17 18 19 20 21
2223 24 25
26 2728
29 30 31 3233 34 35 36 37 38
3940 41 42 4344 45 46 47 48 49
50510
1520
25
0 50 100 150 200 250 300
010
2030
4050
![Page 7: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/7.jpg)
Drawing the figuresset1a.clust<-hclust(dist(set1a.norm, met="euc"),
met="aver")set1a.clust<-clorder(set1a.clust, apply(set1a.norm,
1, mean))set1a.plclust<-plclust(set1a.clust, labels=FALSE)set1a.cutree<-cutree(set1a.clust, k=14)temp<-cbind(set1a.plclust$x, set1a.plclust$y,
col=as.vector(set1a.cutree))for(i in 1:14)
points(temp[temp[,3]==i,1], temp[temp[,3]==i,2],col=i, pch=16)
set1a.norm<-set1a.norm[set1a.clust$order,]image(list(x=1:dim(set1a.norm)[1],
y=1:dim(set1a.norm)[2], z=as.matrix(set1a.norm)))
image.legend(as.matrix(set1a.norm), x=nrow(set1a.norm)*1.066,y=ncol(set1a.norm)*1.05, size=c(.1, 2.55),hor=F,cex=0.66,mgp=c(0,0.25,0))
![Page 8: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/8.jpg)
Second dimensionset1a.tclust<-hclust(dist(t(set1a.norm), met="euc"),
met="aver")set1a.tclust<-clorder(set1a.tclust, apply(t(set1a.norm), 1,
mean))set1a.tplclust<-plclust(set1a.tclust, labels=FALSE)set1a.tcutree<-cutree(set1a.tclust, k=6)temp<-cbind(set1a.tplclust$x, set1a.tplclust$y,
col=as.vector(set1a.tcutree))for(i in 1:6)
points(temp[temp[,3]==i,1], temp[temp[,3]==i,2], col=i,pch=16)
set1a.norm<-set1a.norm[,set1a.tclust$order]image(list(x=1:dim(set1a.norm)[1], y=1:dim(set1a.norm)[2],
z=as.matrix(set1a.norm)))image.legend(as.matrix(set1a.norm),x=nrow(set1a.norm)*1.066,
y=ncol(set1a.norm)*1.05, size=c(.1, 2.55), hor=F,cex=0.66,mgp=c(0,0.25,0))
![Page 9: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/9.jpg)
02
46
810
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
Gene clusters identified
![Page 10: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/10.jpg)
Experiment clusters identified5
1015
2025
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
![Page 11: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/11.jpg)
k-means
• Objective– partition observations into groups that minimizes within group
sum of squared distances (withinss). – Centroids– Requires a defined number of groups– Determining optimum number of groups– No graphical output
– The classic example• Ruspini’s data set
![Page 12: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/12.jpg)
kmeans(ruspini,4)Centers:
x y [1,] 98.17647 114.8824[2,] 20.15000 64.9500[3,] 43.91304 146.0435[4,] 68.93333 19.4000
Clustering vector:[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3[40] 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Within cluster sum of squares:[1] 4558.235 3689.500 3176.783 1456.533
Cluster sizes:[1] 17 20 23 15
Available arguments:[1] "cluster" "centers" "withinss" "size"
![Page 13: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/13.jpg)
Ruspini dataset
ruspini$x
rusp
ini$
y
0 20 40 60 80 100 120
050
100
150
![Page 14: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/14.jpg)
Ruspini dataset, k=4
ruspini[, 1]
rusp
ini[,
2]
0 20 40 60 80 100 120
050
100
150
![Page 15: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/15.jpg)
Ruspini dataset, k=5
ruspini[, 1]
rusp
ini[,
2]
0 20 40 60 80 100 120
050
100
150
![Page 16: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/16.jpg)
So, how many clusters are there?• Hartigan’s recommendation• if• (sum(k$withinss)/sum(kplus1$withinss)-1)*(nrow(x)-k -1)
> 10• addition of group is justifed
• Setting up a test…
kscore.ruspini<-as.list(2:21)for(i in 2:20){
k<-kmeans(ruspini, i)kscore.ruspini[[i]]<-k$withinss
}for(i in 2:19){
print((sum(kscore.ruspini[[i]])/sum(kscore.ruspini[[i+1]])-1)*(nrow(ruspini)-i-1))
}
![Page 17: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/17.jpg)
[3] 53.74084
[4] 210.9672
[5] 8.920013
[6] 21.9182
[7] 13.646
[8] 10.26787
[9] 12.0679
[10] 6.488705
[11] 13.35045
[12] 9.935521
[13] 4.754963
[14] 7.834783
[15] 6.378573
[16] 3.886348
[17] 2.072307
[18] 4.197096
[19] 5.176718
[20] 4.354051
![Page 18: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/18.jpg)
k-means with array dataset1a.norm<-(set1a -
apply(set1a,1,mean))/apply(set1a,1,stdev)set1a.norm<-set1a.norm[sort(dimnames(set1a.norm)[[1]]),]set1a.kmeans<-kmeans(set1a.norm, 14)gene.order<-cbind(dimnames(set1a)[[1]], set1a.kmeans$cluster)gene.order<-gene.order[order(gene.order[,2]),1]set1a.kmeans<-kmeans(t(set1a.norm), 6)exp.order<-cbind(dimnames(set1a)[[2]], set1a.kmeans$cluster)exp.order<-exp.order[order(exp.order[,2]),1]
#to visualize the output of the two analysis
temp<-set1a.norm[gene.order, exp.order]
image(list(x=1:dim(temp)[1], y=1:dim(temp)[2], z=as.matrix(temp)))
![Page 19: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/19.jpg)
Kmeans, genes=14, exp=6
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
![Page 20: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/20.jpg)
Optimum number of clusters
Genes[3] 52.11781[4] 61.86213[5] 76.23372[6] 57.22758[7] 134.5601[8] 115.662[9] 111.2952[10] 89.475[11] 182.0612[12] 187.1214[13] 153.94[14] 316.5357[15] 8.525307[16] 4.961622[17] 6.483269[18] 6.932642[19] 3.641691[20] 3.420847
Experiments[3] 82.84932[4] 111.1703[5] 52.30111[6] 46.1943[7] 46.47232[8] 49.47469[9] 23.58832[10] 39.69845[11] 28.4224[12] 28.27107[13] 43.54086[14] 22.23373[15] 35.81718[16] 23.46791[17] 29.98056[18] 24.97635[19] 31.67716[20] 30.31866
![Page 21: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/21.jpg)
Optimized k-means clustering
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
![Page 22: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/22.jpg)
Six steps of cluster analysis
• Obtaining the data matrix– Test data set
• Standardizing the data matrix– Normalization
• Computing the resemblance matrix– Similarity– Dissimilarity– Distance– Other measures
• Clustering the data• Rearranging the data matrix• Goodness of fit
![Page 23: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/23.jpg)
Partitioning around medoids (pam())
• Similar to k-means– Utilizes medoids rather than centroids– More robust
• Minimizes sum of dissimilarities rather sum of squared Euclideandistances
– Provides grapical output to evaluate clustering• Silhouete plots
– Denotes number of clusters, cluster width and quality– Ranked in decreasing order– Overall average silhouette width
» Heuristics
– pam(x, k, diss=F, metric="euclidean", stand=F, save.x=T, save.diss=T)
![Page 24: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/24.jpg)
pam() with array dataset1a.pam<-pam(set1a.norm,14)set1a.tpam<-pam(t(set1a.norm),4)gene.order<-cbind(dimnames(set1a.norm)[[1]],
set1a.pam$clustering)gene.order<-gene.order[order(gene.order[,2]),1]exp.order<-cbind(dimnames(set1a.norm)[[2]],
set1a.tpam$clustering)exp.order<-exp.order[order(exp.order[,2]),1]temp<-set1a.norm[gene.order, exp.order]image(list(x=1:dim(temp)[1], y=1:dim(temp)[2],
z=as.matrix(temp)))image.legend(as.matrix(temp), x=nrow(temp)*1.075,
y=ncol(temp)*1.05, size=c(.125, 6.1), hor=F,cex=0.66, tck=-0.01, mgp=c(0,0.5,0))
![Page 25: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/25.jpg)
Silhouette plot, grouped by gene, k=14
0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.83
![Page 26: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/26.jpg)
Silhouette plot, grouped by expt, k=4
-0.2 0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.29
![Page 27: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/27.jpg)
DNA array data by pam()
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
![Page 28: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/28.jpg)
Clustering large applications
• Clara– Optimized version of pam()– Limitations of k-means and pam()
• Memory requirements are quadratic– Algorithm works with subsets
• Divides data into k clusters• Remaining objects assigned to clusters• Susbsequent iterations forced to contain currently best medoids
– clara(x, k, metric="euclidean", stand=F, samples=5, sampsize=40 + 2 * k, save.x=T, save.diss=T)
![Page 29: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/29.jpg)
Silhouette plots
0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.84
-0.2 0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.31
![Page 30: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/30.jpg)
DNA array data by clara()
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
![Page 31: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/31.jpg)
0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.77
0.0 0.2 0.4 0.6 0.8 1.0Silhouette width
Average silhouette width : 0.22
Silhouette plots
![Page 32: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/32.jpg)
0 50 100 150 200 250 300
010
2030
4050
-3-2
-10
12
3
DNA array data by fanny()
![Page 33: MMG991 Session 7 - Michigan State UniversitySilhouette plots 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average silhouette width : 0.84-0.2 0.0 0.2 0.4 0.6 0.8 1.0 Silhouette width Average](https://reader034.fdocuments.net/reader034/viewer/2022051822/5feceec1d0470f4ec06c020e/html5/thumbnails/33.jpg)
Summing up
• Cluster analysis provides a means of organizing the data based on common features
• Different algorithms may arrive at different solutions
• Homework for next week– Comparing the output of hierarchical and partition methods
• Use Eisen’s test data– Which genes consistently group together?– Which experiments consistently group together?
– Projects