Ahmed at Kamel [i.e. Ahmed al Kamel] : the pilgrim of love ...
A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G....
-
Upload
timothy-robinson -
Category
Documents
-
view
223 -
download
0
Transcript of A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G....
![Page 1: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/1.jpg)
A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters
Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo, Ontario, Canada
Hamdi JENZRI
MRL Seminar
![Page 2: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/2.jpg)
2
Outline
Introduction
Consensus Clustering
Review of consensus methods
Contribution of the authors
Theoretical formulation
Algorithms
Experimental results
Conclusion
![Page 3: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/3.jpg)
3
Introduction Cluster Analysis: discovery of a meaningful grouping for a set of
data objects by finding a partition that optimizes an objective function
The number of ways of partitioning a set of n objects into k non empty clusters is a Stirling set number of the second kind, which is of the order of kn/k!
A well-known dilemma in data clustering is the multitude of models for the same set of data objects, Different algorithms, Different distance measures, Different features for characterizing the objects, Different scales (the number of clusters)...
This issue led to a lot of research work that addressed the problem of comparison and consensus of data clustering
![Page 4: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/4.jpg)
4
Consensus Clustering Consensus Clustering (Cluster ensembles): Finding a
consensus partition that summarizes an ensemble of b partitions in some meaningful sense
Consensus Partition
Partition b
Partition 1Partition
2
…
Partition 1
Partition 2
…Partition b
Data
Method 1
Method 2
Method b
…
![Page 5: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/5.jpg)
5
Objectives of Consensus Clustering Improving clustering accuracy over a single data clustering, Allowing the discovery of arbitrarily shaped cluster structures, Reducing the instability of a clustering algorithm due to noise,
outliers, or randomized algorithms, Reusing preexisting clusterings (knowledge reuse), Exploring random feature subspaces or random projections for
high-dimensional data, Exploiting weak clusterings such as splitting the data with random
hyperplanes, Estimating confidence in cluster assignments for individual
observations, Clustering in distributed environments including feature or object
distributed clustering.
![Page 6: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/6.jpg)
6
Consensus Approaches Axiomatic: is concerned with deriving possibility /
impossibility theorems on the existence and uniqueness of consensus partitions satisfying certain conditions.
Constructive: specifies rules for constructing a consensus, such as the Pareto rule, also known as the strict consensus rule, whereby two objects occur together in a consensus if and only if they occur together in all the individual partitions.
Combinatorial optimization: considers an objective function J, measuring the remoteness of a partition to the ensemble of partitions, and searches for a partition in the set of all possible partitions of the data objects that minimizes J. The approach is related to the notion of central value in statistics.
![Page 7: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/7.jpg)
7
Challenge in Consensus Clustering The purely symbolic nature of the labels returned by the
clustering algorithms Let’s take the example of partitioning a 7-objects data set into
k= 3 clusters, where the first 3 objects belong to a first cluster, the following 2 objects to a second cluster and the remaining 2 objects to a third cluster: The vector representation of the clustering result can be:
[1 1 1 2 2 3 3], [2 2 2 1 1 3 3], [3 3 3 2 2 1 1] or any of the k! = 3! = 6 possible label permutations
The matrix representation of the same partition can be:
, , or any of the k! = 3! = 6
possible permutations of the rows
1 1 1 0 0 0 0
0 0 0 1 1 0 0
0 0 0 0 0 1 1
0 0 0 1 1 0 0
1 1 1 0 0 0 0
0 0 0 0 0 1 1
0 0 0 0 0 1 1
0 0 0 1 1 0 0
1 1 1 0 0 0 0
![Page 8: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/8.jpg)
8
Review of Consensus Methods
Authors Algorithm DescriptionComputational
ComplexityDisadvantages
Strehl and Ghosh
Cluster-based Similarity Partitioning Algorithm (CSPA) The optimal consensus is
the partition that shares the most information with the ensemble of partitions, as measured by the Average Normalized Mutual Information (ANMI)
Quadratic in n
They seek balanced size clusters, making them unsuitable for data with highly unbalanced clusters
Hyper Graph Partitioning Algorithm (HGPA)
Linear in n
Linear in nMeta CLustering Algorithm (MCLA)
![Page 9: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/9.jpg)
9
Review of Consensus Methods
Authors Algorithm Description Computational Complexity
Fred and Jain
Evidence Accumulation Clustering (EAC)
The cluster ensemble is mapped to a co-association matrix, where entries can be interpreted as votes (or vote ratios) on the pairwise co-occurrences of objects and are computed as the number of times each pair of objects co-occurs in the same cluster of a base clustering (relative to the total number of baseclusterings). The final consensus clustering is extracted by applying linkage-based clustering algorithms on the co-association matrix.
Quadratic in n
![Page 10: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/10.jpg)
10
Contributions of the Authors They introduce a new solution for the problem of aligning the cluster labels
of a given clustering with ki clusters with respect to a reference clustering with ko clusters.
This is done through what they call “cumulative voting” Plurality voting scheme (winner takes all) allows each voter to vote for
one option, and the option that receives the most votes is the winner. Cumulative voting is a rated voting scheme, where each voter gives a
numeric value (called rating) to each option such that the voter’s ratings add up to a certain total (for example, a number of points).
Cumulative voting is sometimes referred to as weighted voting. As proposed in this paper, cumulative voting maps an input ki-partition into
a probabilistic representation as a ko-partition with cluster labels corresponding to the labels of the reference ko clusters.
They formulate the selection criterion for the reference clustering based on the maximum information content, as measured by the entropy.
![Page 11: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/11.jpg)
11
Contributions of the Authors They explore different cumulative voting models
with a fixed reference partition Un-normalized weighting Normalized weighting
with an adaptive reference partition, the reference partition is incrementally updated so as to relax the dependence on the selected reference. Furthermore, these updates are performed in a decreasing order of entropies so as to smooth the updates of the reference partitions.
Based on the proposed cumulative vote mapping, they define the criterion for obtaining a first summary of the ensemble as the minimum average squared distance between the mapped partitions and the optimal representation of the ensemble,
![Page 12: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/12.jpg)
12
Contributions of the Authors Finally, they formulate the problem of extracting the optimal
consensus partition as that of finding a compressed summary of the estimated distribution that preserves the maximum relevant information.
They relate the problem to the Information Bottleneck (IB) method of Tishby et al. and propose an efficient solution using an agglomerative algorithm similar to the agglomerative IB algorithm of Slonim and Tishby, which minimizes the Jensen-Shannon (JS) divergence within the cluster.
They demonstrate the effectiveness of the proposed consensus algorithms using ensembles of K-Means clusterings with a randomly selected number of clusters, where the goal is to enable the discovery of arbitrary cluster structures.
![Page 13: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/13.jpg)
13
Architecture of the proposed MethodMinimizing the Jensen-
Shannon (JS) divergence
Minimum average squared distance
Cumulative Vote
Mapping
Maximum entropy
Selection of the Reference Partition
Un-normalized Vote Weighting
Scheme
Normalized Vote Weighting Scheme
First Summary of the partitions
Finding the optimal
consensus
Adaptive Vote Weighting Scheme
![Page 14: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/14.jpg)
14
Theoretical Formulation Let X denote a set of n data objects xj, where xj ϵ Rd
Partition of X into k clusters is represented as an n-dimentional cluster labeling vector y ϵ Cn, where C = {c1, ..., ck}
Alternatively, the partition can be represented as a k x n matrix denoted as U with a row for each cluster and a column for each object xj
Hard partition, U is referred to as a binary stochastic matrix, where each entry ulj ϵ {0, 1}, and
Soft partition, let C denote a categorical random variable defined over the set of cluster labels C, a stochastic partition corresponds to a probabilistic clustering and is defined as a partition where each observation is assigned to a cluster by an estimated posterior probability
![Page 15: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/15.jpg)
15
Theoretical Formulation Consider as input an ensemble of b partitions with a
variable number of clusters U = {U1, …, Ub}, such that each partition Ui is a ki x n
binary stochastic matrix (hard partitions) They use a center-based clustering algorithm, namely the
K-Means for text data to generate the cluster ensembles. The number of clusters for individual partitions is
randomly selected within some range ki ϵ [kmin, kmax] They address the problem of estimating a consensus
partition for the set of data objects X that optimally represents the ensemble U
![Page 16: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/16.jpg)
16
Selection of the reference partition Consider the random variable Ci defined over the cluster labels
of the ith clustering, with probability distribution
where is the number of objects assigned to cluster The Shannon entropy, defined as a measure of the average
information content (uncertainty) associated with a random outcome is a function of its distribution
The higher the entropy, the more informative is the distribution Thus, the partition Ui ϵ U with the highest entropy represents
the cluster label distribution with the maximum information content:
![Page 17: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/17.jpg)
17
Cumulative Vote Mapping Consider some reference partition Uo and a partition Ui ϵ U
with ko and ki clusters and associated random variables Co and Ci with estimated probability distributions denoted as
Ui is designated as the voting partition with respect to Uo
Each cluster is viewed as a voter that votes for each of the clusters , with a weight denoted as
The weights are represented in a ko x ki cumulative vote weight matrix, denoted as
Un-normalized weighting scheme Normalized weighting scheme
![Page 18: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/18.jpg)
18
Un-normalized Weighting Scheme It is connected to the co-association matrix commomly used for
summarizing a set of partitions Let and be the qth and lth row vectors of Uo and Ui respectively
Represents the number of objects belonging to both clusters dv and
The binary ki vectors of Ui are transformed into ko frequency vectors
represented in the mapped matrix Uo,i
Members of cluster are scaled by when mapped as members of clusters
![Page 19: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/19.jpg)
19
Un-normalized Weighting Scheme Each entry of the ko x n matrix Uo,i is taken as an
assignment frequency of object xj to cluster where
Example:
![Page 20: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/20.jpg)
20
Normalized Weighting Scheme Normalizing the weights to sum to 1 The weight is computed as the average of the
conditional probabilities of cluster , given each of the data objects assigned to cluster , which is taken as an estimate of the conditional probability . When the reference partition is represented as a binary stochastic matrix . .
Each of the ki columns of is a probability vector
![Page 21: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/21.jpg)
21
Normalized Weighting Scheme Each partition Ui is mapped using:
Each entry of Uo,i is considered as an estimate of
Uo,i is a stochastic partition representing Consider
The estimated priors
![Page 22: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/22.jpg)
22
Normalized Weighting Scheme Which gives:
Which ensures the entropy preserving property The normalization scheme reflects the intuition that
objects that are members of a large cluster are considered less strongly associated to each other than objects belonging to a small cluster
![Page 23: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/23.jpg)
23
Normalized Weighting Scheme Example
![Page 24: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/24.jpg)
24
Average-Squared-Distance Criterion for Mapped Partitions The chosen criterion for finding a stochastic partition Û of
X summarizing a set of b partitions as the minimum average distance between the mapped ensemble partitions and the optimal consensus is defined as follows:
Where represents the mapping of partition Ui into the stochastic partition Uo,i, defined with respect to the reference partition Uo
The dissimilarity function h() is defined as the average squared distance between the probability (column) vectors fd and and given as
![Page 25: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/25.jpg)
25
Average-Squared-Distance Criterion for Mapped Partitions This minimization problem can be solved directly by
calculating as the average of the probability vectors
It’s to note that using the cumulative vote mapping of the partitions, the number of clusters of Û is preset through the selected reference partition, regardless of the number of clusters ki of each partition.
![Page 26: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/26.jpg)
26
Algorithms Un-normalized Reference-based cumulative Voting (URCV)
![Page 27: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/27.jpg)
27
Algorithms Reference-based cumulative voting (RCV)
![Page 28: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/28.jpg)
28
Algorithms Adaptive cumulative voting (ACV)
![Page 29: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/29.jpg)
29
Finding the optimal consensus The above summarization of the ensemble do not always lead to
capturing the most “meaningful” or “relevant” information, given the arbitrary number of clusters ki
The problem of extracting a compressed representation of stochastic data that captures only the relevant or meaningful information was addressed by the Informarion Bottleneck (IB) method of Tishby et al.
It addresses a trade-off between compressing the representation and preserving meaningful information.
In this paper, they formulate a subsequent problem as that of finding an efficient representation of random variable C, described by random variable Z, that preserves the maximum amount of relevant information about X, based on the estimated distribution
![Page 30: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/30.jpg)
30
Finding the optimal consensus Solution that is approximately equivalent to the AIB
algorithm but requires less computational time They map the ko clusters to a (ko)
2 JS divergence matrix They apply a distance-based clustering algorithm Agglomerative group average algorithm because it
minimizes the average pairwise distances between members of the merged clusters, as given by its objective function
Where S1 and S2 denote a pair of distributions, whose cardinalities are |S1| and |S2|, respectively
![Page 31: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/31.jpg)
31
Finding the optimal consensus
The JS divergence is the entropy of the weighted average distribution minus the weighted average of the entropies of the individual distributions and . It is symmetric, bounded, nonnegative, and equal to zero when .
![Page 32: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/32.jpg)
32
Finding the optimal consensus When a k-partition {S1, …, Sk} is obtained, the consensus
clustering Zk described by estimates of the prior probabilities of the consensus clusters and the posterior probabilities are computed using
![Page 33: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/33.jpg)
33
Finding the optimal consensus
![Page 34: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/34.jpg)
34
Case of identical partitions An essential property for consensus clustering algorithms is
that when all individual partitions represent a perfect consensus, that is, they are identical with respect to cluster label permutations, the consensus solution should be the same partition.
In the algorithms presented in this paper, this property is satisfied. In the case of the normalized weighting scheme becomes the
identity matrix I In the case of the un-normalized weighting scheme, is a
diagonal matrix whose qth diagonal element equals to . Each entry of Uo,i is equal to the size of the corresponding cluster. After averaging we get the exact same partition started with.
![Page 35: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/35.jpg)
35
Experimental Setup They compare the performances of the URCV, RCV, and ACV
algorithms with several recent consensus algorithms and with the average quality of the ensemble partitions
External Validation: Adjusted Rand Index with respect to the true clustering
They report the distribution of the Adjusted Rand Index using boxplots for r =20 runs for small data sets and for r=5 runs for large data sets (n ≥ 1000)
Internal Validation: they measure the Normalized Mutual Information (NMI) between every pair of consensus clusterings over multiple runs of the proposed consensus algorithms (interconsensus NMI)
To assess the stability of the consensus clustering over multiple runs Variations in the consensus clustering across multiple runs are due to the
ensemble partitions being generated with a random number of clusters and with different random seeds for the K-Means algorithm.
Default value for b = 25
![Page 36: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/36.jpg)
36
Experimental Setup
![Page 37: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/37.jpg)
37
Algorithms compared to
The binary one-to-one voting algorithm of Dimitriadou et al. (BVA)
![Page 38: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/38.jpg)
38
Data sets
![Page 39: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/39.jpg)
39
![Page 40: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/40.jpg)
40
![Page 41: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/41.jpg)
41
![Page 42: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/42.jpg)
42
![Page 43: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/43.jpg)
43
![Page 44: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/44.jpg)
44
![Page 45: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/45.jpg)
45
![Page 46: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/46.jpg)
46
Computational Complexity
Algorithm Complexity
Co-association based algorithms O(n2b)
CSPA O(n2kb)
MCLA O(nk2b2)
HGPA O(nkb)
QMI O(nkb)
URCV, RCV, ACV O(nko2b)
![Page 47: A Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters Hanan G. Ayad, Mohamed S. Kamel, ECE Department University of Waterloo,](https://reader035.fdocuments.net/reader035/viewer/2022062407/56649dc65503460f94abad90/html5/thumbnails/47.jpg)
47
Conclusion Cumulative voting to map an input ki-partition to a reference
ko-partition with probabilistic assignments Un-normalized Normalized
The reference partition was chosen as the one having the maximum entropy
Minimum average distance criterion between the mapped ensemble partitions and the summarizing stochastic partition Û
Extracting the optimal consensus partition from Û by minimizing the JS divergence within the cluster
Over all, the proposed methods performed better than the existing consensus methods, with less complexity