cclustr - Consensus Clustering Methods for Multiple Imputed Data
Provides tools for performing consensus clustering on
multiple imputed datasets. The package supports a range of
clustering algorithms across imputations, including
hierarchical methods (e.g., Ward, single, complete, average)
and partition-based approaches such as k-means, k-medoids
(PAM), fuzzy clustering, model-based clustering ('mclust'), and
methods for mixed or categorical data (k-modes and
k-prototypes). A co-assignment matrix is constructed to
quantify agreement between partitions, and consensus solutions
are derived via hierarchical clustering applied to the
resulting dissimilarity matrix. Additional functions are
provided for validation and visualization of clustering
results, facilitating robust analysis in the presence of
missing data. Consensus clustering framework is based on Monti
et al. (2003) <doi:10.1023/A:1023949509487>, rank aggregation
methods follow Pihur et al. (2007)
<doi:10.1093/bioinformatics/btm158>, and the PAC (Proportion of
Ambiguous Clustering) metric is based on Senbabaoglu et al.
(2014) <doi:10.1038/srep06207>.