How to apply ensemble clustering method?

I need to use ensemble clustering method by using python in my data set. I already applied k-means clustering by using scikit learn library. I also applied different classification method also find ensemble classification method in scikit-learn. Now I am confused is there any library exist in scikit learn for ensemble clustering or how I can apply ensemble clustering method on my data set?

Topic ensemble-learning scikit-learn python clustering data-mining

Category Data Science


As far as I know, scikit-learn has no library for ensemble clustering. On the other hand, you can apply the method on your dataset as follows:

import numpy as np
import ClusterEnsembles as CE

kmeans1 = np.array([1, 1, 1, 2, 2, 3, 3])
kmeans2 = np.array([2, 2, 2, 3, 3, 1, 1])
kmeans3 = np.array([4, 4, 2, 2, 3, 3, 3])
kmeans4 = np.array([1, 2, np.nan, 1, 2, np.nan, np.nan]) # `np.nan`: missing value

ret = CE.cluster_ensembles(np.array([kmeans1, kmeans2, kmeans3, kmeans4]))

print(ret) # output: [1 1 1 2 2 0 0]

since the problem is to combine several runs different clustering algorithms to get a common partition of the original dataset, aiming for consolidation of results from a portfolio of individual clustering results. the solution is simple for there is no correct way to answer it we formally define the CDC problem as an optimization problem from the viewpoint of CE, and apply CE approach for clustering categorical data. Experimental results on real datasets show that CE based clustering method is competitive with existing CDC algorithms with respect to clustering accuracy.


You might find the following libraries helpful:

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.