kMeans on graph Laplacian to cluster nodes based on their distance

Question

kMeans on graph Laplacian to cluster nodes based on their distance

user12632521

2021年11月13日 14:10

I have a connected weighted graph and I want to use kMeans to cluster the points based on their distance (smaller distances indicate that the nodes are more likely to be in the same cluster).

I computed the Laplacian of the graph and chose the eigenvectors that have corresponding nonzero eigenvalues. I then performed the kMeans in this embedding space represented by the chosen eigenvectors.

In the following example, I want to create 2 clusters and I chose nodes $1$ and $3$ to be the initial centroids. My problem is that the resulting clustering puts nodes $2, 3, 4$ in one cluster and the node $1$ in another one. The desired clustering is $1, 2, 4$ in one cluster and $3$ in the other one.

Does it make sense to use such an approach to cluster the nodes of a graph? Or how can it be that the resulting clustering does not minimize the distances between the points in one cluster?

I am using sklearn.cluster.k-means.

Topic spectral-clustering k-means

Category Data Science

kMeans on graph Laplacian to cluster nodes based on their distance

About