Is there any advantage in using Particle Swarm Optimization for clustering than K-Means?

I have read some paper about using particle swarm optimization. It doesn't look give much different result than K-Means. I tried to use PSO for clustering but the result is pretty much the same with K-Means with some drawbacks like longer execution time and have a lot of different result caused by the random factor.

Topic evolutionary-algorithms optimization clustering

Category Data Science


K-means makes locally the optimal decision. Most of the time (in particular when this objective works reasonably well at all) this works quite well to find a good local optimum. I doubt that using an approach such as PSO gives you any advantage here - the problem is just too simple, and k-means has the speed advantage, while PSO is unlikely to find other optima. And in particular the better k-means algorithms such as Hamerly, Elkan, ... are so fast, they'll be able to run with hundreds of random restarts in the time needed for just one PSO.


You cannot acutally compare the two, because PSO is an optimization algorithm, and K-means is a clustering algorithm. An optimization algorithm tries to find the best solution to a given problem, such as a clustering problem. If the clustering problem is stated like it is done for K-means, then you should find the same result using PSO, as applying K-means directly.

The problem statement for K-means defines k clusters as k points in space ("cluster centroids"), and all examples that are closest to each of the k points are assigned to that cluster. If the clusters you are trying to find with PSO are defined the same way, you should find the same results. However, if you define the problem differently, like "find k volumes in space with the maximum density and arbitrary shape", then trying to find the solution to this problem with PSO will give you different results.

It's still going to run more slowly than K-means though, because that's the strong point of K-means. It scales really well, because it doesn't involve any pair-wise comparisons between examples.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.