Theoretical work on validity of restricting movement of Centroid of K-Mean

I recently received a manuscript for review in which author used ~1000 "fake" data points, so that the final centroid of K-mean stays within the required range. Neither me nor the author seems to have background in data science and the paper is more of application into our research area.

I have tried to find published work related to such method of restricting k-mean centers, but failed to do so. However, on simple logic, it seems like valid way, so maybe author used wrong terminology.

Hence, I would like to ask, is this a valid way to restrict k-mean center and are there any published work on it?

Topic k-means

Category Data Science


A generalized solution would be constrained optimization. Change to the loss function to only allow solutions within a certain region.

Adding fake data points to nudge the solution into a valid region has several limitations: it requires human intervention adjustment for every model run and no guarantees. Constrained optimization would be automated and provide give strong guarantees.


I highly recommend finding a source explains how k-means work and understand it well. The K-means is well known, so it is hard to find a reference talk about it as an algorithm or explain how it work.

I noticed you stating "author used ~1000 "fake" data points, so that the final centroid of K-mean stays within the required range" which is always going to be true. K-means is about calculating the mean (average) of data points used, which assure (always) to end with a centroid/s within the range of data used.

The power of this algorithm (K-means) is calculating the mean iteratively to reach stability of means (centroids). In another waord, in each iterate, means shift to be centered of denses. That give, if you in case of finding 1 K (one centroid) you will find it by one iterate.

Me personally suggest start with some videos, and go forward. Here is the first result on YouTube about k-means https://youtu.be/_aWzGGNrcic.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.