Clustering with custom criterion (minimum cluster weight)

Edit: following comment from @anony-mousse, I'm changing the question to search for a general clustering approach that matches this criterion (minimum weight per cluster).

I am to use a clustering method on a set of $n$ weighted points:

---------------------------------------------
| id  | weight | feature_1| feature_2 | ... |
---------------------------------------------
| 1   | 4      | 0.2345   | -0.2345   | ... |
| 2   | 2      | 0.675    | 0.7433    | ... |
| 3   | 15     | -0.45    | 0.123     | ... |
| ... | ...    | ...      | ...       | ... |
---------------------------------------------

I have a custom criterion: some algorithms make sure there is a minimum number of points $n_{min}$ per cluster ; here I would like to make sure each cluster has a minimum weight (sum of point weights) $\sum w_i s_{min}$.

Is there such a clustering method already implemented in Python?

Topic unsupervised-learning weighted-data python clustering

Category Data Science


This does not work, and it's not how hierarchical clustering works.

If you stop at $n_\min$, no cluster will be larger than $2n_\min-2$ but there will either be plenty of badly clustered points, or unclustered points.

Consider the data set 0 2 3 5 with nmin=2. The first merge is (2,3) and fulfills the stopping criterion. So either you cluster this as (0), (2,3), (5) or as (0,5), (2,3) neither of which is convincing: either nmin is not a minimum size, or the clusters can be arbitrarily bad (and still may be below the minimum size).

The same concern applies to a weighted version.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.