Conditional clustering
I have a dataset consisting of addresses (points) that have several attributes; one that distinguishes the sort of address and one attribute that contains a numerical value.
I want to cluster these points based on:
- their distance from each other
- the sort of address
However, the summed numerical attribute per cluster cannot exceed a certain threshold value.
In other words, the system needs to form clusters but needs to stop clustering as soon as the sum of the numerical value attached to each address has been reached.
How do I even go about it? I have R, Python, and another geo- applications at my disposal.
It seems that none of the existing clustering algorithms work. For k- means, for example, I need to know the number of clusters beforehand, which I don't.
It seems rather simple, but I can't find a basic methodology to follow.
Topic geospatial clustering
Category Data Science