Clustering of multi-label data
The dataset consists of
1) a set of objects and
2) a set of labels, which are used to describe the objects.
For the moment, for simplicity sake, each label can be marked as either true or false (In a more complex setup, each label will have a value of 1-10).
But, not all the labels are actually applied to all the objects (in principle, all the labels can and should be applied across all the objects, but in practice, they just are not). Also, when a label isn't applied to an object, one cannot simply assume that the label's value for that particular is false. Therefore, the missing labels will be ignored in the model.
I need to cluster the objects based on their labels.
Any tips on how and what algorithms to use will be appreciated.
Topic labels multilabel-classification classification clustering
Category Data Science