Multi-label classification with nested features
I need to perform a multi-label classification. I have three features and they are nested. I am unsure how to combine this or what kind of classification algorithm would be best. Some multi level neural network as shown here seems good, but the nested features don't seem to be taken into account there.
I present the nested features (X) and labels (Y) in the two datasets below: one subject ID can have one or more features and one or more classes. Features and classes can be 'occupied' by one or more subject.
Note: I have about 100k subjects, 1k features (at the third level) and 200 classes.
data_features
subject_id feature1 feature2 feature3
1 a aa aaa
2 a aa aab
3 a ab aba
1 a ab abb
2 b ba baa
3 b ba bac
1 b ba bad
2 b ba bad
3 c ca caa
4 c ca caa
5 c cb cba
6 c cb cbb
data_labels
subject_id label1 label2 label3 label4
1 0 1 0 0
2 0 1 1 1
3 0 1 1 0
4 1 1 0 1
5 1 0 0 0
6 0 1 1 1
7 0 0 0 1
8 1 1 1 1
9 0 0 1 1
10 1 0 1 0
11 0 1 0 1
12 1 0 0 1
I am quite unsure what algorithm would combine those the best? (I am skilled in R and SAS and decent in Python, but will learn any other language that would be needed)
Topic multilabel-classification neural-network
Category Data Science