SMOTE for multi-instance learning i.e num_rows(x_train) > num_rows(y_train)
I have an imbalanced dataset and I wish to predict classes(0 or 1).
Sample x_train
:
id date c1 c2 . . . . . . c20
101 13-02-2015 2 7 . . . . . . 14
101 14-02-2015 24 7 . . . . . . 8
.
.
.
105 13-02-2015 12 5 . . . . . . . 4
.
.
Sample y_train
id class
101 1
105 1
107 0
.
.
.
Now I wish to over sample class 0 in the dataset but the problem is for each id
I have just one row in y_train
whereas I have 50 rows for the same id
in x_train
.
Topic multi-instance-learning data-science-model smote class-imbalance dataset
Category Data Science