GridSearch CV: Suitable scoring metrics for Imbalanced data sets
I am new to machine learning. This is my $1^{st}$ machine learning project and I am working on classification on an imbalanced dataset. There are also multi-classes in the target variable.
I would like to know what is the most suitable metrics for scoring the performance in the GridSearchCV.
I think
- roc_au is sometimes used for imbalanced dataset. But there are several
‘roc_auc’
‘roc_auc_ovo’
‘roc_auc_ovr’
Which should I use?
- Alternatively, precision-recall_auc is also used. But I can't seem to find this scoring metrics for GridSearchCV. How do I use it in GridSearchCV?
Thank you
X_train, X_test, y_train, y_test = train_test_split(X_total, Y_total, random_state=0, test_size=0.25)
kfold =GroupKFold(n_splits=3)
grid_search = GridSearchCV(RandomForestClassifier(random_state=0), hyperF, cv = kfold, scoring=, verbose = 1, n_jobs = -1)
Topic grid-search class-imbalance
Category Data Science