Stratified K Fold Cross Validation in Orange: python script

I am using Orange to predict customer churn and compare different learners based on accuracy, F1, etc.

As my problem is unbalanced (10% churn - 90% not churn), I want to oversample. However, when using orange, this is not possible to do the oversampling within the cross-validation (test score block).

Therefore, I want to, based on my input data, generate first 10 folds (stratified - where the distribution 10 % churn / 90 % not churn) is preserved. Then, oversample within each fold to get 50 - 50 distribution. Then, add for each instance the fold number as a feature. Lastly, within the test score block, do cross validation by feature, namely the fold number. I think I have to implement this myself by using a Python script. Is there anyone that could help me doing this?

Thank you! Emma

Topic imbalance orange cross-validation python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.