In which situation should we consider a dataset as imbalanced?
I'm facing a problem about making a classification on a dataset. The target variable is binary (with 2 classes, 0 and 1). I have 8,161 samples in the training dataset. And for each class, I have:
- class 0: 6,008 samples, 73.6% of total numbers.
- class 1: 2,153 samples, 26.4%
My questions are:
In this case, should I consider the dataset I used as an imbalanced dataset?
If it was, should I process the data before using RandomForest to make a prediction?
If it was not an imbalanced dataset, could somebody tell me in which situation (like what ratio for each class) I could consider a dataset as imbalanced?
Topic class-imbalance random-forest classification machine-learning
Category Data Science