How to increase the accuracy of an imbalanced dataset (not precision)?
There's an imbalanced dataset in a Kaggle competition I'm trying. The target variable of the dataset is binary and it is biased towards 0. 0 - 70% 1 - 30% I tried several machine learning algorithms like Logistic Regression, Random Forest, Decision Trees etc. But all of them give an accuracy around 70%. It seems that the models always tend to predict 0. So I tried several methods to get an unbiased dataset like the following.
- Up sampling the dataset using SMOTE and other techniques.
- Under sampling the dataset
- Changing the weight of the model.
But all of these steps reduced the accuracy instead of increasing. Area under the curve and precision was improved but unfortunately I have to increase the accuracy somehow to win the competition.
So I would really appreciate it if you could tell me about the techniques to improve the accuracy in an imbalanced dataset.
Topic imbalanced-data preprocessing visualization dataset
Category Data Science