Validating classification results

I created a model for only 2 classes and the classification report was:

Although accuracy looks good, I don't think this model is good. The original data has 522 records of class 1 and 123 of class 2. So, I think that the model is guessing for the most common (class 1). When I applied the model on the original data, it was predicted 585 class 1 and 60 class 2.

When I balanced the classes, the results were:

The model application on original data generated 396 for class 1 and 249 for class 2. Since I'm going to use this model for prediction, it still doesn't look good to me.

My evaluation, in this case, was: I multiplied that class-2 predicted number by precision: 0.65 x 249 = 162, without considering the classes predicted as class 1, when in true are class 2. These values as very bigger than the original count (123 records)

Is this evaluation correct? Are there other ways to evaluate this model?

Topic model-evaluations classification

Category Data Science


Let's start with your data: you have 522 class 1 and 123 class 2, which is roughly 4:1. So there is a skew, which is what we call a class imbalance problem.

There are some precautions while working with a class imbalance dataset, including the model to use, loss function to optimize and metric to evaluate.

Here we focus on the metric (since no model/loss info are available). Looking at the 1st report, the recall of class 1 is much higher than class 2 (0.98 vs 0.32). This is a clear sign of your model biased towards predicting class 1, which is a problem.

There are a few things you can do:

  1. Accuracy is well-known not suitable for imbalance data; pick a metric which can cope with class imbalance, e.g. F1-score.

  2. Go back to the business objective: how much do you wish to trade-off between precision/recall of class 1 and 2? This will give you a clue on what metric to choose, as well as designing a cost matrix for training if applicable.

On the other hand, do not forget to follow best practice for model training/validation/testing, e.g. split your dataset beforehand and not using same data for both training and evaluation. You can find a lot of great tutorials on how to do model evaluation, and I strong recommend you to do some research on.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.