Calibrating probability thresholds for multiclass classification
I have built a network for the classification of three classes. The network consists of a CNN followed by two fully-connected layers. The CNN consists of convolutional layers, followed by batch normalization, a RELU activation, max pooling and drop out. The three classes are imbalanced (as can be seen in the confusion matrix below). I have optimized the parameters of the network to maximize AUC.
I'm calculating the AUC using macro- and micro-averaging. As can be seen in the ROC plot, the AUC is not that bad. On the other hand, the confusion matrix looks pretty bad, especially the first (low) class is badly predicted. The network tends to predict the majority class. As output of the network I'm getting a probability for each class. Then, I'm just taking the class according to the maximum probability for creating the confusion matrix.
I have tried to use balanced class weights while training the network (in the fit
method of Keras). This helped that the network also predicts more often the minority class(es) but on the other hand the AUC was decreasing.
Is there a way to infer probability thresholds from the ROC plot? I think for two classes the optimal probability threshold can be inferred from the ROC plot by taking the max(TPR - FPR)
but here I have three classes... Or is there another method?