What is the appropriate statistical test to compare the MAUC scores from two machine learning classifiers?

I would like to compare the scores of two multi-class classifiers. I have calculated the MAUC score for each of the algorithms, and now I want to see whether there is a statistical difference between the results.

From what I have read so far, the McNemar test seems to be a good alternative, however, I am not sure how exactly to use it. In this article, there is an example o how to use McNemar's test to compare the accuracy between algorithms.

The scores I would like to compare are 0.809 and 0.812. By trying to follow the tutorial, I came up with this table on which I want to apply the McNemar test implemented here.

              model 1(correct)  |   model 1 (wrong)

model 2 (correct)        0.809        |     0.003
                                      |    
model 2 (wrong)          0.000        |      0.191

Could someone please help me out in here? I'm very confused. Thank you!!

Topic auc difference multiclass-classification

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.