What is the relationship between the accuracy and the loss in deep learning?

Question

What is the relationship between the accuracy and the loss in deep learning?

N.IT

2022年5月29日 14:30

I have created three different models using deep learning for multi-class classification and each model gave me a different accuracy and loss value. The results of the testing model as the following:

First Model: Accuracy: 98.1% Loss: 0.1882
Second Model: Accuracy: 98.5% Loss: 0.0997
Third Model: Accuracy: 99.1% Loss: 0.2544

My questions are:

What is the relationship between the loss and accuracy values?
Why the loss of the third model is the higher even though the accuracy is higher?

Topic metric keras tensorflow deep-learning neural-network

Category Data Science

all_aest · Accepted Answer · 2022年5月29日 14:30

Someone says that accuracy has no relationship to the loss, but from a theoretical perspective, there IS a relationship.

Accuracy is $1 - (error\ rate)$ and the error rate can be seen as the expectation of the 0-1 loss: \begin{equation} l_{01}(f(x), y) := \begin{cases} 0 & (f(x) = y) \\ 1 & (f(x) \neq y) \end{cases} \end{equation}

\begin{equation} error\ rate = \mathbb{E}_{x, y} \left[ l_{01}(f(x), y) \right] \end{equation} where $f$ is the model, $x$ is its input and $y$ is the ground truth label for $x$.

In order to maximize the accuracy, we want to minimize the error rate. However, due to the incontinuity of the 0-1 loss, it is practically impossible. Instead, a variety of "surrogate loss" is used. The surrogate loss function $l$ is required to have some properties:

$l$ is continuous.
$l$ is convex.
$l$ bounds $l_{01}$ from above.

Surrogate losses with these properties allow us to minimize them via the well-known gradient descent algorithm.

Popular classes of those surrogate losses include the hinge loss that is used in support vector machine (SVM) and the logistic loss that is used in logistic regression and standard neural networks.

So, from a theoretical viewpoint, the accuracy and the loss displayed in every epoch of your training have some relationship. That is,

Accuracy has a direct connection with the error rate, which we want to minimize in the training.
Loss (usually the cross entropy loss, which is equivalent to the logistic loss in a sense) is a surrogate loss that bounds the error rate.

Jérémy Blain · Accepted Answer · 2021年3月11日 21:09

There is no relationship between these two metrics.
Loss can be seen as a distance between the true values of the problem and the values predicted by the model. Greater the loss is, more huge is the errors you made on the data.

Accuracy can be seen as the number of error you made on the data.

That means:

a low accuracy and huge loss means you made huge errors on a lot of data
a low accuracy but low loss means you made little errors on a lot of data
a great accuracy with low loss means you made low errors on a few data (best case)
your situation: a great accuracy but a huge loss, means you made huge errors on a few data.

For you case, the third model can correctly predict more examples, but on those where it was wrong, it made more errors (the distance between true value and predicted values is more huge).

NOTE:

Don't forget that low or huge loss is a subjective metric, which depends on the problem and the data. It's a distance between the true value of the prediction, and the prediction made by the model. It depends also on the loss you use.

Think:

If your data are between 0 and 1, a loss of 0.5 is huge, but if your data are between 0 and 255, an error of 0.5 is low.
Maybe think of cancer detection, and probability of detecting a cancer. Maybe an error of 0.1 is huge for this problem, whereas an error f 0.1 for image classification is fine.

rlms · Accepted Answer · 2020年9月5日 18:32

The other answers give good definitions of accuracy and loss. To answer your second question, consider this example:

We have a problem of classifying images from a balanced dataset as containing either cats or dogs. Classifier 1 gives the right answer in 80/100 of cases, whereas classifier 2 gets it right in 95/100. Here, classifier 2 obviously has the higher accuracy.

However, in the 80 of images classifier 1 gets right, it is extremely confident (for instance when it thinks an image is of a cat it is 100% sure that's the case), and in the 20 it gets wrong it was not at all confident (e.g. when it said a cat image contained a dog it was only 51% sure about that). In comparison, classifier 2 is extremely confident in its 5 wrong answers (it's 100% convinced that an image which actually shows a dog is a cat), and was not very confident about the 95 it got right. In this case, classifier 2 would have worse loss.

DmytroSytro · Accepted Answer · 2018年12月14日 12:41

Actually, accuracy is a metric that can be applied to classification tasks only. It describes just what percentage of your test data are classified correctly. For example, you have binary classification cat or non-cats. If out of 100 test samples 95 is classified correctly (i.e. correctly determined if there's cat on the picture or not), then your accuracy is 95%. By the way, Confusion matrix describes your model much better then accuracy.

Loss depends on how you predict classes for your classification problem. For example, your model use probabilities to predict binary class cat or non-cats between 1 and 0. So if probability of cat is 0.6, then the probability of non-cat is 0.4. In this case, picture is classified as cat. Loss will be sum of the difference between predicted probability of the real class of the test picture and 1. In reality log loss is used for binary classification, I just gave the idea of what loss is.

What is the relationship between the accuracy and the loss in deep learning?

About