How is calculated the error with multiple output neurons in neural network?

Question

How is calculated the error with multiple output neurons in neural network?

heresthebuzz

2022年3月22日 21:06

Machine Learning books generally explains that the error calculated for a given sample $i$ is:

$e_i = y_i - \hat{y_i}$

Where $\hat{y}$ is the target output and $y$ is the actual output given by the network. So, a loss function $L$ is calculated:

$L = \frac{1}{2N}\sum^{N}_{i=1}(e_i)^2$

The above scenario is explained for a binary classification/regression problem. Now, let's assume a MLP network with $m$ neurons in the output layer for a multiclass classification problem (generally one neuron per class).

What does change in the equations above? Since we now have multiple outputs, both $e_i$ and $y_i$ should be a vector?

Topic perceptron multiclass-classification deep-learning neural-network machine-learning

Category Data Science

Mikedev · Accepted Answer · 2020年10月20日 21:49

You are mixing various concepts:

$L = \frac{1}{2N}\sum^{N}_{i=1}(e_i)^2$ is used only for regression problem and not for binary classification because MSE fits very well when your target distribution is normal
You can use the latter formula for binary classification but will works really bad because your target data distribution is a Bernoulli, not Normal. Remember that the choice of the right imply a prior assumption on the target data distribution. For this reason the right formula is binary crossentropy (aka negative log likelihood of a Bernoulli) $$ L = - \sum_i y_i \log \hat{y_i} (1 - y_i) \log(1 - \hat{y_i}) $$
For multi classification problem there is a generalized formula of binary crossentropy which is called categorical crossentropy. If $\hat{y}$ is a vector of C element, one for each class and the true class $y$ is encoded as integer (e.g 0, 1, 2 ...) then the loss is $$ L = - \sum_i \log(\hat{y_i}[y]) $$

How is calculated the error with multiple output neurons in neural network?

About