Is the _error_ in the context of ML always just the difference of predictions and targets?

Question

Is the _error_ in the context of ML always just the difference of predictions and targets?

lo tolmencre

2019年7月29日 13:38

Simple definitional question: In the context of machine learning, is the error of a model always the difference of predictions $f(x) = \hat{y}$ and targets $y$? Or are there also other definitions of error?

I looked into other posts on this, but they are not sufficiently clear. See my comment to the answer in this post:

Whats the difference between Error, Risk and Loss?

Topic machine-learning-model terminology definitions machine-learning

Category Data Science

MachineLearner · Accepted Answer · 2019年7月29日 13:38

The error can have different forms depending on the application. For example for a simple regression we often use the sum of squared deviations between the actual output $y_n$ for the input $x_n$ and the predicted output $\hat{y}(x_n)$ for the input $x_n$. The total loss $J_\text{Gauss}$ is then given as the sum over all squared errors (also known as Gaussian Loss)for each observation.

$$J_\text{Gauss}= \sum_{n=1}^N\left[y_n-\hat{y}(x_n)\right]^2$$

If we use absolute values instead of squares we obtain the Laplacian loss function $J_\text{Laplace}$, which is given by

$$J_\text{Laplace}=\sum_{n=1}^N\left|y_n-\hat{y}(x_n)\right|$$

If we rather try to compare two probability distributions $p(x)$ and $q(x)$ we use a unsymmetric distance meassure called Kullback-Leibler divergence

$$ {\displaystyle D_{\text{KL}}(P\parallel Q)=\int _{-\infty }^{\infty }p(x)\ln {\frac {p(x)}{q(x)}}\,dx}. $$

For binary classification we can use the hinge-loss

$$J_\text{hinge}=\sum_{n=1}^N\max \{0, 1- t_n \hat{y}(x_n)\},$$

in which $t_n=+1$ if observation $x_n$ is from the postive class and $t_n=-1$ from the negative class.

For support vector regression the $\varepsilon$-insensitive loss $J_\varepsilon$ is used. It is defined by the following equation.

$$J_\varepsilon=\max\{0,|y_n-\hat{y}(x_n)|-\varepsilon\}$$

This loss acts like a threshold. It will only count something as an error if the error is larger then $\varepsilon$.

As you can see there are some meassures (see this Wikipedia article for loss functions used for classification) of error for comparing the predicted output and the observed output.

Erwan · Accepted Answer · 2019年7月28日 23:47

Simple definitional question: In the context of machine learning, is the error of a model always the difference of predictions ()=̂ and targets ? Or are there also other definitions of error?

No, I don't think the word "error" can be considered as a technical term which always follows this specific definition.

The word "error" is frequently used in the context of evaluation: in a broad sense, an error happens every time a ML system predicts something different than the true value. It's obviously the same concept as in the definition, but it can be applied to non numerical values for which the mathematical difference doesn't make sense.

For example in the context of statistical testing and in classification it's common to talk about "type I errors" and "type II errors" for instances misclassified as false positive and false positive respectively.

Is the _error_ in the context of ML always just the difference of predictions and targets?

About