How to interpret Sum of Squared Error in a classification task

I am working on ANN. I have 2497 training examples and each of them is a vector of 128, so the input size is 128. Number of neurons in hidden layer is 64 and number of output neurons is 6 (since classes are six).

My Target vector looks something like this: [0 1 0 0 0 0]. This means that the example belongs to class 2.

I have used sigmoid as an activation at all layers and sum of squared error is loss. SSE is computed over one epoch. Total epochs are 10k.

My loss starts from around 700 and reduces to 450. Should I say that loss is 18% per example since 450 is the loss for all the 2497 examples.

How do I interpret this? Is my model good enough? I know that I should test it on unseen data to be sure of its accuracy, but still does this tell anything about the performance at all or not.

PS: I am implementing it in C.

Topic multiclass-classification c scoring classification machine-learning

Category Data Science


SSE in classification is proportional to the Brier score:

$$ SSE = \sum_n \sum_k (\hat y_{nk} - y_{nk})^2 \\ Brier = \frac{SSE}{N} $$

For observations indexed by $n$ and classes indexed by $k$.

In your case, you obtained a Brier score of 0.18.

The Brier score can be decomposed into

$$ Brier = Reliability - Resolution + Uncertainty $$

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.