uncertainties in non-convex optimization problems (neural networks)
How do you treat statistical uncertainties coming from non-convex optimization problems?
More specifically, suppose you have a neural network. It is well known that the loss is not convex; the optimization procedure with any approximated stochastic optimizer together with the random weights initialization introduce some randomness in the training process, translating into different optimal regions reached at the end of training. Now, supposing that any minimum of the loss is an acceptable solution there are no guarantees that those minima correspond to the same model performance (i.e., the same scores).
Ideally, one would repeat N times the same optimization and look at the distribution of results, but practically speaking with a large neural network you cannot afford a reasonably large number of replica and use a statistical approach. Moreover, even looking at frequency histograms it would not be trivial to model such a distribution and quote expected values and variance (of course one can selected some percentiles, but it is not a formally correct approach).
Notice: I am not changing the data, I am talking about performance variance introduced by the non-convex optimization problem. Clearly changing the data (for instance with some cross-validation) would introduce a change in the loss and consequently introduce another source of variance in the game. So I am not interested in this.
Topic uncertainty loss-function neural-network optimization statistics
Category Data Science