uncertainties in non-convex optimization problems (neural networks)

How do you treat statistical uncertainties coming from non-convex optimization problems? More specifically, suppose you have a neural network. It is well known that the loss is not convex; the optimization procedure with any approximated stochastic optimizer together with the random weights initialization introduce some randomness in the training process, translating into different "optimal" regions reached at the end of training. Now, supposing that any minimum of the loss is an acceptable solution there are no guarantees that those minima …
Category: Data Science

How can I store sources, effective dates, and confidence for every property in a knowledge graph?

What I am wanting to do is ensure that every property in a knowledge base comes from at least one source. I would like to ensure that every edge is spawned (or at least explained) by some event, like a "claim" or "measurement" or "birth." I'd like to rate on a scale the confidence that some property is correct, which could also be inherited from the source's confidence rating. Finally, I want to ensure that effective date(s) are known or …
Category: Data Science

How to forecast time series with negative trend in test set and big uncertainty? (uncertainty due to Covid and Ukraine crisis)

Recently I started to create a machine learning model for a European customer for around 800 product time series. The goal is to produce a monthly forecast for the 6 months ahead. Since this customer is a grocery wholesaler, a lot of the products experience supply chain difficulties due to Covid restrictions and now there might be some large effects due to the Ukraine crisis. From the picture attached, you can already spectate the downwards trend in the last 6 …
Category: Data Science

Conventional way of representing uncertainty

I am calculating metrics such as F1 score, Recall, Precision and Accuracy in multilabel classification setting. With random initiliazed weights the softmax output (i.e. prediction) might look like this with a batch size of 8 (I am using pytorch): import torch logits = torch.tensor([[ 0.0334, -0.0896, -0.0832, -0.0682, -0.0707], [ 0.0322, -0.0897, -0.0829, -0.0683, -0.0708], [ 0.0324, -0.0894, -0.0829, -0.0682, -0.0705], [ 0.0322, -0.0897, -0.0828, -0.0683, -0.0708], [ 0.0333, -0.0895, -0.0832, -0.0682, -0.0708], [ 0.0341, -0.0871, -0.0829, -0.0681, -0.0650], [ …
Category: Data Science

What are some state of art computer vision models for anomaly detection that can learn continuously and build classes for detected anomalies?

I'm looking forward to build a model that: Detect anomalies Improve over user feedback Build classes for the anomalies based on user feedback Since a schema is worth a thousand words: Do you know some state of art models that have this behavior (at least partially), that I can used or benchmark?
Category: Data Science

How can I distribute samples optimally to fit a model?

I'm trying to fit a model to a low number (~5-10) of data points. I might be able to suggest the optimal distribution of the data points beforehand, knowing a bit about the data and the model I created. Is there an established method or do you have any ideas on how to choose the best sampling intervals? Maybe based on the gradient of the model or similar? The model consists of several differential equations describing a biological system and …
Category: Data Science

Confidence intervals for evaluation on test set

I'm wondering what the "best practise" approach is for finding confidence intervals when evaluation the performance of a classifier on the test set. As far as I can see, there are two different ways of evaluating the accuracy of a metric like, say, accuracy: Evaluate the accuracy using the formula interval = z * sqrt( (error * (1 - error)) / n), where n is sample size, error is classification error (i.e. 1-accuracy) and z is a number representing multiples …
Category: Data Science

How should systematic uncertainties (up and down) in training data be handled in classification neural networks?

I have a classification neural network and nominal input data on which it is trained, however the input data has for each feature a systematic (up and down) uncertainty. How should the accuracy of the classifier be qualified and visualised using these input data uncertainties? I have a simple MWE example composed using the iris dataset; the intention is that is should be copy-pastable easily into a Jupyter notebook. Lotsa imports: import numpy as np import datetime from IPython.display import …
Category: Data Science

A deployed model has epistemic or aleatoric uncertainty?

Aleatoric uncertainty refers to the notion of randomness that there is in the outcome of an experiment that is due to inherently random effects. Epistemic uncertainty refers to the ignorance of the decision-maker, due to for example lack of data. Aleatoric uncertainty is irreducible, while epistemic can be mitigated (Adding more data). When we deploy a ML model in production. Can we distinguish between epistemic and aleatoric uncertainy?
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.