Ethical consequences of non-deterministic learning processes?
Most advanced supervised learning techniques are non-deterministic by construction. The final output of the model usually depends on some random parts of the learning process. (Random weight initialization for Neural Networks or variable selection / splits for Gradient Boosted Trees). This phenomenon can be observed by plotting the predictions for a given random seed against the predictions for another seed : the prediction are usually correlated but don't coincide exactly.
Generally speaking it is often not a problem. When trying to separate green tomatoes from red ones, only overall performance of the classifier matters. Individual predictions don't really matter as no tomato will be upset / will sue you. However, for more adavanced problems, more specifically those relating to people (education, work, loan application...) variance in individual scores due to the non-deterministic learning process might become a problem. Basically some people might get a life impacting result based on a given seed and the opposite result had you used another seed. This doesn't seems very fair or ethical to me. Choosing a seed sort of feel like a trolley problem...
Outside of techniques that might be used to reduce this 'seed-dependence' (regularization / ensemble and so on), I would be interested on the ethical aspects of this variance in outputs relating to the random seed. But I can't find any resource on the ethical matter. (I can barely find some resources on the topic of seed dependence - I suspect this phenomenon is not widely disclosed as it might deter people from wanting to use 'Artificial Intelligence').
In the context of models impacting people lives, have the ethical consequences of non-deterministic learning processes been formalized / evaluated?
Topic ethical-ai methodology model-selection
Category Data Science