Bounded regression problem: sigmoid, hard sigmoid or…?
I have been training a neural network for a bounded regression and I am still in doubt for which activation function to use on the output layer.
At first, I was convinced that a sigmoid would be the best option in my case because I need my output to be from 0 to 1, but cases near 0 and 1 were never predicted.
So I have tried to use a hard sigmoid, but now I face (almost) the opposite problems, many data points are wrongly predicted to be at 1 or at 0, and then there is a gap and they start getting predicted in the center (so around 0.5).
Could this be solved by using another activation function for the output layer? If yes, which other function could be used for this bounded regression problem?
My doubt is that this is mainly due to my data, which for those boundary labels might consist of very noisy data.
Topic sigmoid activation-function regression neural-network
Category Data Science