How do I find the optimal dropout rate for Monte Carlo Dropout?
I have a text classifier with 3 dropout layers. I tried to use Monte Carlo Dropout (MCD) technique to improve its performance, however its performance hasn't improved. MCD improved performance when classifying hand-written digits for MNIST dataset.
Now I wonder whether there is simply no space/potential for improving my text classifier or I have selected incorrect dropout rate.
How do I find the optimal dropout rate for Monte Carlo Dropout?
In particular:
- Should I use same dropout rate during both training and prediction?
- Should I use same dropout rate for all dropout layers?
Topic monte-carlo
Category Data Science