When should one use L1, L2 regularization instead of dropout layer, given that both serve same purpose of reducing overfitting?

Question

When should one use L1, L2 regularization instead of dropout layer, given that both serve same purpose of reducing overfitting?

user781486

2020年8月9日 08:36

In Keras, there are 2 methods to reduce over-fitting. L1,L2 regularization or dropout layer.

What are some situations to use L1,L2 regularization instead of dropout layer? What are some situations when dropout layer is better?

Topic overfitting keras dropout regularization

Category Data Science

n1k31t4 · Accepted Answer · 2018年8月23日 20:19

I am unsure there will be a formal way to show which is best in which situations - simply trying out different combinations is likely best!

It is worth noting that Dropout actually does a little bit more than just provide a form of regularisation, in that it is really adding robustness to the network, allowing it to try out many many different networks. This is true because the randomly deactivated neurons are essentially removed for that forward/backward pass, thereby giving the same effect as if you had used a totally different network! Have a look at this post for a few more pointers regarding the beauty of dropout layers.

$L_1$ versus $L_2$ is easier to explain, simply by noting that $L_2$ treats outliers a little more thoroughly - returning a larger error for those points. Have a look here for more detailed comparisons.

When should one use L1, L2 regularization instead of dropout layer, given that both serve same purpose of reducing overfitting?

About