Is it wrong to use Glorot Initialization with ReLu Activation?

Question

Is it wrong to use Glorot Initialization with ReLu Activation?

Kalanos

2022年3月6日 14:04

I'm reading that keras' default initialization is glorot_uniform.

However, all of the tutorials I see are using relu activation as the go-to for hidden layers, yet I do not see them specifying initialization for those layers as he.

Would it be better for these relu layers to use he instead of glorot?

As seen in OReilly's Hands-On Machine Learning with Scikit-Learn Tensorflow:

| initialization | activation                    | 
+----------------+-------------------------------+
| glorot         | none, tanh, logistic, softmax | 
| he             | relu  variants               |
| lecun          | selu                          |

Topic weight-initialization activation-function keras deep-learning neural-network

Category Data Science

lcrmorin · Accepted Answer · 2020年1月24日 19:38

As a general answer for hyperparameter tuning, you have to try both and see what works better for your problem. I suspect that some (if not most) of general tuning rule have been observed on a given problem / with a given architecture. (for exemple the He paper is about vision, including convolutional layers).

As for keras choice, sometimes, for practical reasons, it is easier to implement an unique default option than to adapt the default option to the activation. Given your Hends-on machine learning citation, it's not hard to see why they would implement Glorot as the default.

Is it wrong to use Glorot Initialization with ReLu Activation?

About