Why does using tanh worsen accuracy so much?
I was testing how different hyperparameters would change the output of my multilayer perceptron for a regression problem
checkpoint = keras.callbacks.ModelCheckpoint(best_model.h5, save_best_only=True)
# Initialising the ANN
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = X_train.shape[1]))
# Adding the second hidden layer
model.add(Dense(units = 8, activation = 'relu'))
# Adding the output layer
model.add(Dense(units = 1))
optimizer = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss='mean_squared_error')
# Fitting the ANN to the Training set
history = model.fit(X_train, y_train, batch_size = 100, epochs = 20, verbose=1, validation_split = 0.1, callbacks=[checkpoint])
and this model produced around 68% accuracy.
But when the activation functions for the hidden layers were changed to 'tanh', the accuracy jumped off a cliff to 0.07%!
I'm guessing it is something to do with tanh not being suited to regression?
Topic activation-function keras neural-network
Category Data Science