Why using the hyperbolic tangent or the sigmoid as activation function on the last layer gaves the same result in accuracy?

Question

Why using the hyperbolic tangent or the sigmoid as activation function on the last layer gaves the same result in accuracy?

Simone

2022年3月2日 13:19

The problem

I'm making a simple Multilayer Perceptron (MLP), in Keras, that has to do the binary classification from some float type of data. Each single data is a group of three float values (e.g. 32.01, -10.23, -1.01) and is labelled with the value 0 or 1. Every time I do the training process the result of the validation accuracy and validation loss remain at the same value after few training epochs, like 5 or 6.

The problem is the validation accuracy don't increase and always remain at the value of 0.0000

What I have tried

I have tried to use a different activation function for the last layer, like a softmax, and while I got a different result I don't think doing this is 100% correct because the output I want from the MLP is only one, which is a value that should tell me whenever the data goes to a class or the another.

The source code

Here is the python code I use for making the model.

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.activations import softmax, sigmoid
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import binary_crossentropy
from tensorflow.keras.metrics import categorical_accuracy

x_train = #training set data
y_train = #training set labels

# Make the model
model = Sequential()
model.add(Input(shape=(3,)))
model.add(Dense(8, activation=relu))
model.add(Dense(2, activation=tanh))

model.compile(optimizer=Adam(learning_rate=0.1),
              loss=binary_crossentropy,
              metrics=[binary_accuracy])

model.fit(x=x_train, 
          y=y_train,
          batch_size=32,
          epochs=100,
          validation_split=0.1,
          shuffle=True,
          verbose=1)

What I want

I don't know what am I doing wrong, but I simply want the model to have a better result in terms of validation accuracy and validation loss.

Topic activation-function keras tensorflow python machine-learning

Category Data Science

Why using the hyperbolic tangent or the sigmoid as activation function on the last layer gaves the same result in accuracy?

The problem

What I have tried

The source code

What I want

About