Autoencoder train and test accuracy shooting to 99% on few epochs
I am trying to train an autoencoder for dimensionality reduction and hopefully for anomaly detection. My data specifications are as follows.
- Unlabeled
- 1 million data points
- 9 features
I am trying to reduce it to 2 compressed features so I can have better visualization for clustering.
My autoencoder is as follows where latent_dim = 2
and input_dim = 9
class Autoencoder(tf.keras.Model):
def __init__(self,latent_dim,input_dim):
super(Autoencoder32x, self).__init__()
self.latent_dim = latent_dim
self.input_dim = input_dim
self.dropout_factor = 0.5
self.encoder = Sequential([
# Dense(16, activation='relu', input_shape=(self.input_dim,)),
#Dropout(self.dropout_factor),
Dense(8, activation='relu'),
Dropout(self.dropout_factor),
Dense(4, activation='relu'),
Dropout(self.dropout_factor),
Dense(self.latent_dim, activation='relu')
])
self.decoder = Sequential([
Dense(4, activation='relu', input_shape=(self.latent_dim,)),
Dropout(self.dropout_factor),
Dense(8, activation='relu'),
Dropout(self.dropout_factor),
#Dense(16, activation='relu'),
#Dropout(self.dropout_factor),
Dense(self.input_dim, activation=None)
])
def call(self, inputs):
encoder_out = self.encoder(inputs)
return self.decoder(encoder_out)
Model compilation
ae_train_x, ae_test_x, ae_train_y, ae_test_y = train_test_split(scaled_df[COLUNMS_FOR_AUTOENCODER], scaled_df[COLUNMS_FOR_AUTOENCODER], test_size=0.33)
autoencoder = Autoencoder(latent_dim=2,input_dim=9)
autoencoder.compile(loss='mse', optimizer='adam',metrics=['accuracy'])
Finally training
ae_history = autoencoder_10_32x.fit(ae_train_x, ae_train_y, validation_data=(ae_test_x, ae_test_y), epochs=50)
Output of training
Epoch 1/50
22255/22255 [==============================] - 38s 2ms/step - loss: 0.3330 - accuracy: 0.9646 - val_loss: 0.2816 - val_accuracy: 0.9999
Epoch 2/50
22255/22255 [==============================] - 38s 2ms/step - loss: 0.2664 - accuracy: 0.9999 - val_loss: 0.2818 - val_accuracy: 0.9999
Epoch 3/50
22255/22255 [==============================] - 38s 2ms/step - loss: 0.2649 - accuracy: 0.9999 - val_loss: 0.2845 - val_accuracy: 0.9999
What could be the problem? I think the network is learning to just pass the values. But that should not be possible with the bottleneck and dropout layers. I have also decreased layers but still the result is same. How can I fix it?
Topic sparsity keras autoencoder deep-learning accuracy
Category Data Science