is it good to have 100% accuracy on validation?

i'm still new in machine learning. currently i'm creating an anomaly detection for flight data. it is a multivariate time series data that include timestamp, latitude, longitude, velocity and altitude of the aircraft. i'm splitting the data into train and test with 80% ratio. i used the keras LSTM autoencoder to do a anomaly detection. so here's my code

def create_sequence(data, time_step = None):
    Xs = []
    for i in range (len(data) - time_step):
        Xs.append(data[i:(i + time_step)])

    return np.array(Xs)

# pre-process to split the data

dfXscaled, scalerX = scaledf(df, normaltype=normalization)
num_train = int(df.shape[0]*ratio)

values_dataset = dfXscaled.values

train = values_dataset[:num_train, :]
test = values_dataset[num_train:, :]

# sequence input data [sample, time step, features]
train_input = create_sequence(train, time_step = time_step) 
test_input = create_sequence(test, time_step = time_step) 

train_time = index_time.index[:num_train]
test_time = index_time.index[num_train:]

# model 
model_arch = []

last_layer = num_layers - 1
for x in range(num_layers):
    if x == last_layer:
        model_arch.append(tf.keras.layers.LSTM(num_nodes, activation='relu', return_sequences=True, dropout = dropout))
        model_arch.append(tf.keras.layers.LSTM(num_nodes, activation='relu', input_shape=(time_step, 4), dropout = dropout))  
model = tf.keras.models.Sequential(model_arch)
opt= tf.keras.optimizers.SGD(learning_rate=learning_rate)
              metrics=[tf.keras.metrics.MeanAbsolutePercentageError(name='mape'), tf.keras.metrics.RootMeanSquaredError(name='rmse'), mae, 'accuracy'])
history =, train_input, epochs=epochs, batch_size = num_batch, validation_data=(test_input, test_input), verbose=2, shuffle=False)

when i do a model evaluation, it come up with 100% accuracy

is it good to have 100% accuracy ? or my model is overfitting the data ?

Topic lstm keras anomaly-detection

Category Data Science

Usually indicates something is wrong.

In your case, things which do not seem right:

  1. One can easily get ~100% accuracy in anomaly detection - just keep predicting the majority class.
  2. Is this model really for anomaly detection? Anomaly detection is a classification problem, but your metrics (MAPE, RootMeanSquaredError etc.) are regression metrics.


Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.