How to calculate MAE and threshold in a multivariate time series
I'm trying to understand how to calculate the MAE in my time series and then the thresholds to understand which of my data in the test set are anomalies. I'm following this tutorial, which is based on a univariate time series, and they calculate it in the following way:
# Get train MAE loss.
x_train_pred = model.predict(x_train)
train_mae_loss = np.mean(np.abs(x_train_pred - x_train), axis=1)
I have a dataset structured as well:
device1 device2 device3 .... device30
0.20 0.35 0.12 0.56
1.20 2.10 5.75 0.16
3.20 9.21 1.94 5.12
5.20 4.32 0.42 9.56
.... .... .... ....
7.20 6.21 0.20 -9.56
This means my threshold will need to care about 30 devices instead of one. I reshaped my train set as (3000, 10, 30)
, where 3000 are the values, 10 are the TIME STEPS (I did this to prepare the data for the Conv1D) and 30 are the features (the devices). What I thought was to calculate it as well, but it did't work since it gave me a very small threshold:
# Get train MAE loss.
x_train_pred = model.predict(X_train)
train_mae_loss = np.mean(np.mean(np.abs(x_train_pred - X_train), axis=1), axis=1)
# Get reconstruction loss threshold.
threshold = np.max(train_mae_loss)
Any clue on how can I calculate it?
Topic loss training anomaly-detection time-series python
Category Data Science