Forecasting vs non-forecasting predition for time series anomaly detection
I have got the objective of implementing a uni/multivariate online anomaly detection system.
After multiple days of research, I could collect many ways to achieve this (Eg. moving average solutions such as ARIMA, Space state solutions as Kalman filters, Holt-Winters double/triple exponential smoothing, CUSUM, one-class SVM, deep learning sliding-windows autoencoding approaches, deep learning using autoregressive neural networks, etc).
In general, anomaly detection on time series works with a threshold on the deviation originated from the difference between a predicted point or group of points of the original timeseries and the predicted one.
Attending to this prediction, this can happen in:
a forecasting way (such as ARIMA would do, or you could achieve this result also by using a LSTM deep learning model),
or in a non-forecasting way (eg. denoising with an autoencoder would do, or analyzing fragments STL+ESD used by Twitter).
Which are the (dis)advantages of each one, attending to the objective I mentioned?
Topic anomaly-detection online-learning time-series
Category Data Science