Anomaly Detection System

I need a sanity check. I want to create an anomaly detection system.

The logic which I am planning to use is the following:

  1. Find anomalies in the past using Seasonal Hybrid Extreme Studentized Deviate Test.
  2. Binarise the anomalies (1 the anomalies and 0 the trends).
  3. Run several algorithms (Autoencoders, SVM, Logistic Regression, Naive Bayes, Lasso Regression, etc) with variables that are correlated and validate the models and use it.

Does the binarisation process makes sense?

Topic anomaly machine-learning-model anomaly-detection binary machine-learning

Category Data Science


Yes, your logic and what you are thinking is excellent.

There is only a flaw in your thinking: The variables you run the model with must not necesarily be "correlated" in a linear sense of the word, just don't discard any variable because any of them could explain your binary output, and not have a linear relationship with it.

Is a common solution to binarise an output to detect anomalies, but you will lose the ability to predict "how much" outlier is an outlier, make sure you don't need this information after.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.