Identify the parameter causing the anomaly in a multivariate dataset

I have a payment transaction dataset with a large number of predictor variables. I am trying to build a model for anomaly detection and I have evaluated various algorithms/approaches for the same like Isolation Forest, kNN, Autoencoders, and One-class SVM.

I am able to identify if a payment record is an anomaly or not but I am not able to pin-point the predictor variable that is causing the anomaly.

e.g.:

Account || Currency || Beneficiary || Amount || isAnomaly(target)

I want to identify if, for an anomalous record, Currency variable is causing the anomaly or Amount variable is causing the anomaly.

I have gone through the below sources amongst many others but couldn't find anything helpful.

Anomaly Detection in Database

Anomaly Detection in multiple parameters

I have recently started my journey in data science and would be glad if someone could help me with this issue.

Topic isolation-forest k-nn autoencoder anomaly-detection svm

Category Data Science


I understand you are looking for some interpretability.

But if you recall Feature engineering, we mostly remove features which are of less value. What it means is that all the remaining features are contributing.

What you may do a trade-off between Accuracy and Interpretability -
Logistics Regression and Decision Tree will give you a clear picture on how the model arrives at the decision.
You may try that


Usually it will not be a single feature value that is responsible for the decision of a ML model. Neural Networks, Random Forests, SVM, etc. intend to transform the input in a more beneficial feature space, where making decisions is more easier for them.

As a draw back, this makes interpratability for humans more intricate. Explainability of ML methods is a whole research field.

You could check out some explainability approaches. For autoencoders for example you could use Layerwise Relevancy Propagation (LRP). https://arxiv.org/pdf/1708.08296.pdf

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.