How to find an anomalous matrix among many?

Let's say we have a bunch of matrices that we know are non-anomalous. We now receive a new matrix and want to know if it belongs into the group or is way off. Is there a way to do that?

I'm thinking of something similar to MAD (median absolute deviation) but for matrices.

Topic anomaly anomaly-detection statistics

Category Data Science


Off the top of my hat:

If your matrix is purely numeric and does not have any NULL values: try normalization, standardization, PCA. Pairwise X-Y Plots of the main first 1-3 principal components probably will look similar for "normal" matrices, and look different for anomal matrices. The plots would give you a simple visual clue. This does not work for matrices with a small number of rows.

Also, maybe you can design your own feature by feature engineering?

Let's say you have a vague Idea what an anomaly comprises. You could add a new binary column such as rowsum > 100. (A very simple feature). Rowsum <= 100 means "normal matrix", rowsum > 100 means "anomal matrix". Because you say so.

Optional (with a rule this simple) you could append all your x mn matrices into a single (xm) * n matrix.

You now have a classification variable 0/1 as the last column. Then you could apply the machinery of classic supervised machine-learning algorithms on your matrices, and tune the accuracy of your model.

It does not have to be a binary classification variable. Classification into "low", "Medium" , high" may also make sense.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.