forcasting anomaly in products

I have a question about the forecasting of anomalies. I would be very grateful if you could refer me to some papers that deal with this kind of problem or give me some hints to start with this problem.

I have some products that go to a bigger machine and some forces act on these products for about 5 minutes. After that, some of these products are not normal and they are anomalies. I want to predict an anomaly before it occurs.

The data is as follows: I have more than 100000 products and for each product, I have 120 features that are measured 1000 times (during 5 minutes).

Some of the data are labeled as OK and anomaly.

point: Most of the papers I have seen were about predicting anomalies in the machine. For example, there are some sensors in a large machine that are monitored till they show strange behavior. This means that the machine is close to breaking and something needs to be changed to make it works normally again. All these examples are about machines and not related to the products

Topic prediction unsupervised-learning anomaly-detection semi-supervised-learning time-series

Category Data Science


Each of your observations is associated 120 features x 1000 measures matrix. I would start with something simple and reduce the matrix to be 1-dimensional to able to do standard ML. For example, you could use: mean of feature 1, standard deviation of feature 1, last value observed, and some other measures that can summarized the features.

If you used just the three features I mentioned above, your feature set would be [mean_of_feat_1, st_dev_of_feature_1, last_obs_of_feature_1, mean_of_feature_2,...], which is 360x1 matrix. You probably would want to remove some (many?) features.

For data set being unbalanced, 5% positive is not that low, I would see what results I get without with the data as it is, and depending on the results apply oversampling/under sampling if there is unbalanced data problem.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.