determining size of batch, time of sending and memory in to send from scala to ML section

I have a time series (sampling time: 66.66 micro second, number of samples/sampling time=151), I would like to determine some anomalies in them, the inputs are made by scala customer message bus. would like to know how I can determine size of batch, time of sending and memory in Scala customer or ML/AL?

Topic pipelines memory scala time-series machine-learning

Category Data Science


It seems very complex to me to give you a correct size of batch, time of sending and memory, because very high frequency problems depends on every small operation in the algorithm you use to detect anomalies.

In addition to that, I don't know if the anomaly could be detected on a single or a multiple set of values (or both), and the minimum range of values to detect an anomaly. But you can easily define those limits by starting by quite simple anomaly detection algorithms that doesn't require lot of calculations, like autoregressive models, ideally autoregressive integrated moving average models (ARIMA).

Those algorithms would help you defining the right parameters and the best batch size based on their prediction quality.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.