Need help on Time Series ARIMA Model

I'm working on forecasting daily volumes and have used time series model to check for data stationarity. However, I'm strugging at forecasting data with 90% accuracy. Right now variation is extremely high and I'm just unable to bring it down.

I've used log method to transform my data. Please find the link to folder below which contains ipynb and csv files:

https://drive.google.com/drive/folders/1QUJkTucLPIf2vjo2mRmoBU6be083dYpQ?usp=sharing

Any help will be highly appreciable

Thanks, Rahul

Topic machine-learning-model time-series pandas predictive-modeling

Category Data Science


Before anything, draw a graph.

You'll notice that you have at least three process: one around 20-25, one around 60-90, and one around 250-450. A closer look tells you that the first one is on Saturdays, the second on Sundays and the last one for other working days.

Further notice that the Sundays series is different since 2016-12-26. And have a close look to Christmas and Banking holidays to decide if they behave like Saturdays, Sundays or ordinary days.

Also notice a set of outliers between 2017-12-21 and 2018-01-02. Remove them before to fit. You also have overproduction by the end of November 2016. Remove them also.

Then make a 7-day moving average, which as you will see is rather linear, except for an underproduction in January 2017, for which you may have an external explanation (following the change of end 2017).

With this linear fit, you may have a look to a weekly seasonality. And reach a 70-80% accuracy.

But forget a 90% accuracy. There is much more fluctuation in your data.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.