Fitting an arimax model on out of sample dataset

I have built an arimax model where we have sales data across time as the response variable and price is one of the external variables. I used the below code to build a simple arimax model. I had data points from 1 to 24, I have kept only 1 to 20 data points in the training dataset

library(stats)
fit=arima(window(tssales, end=20), order = c(0,1,1), xreg = window(tsprice, end=20))
summary(fit)
fcast=forecast(fit, h=5, xreg = window(tsprice, end=20))
plot(fcast)

Now when I try to fit the model from the training dataset in out of sample dataset (last 4 data points) I use the below code

library(stats)
out_of_sample=arima(window(tssales, start=21), xreg = window(tsprice, start=21), model=fit)

I am getting the following error

Error in arima(window(tssales, start = 21), xreg = window(tsprice, start = 21), : unused argument (model = fit)

Topic forecast time-series r

Category Data Science


In the following, I will demonstrate an example to show how you could fit an arimax model to your data in R using auto.arima() function (the code is the same if you want to use arima).

If you use forecast package, auto.arima() function will fit "best ARIMA model according to either AIC, AICc or BIC value" to your data.

Now, I assume your data have length of 300

Lets see how accurate arima would be:

library(forecast)

train <- window(my_data,end = 250)

test <- window(my_data,start = 251)

Since this is only a example to show an arimax model. I will generate monthly dummy variables to use as Covariate, to generate dummy monthly variables we can use nnfor package

library(nnfor)

dta <- seasdummy(350,12)

colnames(dta) <- c("Jan", "Feb","Mar","Apr","May","Jun","Jul","Agu","Sep","Oct","Nov")

We generate dummies for 350 months. later we will forecast next 50 month ("out of sample dataset").

train_xr <- window(dta,end=250)

train_new_xr <- window(dta,start=251,end=300)

Lets train our data:

h1=nrow(train_new_xr)

fit <- auto.arima(train,xreg = train_x)

fc <- forecast(fit,h=h1,xreg= train_new_xr) 

autoplot(fc$mean))+autolayer(test) # to see how good was the forecast or use accuracy() function

Now out of sample forecast:

xreg <- window(dta, end=300)
new_xreg <- window(dta,start=301)

h= nrow(new_xreg)

fit1 <- auto.arima(my_data,xreg=xreg)

fc1 <- forecast(fit1,h=h,xreg=new_xreg)

autoplot(fc1)

When I use stepwise = FALSE, approximation = FALSE arguments the forecast gets more accurate but auto.arima() function gets very slow. you could use it as: auto.arima(train,stepwise = FALSE, approximation = FALSE,xreg = train_x).


the arima function from the stats library does not take an argument called model, which is why you are receiving an error. Here is the function signature:

arima(x, order = c(0L, 0L, 0L),
      seasonal = list(order = c(0L, 0L, 0L), period = NA),
      xreg = NULL, include.mean = TRUE,
      transform.pars = TRUE,
      fixed = NULL, init = NULL,
      method = c("CSS-ML", "ML", "CSS"), n.cond,
      SSinit = c("Gardner1980", "Rossignol2011"),
      optim.method = "BFGS",
      optim.control = list(), kappa = 1e6)

The returned object, does contain the model. Read the related documentation for more details.


I think your workflow is perhaps a little confused. You fit a model first on 20 data points, which is fine (more data would be nice!). You make some forecasts and plot it, which is also good - you can see if the model learned much and if there is perhaps some systematic error, e.g. simply predicting the previous time-step and not a more intelligent trend.

The final step, however, should be to once again make predictions on your hold-out data; the last 4 data points. You should not fit another model to the hold-out data! Just predict what you model would say for that data, which it has never before seen.

The reason we work like this, is so we can assess the model's performance independently from the data that was used to train it. We want to know what will happen in the future, when you get your 25th datapoint.


Have a look at this nice tutorial, which explains the mains concepts of ARIMA and has a working example. Here is a very similar tutorial, but it is a video.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.