Which dataset for multivariate time series forecasting

I'm trying to forecast Real estate Price , it's not a prédiction. But a forecast Like the Price of a an appartement in 2023 or 2024, i'm asking about how should be my dataset ? Can I use a dataset from 2018 to 2021 of 13 columns You can find the dataset here: https://www.kaggle.com/datasets/mrdaniilak/russia-real-estate-20182021 Date, area, kitchen_are, nb_rooms Please note that every row is a new house indépendant from others, I'm having this dataset by scrapping a website of ads …
Category: Data Science

Time series forecast for everyday for till a distant future

I have time-series data for every single day from the last 5 years with seasonal variation and a general increase in trend. This is what my data looks like: And I am trying to predict for every single day for 4-5 years in the future. Approaches I have used currently: LSTM, GRU (but they are extremely prone to overfitting ad I am trying successive predictions which results in massive error accumulation and often flat over time for smaller lookbacks and …
Category: Data Science

Python: SARIMAX Model Fits too slow

I have a time series data with the date and temperature records of a city. Following are my observations from the time series analysis: By plotting the graph of date vs temperature seasonality is observed. Performing adfuller test we find that the data is already stationary, so d=0. Perform Partial Autocorrelation and Autocorrelation with First Seasonal Difference and found p=2 and q=10 respectively. Code to Train Model model=sm.tsa.statespace.SARIMAX(df['temperature'],order=(1, 1, 1),seasonal_order=(2,0,10,12)) results=model.fit() This fit function runs indefinitely and does not reach …
Category: Data Science

Anomaly detection and root cause analysis

ARIMA is widely used for anomaly detection on time-series data e.g. stock price prediction. ARIMA assumes that future value of a variable (stock price in our case) is dependent on its previous values. When we do root cause analysis of a detected anomaly, there can be numerous reasons e.g. russia-ukraine war. I have 2 questions: Isn't the assumption of ARIMA invalidated because stock price is also dependent on other factors such as war Which models can I use to do …
Category: Data Science

How to revert np.log(data) and data.diff()?

I have used np.log(data) and then applied data.diff() to transform my data in timeseries model. I have the predictions. How do I convert it back to normal scale? Here is an example for your reference: -------------------------------------------------------------------- | sales | np.log(sales) | (np.log(sales)).diff() | predictions | -------------------------------------------------------------------- |166.594019 | 5.115560 | -0.045918 | -0.045918 | -------------------------------------------------------------------- Note: I have provided only one example which from index 2 as the first value after data.diff() will be null. And hence the prediction at …
Category: Data Science

Multiple seasonality with ARIMA?

I know that ARIMA can't detect multiple seasonality, but it is possible to use fourier functions to add a second seasonality. I need to forecast gas consumption composed by a daily, weekly (week days-weekend), yearly seasonality. Does it make sense to apply three times the STL decomposition by LOESS? The reason is that I applied the fourier method and I have bad results but I don't know if it is only because I applied it wrong. I'm interested in the …
Category: Data Science

ML model to forecast time series data

This question has three sub-parts, answering each of which probably doesn't require huge text. I hope that is okay. I'm trying to understand time series prediction using ML. I have the target variable $y_t$, and suppose two other variables $x_t,z_t$ (e.g. if $y_t$ were the demand of an item, $x_t$ could be type of item or price of item, etc.). Also, let's say I'm using a random forest model because I've read it generally does okay out of the box. …
Category: Data Science

Why ARMA models needs stationarity

I am trying to find why ARMA models needs stationarity to work, I have simulated some nonstationary processes and the estimated parameters (point estimates) seems to be very similar to the actual ones. So, what are the main problems fitting a nonstationary time series with an ARMA model? Is it poor forecast prediction, biased estimates, prediction intervals ? I guess Inference is a problem, but I am concern in the forecast part mainly. Thanks in advance
Category: Data Science

Regression AR(p) models and stationarity

I am starting to learn time series models besides the expoential smoothing ones and I got a few questions that I am struggling with. If I have a stationary time series wich follows an AR(1) process, should I get the same results using either AR (1) model or a linear regression with a explanatory variable equal to a lag version of the time series (1 in this case). Regarding p values, would them be likely similar (assuming AR model might …
Category: Data Science

Applicability of ARIMA model on non stationary data

I have a time series dataset that does not have the stationary property. The dataset is monotonically increasing or sometimes showing no change over periods of time. Can I apply the ARIMA model to such datasets which do not have stationary properties? And if yes, what are the methods to process the data before feeding it into the model?
Category: Data Science

How to handle multi time series data for 10K + items

There are 50 shops and each shop have 30000 items. Goal is to forecast the sale of item based on shop. Forecase the item_cnt_day, for this i dont see this as multi variate time series. Only shop and item ID is needed to forecast the next month data. The question is do we need to take this as Multi time series problem and build 30000 ARMA, ARIMA, SARIMA etc models for each of the shop and items. So for 50 …
Category: Data Science

How to train ARIMA model on multiple similar time series?

I am having 'business potential' values of 4000 cities (having generic names to ensure anonymity) for 72 months. The data for an individual city is just 72 months so I clustered the entire dataset with KMedoids so that the cities are clustered into localities having similar time series patterns. Now I would like to train a single ARIMA model on all cities of a cluster and forecast next 15 months for all the cities in that cluster. How do I …
Category: Data Science

Does this ARIMA model take seasonality into account?

I'm writing a tutorial on traditional time series forecasting models. One key issue with ARIMA models is that they cannot model seasonal data. So, I wanted to get some seasonal data and show that the model cannot handle it. However, it seems to model the seasonality quite easily - it peaks every 4 quarters as per the original data. What is going on? Code to reproduce the plot from statsmodels.datasets import get_rdataset from statsmodels.tsa.arima.model import ARIMA import matplotlib.pyplot as plt …
Category: Data Science

How can we make forecasts from stationary data

I'm confused about the concept of stationarity. Most definitions require the mean and Variance to be constant 'over any interval'. This statement confuses me, if any interval should have the same mean mean and variance then I'll select a time-strip as narrow as possible, say 1 day where the graph is on a high and then another 1 day where the graph is on a low, then the mean is obviously different. Say I take means over the green and …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.