Store's unseen items sales forecasting

I am working on sales forecasting problem.I am able to provide data about which items got sold and not sold to the algorithm.How to provide algorithm information about items that are not present in the store.Is there any way we could encode this information in data or any other algorithms accepts this kind of information.Currently, I am using Neural Networks and Random Forest to forecast Sales.
Category: Data Science

Which dataset for multivariate time series forecasting

I'm trying to forecast Real estate Price , it's not a prédiction. But a forecast Like the Price of a an appartement in 2023 or 2024, i'm asking about how should be my dataset ? Can I use a dataset from 2018 to 2021 of 13 columns You can find the dataset here: https://www.kaggle.com/datasets/mrdaniilak/russia-real-estate-20182021 Date, area, kitchen_are, nb_rooms Please note that every row is a new house indépendant from others, I'm having this dataset by scrapping a website of ads …
Category: Data Science

Fitting an arimax model on out of sample dataset

I have built an arimax model where we have sales data across time as the response variable and price is one of the external variables. I used the below code to build a simple arimax model. I had data points from 1 to 24, I have kept only 1 to 20 data points in the training dataset library(stats) fit=arima(window(tssales, end=20), order = c(0,1,1), xreg = window(tsprice, end=20)) summary(fit) fcast=forecast(fit, h=5, xreg = window(tsprice, end=20)) plot(fcast) Now when I try to …
Category: Data Science

Forecasting out of sample with Fourier terms as regressors

I'm trying to create a multivariate multi-step-ahead forecast using machine learning (weekly and yearly seasonality). I use some exogenous variables, including Fourier terms. I'm happy with the results of testing the model with in sample data, but now I want to go for production and make real forecasts on completely unseen data. While I can update the other regressors (variables) since they are dummy variables and related to time, I don't know how I will generate new Fourier terms for …
Topic: forecast
Category: Data Science

Fully endogenous models for predicting multivariate time series

I have a formal social science background but I am new to data science. My interest is in building predictive models for applications in the social sciences, mostly (but not only) in economics. I am interested in the following kind of setups: I have data that describe the evolution of a number of variables $j \in J$ for a number of "individuals" $i\in N$ across time periods $t \in \{1\dots, T\}$. For example, "individuals" $i\in N$ could be countries, with …
Category: Data Science

Multi-Source Time Series Data Prediction

I was wondering if anyone has experience with time series prediction for data from multiple sources. So for instance, time series $a,b,..,z$ each have their own shape, some may be correlated with others. The ultimate goal is to have a model trained such that the value at time $t+1$ for any given data source can be predicted. I personally have two solutions that in theory could work, but was wondering if anyone knew of other frequently used methods. Multi-task learning …
Category: Data Science

How can I go about building a model for large number of outputs?

I have previously worked on small-scale feedforward neural network problems. But I have started working on a new project where the goal is to predict air quality in 25 locations throughout the country a day ahead. Now, I am quite well-versed with the air quality side of things. My question: In a problem like this, would I develop 25 independent models (which share the same input structure) or one model with 25 outputs. I guess what I want to do …
Topic: rnn forecast
Category: Data Science

Monthly trend with fb prophet-Interpreting the graph

I have monthly data with month/year in one column and price on another. I would like to get a yearly trend with fb prophet library in python (how to use monthly data with the library is explained at the end of this page ). This is my code: import fbprophet import pandas as pd import matplotlib.pyplot as plt data = pd.read_csv('data.csv', sep=';') data_prophet = fbprophet.Prophet(interval_width=0.95, changepoint_range=0.9, changepoint_prior_scale=0.15, seasonality_mode='multiplicative', n_changepoints=100) data_prophet.fit(data) # Make a future dataframe for 5 years data_forecast = …
Category: Data Science

Quantifying 'growth friction' when projecting target goals

As part of my DS work I spend some fraction of my time helping the team make growth projections, either for setting growth targets or when forecasting actual data. There is obviously a range of ways to go about doing this but the one thing I don't have a good solution for at the moment is being able to fold in or at least quantify how much harder it is to grow in a market this year as opposed to …
Topic: forecast
Category: Data Science

Interpretation of VAR model: about impulse function and lag of p

For example, I have three time series, Y,X1,X2. After using time series cross validation and utilizing BIC/AIC to determine the best p as the lag of the VAR model, in which I got p = 1 to estimate the model. I know that to explain the model, we can use impulse function to explain the model, while using variance decomposition to explain the variance of predicted errors. I have a confusion of p and the explanation of impulse function. Based …
Category: Data Science

How can one generate future forecasts from probabilistic events?

I have an event "whether an item sold will be returned or not" which I can predict with a certain probability based on information gathered at the time that the purchase occurs (product features, customer information, time and place, etc). So: P(Return | transaction information) = x% for a specific unit sold I also have a historical time series of total units sold for that item, and a future forecast of sales of that item over the next few weeks. …
Category: Data Science

Sliding window approach using SVR & LightGBM

I'm working on a multivariate time series forecast using a couple of ML algorithms (Neural Networks, Support Vector Machines & Gradient boosting algorithms). I need to measure the performance of each model. I've implemented the 1st model using Tensorflow 2.0. Training & testing data was created using tf.Dataset API. The data format is (window_data, forecast), where window_data represents a set of 24 timesteps and forecast represents the next timestep. Now I need to train 2nd & 3rd model using SVR …
Category: Data Science

Forecasting non-negative sparse time-series data

I have a time-series dataset (daily frequency) representing the sales of a product to a customer over time. The sales is represented as the following: $$[0, 0, 0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 17, 0, 0, 0, 0, 9, 0, ...]$$ in which each number represents the sales of the product in a day. The problem is that time-series forecast methods (ARMA, HoltWinters) work well for "continuous" and "smooth" data, but …
Category: Data Science

Kalman filter for time series prediction

I have the information about the behaviour of 400 users across period of 1 months (30 days). Across those 30 days I measure 4 different information (let's call it A,B,C and D), hence I have a total of 4 time series. My goal is to predict, starting from day 5, the future values of A, by knowing the past values of A, B, C and D. So: A(5) = f(A(1,2,3,4), B(1,2,3,4), C(1,2,3,4), D(1,2,3,4)) Ideally, I'd like to estimate the whole …
Category: Data Science

Forecasting profit based on allocation of labor and time-series data

Situation: a store sells services A & B, and we have historical data for daily sales/revenue/profit of each service. The store is interested in whether they should staff for more of service A or service B in the future. They have a fixed employee base (can't hire or fire) but employees can swap between performing service A and service B. Furthermore, historically the total number of employees working on each service doesn't change much, e.g. a store may have 3 …
Category: Data Science

Additive vs Multiplicative model in Time Series Data

The above time series plot is a daily closing stock index of a company. I want to know which model between additive and multiplicative best suits the above data. I know what the two models are, but i haven't been able to figure out the correct model for the above data. Also, is there any way other than simple visualisation which can help me decide the correct model?
Category: Data Science

Best forecast model for insurance policies volumes

I am new in forecasting and I am studying a dataset from an insurance company that contains the volume on a monthly basis of new policies, renewals & cancellations. New policies of a given month are renewed in intervals (3 months, 6 months, 12 months) but could be canceled as well at any time. For instance, new policies of January with 3-months duration are renewed after 3 months in April. I would like some help in what direction to study …
Category: Data Science

Demand forecasting with marketing budget data

I'm trying to build a demand forecasting model to predict future daily orders of an online food takeout service (similar to UberEats or DoorDash). My first model uses a univariate approach, which is basically an ensemble of statistical models such as Auto ARIMA, ETS, BSTS, etc. Now, I want to build upon this by adding features related to marketing budget/spend because that's always been a big push for sales, however, the data that I have is only on a monthly …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.