Temporal Fusion Transformer from PyTorch-Forecasting with Multiple Targets - 'list' error

New to PyTorch and the PyTorch Forecasting library and trying to predict multiple targets using the Temporal Fusion Transformer model. I have 7 targets in a list as my targets variable. I'm using MultiLoss as my loss function with a list of 7 CrossEntropy loss functions (1 per target variable) -- In the problem I'm trying to model, there are 7 possible outcomes per time step and I'm trying to find which option is most likely. I'm looking for a …
Category: Data Science

Using survival analysis models with uncensored data for time-to-event prediction

Are there any advantages of using survival analysis models like Cox’s proportional hazard model with uncensored data over simple linear regression or other classic ML models? I have data with recurrent events and I try to predict the time of the next event. Data contains about 2000 different subjects and about 60 events per subject. The percentage of censored data (the last event of each subject) is small, and I don't think it plays a big role in the prediction.
Category: Data Science

Clustering time series data using dynamic time warping

I would like to cluster/group the curves in the attached picture with Python. The data is already normalized and my approach would be to use dtw (dynamic time warping) to calculate the distance and with that feature use a clustering algorithm (like kmeans or DBSCAN) to classify them. Do I pick out one trajectory as a starting curve to compare the other curves to, or do I calculate an 'average' curve of all curves and use that as the starting …
Category: Data Science

Unable to remove Seasonality

I have sales data which is seasonal and has no trend. The frequency of this series is 15 mins. I don't know how to compute the exact period of seasonality - whether it is daily or weekly or monthly or yearly. But, from plotting it, I think there is a yearly pattern. I tried removing seasonality before forecasting by lagging the series by a year and differencing the two but even the result has a yearly pattern. Code with what …
Category: Data Science

Problems to understand how to create the input data for time series forecasting with a recurrent neural network in Keras

I just started to use recurrent neural networks (RNN) with Keras for time-series forecasting and I found this tutorial Forecasting with RNN. I have difficulties understanding how to build the training data both regarding the syntax and the format of the input data. Here is the code: import pandas as pd import numpy as np import tensorflow as tf from tensorflow import keras from matplotlib import pyplot as plt # Read the data for the parameters from a csv file …
Category: Data Science

Perform DTW simultaneously on multiple trajectories

Good day, I have ~50 sample trajectories (timeseries) showing reactor temperature over time for a given process. In addition, I have a reference signal of the ideal trajectory for this process. I would like to synchronize all the sample trajectories with the reference trajectory. Performing DTW with 1 sample signal and reference produces new signals along a common axis (as it should). My question is how can I perform this synchronization of all sample trajectories with the reference simultaneously? Such …
Category: Data Science

In a Time Series Problem, is it possible to forecast quantities by learning the patterns of other items? What are my options?

Suppose I own a store that sells a variety of apples and I have the following stats each month. Report Date Type of Apple (TA) Quantity Available(QA) Quantity Sold in the Past 30 days(QS30) Quantity Shipping In (QSI) Quantity Needed to Order(QN) Lets make the following assumptions/givens: There are three types of apples: red apples, green apples and yellow apples. T(1) denotes the first month and T(60) denotes the 60th month. QA @ T(i + 1) = QA@T(i) + QSI@T(i) …
Category: Data Science

Help with Time Series prediction

I'm a complete n00b to both this stackexchange and ML so please don't flame me too bad. I am trying to make a prediction from Time Series data. I have about 10 years worth of 1-minute resolution price data for the S&P500. What I'd like to do is treat each DAY in the data as it's own series to predict what the price movement will be for the last 15 minutes of market hours. I've looked through several books, some …
Category: Data Science

How to make XGBOOST capture trend in time series forecasting?

I am trying to forecast some sales data with monthly values, I have been trying some classical models as well ML models like XGBOOST. My data with a feature set looks like this with a length of 110 months and I am trying to forecast for next 12 months, When it comes to XGBOOST, I've been spending time mostly on hyperparameter optimization with Gridsearch and also state-of-art packages like optuna. My currently best set of parameters looks like this, parameters …
Category: Data Science

which statistical parameters are more useful to detect anomalies and outlier? mean max min var?

This time series contains some time frame which each of them are 8K (frequencies)*151 (time samples) in 0.5 sec [overall 1.2288 millions samples per half a second) I need to find anomalous based on different rows (frequencies) Report the rows (frequencies) which are anomalous? (an unsupervised learning method) Do you have an idea to which statistical parameter is more useful for it? mean max min median var or any parameters of these 151 sampling? Which parameter I should use? (I …
Category: Data Science

How to get vector representations(or embeddings) of time series?

Even if a time series is constructed up of numbers only, finding abstract fixed-dim vector representation would be interesting for classification/clustering purposes. As we can learn & find abstract representations/embeddings of text/images, can we do something similar on Time series? Finding such ways would result in better clustering & related tasks instead of traditional ways using some statistical measures like Pearson correlation etc. All thoughts are welcome.
Category: Data Science

One-sided time series trend-seasonal decomposition

TL;DR: Are there one-sided decomposition alternatives to the naive seasonal_decompose from statsmodels? Are there approaches to adapt intrinsically two-sided algorithms (like STL from statsmodels) to forecasting applications? I'm attempting to perform time-series forecasting. For this I want to decompose a time-series into trend and seasonal parts. I picked STL implementation from statsmodels to handle this. I gravitated towards STL instead of seasonal_decompose, since even the docs down the bottom encourage more sophisticated approaches: I noticed, however, that the decomposition is …
Category: Data Science

Cross-validation for anomaly detection on time series data

I want to perform k-fold cross-validation for the setting where I have a training dataset consisting of a sequential time series that is fully benign and a test dataset (also a sequential time series) which contains labeled anomalies. I already took a look at this post, but as my data is sequential, the answer doesn't work out. I am especially stuck with the factor that for K-fold cross-validation, you use (k-1)/k parts of your data for training and 1/k parts …
Category: Data Science

Extract all data of a month from different years

Ok I had a typo in this question before which I have now corrected: my database (df_e) looks like this: 0,Country,Latitude,Longitude,Altitude,Date,H2,Year,month,dates,a_diffH,H2a 1,IN,28.58,77.2,212,1964-09-15,-57.6,1964,9,1964-09-15,-3.18,-54.42 2,IN,28.58,77.2,212,1963-09-15,-120.0,1963,9,1963-09-15,-3.18,-116.82 3,IN,28.58,77.2,212,1964-05-15,28.2,1964,5,1964-05-15,-3.18,31.38 ... and I would like to save the data from the 9th month from the years 1963 and 1964 into a new df. For this I use the command: df.loc[df_e['H2a'].isin(['1963-09-15', '1964-09-15'])] But the result is Empty DataFrame Columns: [Country, Latitude, Longitude, Altitude, Date, H2, Year, month, dates, a_diffH, H2a] Index: [] Where is my mistake?
Category: Data Science

How to cluster time series of ordered data?

There are a few hundred time series of a large set of different locations (irregularly distributed) with the following properties: ordered factor (5 levels) between 5 and 25 observations per series lots of missing values within each series temporal and spatial autocorrelation (unknown) temporal frequency The objective is to spatially cluster the time series based on their similarity (of observed value per point in time). What would be adequate methods? The analysis will be carried out in R.
Category: Data Science

How to combine data having similar distribution?

I have a collection of time series data with data points of around 2 years of daily data. I am thinking of a way to increase the number of data points in it so that the neural network gets a better understanding of the fluctuations in the data. I am suggesting a hypothesis where I try to cluster similar time-series data following similar distribution, in order to increase the number of data points fed into the neural network. Is this …
Category: Data Science

Binary Classification Comparing two time series of variable length

Is there a machine learning model (something like LSTM or 1D-CNN) that takes two time series of variable length as input and outputs a binary classification (True/False whether time series are of same label)? So the data would look something like the following date value label 2020-01-01 2 0 # first input time series 2020-01-02 1 0 # first input time series 2020-01-03 1 0 # first input time series 2020-01-01 3 1 # second input time series 2020-01-03 1 …
Category: Data Science

Need help on Time Series ARIMA Model

I'm working on forecasting daily volumes and have used time series model to check for data stationarity. However, I'm strugging at forecasting data with 90% accuracy. Right now variation is extremely high and I'm just unable to bring it down. I've used log method to transform my data. Please find the link to folder below which contains ipynb and csv files: https://drive.google.com/drive/folders/1QUJkTucLPIf2vjo2mRmoBU6be083dYpQ?usp=sharing Any help will be highly appreciable Thanks, Rahul
Category: Data Science

When to tune hyperparameters in deep learning

I am currently playing around with different CNN and LSTM model architectures for my multivariate time series classification problem. I can achieve validation accuracy of better than 50 %. I would like to lock down an exact architecture at some stage instead of experimenting endlessly. In order to decide this, I want to also tune my hyperparameters. Question: How do I balance the need to experiment with different models, such as standalone CNN and CNN with LSTM against hyperparameter tuning? …
Category: Data Science

Relating changes of a value in time to known events

I work with two datasets. The first dataset contains fluor values measured every minute. The second dataset contains certain events and their time. We know that these events cause peaks in fluor values shortly before and shortly after the event time. A simplified reproducible example in R: Here I provide a simplified version of the R code I use to relate the fluor values to events. I have a series of fluor values measured every minute. Next I have a …
Topic: time-series r
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.