Temporal Fusion Transformer from PyTorch-Forecasting with Multiple Targets - 'list' error

New to PyTorch and the PyTorch Forecasting library and trying to predict multiple targets using the Temporal Fusion Transformer model. I have 7 targets in a list as my targets variable. I'm using MultiLoss as my loss function with a list of 7 CrossEntropy loss functions (1 per target variable) -- In the problem I'm trying to model, there are 7 possible outcomes per time step and I'm trying to find which option is most likely. I'm looking for a …
Category: Data Science

Neural network / machine learning approach to model specific sequencing-classification problem in industry

I am working on a project which involves developing a machine learning/deep learning for an application in a roll-to-roll industry. For a long time, I have been looking for similar problems as a way to get some guidance but I was never able to find anything related. Basically, the problem can be seen as follows: An industrial machine is producing a roll of some material, which tends to have visible defects throughout the roll. I have already available a machine …
Category: Data Science

Forecasting on multivariate time series containing quaternions

I have a multivariate time series containing 3D position data ($x,y,z)$ and orientation data (as quaternions) obtained from motion sensors. My goal is to forecast the future position/orientation, and for this I'm looking into use sequence models, esp. LSTMs. A quaternion has 4 elements, one of them denoting the real/scalar part (say $q_w$) and the other three denoting the imaginary/vector part (say $q_x, q_y, q_z$). So my time series has 7 columns in total. My question: Considering that quaternion elements …
Category: Data Science

LSTM evaluation metric MAE explanation

I have a hard time understanding the LSTM model performance as I summarize my model as follow: X_train.shape (120, 7, 11) y_train.shape (120,) X_test.shape (16, 7, 11) y_test.shape (16,) model = keras.Sequential() model.add(keras.layers.LSTM(100, input_shape=(X_train.shape[1], X_train.shape[2]), return_sequences = True)) model.add(keras.layers.Dropout(rate = 0.2)) model.add(keras.layers.LSTM(20)) model.add(keras.layers.Dropout(rate = 0.2)) model.add(keras.layers.Dense(1)) model.compile(loss='mean_squared_error', optimizer=keras.optimizers.Adam(0.001), metrics = ['mae']) history = model.fit( X_train, y_train, epochs=60, batch_size=5, verbose= 0, validation_split = 0.1, shuffle=False ) Based on the below plots, both MSE and MAE decrease in the training process and …
Category: Data Science

Help with Time Series prediction

I'm a complete n00b to both this stackexchange and ML so please don't flame me too bad. I am trying to make a prediction from Time Series data. I have about 10 years worth of 1-minute resolution price data for the S&P500. What I'd like to do is treat each DAY in the data as it's own series to predict what the price movement will be for the last 15 minutes of market hours. I've looked through several books, some …
Category: Data Science

Traffic prediction using lstm

I am using LSTM model to predict the data traffic in every second of a base station. The dataset is as follows: The test and train prediction looks as follows: And the RMSE values for train score and test score are 32.54 and 30.03 respectively. To reduce the RMSE values I have changed the lookback value to 15,20 and 30 but it's not reducing. Can somebody tell me the reason behind this huge prediction error and some advice on how …
Category: Data Science

How to fit a model on validation_data?

can you help me understand this better? I need to detect anomalies so I am trying to fit an lstm model using validation_data but the losses does not converge. Do they really need to converge? Does the validation data should resemble train or test data or inbetween? Also, which value should be lower, loss or val_loss ? Thankyou!
Category: Data Science

Working Behavior of BERT vs Transformers vs Self-Attention+LSTM vs Attention+LSTM on the scientific STEM data classification task?

So I just used BERT pre-trained with Focal Loss to classify Physics, Chemistry, Biology and Mathematics and got a good f-1 macro of 0.91. It is good given it only had to look for the tokens like triangle, reaction, mitochondria and newton etc in a broader way. Now I want to classify the the Chapter Name also. It is a bit difficult task because when I trained it on BERT for 208 classes, my score was almost 0. Why? I …
Category: Data Science

Regression sequence output loss function

I am fairly new to deep learning, and I have the following task. Based on an audio sequence of shape (200, 1024), I have to predict two sequences of shape (200, 1) of continuous values (for e.g 0.5687) that represent the emotion at each timestep (valence "v" and arousal "a"). So I've created the following LSTM: inputs_audio = Input(shape=(200, 1024)) audio_net = LSTM(256, return_sequences=True)(inputs_audio) audio_net = LSTM(256, return_sequences=True)(audio_net) audio_net = LSTM(256, return_sequences=False)(audio_net) audio_net = Dropout(0.3)(audio_net) final_model = audio_net target_names = …
Category: Data Science

Keras data structure for LSTM-networks

After reading a while, I am confused now about my LSTM data structure. Assuming that I have a supervised learning problem with 1000 samples and 40 features as input. Now I want to create 10 timesteps of x. My resulting dimension of Keras data structure is x: (1000, 10, 40) with every two-dimensional matrix (1000, 40) shifted row-wise and ten times. The question is now: Which dimension has my target y to be? My dimension is y=[1000,1] for one resulting …
Topic: lstm keras
Category: Data Science

RNN to model DNA sequencing classification

I have a DNA sequence dataset each mapped to a certain class. e,g TCAGCCGAGAGCTCATCGATCGTACGT 2 ATGCAGTGCATCGATCGATCGTAGAAC 3 Where the number after the sequence specifies the type of protein this sequence belongs to. So my question can I use KMers and 1-HOT-Coding to classify these sequences through biLSTM. Or this is not a possible concept and I would appreciate your feedback and suggestions on this task as I am new to Deep Learning. Thank you.
Category: Data Science

Error on custom RNN/LSTM with multiple inputs

I want to implement a custom RNN/LSTM model similar to this. The model should take two separate vectors as input and process them. I was following keras tutorial to implement a custom keras layer and inputting two vectors a and b as a list [a,b] to the layer as shown below. import keras from keras.layers.recurrent import RNN import keras.backend as K class MinimalRNNCell(keras.layers.Layer): def __init__(self, units, **kwargs): self.units = units self.state_size = units super(MinimalRNNCell, self).__init__(**kwargs) def build(self, input_shape): print(type(input_shape)) self.kernel …
Category: Data Science

Why are predictions from my LSTM Neural Network lagging behind true values?

I am running an LSTM neural network in R using the keras package, in an attempt to do time series prediction of Bitcoin. The issue I'm running into is that while my predicted values seem to be reasonable, for some reason, they are "lagging" or "behind" the true values. Right below is some of my code, and farther down I have some graphs to show you what I mean. My model code: batch_size = 2 model <- keras_model_sequential() model%>% layer_lstm(units=22, …
Category: Data Science

LSTM with input of actual time step

I'm working on an implementation of LSTM neural network to forecast energy consumption. I have a dataset with load, series of weather parameters and indicator of it's bank holiday or not. I first did a network with input of 24 lag (using function from this tutorial). So I have a dataset like this, but with 18 variables and from ($t_{-24}$) var1(t-1) var2(t-1) var1(t) var2(t) 1 0.0 50.0 1 51 2 1.0 51.0 2 52 3 2.0 52.0 3 53 4 …
Category: Data Science

Train an LSTM on separate sequences of different lengths

My case is the following: I want to train a sequential classifier to recognize what action is being performed given sensors observations.My data consists of 10 executions of an assembling task for 10 different people. So, basically each person performed the same task and I have the sensor measurements for each millisecond. That means that for each person I have a really big data set with the corresponding measurements and the labels (which action is being performed) for each millisecond. …
Category: Data Science

How to represent the number of neurons in an LSTM for architecture schematic?

I'm trying to visualise a neural network schematic and found a great tool for building schematics here http://alexlenail.me/NN-SVG/index.html. I've edited the SVG file to change one of the dense layers into a LSTM layer, and the input to time series instead of singular neurons. At the bottom of the image there is some set notation detailing how many neurons is in each layer. I'm not too familiar with set notation. I'm not quite sure how to represent the LSTM layers …
Category: Data Science

is it good to have 100% accuracy on validation?

i'm still new in machine learning. currently i'm creating an anomaly detection for flight data. it is a multivariate time series data that include timestamp, latitude, longitude, velocity and altitude of the aircraft. i'm splitting the data into train and test with 80% ratio. i used the keras LSTM autoencoder to do a anomaly detection. so here's my code def create_sequence(data, time_step = None): Xs = [] for i in range (len(data) - time_step): Xs.append(data[i:(i + time_step)]) return np.array(Xs) # …
Category: Data Science

Timeseries LSTM: does test data need to come after training data?

I have one single, very long time series. I want to train an LSTM to distinguish between two behaviours (A or B) at every timestep (sequence-to-sequence). Because the time series is very long, I plan to extract shorter, partially-overlapping subsequences and use each of them as one training input for the LSTM. In my train/validation/test split, do I have to use older subsequences for training and newer for validation and test? Or can I treat them as if they were …
Category: Data Science

Advantages of CNN vs. LSTM for sequence data like text or log-files

When do you tend to use CNN rather than LSTM (or the other way round) in classification or generation tasks of sequential data like text or log-data? What are the reasons for the decision and what does it depend on? Are there any papers or statistics that confirm this? I'm thinking of data like Linux log entries or short sentence of length of less than 20 words/tokens. Personally i would almost always use LSTM but I'm curious if CNN wouldn't …
Category: Data Science

Predicting next number in a sequence - data analysis

I am a machine learning newbie and I am working on a project where I'm given a sequence of integers all of which are in the range 0 to 70. My goal is to predict the next integer in the sequence given the previous 5 integers in the same sequence. There isn't much more information on the sequence of integers itself (for example, how was the sequence obtained, etc). The following are the things I tried. The first thing that …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.