How to prepare data for LSTM time series prediction

Question

How to prepare data for LSTM time series prediction

Kaggle

2022年3月11日 02:00

I have a binary classification task for time series data. Every 14 rows in my CSV is relevant to one time slot. How should I prepare this data to be used in LSTM? In other word how to feed the model with this data?

Topic learning python

Category Data Science

Adeetya · Accepted Answer · 2019年6月13日 13:00

Here is the pseudo code for this:

Import pandas as pd
Import numpy as np

Data = pd.read_csv(filename)
Lag = 14
#assuming target column is last one
X=[ ]
Y = [ ]
for x in range(lag, len(data)):
     X.append(data.iloc[x-lag:x,:])
     Y.append(data.iloc[x,-1])
X= np.array(X)
Y = np.aaray(Y)

vipin bansal · Accepted Answer · 2019年6月13日 09:20

Although I'm not sure about this statement "Every 14 rows in my CSV is relevant to one time slot.", as it's not cleared to me.

But if I go with your comment "How should I load this data to LSTM?So the number of column is 12 ", what I believe that you are asking how to load multiple features(in your case 12) for a time series model.

If my understanding is correct its a problem of type "Multiple Parallel Timeseries". I have created a similar model in Tensorflow and pushed in github. Github Source Code for Multiple Parallel TimeSeries

Note: Here instead of 12 features, I have used 3 features.

yunus · Accepted Answer · 2019年1月14日 06:10

I hope that dataset also consist of meta data, which means you also need to have a one to one mapping of those tuples, eg. dog > good, cat > bad, kittens > bad, puppies > good, etc.

Separate the data into X:training_data, Y:label. Then use a vectorizer and train using X, Y. If you're able to do above steps then use methods like test_train set , cross_folds etc.

Friendly suggestion: Try seq2seq layers before LSTM (they require more resources).

How to prepare data for LSTM time series prediction

About