Keras data structure for LSTM-networks

After reading a while, I am confused now about my LSTM data structure. Assuming that I have a supervised learning problem with 1000 samples and 40 features as input. Now I want to create 10 timesteps of x. My resulting dimension of Keras data structure is

x: (1000, 10, 40)

with every two-dimensional matrix (1000, 40) shifted row-wise and ten times.

The question is now:

Which dimension has my target y to be?

My dimension is y=[1000,1] for one resulting target, but I also read y=[1000,10,1]. I suppose, the second dimension with value 10 should be the row-wise shifted target-vector. But, do I really need this? Shouldn't my solution be the right one?

Topic lstm keras

Category Data Science


It depends on the problem. A shape of [1000,1], suggests that you are trying to predict a single label for each sequence member of the batch, each member having up to 10 tokens in the sequence. This could happen say, if you want to classify sentences of 10 words each as positive or negative sentiment, or basically if you have a label to apply to each sentence.

A shape of [1000,10,1] suggests that you are trying to predict a single label for each of 10 token members in each sequence member of the batch. This is typically done for POS tagging, NER tagging, basically if you have a label to apply to every token in the sentence.


Your target y can be whatever you need.

If you want to do sequence to sequence for example, y will have the same number of time steps than the inputs (you predict something for each timestep).

But you can also define an output with one timestep only, for text classification for example. Your input data has 10 timesteps, but the output is only a prediction at the last timestep.

It depends on the problem you’re trying to solve.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.