Connect a dense layer to a LSTM architecture
I am trying to implement an LSTM structure in plain numpy for didactic reason. I clearly understand how to input the data, but not how to output. Suppose I give as inputs a tensor of dimension (n, b, d) where: • n is the length of the sequence • b is the batch size (timestamps in my case) • d the number of features for each example Each example (row) in the dataset is labelled 0-1. However, when I fed the data to the LSTM, I obtain as a result the hidden state h_out which has the same dimension of the hidden size of the network. How can I obtain just a number that can be compared to my labels and properly backpropagated? I read that someone implements another dense layer on top of the LSTM, but it's not clear to me the dimensions that such layer and its weight matrix should have.
Topic stacked-lstm lstm neural-network classification python
Category Data Science