Intuition behind the RNN/LSTM hidden state?

What's the intuition behind the hidden states of RNN/LSTM? Are they similar to the hidden states of HMM (Hidden Markov Model)?

Topic markov-hidden-model lstm rnn deep-learning

Category Data Science


Just to add, the hidden state can be described as the working memory of the recurrent network that carries information from immediately previous timesteps/events. This working memory overwrites itself at every step uncontrollably and is present at RNNs and LSTMs.

Given the latter, I appreciate the analogy with markovian framework - in a wider sense. Feel free to check my answer to a similar question for more information on the hidden and cell state architectures in sequence models.

Difference between LSTM cell state and hidden state


I personally don't think they are comparable to the hidden state of a Markov model. One key difference is that, in a HMM you can explain what a given state means to someone, where in a RNN/LSTM you cannot interpret a given state.

The closest thing that you can compare the hidden state of an RNN/LSTM is to think of it as the output of an intermediate layer of a fully-connected neural network but for time-series data.

And the larger the hidden state the more memory it can retain of the past.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.