N-grams for RNNs
Given a word $w_{n}$ a statistical model such a Markov chain using n-grams predicts the subsequent word $w_{n+1}$. The prediction is by no means random.
How is this translated into a neural model? I have tried tokenizing and sequencing my sentences, below is how they are prepared to be passed to the model:
train_x = np.zeros([len(sequences), max_seq_len], dtype=np.int32)
for i, sequence in enumerate(sequences[:-1]): #using all words except last
for t, word in enumerate(sequence.split()):
train_x[i, t] = word2idx(word) #storing in word vectors
The sequences look like this:
Given sentence Hello my name is:
Hello
Hello my
Hello my name
Hello my name is
Passing these sequences as input to an RNN with an LSTM layer, the predictions of the next word (given a word) I'm getting are random.
Category Data Science