Clarification on "predict the next character given the previous 100 characters"
I am studying Justin Johnson's lecture on RNNs
Lecture recording: https://www.youtube.com/watch?v=dUzLD91Sj-olist=PL5-TkQAfAZFbzxjBHtzdVCWE0Zbhomg7rindex=12t=3177s
One of the examples is character level language modeling: predicting the next character given the previous characters. At 33:03 in the video linked above, Justin discusses training an RNN that processes the works of William Shakespeare and tries to predict the next character given the previous 100 characters. What does given the previous 100 characters mean? In the lecture slides
Slides link: https://web.eecs.umich.edu/~justincj/slides/eecs498/498_FA2019_lecture12.pdf
there are the following figures:
It is my understanding that the language model is an example of the many-to-many model where the inputs x are given by one-hot character vectors. So the model predicts the next character given all previous characters. How does one encode given the previous 100 characters?
The following figure also appears in the slides:
Does given the previous 100 characters actually mean that the truncated backpropagation through time chunk size is 100?
Lastly, a many-to-one RNN architecture is also discussed and the below figure is provided in the slides:
Does given the previous 100 characters actually mean that the model is many-to-one where the many are 100 characters and the one is the next character?
Topic rnn deep-learning language-model
Category Data Science