When training on a paragraph containing large number of words,does GRU end up predicting repeated outputs
Is it correct,that if we train GRUs on paragraphs containing a large number of words(say 10,000),then the GRU will end up predicting repeated outputs or in worst case,the predicted output will not have much variance.
Apart from these points mentioned above,what other problems might GRU suffer from when training on documents containing large amount of words.
Topic gru
Category Data Science