what is the training phase in N-gram model?
Following is my understanding of N gram model used in text prediction case :
Given a sentence say, I love my (say N = 1 /bigram), using N gram and say 4 possible candidates ( country, family, wife, school) I can estimate the conditional probability on each of the candidates and take the one with highest probability as the next word.
Question :
I understand the probability part of the model but to even get to the probability, we need the possible candidates ( next words, in this case family, wife, school, country). How does the model choose the candidates
Most of the articles online talk about the probability part but doesn't mention anything about training phase. What exactly is happening in training phase of this model?
Category Data Science