For an n-Gram model with n>2, do we need more context at end of each sentence?
Jurafsky's book says we need to add context to left and right of a sentence:
Does this mean,
for example, if we've a corpus of three sentences: John read Moby Dick
, Mary read a different book
, and She read a book by Cher
; and after training our tri-gram model on this corpus of three sentences, we need to evaluate the probability of a sentence John read a book, i.e. to find $P(John\; read\; a\; book)$ as below,
$P(John\; read\; a\; book)$
$=P(ssJohn\; read\; a\; book\backslash s\backslash s)$
$=P(John|ss) P(read|sJohn) \; P(a|John\; read) P(book|read\; a) P(\backslash s|a\; book)\; P(\backslash s|book\backslash s)$
$=\frac{1}{3}\frac{1}{1}\frac{1}{1}\frac{1}{2}\frac{0}{1}\frac{1}{1}$ (without smoothing)
It would be great, if you let me know if the above understanding is correct?
Topic stanford-nlp ngrams language-model nlp
Category Data Science