For an n-Gram model with n>2, do we need more context at end of each sentence?

Question

For an n-Gram model with n>2, do we need more context at end of each sentence?

KGhatak

2020年10月5日 15:06

Jurafsky's book says we need to add context to left and right of a sentence:

Does this mean,

for example, if we've a corpus of three sentences: John read Moby Dick, Mary read a different book, and She read a book by Cher; and after training our tri-gram model on this corpus of three sentences, we need to evaluate the probability of a sentence John read a book, i.e. to find $P(John\; read\; a\; book)$ as below,

$P(John\; read\; a\; book)$

$=P(ssJohn\; read\; a\; book\backslash s\backslash s)$

$=P(John|ss) P(read|sJohn) \; P(a|John\; read) P(book|read\; a) P(\backslash s|a\; book)\; P(\backslash s|book\backslash s)$

$=\frac{1}{3}\frac{1}{1}\frac{1}{1}\frac{1}{2}\frac{0}{1}\frac{1}{1}$ (without smoothing)

It would be great, if you let me know if the above understanding is correct?

Topic stanford-nlp ngrams language-model nlp

Category Data Science

For an n-Gram model with n>2, do we need more context at end of each sentence?

About