How does the Naive Bayes algorithm function effectively as a classifier, despite the assumptions of conditional indpendence and bag of words?
Naive Bayes algorithm used for text classification relies on 2 assumptions to make it computationally speedy:
Bag of Words assumption: the position of words is not considered
Conditional Independence: words are independent of one another
In reality, neither of those conditions often holds, yet Naive Bayes is quite effective. Why is that?
Topic naive-bayes-algorithim naive-bayes-classifier
Category Data Science