Stop words list to use for CountVectorization

Question

Stop words list to use for CountVectorization

Juan

2021年8月3日 05:35

The sci-kit learn library by defaults provides two options either no stop words
or one can specify stop_words=english to include a list of predefined English words
I am using Naive Bayes for SMS spam detection. Is there any other list of stop words
I can experiment with?

Topic naive-bayes-algorithim scikit-learn

Category Data Science

SrJ · Accepted Answer · 2021年8月3日 05:35

1

SrJ answered at 2021年8月3日 05:35

import nltk
from nltk.corpus import stopwords
print(stopwords.words('english'))

Stop words list to use for CountVectorization

About