When to do tokenization and does my output need tokenization after stemming?
I am working on sentiment analysis project , where there are various customer reviews. So I am trying to clean those reviews.
So first thing i did is removing special characters, white spaces, numbers from text. Next I did is removing stop words(removing this, that, have etc.) After that i did stemming(removing ING, ed,y etc).
Below is my output.What I want to know is that is tokenization needed here any more?
Because my output after stemming looks like tokenization is done.
Topic tokenization preprocessing sentiment-analysis nlp data-cleaning
Category Data Science