Why (or how) does a Keras model skip Stemming or Lemmatization steps?

Question

Why (or how) does a Keras model skip Stemming or Lemmatization steps?

Rajdeep Biswas

2022年4月30日 12:00

This Keras article / tutorial here does perform text standardization i.e removing HTML elements, punctuation, etc. from the text dataset, however, there is a distinct lack of any stemming or lemmatization before the vectorization step.

I have a bit of experience in deep learning but I am very new to NLP, and I just got to know (from a different tutorial on Udemy, which BTW was using Bag of Words) that using either a Stemmer or a Lemmatizer helps in bringing down the vocabulary size and hence increases performance. I am a bit baffled by the absence of this step in the Keras-way of doing things.

Here is one assumption of mine - is it omitted because a Neural Network model is capable of handling a larger vocabulary size? I cannot think of any other reason(s) as to why that might be the case.

Topic keras sentiment-analysis nlp

Category Data Science

Why (or how) does a Keras model skip Stemming or Lemmatization steps?

About