Should I create single feature for each specific word which i find in text or one for all them?
I am doing feature engineering right now for my classification task. In my dataframe I have a column with text messages. I decided to create a binary feature which depends on whether or not in this text were words call, phone, mobile, @gmail, mail facebook. But now I wonder should I create separate binary features for each word (or group of words) or one for all of them. How to check which solution is better. Is there any metric and how usually people do in practice. Thanks)
Topic dummy-variables feature-engineering nlp machine-learning
Category Data Science