Spam/ham classification

I am exploring the use of lime for spam/ham categorisation. specifically I have a data frame having list of messages. I would need to identify which messages are spam and which ones are ham by using a set of words (100). I would need to find to test the accuracy of the model. I found some articles on towardsdatasciene and medium that helped me a bit, but I would need a really small example on what I would need (already labelled messages? If yes, I should manually label them first, or should I use some algorithm to label them as spam/ham? How could I predict a new message?...).

Any explanation and comment would be greatly appreciated!

Topic lime text-classification python machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.