Keywords extraction for business rule text classification
I would like to classify texts without using any ML model. My idea is to find a list of keywords that I would assign to each class. Then when I need to classify a new text, I can compare it with my list of keywords and count how many keywords for each class are in the text; the class with the most corresponding keywords would be my final prediction.
Example of classification for this list of keywords:
green : A
red : B
apple : A
car : C
The sentence A green apple in a car is classified as A.
(Points = A : 2, B : 0, C : 1)
The question is what are good techniques for me to explore in order to build my keyword list based on thousands of different text pieces and ~5 classes ? Most keywords algos I found (RAKE,...) are focused on extracting keywords from one text which is totally not my goal.
It would be a good 'baseline' algo for me to then compare results with more advanced ML classification techniques for my study.
Topic word text classification
Category Data Science