How to cluster sentences based on company names from a post(s) containing several company names using similarity metric.
My corpus contains several posts having text for several companies i.e. each post contains information about several companies.
I want to cluster the information based on few company names that I can specify. Clustering should be based on some similarity matrix such as euclidean or cosine similarity.
Which algorithm to use based on company name that I can specify and which similarity method to use?
Topic text-mining nlp python
Category Data Science