Data Set and guidance for Occupations/ Roles classification problem
I am working on a project where I need to find similar roles -- for example, Software Engineer, Soft. Engineer , Software Eng ( all should be marked similar)
Currently, I have tried using the Standard Occupational Classification Dataset and tried using LSA, Leveinstein and unsupervised FastText with Word Movers Distances. The last option works but isn't great.
I am wondering if there are more comprehensive data sets or ways available to solve this problem?? Any lead would be helpful!
Topic fasttext word2vec dataset nlp machine-learning
Category Data Science