NLP: Compare tags semantically with machine learning? (finding synonyms)
Let's say I have multiple tags that I need to compare semantically. For example:
tags = ['Python', 'Football', 'Programming', 'Handball', 'Chess', 'Cheese', 'board game']
I would like compare these tags (and many more) semantically to find a similarity value between 0 and 1. For example, I want to get values like these:
f('Chess','Cheese') = 0.0 # tags look similar, but means very different things
f('Chess', 'board game') = 0.9 # because chess is a board game
f('Football', 'Handball') = 0.3 # because both are sports with a ball
f('Python', 'Programming') = 0.9 # because Python is a programming language
So what is the state of the art approach to get a function f
like this? I know that machine learning might be doing this, but this area is huge and overwhelming for me (on the first glance it looks like NLP focuses on other problems).
So what would be the best approach for this specific problem?
Topic semantic-similarity nlp machine-learning
Category Data Science