Word similarity considering special characteristics
I'm looking for an algorithm that computes the similarity between two strings just like the levenshtein
distance. However, I want to consider the following. The levenshtein
distance gives me the same for these cases:
distance(apple, appli) #1
distance(apple, appel) #1
distance(apple, applr) #1
However, I want the second and third example to have a smaller distance because of the following reasons:
- second example: all the correct letters are used in the second word
- third example:
r
is much likely to be a typo of the lettere
because of the keyboard placement.
Are you familiar with any algorithm that weights such characteristics ?
Topic nlp similarity
Category Data Science