Similarity Matching Algorithm

Question

Similarity Matching Algorithm

Keith

2022年5月8日 11:06

I am looking for help on identifying a class of algorithm. If I have a tabular training and test set I want to know the similarity of rows based on some numeric features. The training data would be labelled such that rows would be paired (or even grouped). The output for each row in the test/prediction set would be the most similar row and the probability that it would have been paired with that row.

In theory there could be a matrix with a score for the pair by pair comparison but I only need the highest for my use case.

This feels somewhat like collaborative filtering but I am not quite sure what to call this class of algorithm. Bonus points if you can point me to a python library.

Topic similarity machine-learning

Category Data Science

Keith · Accepted Answer · 2021年1月27日 20:51

1

Keith answered at 2021年1月27日 20:51

It seems there does exist a python library for supervised ML to solve this

https://pypi.org/project/dedupe/

Similarity Matching Algorithm

About