Similarity Matching Algorithm

I am looking for help on identifying a class of algorithm. If I have a tabular training and test set I want to know the similarity of rows based on some numeric features. The training data would be labelled such that rows would be paired (or even grouped). The output for each row in the test/prediction set would be the most similar row and the probability that it would have been paired with that row.

In theory there could be a matrix with a score for the pair by pair comparison but I only need the highest for my use case.

This feels somewhat like collaborative filtering but I am not quite sure what to call this class of algorithm. Bonus points if you can point me to a python library.

Topic similarity machine-learning

Category Data Science


It seems there does exist a python library for supervised ML to solve this

https://pypi.org/project/dedupe/

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.