Fuzzy Address Matching using Rapid Fuzz
I am using RapidFuzz for matching US Addresses from two separate datasets.
I was able to get the results that I was hoping for using the below code:
for address in EB_RATING_LIST:
matches1.append(process.extractOne(address,CLAIMS_LIST, scorer = fuzz.ratio))
DAVE_EB_NO_DUPLICATES_ADDRESS['MATCHED_ADDRESS'] = matches1
But, I don't have a full confidence on the results I received. For example:
10 Washington Street has a 86% Match Ratio with: 102 Washington Street
My Question is how can I proceed with Fuzzy matching at a more granular level? Should I include the Zip Code, State as well for the Matching?
EDIT 09/14/21: I am concatenating Address, with City and State and then trying to match. I will share the results as soon as I get them.
EDIT 09/15/21: I did concatenate the Address, which is now having State and City name, along with Address and then tried Fuzzy Matching on it.
EXAMPLE: ***5805thAveStes323416NewYorkNY
(3505thAveNewYorkNY, 72.34042553191489, 9315)***
[Address that match the Most, Percentage of Matching, Index of the Address(From the table used for matching)]
Topic python-3.x fuzzy-logic python
Category Data Science