5 digit number mis-reads analysis
Nothing to do with number recognition in the classical 'hand-written' sense
Disclaimer above to avoid this being counted as a repeat.
I have a selection of 96 serial numbers, and a separate selection of >220 serial numbers. Within the larger set typically resides the smaller set (not always though), but also ~ 120 incorrect numbers.
See below for an example - for the record I have matched things up as best as I can... the correct number is first, the 'possibles' are in parenthesis at right:
21490 (21490, 21400, 21498, 21499, 21480, 21488)
21491 (21401, 21481, 1401)
21492 (21492, 21402)
This set gives a good example of the type of thing i'm seeing:
Number being misread the same way (0--> 9 and 8)
Sometimes a number is being missed entirely
Sometimes the right number isn't read at all...
It's not limited to 0, 8s and 9s, but these are the worst, so I'd like to try and understand which numeric characters are problematic (give them all a score), and build a model which takes a number, and knows a list of numbers it CAN be, and give me what number it should be, ideally with a confidence metric.
Anyone done this before and have any ideas?
Topic jupyter numerical python
Category Data Science