Data Entry Automation with ML
I am working on a data entry task with approximately 6000 entries to go over.
The source comes in the form of a string and can look something like this:
Air Canada B737 FFS
From this I can extract the following information:
Company: Air Canada
Model: B737
Technology: FFS
For my initial plan of attack I iterated over the source strings using Regular Expression to extract as many keywords as possible, the problem is there are so many different Companies, Models and Technologies in my source, and not every source is as clean as the one above so it would take forever to write out all the possible regular expressions. Also, there is usually text within the source that isn't important, a red herring if you will.
As a result I have manually filled out 2000 of the entries myself.
My question is, do you think I could train a ML model with my manually filled dataset to do the rest of this task for me? I would consider myself to have intermediate skills in python, but what type of ML algo should I use for this task? Any help to point me in the right direction would be very much appreciated!
Topic text-mining machine-learning
Category Data Science