Identify Resume Structure

I am trying to build a resume parser (from PDF to JSON). After extracting text from a pdf as one long string, how would you split the string into different sections like the red lines show. Resumes have different formats and people use different labels for these sections. Is there any machine learning technique that I could look into? Thanks! .

Topic document-understanding machine-learning

Category Data Science


This is one of the famous implementations for your task. It works well mostly. If you just need such a tool you can use it. However, if you want to develop your own tools you might want to analyze its structure.

It is also able to look for specific skills in the resumes as mentioned here.

As in your requirement, it accepts pdf and also doc and returns json.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.