parse pdf into Json or Xml
I want to create a neural net that can obtain some specific words from a pdf document into JSON or XML. For example let's assume that I have a pdf containing some information about countries and i want to recuperate the countries name and population to obtain something like this :
countries
country
name
France
/name
population
70m
/population
/country
.
.
.
/countries
Should I build a neural net and train it myself? If so can you give a good tutorial to follow please, or is there an already trained one that I can use?
Topic neural-network parsing
Category Data Science