pdf to json libraries
I am looking for a library which converts pdf to json. Basically in that json the paragraph heading is the and the value is the content of paragraph. Is there any python library for that ? I am already using pdfminer but that just converts to plain text. It cannot persist the structure/organisation of the document. For now it is ok to not read images and table although if there is a library to do that would be great.
Topic nlp
Category Data Science