Which is the best algorithm for entity extraction for unstructured document

Question

Which is the best algorithm for entity extraction for unstructured document

Rajesh das

2022年2月2日 06:01

I have unstructured documents from which I have to extract the information like let buyer name, seller name, expiry date, buying date etc. I had planned to use spacy(Custom entity recolonization(Followed this blog https://medium.com/@manivannan_data/how-to-train-ner-with-custom-training-data-using-spacy-188e0e508c6)). But it seems sometimes buyer name predict as seller name and vice-versa and also sometimes got multiple predicted data wrongly in single entity when I passed whole document content. FYI.. This documents have approx 2-20 pages. so it has large content.

Can someone share if we can use any other packages for higher accuracy? if not how I need to train the model so that accuracy will be higher? Thanks in advance

Topic scipy python machine-learning

Category Data Science

score 1 · Accepted Answer · 2019年12月26日 20:28

1

answered at 2019年12月26日 20:28

Try to clean your document and use the flair library, it's a user friendly library from Zalando Research that allows you do do all sorts of nlp tasks very quickly. Especially NER.

Which is the best algorithm for entity extraction for unstructured document

About