Calculating confidence score in NER

Question

Calculating confidence score in NER

Saikat Bhattacharya

2022年3月19日 03:05

I am working on a problem on Named Entity Recognition. Given a text, my model is detecting the Named Entities and extracting that info for the end-user. Now the ask is end-user needs a confidence score along with the extracted entity. For example, the given text is: XYZ Bank India Limited is a good place to invest your money - Our model is detecting XYZ Bank as an Org, but India as a Location (which is wrong - the whole XYZ Bank India Limited is the name of the organization). Our model also gives a probability score for each token it classifies. But the end-user wants to know the confidence of the model that it did not mistake to detect the subsequent tokens as the parts of the organization name.

Question is - how can we efficiently measure that in a given sequence our model is detecting a certain sub-sequence as an Organization name (or a Location or something else) correctly or not? How can we say that it did not miss out on any subsequent or preceding token which actually a part of the named entity (like it missed India Limited in the above example)?

Topic sequence-to-sequence named-entity-recognition deep-learning nlp python

Category Data Science

Vivek · Accepted Answer · 2021年5月11日 12:13

Named Entity Recognition is traditionally evaluated using precision/recall and F1 score [https://towardsdatascience.com/entity-level-evaluation-for-ner-task-c21fb3a8edf] - the medium article gives a low down on how to achieve this I recently happened to read this article on a new approach for the same. Please see the details in the attached medium link : [https://towardsdatascience.com/a-pathbreaking-evaluation-technique-for-named-entity-recognition-ner-93da4406930c] but havent tried this out yet though

Calculating confidence score in NER

About