Inter-Annotator Agreement score for NLP?

I have several annotators who annotated strings of text for me, in order to train an NER model. The annotation is done in json format, and it consists of a string followed by the start and end index of named entities, along with their respective entity type. What is the best way to calculate the IAA score in this case? Is there a tool, or Python library available?

Topic annotation named-entity-recognition

Category Data Science


I think the Kappa coefficient is the most commonly used to measure inter-annotator agreement, but there are other options as well.

sklearn provides an implementation of the Cohen Kappa coefficient, which can be used to compare two annotators.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.