What are helpful annotation tools (if any)

I'm looking for tools that would help me and my team annotate training sets. I work in an environment with large sets of data, some of which are un- or semi-structured. In many cases there are registration that help in finding a grounded truth. In many cases however a curated set is needed, even if it just were for evaluation. A complicating factor is that some of the data can not leave the premise.

We are looking to annotate an object detection task, but I anticipate an image segmentation task, a text classification task and a sentiment detection task in the near future.

What I'm looking for is a system that can help a group make an annotation, preferably in a way that motivates the annotators by showing group progress, relative individual progress and perhaps personal inter annotator agreement.

Topic annotation classification tools

Category Data Science


Label Studio is a powerful opensource with a web interface to annotate different data types. It can be audio, text, image, video, time series sources and mixes of them. The conditional and nested annotations are supported too. You write your own labeling config fitting your needs to configure the system.

Check it here: https://labelstud.io/playground

Annotation tool Label Studio


You could try UBIAI Annotation tool, it is pretty easy to use, has multiple annotation exports and is currently free.


I have just created a python library (GitHub --Blog post) to quickly create training data for spaCy NER models using ipywidgets.

Demo


I have been working with the spaCy extenstion on INCEpTION from Technische Universität Darmstadt. Seems pretty good so far.


Doccano is an open source simpler alternative to Prodigy. Its native python via Django. I found it suitable for simple implementations.


You can try Prodigy by explosion.ai, creators of spacy or brat an open source alternative to it. You may also refer to this post on qoura.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.