train NER using NLTK with custom corpora (non-english) must use StanfordNER?
I have searched about customization NER corpora for trainig the model using NLTK library from python, but all of the answer direct to nltk book chapter 7 and honestly makes me confuse how to train the corpus with correct flow and data set that has structure like this below:
Eddy N B-PER
Bonte N I-PER
is V O
woordvoerder N O
van Prep O
diezelfde Pron O
Hogeschool N B-ORG
. Punc O
I have some questions:
- I found so many article that if you will train customed corpora using NLTK, there uses StanfordNER library too, should it be? or we can use pure of NLTK library for it?
- Should the grammar pattern be included if you want to apply it to other languages? How is the flow?
And please give me example of code to train custom corpora until give the tag of POS Tag and NER label output using data like data structure above if you have. Thank you.
Topic nltk named-entity-recognition nlp
Category Data Science