How to classify named entities of the same type?
I am doing a project where I am extracting date/time entities from text. I'm using a rule-based system to extract the temporal expressions and ground them to an actual date/time.
The second part of the problem I hope to solve is label the role of each entity discovered. For example, consider the following text: "Leaving at 2pm and back at 4pm". I correctly identified 2pm and 4pm as date/time entities. However, I'm unable to say whether the entity is "start-time", "end-time", or neither.
The question is how do I do this?
I'm new to NLP and ML. Here is an idea I have please tell me if I'm going the right direction:
The plan is to train a logistic regression (or naive bayes?) classifier using the following features:
- The average of the word embedding for each word within a window of the date/time phrase.
- The POS tags for each word within a window of the date/time phrase??(Not sure how to pass this in to a logistic regression classifier but just a thought)
- The word shapes of the words in the temporal expression??
I'm a little confused as to where to start and would really appreciate some pointers on how to select my features and what classifier would be appropriate.
I'm also open to suggestions on learning resources. There's a lot of NER resources online but not many on how to "role classify" found entities.
Topic named-entity-recognition nlp machine-learning
Category Data Science