Best strategy for extracting specific structured data from unstructured sentences
Given a list of sentences like this:
- 4 to 5 hours over a period of 16 weeks
- 1st session: 2.0-2.5 hours 2nd session: 1.5-2.0 hours
- Approximately 5-6 visits over the course of 5 months. Visit 1, 3, 5: about 1.5 hours. Visit 2, 4: short
- 15 visits over a period of approximately 74 weeks.
- You will come to the organization about 12 times, over a period of a little more than three years. Each visit will take from 3-6 hours.
What tools/strategy should I use if I want to let the model spit out the following data for the above sentences:
Number of sessions | Total duration(h) | Total timespan(w) |
---|---|---|
Unknown | 4-5 | 16 |
2 | 3.5-4.5 | Unknown |
5-6 | 4.5 | 20 |
15 | Unknown | 74 |
12 | 36-72 | 156 |
I'm a ML beginner and wondered if this is achievable with Tensorflow or GPT? For further learning on my own: what is the specific terminology I should google for? Is this NER, text extraction or more like text classification?
Topic openai-gpt tensorflow
Category Data Science