Transform NL Text to DSL using NN/ML approach
Essentially I have a Corpus of a multitude of system requirements given in a natural language. An example requirement can look like this:
When the Gear shifter is put into Drive, the Car should start moving forward!
my task was to develop a DSL that models these NL requirements. I have developed certain keywords in my DSL and the requirement above translated in my DSL would look like this:
The Component: Gear shifter
if: put into Drive
then: Car should start moving forward
The keywords in this case would be The Component, if then (of course there are some more but in this example this ones would suffice)
So my Goal is essentially to transform the NL requirement into the DSL requirement using ML/NN approaches. Honestly I am not sure how to achieve this so my first instinct was to find/extract templates automatically and then have this template with keywords and have the blanks filled out. Essentially something like this:
The Component: ____
if: ____
then: ____
The problem with this approach would be that not every requirement would use the same template as there are some requirements that don't need any if-then statements and use other keywords instead so the complexity can vary from requirement to requirement
Is there an easier way to achieve what I want? I have been looking at CodeBERT as well which allows some kind of program synthesis by a NL query as input, I am not sure if it's possible to fine tune codeBERT in such a way that it can output my own DSL.
I would like it most if I could just take a pre-trained Model and fine-tune it by giving it a trainset containing every single NL requirement and the corresponding transformed DSL requirement as Input and have the Model learn to transform the requirements accordingly but I worry that the result would be very bad when giving the Model a test dataset containing new requirements that it has not seen before.
Topic text-generation neural-network nlp
Category Data Science