Best way to suggest answers given historical question-answer pairs
Many question-answering implementations focus on extracting information from large documents/corpora of text such as Wikipedia.
I have access to a full chat log from the customer service of a large electronics company and I'm wondering if I can use the large amount of historical question-answer pairs in order to come up with useful answer suggestions for the customer care agents. The goal is thus to provide the customer care agent with useful answer suggestions. These could either be the result of some matching- or classification endeavor over the historical answers, but could also be the result of generative methods.
The question is basically the same as this one: Question and Answer Chatbot for Customer Support. Given that a lot has changed and evolved in NLP-land, I figured I'd ask this question again.
What I've experimented with thus far is to use state-of-the-art language models like BERT to vectorize the incoming questions and use cosine-similarity to find the most similar question and return the answer that was given to that question. This seems to work pretty well for generic questions that are answered with generic answers, but is horrible for more specific questions. Unfortunately, the generic questions make up only a small portion of the dataset.
I'm looking for suggestions to implement an approach that takes full advantage of the available historical question-answer pairs.
Topic question-answering text-generation nlp
Category Data Science