Can Transformer Models be used for Training Chatbots?

Can Transformer Models be used for Training Chatbots?

Note - I am talking about the transformer model google released on the paper 'Attention is all you need'

Topic transformer chatbot deep-learning nlp machine-learning

Category Data Science


The Transformer model is a sequence-to-sequence model, that is, it is meant to address problems where the input is a sequence of discrete tokens (i.e. text) and the output is also a sequence of discrete tokens.

Therefore, a Transformer is well suited to be trained with a dataset of dialogs where the input is a statement or question and the output is the answer. This is usually called a "chit-chat" chatbot, because they are not backed by a knowledge base. They can just have "small talk".


A transformer is just a neural network. Sure, it is way more complex than a feed forward one, but it still is a neural network. As long as you provide the correct dataset (supervised case, so the correct pair input-target) your model should be able to learn the hidden representation that make it able to answer to new questions.

I found this interesting tutorial, try to check it out.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.