chatbot encoder/decoder: why do we need to use chatbot answer as the decoder inputs?

I am looking into the chatbot tutorial at:

https://medium.com/predict/creating-a-chatbot-from-scratch-using-keras-and-tensorflow-59e8fc76be79

It uses sequence to sequence model with encoder/decoder to solve the problem:

encoder_inputs = tf.keras.layers.Input(shape=( None , ))
encoder_embedding = tf.keras.layers.Embedding( VOCAB_SIZE, 200 , mask_zero=True ) (encoder_inputs)
encoder_outputs , state_h , state_c = tf.keras.layers.LSTM( 200 , return_state=True )( encoder_embedding )
encoder_states = [ state_h , state_c ]

decoder_inputs = tf.keras.layers.Input(shape=( None ,  ))
decoder_embedding = tf.keras.layers.Embedding( VOCAB_SIZE, 200 , mask_zero=True) (decoder_inputs)
decoder_lstm = tf.keras.layers.LSTM( 200 , return_state=True , return_sequences=True )
decoder_outputs , _ , _ = decoder_lstm ( decoder_embedding , initial_state=encoder_states )
decoder_dense = tf.keras.layers.Dense( VOCAB_SIZE , activation=tf.keras.activations.softmax ) 
output = decoder_dense ( decoder_outputs )

model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output )
model.compile(optimizer=tf.keras.optimizers.RMSprop(), loss='categorical_crossentropy')
model.summary()

I understand that the "chatbot question" needs to be the input of the encoder, and I also understand that "chatbot answer" needs to be the output of the decoder. However, I do not understand why the "chatbot answer" (decoder_inputs) has to be the decoder (as well as the entire model) input:

model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output )

Could anyone please share their thoughts? Also, any paper related to such approach? What's the intuition behind using the "chatbot answer" as the inputs to decoder? Thanks!

Topic chatbot lstm autoencoder

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.