In keras seq2seq model, what is the difference between `model.predict()` and the inference model?

I am looking into seq2seq model in keras, for example, this blog post from keras or this. All the examples I have seen have some inference model, that depicts the original model. That inference model is then used to make the predictions.

My question is why can't we just do the model.predict(). I mean, we can because I have used it and it works but what is the difference between these two approaches. Is it wrong to use model.predict() and do the reverse word tokenizer for the argmax ?

Topic sequence-to-sequence inference keras tensorflow

Category Data Science


I understand that with "All the examples I have seen have some inference model, that depicts the original model" you mean that there is a function that performs complex operations with the model instead of just invoking model.predict(). Such a function is called decode_sequence in the linked examples.

Note that you can't just invoke model.predict() once because you don't have any inputs to feed to the decoder.

The thing with this type of seq2seq models is that they are autoregressive. This means that they predict the next token based on its previous predictions. Therefore, you need to predict one token at a time: first, you predict the first token, then you invoke again the model with such a prediction to get the next token, and so on. This is precisely what function decode_sequence does: it just invokes model.predict() to get the next token, until the stop condition is met, that is either predicting the \n token or having predicted the maximum number of tokens.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.