Seq2Seq loss function

Question

Seq2Seq loss function

Văn Hiếu Lê

2022年4月6日 16:45

I was reading the paper neural_approach_conversational_ai.pdf. And in the section Seq2Seq for Text Generation there is a formula that i feel a bit wrong [1]: https://i.stack.imgur.com/sX0it.png Can someone help me confirm this formula?

Topic sequence-to-sequence rnn neural-network

Category Data Science

user101893 · Accepted Answer · 2022年4月6日 16:45

This is the loss function that you aim to minimize by tuning the parameters theta given the data x,y. the loss is actually the negative conditional log-likelihood of the output sequence y given the input sequence x. what you want to find is a distribution P(y|x) parametrized by theta that gives you the most probable output sequence y given an input sequence x. minimizing the loss function means that you shape the distribution based on the examples in your training data such that for every sequence x in the training data the most probable output sequence y_predict agrees best with the actually observed output sequence y. You do this in the hope that the model will generalize well on unseen data, i.e. when you feed in a new sequence x that the model hasn't seen before, it will give you an accurate estimate of the corresponding sequence y that most likely will be associated with x.

Seq2Seq loss function

About