Should I rescale losses before combining them for multitask learning?

Question

Should I rescale losses before combining them for multitask learning?

Silver Duck

2020年9月17日 08:01

I have a multitask network taking one input and trying to achieve two tasks (with several shared layers, and then separate layers).

One task is multiclass classification using the CrossEntropy loss, the other is sequence recognition using the CTC loss.

I want to use a combination of the two losses as criterion, something like Loss = λCE + (1-λ)CTC. The thing is that my CE loss starts around 2 while the CTC loss is in the 400s.

Should I rescale the losses at each epoch with a Max(L₁)/L₁ factor, where Max(L₁) is the maximal loss at epoch 1 and L₁ is each “sub-loss” at epoch 1. That is we scale the loss so that at the first epoch they have the same magnitude and then we keep scaling using those factors.

Is there a better approach? How do I ensure that my two losses have the same influence on the backpropagation with respect to λ?

Topic machine-learning-model loss-function multitask-learning multiclass-classification neural-network

Category Data Science

ashutosh singh · Accepted Answer · 2019年6月18日 12:57

1

ashutosh singh answered at 2019年6月18日 12:57

Check this.

Under the heading Multi-tasks losses they have mentioned how they average losses from two different tasks. They do a weighted average depending on their use case.

Should I rescale losses before combining them for multitask learning?

About