ignoring instances or masking by zero in a multitask learning model
For a multitask learning model, I've seen that approaches usually mask the output that doesn't have a label with zeros. As an example, have a look here: How to Multi-task learning with missing labels in Keras
I have another idea, which is, instead of masking the missed output with zeros, why don't we ignore it from the loss function? The CrossEntropyLoss implementation in Pytorch allows specifying a value to be ignored: CrossEntropyLoss .
Is this going to be ok?
Topic cross-entropy pytorch loss-function multitask-learning
Category Data Science