Why margin-based ranking loss is reversed in these two papers?
For knwoledge graph completion, it is very common to use margin-based ranking loss
In the paper:margin-based ranking loss is defined as
$$ \min \sum_{(h,l,t)\in S} \sum_{(h',l,t')\in S'}[\gamma + d(h,l,t) - d(h',l,t')]_+$$
Here $d(\cdot)$ is the predictive model, $(h,l,t)$ means a positive training instance, and $(h',l,t')$ means a negative training instance corresponding to $(h,l,t)$.
However, in the Andrew's paper, it defines
$$ \min \sum_{(h,l,t)\in S} \sum_{(h',l,t')\in S'}[\gamma + d(h',l,t') - d(h,l,t)]_+$$
It seems that they switch the terms $d(h',l,t')$ and $d(h,l,t)$.
My question is that
does it matter to switch $d(h',l,t')$ and $d(h,l,t)$? it's real strange definition. Thanks
Topic hinge-loss loss-function deep-learning
Category Data Science