Hinge loss question
Hinge loss is usually defined as $$L(y,\hat{y}) = max(0,1-y\hat{y}) $$
What I don't understand is why are we comparing zero with $1-y\hat{y}$ instead of some other constant. Why not make it $2-y\hat{y}$, or $\sqrt2-y\hat{y}$ or just take $y\hat{y}$, to check if the observation would be on the right side of the hyperplane? Is there any reason behind '1' as a constant?
Thanks
Topic hinge-loss
Category Data Science