How is hinge loss related to primal form / dual form of SVM

I'm learning SVM and many classic tutorials talk about the formulation of SVM problem as a convex optimization problem: i.e. We have the objective function with slack variables and subject to constraints. Most tutorials go through the derivation from this primal problem formulation to the classic formulation (using Lagrange multipliers, get the dual form, etc...). As I followed the steps, they make sense eventually after some time of learning.

But then an important concept for SVM is the hinge loss. If I'm not mistaken, the hinge loss formula is completely separate from all the steps I described above. I can't find where the hinge loss comes into play when going through the tutorials that derive the SVM problem formulation.

Now, I only know SVM as a classic convex optimization / linear programming problem with its objective function and slack variables that is subject to constraints. How is that related to hinge loss??

Topic hinge-loss svm

Category Data Science


Hinge loss for sample point $i$: $$l( y_i, z_i) = \max(0, 1-y_iz_i)$$

Let $z_i=w^Tx_i+b$.

We want to minimize

$$\min \frac1n \sum_{i=1}^nl(y_i, w^Tx_i+b)+\|w\|^2$$

which can be written as

$$\min \frac1n \sum_{i=1}^n\max(0,1-y_i (w^Tx_i+b))+\|w\|^2$$

which can be written as

$$\min \frac1n \sum_{i=1}^n \zeta_i + \|w\|^2$$

subject to $$\zeta_i \ge 0$$

$$\zeta_i \ge 1-y_i (w^Tx_i+b)$$

The constraint comes from hinge loss. It is a reformulation of a minimax optimization problem.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.