Understanding SVM Kernels

Question

Understanding SVM Kernels

Mehran Torki

2020年4月23日 18:36

Following Andrew Ng's machine learning course, he explains SVM kernels by manually selecting 3 landmarks and defining 3 gaussian function based on them. Then he says that we are actually defining 3 new features which are $f_1$, $f_2$ and $f_3$.

$\hskip0.9in$

And by applying these 3 gaussian functions on every input data: $$x=(x_1,x_2)\to \hat{x}=(f_1(x), f_2(x), f_3(x))$$

it seems that we are mapping our data from $\mathbb R^2$ space to a $\mathbb R^3$ space. Now our goal is to find a hyperplane in the 3 dimensional space, where our transformed data is linearly separable. Is my understanding correct? If not, how these 3 new features should be interpreted?

$\hskip1in$

In some blog posts, i have read that by using a gaussian kernel, we are mapping our data to an infinite dimensional space (where gaussian kernel computes the dot product of transformed input data) which contradicts with my above understandings.

Topic kernel self-study svm machine-learning

Category Data Science

whege · Accepted Answer · 2020年4月23日 18:36

Yep, this is the correct interpretation. The kernels make a difficult classification problem into a much simpler one by making the data linearly separable by transforming it into a higher dimension. I think this image does a good job of illustrating that.

Understanding SVM Kernels

About