Understanding SVM Kernels
Following Andrew Ng's machine learning course, he explains SVM kernels by manually selecting 3 landmarks and defining 3 gaussian function based on them. Then he says that we are actually defining 3 new features which are $f_1$, $f_2$ and $f_3$.
And by applying these 3 gaussian functions on every input data: $$x=(x_1,x_2)\to \hat{x}=(f_1(x), f_2(x), f_3(x))$$
it seems that we are mapping our data from $\mathbb R^2$ space to a $\mathbb R^3$ space. Now our goal is to find a hyperplane in the 3 dimensional space, where our transformed data is linearly separable. Is my understanding correct? If not, how these 3 new features should be interpreted?
In some blog posts, i have read that by using a gaussian kernel, we are mapping our data to an infinite dimensional space (where gaussian kernel computes the dot product of transformed input data) which contradicts with my above understandings.
Topic kernel self-study svm machine-learning
Category Data Science