How to choose a kernel function and a feature mapping function?
Although, after extensive of reading, I know the concepts of support vector machines pretty well by now, I have trouble translating the concept of the kernel function $K$ and the feature mapping function $\phi$ to a simple example such as the following.
My example data $x \in \mathbb{R}^2$: $(1,0), (4,0)$ are from one class, $(2,0), (3,0)$ are from another.
So here are my two questions:
Would $\phi((x_1,x_2))=(x_1,x_2,(x_1-2.5)^2)$ be a wise choice for the mapping function $\phi:\mathbb{R}^2 \to \mathbb{R}^3$ ? If not, what $\phi$ would be a wiser choice?
What would be the corresponding choice for the kernel function $K$?
Topic kernel classification svm
Category Data Science