Kernel selections in SVM

I want to understand the kernel selection rationale in SVM.

Some basic things that I understand is if data is linear, then we must go for linear kernel and if it is non-linear, then others.

But the question is how to understand that the given data is linear or not, especially when it has many features.

I know that by cross validation I can try and feed different kernels and see the output whichever performs best to be selected, but I'm looking for something anyway to have some early indications.

Topic linearly-separable svm predictive-modeling machine-learning

Category Data Science


This paper by Chih-Wei Hsu et al. is a good starting point for kernel selection. On page 3, they suggest using RBF kernel and fine tuning it.

Andrew Ng has also provided a very high level thumb rule in his video on SVM: If number of observations are larger than features, use Gaussian kernel. Use linear otherwise.


Start with linear kernel and see if your data is linearly seperable or not. Performing that is simpler than looking for early indications.

Linear kernels are suggested when the number of features is larger than the number of observations in the dataset (otherwise RBF will be a better choice).

However, once you conclude that you have non-linear data, you can try to visualize and map the non-linear separable data into a higher dimensional space and see if it makes the data linearly separable. That is what a kernel does anyway and so you visualizing will give you insights and indications about the type of kernel to be used.

enter image description here

This link might help.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.