Why Liblinear performs drastically better than libsvm linear kernel?

l have a dataset of dim=(200,2000) 200 examples and 2000 features. l have 10 classes.

l used sklearn for both cases :

svm.svc(kernel=linear)
LinearSVC()

However LinearSVC() performs drastically better than svm with linear kernel. 60% against 23%. l'm supposed to get the same or comparable results since they are fed with same parameters and data.

What's wrong ?

Thank you

Topic scikit-learn svm libsvm machine-learning

Category Data Science


Here is just a guess, but according to me, the linearSVC might perfoms better than SVM with linear kernel because of regularization.

Because linearSVC is based on liblinear rather than libsvm, it has more flexibility and it gives you the possibility to use regularization with your SVM (default is L2-Ridge regularization).

Because you have more features than observations, it exists multiple solutions to your classification model. Some of them are more "robust" than other ones. L2 regularization will help you reduce the ampltitude of your model coefficients and maybe lead to a more stable solution.

More on LinearSVC in the documentation :

http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

More on Ridge regression benefits when working with a lot of features (and so multicollinearity) :

https://stats.stackexchange.com/questions/118712/why-does-ridge-estimate-become-better-than-ols-by-adding-a-constant-to-the-diago

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.