Why Liblinear performs drastically better than libsvm linear kernel?

Question

Why Liblinear performs drastically better than libsvm linear kernel?

Joseph

2017年12月6日 15:58

l have a dataset of dim=(200,2000) 200 examples and 2000 features. l have 10 classes.

l used sklearn for both cases :

svm.svc(kernel=linear)
LinearSVC()

However LinearSVC() performs drastically better than svm with linear kernel. 60% against 23%. l'm supposed to get the same or comparable results since they are fed with same parameters and data.

What's wrong ?

Thank you

Topic scikit-learn svm libsvm machine-learning

Category Data Science

Theudbald · Accepted Answer · 2017年12月6日 15:58

Here is just a guess, but according to me, the linearSVC might perfoms better than SVM with linear kernel because of regularization.

Because linearSVC is based on liblinear rather than libsvm, it has more flexibility and it gives you the possibility to use regularization with your SVM (default is L2-Ridge regularization).

Because you have more features than observations, it exists multiple solutions to your classification model. Some of them are more "robust" than other ones. L2 regularization will help you reduce the ampltitude of your model coefficients and maybe lead to a more stable solution.

More on LinearSVC in the documentation :

http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

More on Ridge regression benefits when working with a lot of features (and so multicollinearity) :

https://stats.stackexchange.com/questions/118712/why-does-ridge-estimate-become-better-than-ols-by-adding-a-constant-to-the-diago

Why Liblinear performs drastically better than libsvm linear kernel?

About