Is standardization needed before using scikit-learn SVM?

Question

Is standardization needed before using scikit-learn SVM?

user297850

2017年5月10日 13:56

I am using the SVM function provided by scikit-learn. I would like to know whether I need to perform standardization before fitting the model. As I know, LibSVM tends to require pre-processing the data. I am not sure whether scikit-learn automatically normalizes the data instead of expecting us to handle it ourselves.

Topic scikit-learn svm libsvm machine-learning

Category Data Science

phypho · Accepted Answer · 2017年5月10日 13:56

By default, for any methods that use gradient descent or feature combination (e.g. PCA), I scale my data if the orders of magnitude in the features are different from each other. The optimization is easier when the fitted parameters are not too far from zero in any direction. Scaling you data doesn't matter for tree-based methods

Mohammad Athar · Accepted Answer · 2017年2月2日 15:39

scikit learn does not standardize data, but it does offer utilities for you to standardize your input data yourself: http://scikit-learn.org/stable/modules/preprocessing.html

the rule of thumb is to standardize if your data aren't related. That is, if channel X is not a function of channel Y, you should standardize

Qualitatively, think about it this way, SVM 'creates a hyperplane' to separate data into categories; if the data are skewed too far in one axis, that will make it harder to draw a plane to separate them

Is standardization needed before using scikit-learn SVM?

About