How to compute constant c for PCA features before SGDClassifier as advised in Scikit documentation?

In the documentation for SGDClassifier here, it is stated;

If you apply SGD to features extracted using PCA we found that it is often wise to scale the feature values by some constant c such that the average L2 norm of the training data equals one.

  1. Given, I have a dummy training dataset as
import numpy as np
data = np.random.rand(3,3)

How can I compute c and scale the feature values?

  1. I am using IncrementalPCA before SGDClassifier (loss=log). Should I compute c after every batch's partial_fit and transform and scale the transformed data by c before feeding into the SGDClassifier?

On a side note, there is a similar question in this forum at here, however, there is no answer to that question. I have also asked this question in Scikit Github's discussion at here but there is no answer.

Thank you for your kind help.

Topic sgd pca scikit-learn

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.