Python sklearn PCA transform function output does not match
I am computing PCA on some data using 10 components and using 3 out of 10 as:
transformer = PCA(n_components=10)
trained=transformer.fit(train)
one=numpy.matmul(train,numpy.transpose(trained.components_[:3,:]))
Here trained.components_[:3,:] are:
array([[-1.43311999e-03, 1.65635865e-01, 5.49189565e-01,
5.26069645e-02, 2.42638594e-01, 1.20957807e-02,
1.30595572e-01, 1.09279646e-02, 7.21299808e-03,
-2.79057934e-02, -1.14834589e-02, 5.06289160e-01,
5.42890317e-01, 8.50422194e-02, 1.80935205e-01,
2.98473275e-05, -8.04537378e-04],
[-1.05419313e-02, 3.09442577e-01, -8.15534934e-02,
4.28621520e-03, 2.93323569e-01, 3.85849115e-02,
-1.16193185e-01, 4.14964652e-01, 4.16279154e-01,
2.95264788e-01, 3.28620106e-01, -2.60916490e-01,
-2.37459426e-02, 1.57567265e-01, 4.02873342e-01,
5.28389303e-05, -2.07920000e-03],
[ 8.63072772e-03, -3.26129082e-01, 8.59869400e-02,
3.04770780e-03, -3.14966419e-01, -2.47151330e-02,
1.05987767e-01, 3.74235953e-01, 3.75747065e-01,
2.76035253e-01, 3.18273743e-01, 3.02423861e-01,
2.76535177e-02, -1.51485057e-01, -4.48558170e-01,
-8.83328996e-05, -2.25542180e-03]])
and using only 3 components as :
transformer = PCA(n_components=3)
trained=transformer.fit(train)
two=trained.transform(train)
Here the components are:
array([[-1.43311999e-03, 1.65635865e-01, 5.49189565e-01,
5.26069645e-02, 2.42638594e-01, 1.20957807e-02,
1.30595572e-01, 1.09279646e-02, 7.21299808e-03,
-2.79057934e-02, -1.14834589e-02, 5.06289160e-01,
5.42890317e-01, 8.50422194e-02, 1.80935205e-01,
2.98473275e-05, -8.04537377e-04],
[-1.05419314e-02, 3.09442577e-01, -8.15534934e-02,
4.28621520e-03, 2.93323569e-01, 3.85849115e-02,
-1.16193185e-01, 4.14964652e-01, 4.16279154e-01,
2.95264788e-01, 3.28620106e-01, -2.60916490e-01,
-2.37459426e-02, 1.57567265e-01, 4.02873342e-01,
5.28389307e-05, -2.07919994e-03],
[ 8.63072765e-03, -3.26129082e-01, 8.59869400e-02,
3.04770780e-03, -3.14966419e-01, -2.47151331e-02,
1.05987767e-01, 3.74235953e-01, 3.75747065e-01,
2.76035253e-01, 3.18273743e-01, 3.02423861e-01,
2.76535177e-02, -1.51485057e-01, -4.48558170e-01,
-8.83328994e-05, -2.25542175e-03]])
But one comes not equal to two. Components are same in both. They are not same because transform function first subtracts the original data by mean vector and then multiplies with components. But why should the mean be subtracted here. As they are subtracted in the first step to compute PCA for computing basis.
Topic pca scikit-learn python
Category Data Science