I am getting very minimal mse values and not sure if it is correct?

Below is the linear regression model I fitted and not sure if I am doing the right way as I am getting neat to 99% accuracy

Fitting Simple Linear Regression to the Training set

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score

ln_regressor = LinearRegression()
mse = cross_val_score(ln_regressor, X_train, Y_train , scoring = 'neg_mean_squared_error', cv = 5)
mean_mse = np.mean(mse)
print(mean_mse)

ln_regressor.fit(X_train, Y_train)

** MSE SCORE =-6.612466691367042e-06** 

Predicting the Test set results

y_pred = ln_regressor.predict(X_test)

Evaluating accuracy of test data

mse2 = cross_val_score(ln_regressor, X_test, y_pred , scoring = 'neg_mean_squared_error', cv = 5)
mean_mse2 = np.mean(mse2)
print(mean_mse2)

**MSE score = -4.645751512870382e-31**

Please Note: My data is in log scale transformed to standard scaling later on

R2= cross_val_score(ln_regressor,X_test, y_pred,cv = 10)

R2.mean()

R2 mean is '0.9999030728571852'

Topic mse prediction linear-regression accuracy

Category Data Science


So first thing first, accuracy is a classification concept. You can't say you have 99% accuracy for a regression problem.

Your code seems ok. Cross validation is not necessary here since you are not doing any hyper-parameter tuning or model selection. The mse error is indeed low, so I would suggest you go back to normalize your data, since if your target $y$ has a very small span, i.e. low $\sigma$ in Gaussian case, you will get a meaningless low mse guaranteed.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.