How to interpret the Mean squared error value in a regression model?

I'm working on a simple linear regression model to predict 'Label' based on 'feature'. The two variables seems to be highly correlate corr=0.99. After splitting the data sample for to training and testing sets. I make predictions and evaluate the model.

metrics.mean_squared_error(Label_test,Label_Predicted) = 99.17777494521019
metrics.r2_score(Label_test,Label_Predicted) = 0.9909449021176512

Based on the r2_score my model is performing perfectly. 1 being the highest possible value. But when it comes to the mean squared error, I don't know if it shows that my model is performing well or not.

  1. How can I interpret MSE here ?

  2. If I had multiple algorithms and the same data sets, after computing MSE or RMSE for all models, how can I tell which one is better in describing the data ?

  3. R2 score is 0.99, is this suspicious ? Or expected since the label and feature are highly correlated?

          Feature        Label
    0   56171.757812    56180.234375
    1   56352.500000    56363.476562
    2   56312.539062    56310.859375
    3   56432.539062    56437.460938
    4   56190.859375    56199.882812
    ...     ...     ...
     24897  56476.484375    56470.742188
     24898  56432.148438    56432.968750
     24899  56410.312500    56428.437500
     24900  56541.093750    56541.015625
     24901  56491.289062    56499.843750
    

Topic rmse regression python predictive-modeling machine-learning

Category Data Science


Whether you model is performing well or not depends on your business case, you might hive tiny RMSE or great looking score on whatever metric you are using, but it just not enough to solve the business problem, in that case the model is not performing well.

  1. MSE is just that Mean Squared Error

  2. Both MSE and RMSE measure by how much the predicted result deviates from actual, because of the squared term more weight is given to larger errors, and because of square root in RMSE, it is in the same units as dependent variable. MAE, Mean Absolute Error is another useful metric to look at when you are evaluating a regression model; it is also easier to interpret.

  3. Given your data, R-squared seems fine to me.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.