Why is linear regression not doing worse with a low weighted attribute?
I've been able to build a few linear regression models that can predict a material strength quite well: minimum RMSE of 17.95 using 11 attributes that I have selected from 159 original attributes. The data is distributed with mean=234.4 and stdev=19.9. I am working in Orange3.
When using only the highest weighted attribute (weight 8.013) the model calculates RMSE of 18.767. If I use only the lowest weighted attribute (weight 0.051) the RMSE is 20.007. The difference is 1.24, or roughly 7% of the good RMSE. Why is there not a bigger difference? I would have thought that using only the attribute with almost no weight would cause the model to predict a completely incorrect value for the target variable.
Input data is 3700 instances (cleaned and correct). I am using 10-fold cross validation. The RMSE is only slightly over the standard deviation of the data -- is it just a case of luck, or what is the reason for the quite low difference in RMSE?
Topic rmse machine-learning-model linear-regression machine-learning
Category Data Science