Layman's comparison of RMSE

I don't have a maths / stats / data science background and need to evaluate which of the two evaluations below (numerical regression on Amazon Machine Learning) predict more accuracy. Both models use the same data set but it's looking at different time frames both on the independent and dependent variables.

How can I evaluate which one of the two models is more accurate? And is there a way to tell how accurate these two models are in general (e.g. 75%)?

Model 1

Model 2

Topic amazon-ml regression

Category Data Science


Typically you want a smaller RMSE and without getting into detail it should be sufficient to just take the smaller one. However, I am concerned because you state that the models were ran on the same dataset but at different timeframes. Since RMSE scale depends on the dependent variable scale, it's entirely possible that these two timeframes are scaled different. A somewhat contrive example would be energy consumption. I would expect a model trained on daytime consumption to have a higher RMSE than for one trained between 1am and 3 am. In that case, comparing the RMSE may be meaningless. You can try to normalize your data and RMSE to help with this, but i'm unsure if AWS provides this ability.

As for your second questions, you really won't get a 75% accurate number for regression. You can look at the deviations of the residual or do cross valiadtion and see how well the model performs.

Again this may not be possible in AWS.

edit: I juse realized that the histograms were residual plots. Do three things Increase bin size. Check to see if the residuals are centered around 0 and then check if there is skewness in the data. If the data is centered around 0 and symmetric then you can say the model error is basically random and does not favor over or under predicting. If the data is not centered around 0 and there is skewness, then the errors can be systematic and then in that case considering adding more variables.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.