Is R2 score a reasonable regression measure on huge datasets?
I'm running a regression model on a pretty large data set and getting a fairly woeful $R^2$ score of ~0.2 (see plot below), despite the plot looking like the model is generally pointing in the right direction.
My question is, when you have over a million data points, how high can you realistically expect the $R^2$ to go in real world data with a decent amount of noise?
What prompts by scepticism of such traditional measures are articles such as this that discuss how quantity of data can degrade statistical tests.
Let me know what you think and any regression examples using $R^2$ score as a quality metric.
Topic r-squared metric regression performance
Category Data Science