R-Squared for real valued label under non linear regression learner

Below are my questions of R-Squared for real valued label under non linear regression learner. It may be a large problem, if there is no easy answer, could you give me some references?

Firstly for the real valued label, except for R-squared, is there any good value to evaluate the performance of fitting? I know that small MSE, MAE, e.t.c usually mean the good fitting. However they may be not as intuitive as a ratio like R-Sqaured (how small is good?).

Secondly does R-Squared of non linear regression learner really make sense? Since linear regression has sum of squared error decomposition, R-Squared is always between 0 and 1 and the closer R-sqaured gets to 1, the better fitting is. However other learner cannot guarantee sum of squared error decomposition, R-Squared even can go to large negative. Here we really want the small residual error, however does it make sense to compare with total error? One way to understand is that the total error (variance of sample) is a non-model dependent value, therefore we can use it as a benchmark?

Topic r-squared machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.