Batch Size influences R2 score a lot, but not MSE (much)
If I train a model following a random search, (and in general for this problem I am working on), a big batch size seems to control R2 score where bs=200 or more, say, roughly, gives R2 scores of 0.95 or above and an MSE or about 0.012.
If I lower the batch size, MSE may decrease a little faster (I think) except that R2 score blows up. (to minus -5692.7026, say and thereabouts). E.g.
97256/100664 [===========================..] - ETA: 6s - loss: 0.0184 - coeff_determination: -5692.7026 - mean_squared_error: 0.0184 - mae: 0.0162 - mse: 0.0184
[ ]
Given that MSE and R2 scores are so closely interlinked I would have thought that this wouldn't happen. Are there any guesses to its interpretation/what this means/what causes it?
Thanks!
Topic mse r-squared deep-learning neural-network machine-learning
Category Data Science