Why is the variance of my model predictions much smaller than the training data?

I trained a GRU model on some data and then created a bunch of predictions on a test set.

The predictions are really bad, as indicated by a near zero R2 score.

I notice that the variance of the model predictions are much smaller than the actual training data. i.e it seems like the model is overfit to the mean:

But why is this? I made sure to stop training/use hyperparameters where there was no model overfitting, so why are model predictions centred around the mean and less dispersed than the actual variance of the data set?

My model, if it is relevant:

model = Sequential()
model.add(GRU(100, activation='relu', input_shape=(3, 280), recurrent_dropout = 0.2, dropout = 0.2))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.summary()

# fit model
history = model.fit(X_train, y_train, epochs=40, verbose=1, validation_split=0.33)

Topic variance gru keras

Category Data Science


The first thing you can do here is to scale your output data before training the model. For instance

$$ \dfrac{x_i - \bar{x}_{train}}{\sigma_{x_{train}}} $$

If that does not work, then you can have a look at the model and / or training parameters.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.