Why is the variance of my model predictions much smaller than the training data?
I trained a GRU model on some data and then created a bunch of predictions on a test set.
The predictions are really bad, as indicated by a near zero R2 score.
I notice that the variance of the model predictions are much smaller than the actual training data. i.e it seems like the model is overfit to the mean:
But why is this? I made sure to stop training/use hyperparameters where there was no model overfitting, so why are model predictions centred around the mean and less dispersed than the actual variance of the data set?
My model, if it is relevant:
model = Sequential()
model.add(GRU(100, activation='relu', input_shape=(3, 280), recurrent_dropout = 0.2, dropout = 0.2))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.summary()
# fit model
history = model.fit(X_train, y_train, epochs=40, verbose=1, validation_split=0.33)
Category Data Science