What is the difference between keras tuned hyperparameters and manually defined Sequential model with same hyperparameters?
I have a dataset that I divided into 10 splits of training, validation and test sets for a regression problem.
I used the first split and RandomSearch
in keras-tuner
to arrive at the best hyperparameters for a MLP model with two hidden layers. The hyperparameters that I tuned for are the number of neurons in the first hidden layer, the number of neurons in the second hidden layer and the learning rate. I loaded the 'best model' and applied this to each split to find the R² for each split and averaged the $R^2$ values.
When I manually define the MLP model with the same parameters, my results seem to be poorer. I don't understand why this is not working.
If I tuned with one split, is it ok to use the 'best model' while fitting for the other 9 splits?
Finally, my question is generally related to the reproducibility of results. Sometimes, if I start from scratch and tune again, fit the model and test, the results are not that great. I then load the old 'best model' of keras tuner which gave me good results. How do I deal with this randomness?
Summary of the steps I am following:
tune with one splitload best modelfit each split and save weightsload weightspredict on test data$R^2$
I would like to know if I am doing something wrong conceptually.
Topic mlp hyperparameter-tuning regression
Category Data Science