ElasticNet Convergence odd behavior

I am optimizing a model using ElasticNet, but am getting some odd behavior. When I set the tolerance hyperparameter with a small value, I get

ConvergenceWarning: Objective did not converge 

errors. So I tried a larger tolerance value, and the convergence error goes away, but now the test data consistently gives a higher root mean squared error value. This seems backwards to me, if the model does not converge, what can cause it to give a better RMSE score, or even give consistent scores? I am using this inside a gridsearch function:

No error, bad score:

GridSearchCV(cv=KFold(n_splits=4, random_state=None, shuffle=True),
             estimator=ElasticNet(),
             param_grid={'alpha': [0.01], 'fit_intercept': [True],
                         'l1_ratio': [0.75, 0.8, 0.85, 0.9],
                         'max_iter': range(350, 450, 50), 'normalize': [False],
                         'random_state': [3], 'selection': ['random'],
                         'tol': [0.1]},
             refit='nmse', return_train_score=True,
             scoring={'mae': 'neg_mean_absolute_error',
                      'nmse': 'neg_mean_squared_error',
                      'nmsle': 'neg_mean_squared_log_error', 'r2': 'r2'})

Running GridSearchCV:
C:\miniconda3\envs\tf\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:529: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1253.5072980364057, tolerance: 848.4484859258542
  positive)

Error, good score:

GridSearchCV(cv=KFold(n_splits=4, random_state=None, shuffle=True),
             estimator=ElasticNet(),
             param_grid={'alpha': [0.01], 'fit_intercept': [True],
                         'l1_ratio': [0.75, 0.8, 0.85, 0.9],
                         'max_iter': range(350, 450, 50), 'normalize': [False],
                         'random_state': [3], 'selection': ['random'],
                         'tol': [0.01]},
             refit='nmse', return_train_score=True,
             scoring={'mae': 'neg_mean_absolute_error',
                      'nmse': 'neg_mean_squared_error',
                      'nmsle': 'neg_mean_squared_log_error', 'r2': 'r2'})

Topic elastic-net convergence cross-validation predictive-modeling

Category Data Science


Increasing tolerance will result in a "higher root mean squared error value" most of the time. Increasing tolerance is telling the model it is okay to stop earlier with higher error and not continue the search for a possibly more optimal solution using smaller updates.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.