ElasticNet Convergence odd behavior

Question

ElasticNet Convergence odd behavior

mr_python

2021年10月25日 16:39

I am optimizing a model using ElasticNet, but am getting some odd behavior. When I set the tolerance hyperparameter with a small value, I get

ConvergenceWarning: Objective did not converge

errors. So I tried a larger tolerance value, and the convergence error goes away, but now the test data consistently gives a higher root mean squared error value. This seems backwards to me, if the model does not converge, what can cause it to give a better RMSE score, or even give consistent scores? I am using this inside a gridsearch function:

No error, bad score:

GridSearchCV(cv=KFold(n_splits=4, random_state=None, shuffle=True),
             estimator=ElasticNet(),
             param_grid={'alpha': [0.01], 'fit_intercept': [True],
                         'l1_ratio': [0.75, 0.8, 0.85, 0.9],
                         'max_iter': range(350, 450, 50), 'normalize': [False],
                         'random_state': [3], 'selection': ['random'],
                         'tol': [0.1]},
             refit='nmse', return_train_score=True,
             scoring={'mae': 'neg_mean_absolute_error',
                      'nmse': 'neg_mean_squared_error',
                      'nmsle': 'neg_mean_squared_log_error', 'r2': 'r2'})

Running GridSearchCV:
C:\miniconda3\envs\tf\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:529: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1253.5072980364057, tolerance: 848.4484859258542
  positive)

Error, good score:

GridSearchCV(cv=KFold(n_splits=4, random_state=None, shuffle=True),
             estimator=ElasticNet(),
             param_grid={'alpha': [0.01], 'fit_intercept': [True],
                         'l1_ratio': [0.75, 0.8, 0.85, 0.9],
                         'max_iter': range(350, 450, 50), 'normalize': [False],
                         'random_state': [3], 'selection': ['random'],
                         'tol': [0.01]},
             refit='nmse', return_train_score=True,
             scoring={'mae': 'neg_mean_absolute_error',
                      'nmse': 'neg_mean_squared_error',
                      'nmsle': 'neg_mean_squared_log_error', 'r2': 'r2'})

Topic elastic-net convergence cross-validation predictive-modeling

Category Data Science

Brian Spiering · Accepted Answer · 2021年10月25日 16:39

Increasing tolerance will result in a "higher root mean squared error value" most of the time. Increasing tolerance is telling the model it is okay to stop earlier with higher error and not continue the search for a possibly more optimal solution using smaller updates.

ElasticNet Convergence odd behavior

About