Elegant way to plot the L2 regularization path of logistic regression in python?

Question

Elegant way to plot the L2 regularization path of logistic regression in python?

lostwanderer

2021年10月23日 03:59

Trying to plot the L2 regularization path of logistic regression with the following code (an example of regularization path can be found in page 65 of the ML textbook Elements of Statistical Learning https://web.stanford.edu/~hastie/Papers/ESLII.pdf). Have a feeling that I am doing it the dumb way - think there is a simpler and more elegant way to code it - suggestions much appreciated thanks.

counter = 0
for c in np.arange(-10, 2, dtype=np.float):    
    lr = LogisticRegression(C = 10**c,
                            fit_intercept=True,
                            solver = 'liblinear',
                            penalty = 'l2',
                            tol = 0.0001,
                            n_jobs = -1,
                            verbose = -1,
                            random_state = 0
                           )
    model=lr.fit(X_train_z, y_train)


    coeff_list=model.coef_.ravel()
    
    if counter == 0:
        coeff_table = pd.DataFrame(pd.Series(coeff_list,index=X_train.columns),columns=[10**c])
    else:
        temp_table = pd.DataFrame(pd.Series(coeff_list,index=X_train.columns),columns=[10**c])
        coeff_table = coeff_table.join(temp_table,how='left')
    counter += 1

plt.rcParams[figure.figsize] = (20,10)
coeff_table.transpose().iloc[:,:10].plot()
plt.ylabel('weight coefficient')
plt.xlabel('C')
plt.legend(loc='right')
plt.xscale('log')
plt.show()

Topic lasso matplotlib regularization python

Category Data Science

Ben Reiniger · Accepted Answer · 2021年10月23日 03:59

sklearn has such a functionality already for regression problems, in enet_path and lasso_path. There's an example notebook here.

Those functions have some cython base to them, so are probably substantially faster than your version. One other improvement that you can include in your implementation without adding cython is to use "warm starts": nearby alphas should have similar coefficients. So try

# This needs to be instantiated outside the loop so we don't start from scratch each time.
lr = LogisticRegression(C = 1,  # we'll override this in the loop
                        warm_start=True,
                        fit_intercept=True,
                        solver = 'liblinear',
                        penalty = 'l2',
                        tol = 0.0001,
                        n_jobs = -1,
                        verbose = -1,
                        random_state = 0
                       )
for c in np.arange(-10, 2, dtype=np.float):
    lr.set_params(C=10**c)
    model=lr.fit(X_train_z, y_train)
    ...

Elegant way to plot the L2 regularization path of logistic regression in python?

About