Elegant way to plot the L2 regularization path of logistic regression in python?

Trying to plot the L2 regularization path of logistic regression with the following code (an example of regularization path can be found in page 65 of the ML textbook Elements of Statistical Learning https://web.stanford.edu/~hastie/Papers/ESLII.pdf). Have a feeling that I am doing it the dumb way - think there is a simpler and more elegant way to code it - suggestions much appreciated thanks.

counter = 0
for c in np.arange(-10, 2, dtype=np.float):    
    lr = LogisticRegression(C = 10**c,
                            fit_intercept=True,
                            solver = 'liblinear',
                            penalty = 'l2',
                            tol = 0.0001,
                            n_jobs = -1,
                            verbose = -1,
                            random_state = 0
                           )
    model=lr.fit(X_train_z, y_train)


    coeff_list=model.coef_.ravel()
    
    if counter == 0:
        coeff_table = pd.DataFrame(pd.Series(coeff_list,index=X_train.columns),columns=[10**c])
    else:
        temp_table = pd.DataFrame(pd.Series(coeff_list,index=X_train.columns),columns=[10**c])
        coeff_table = coeff_table.join(temp_table,how='left')
    counter += 1

plt.rcParams[figure.figsize] = (20,10)
coeff_table.transpose().iloc[:,:10].plot()
plt.ylabel('weight coefficient')
plt.xlabel('C')
plt.legend(loc='right')
plt.xscale('log')
plt.show()

Topic lasso matplotlib regularization python

Category Data Science


sklearn has such a functionality already for regression problems, in enet_path and lasso_path. There's an example notebook here.

Those functions have some cython base to them, so are probably substantially faster than your version. One other improvement that you can include in your implementation without adding cython is to use "warm starts": nearby alphas should have similar coefficients. So try

# This needs to be instantiated outside the loop so we don't start from scratch each time.
lr = LogisticRegression(C = 1,  # we'll override this in the loop
                        warm_start=True,
                        fit_intercept=True,
                        solver = 'liblinear',
                        penalty = 'l2',
                        tol = 0.0001,
                        n_jobs = -1,
                        verbose = -1,
                        random_state = 0
                       )
for c in np.arange(-10, 2, dtype=np.float):
    lr.set_params(C=10**c)
    model=lr.fit(X_train_z, y_train)
    ...

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.