Can elastic net l1 ratio be greater than 1?

I have multiple datasets that I trained with ElasticNetCV (sklearn), and I noticed that many of them selected l1_ratio = 1 as the best value (which is the max value tried by the CV),

So as a test I wondered if values greater than 1 will produce a better result - and surprisingly the answer is yes... in fact you can reproduce this phenomenon with this code:

    from sklearn.linear_model import ElasticNet
    from sklearn.model_selection import train_test_split

    n = 200
    features = np.random.rand(n, 5)
    target = np.random.rand(n)+features.sum(axis=1)*5

    train_feat, test_feat, train_target, test_target = train_test_split(features, target)

    cls = ElasticNet(random_state=42, l1_ratio=1, alpha=0.1)
    cls.fit(train_feat, train_target)
    print(cls.score(test_feat, test_target), cls.score(train_feat, train_target))

    cls = ElasticNet(random_state=42, l1_ratio=1.1, alpha=0.1)
    cls.fit(train_feat, train_target)
    print(cls.score(test_feat, test_target), cls.score(train_feat, train_target))

And you will find that the l1_ratio=1.1 regressor is better on both train and test.

According to the documentation, you shouldn't use l1_ratio1, but it does technically work. However it doesn't make much sense, as it would mean that the L2 part of the loss function becomes negative - so higher L2 values of the coefs don't punish, but in fact reward (!) the loss function.

Is there any theoretical logic behind this? Is there any reason not to expand the L1 search range to $[0,2]$ instead of $[0,1]$?

Topic elastic-net regularization linear-regression scikit-learn

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.