Why is the L2 penalty squared but the L1 penalty isn't in elastic-net regression?

There was some data set I worked with which I wanted to solve non negative least squares (NNLS) on and I wanted a sparse model. After a bit of experiementing I found that what worked the best for me was using the following loss function:

$$\min_{x \geq 0} ||Ax-b|| + \lambda_1||x||_2^2+\lambda_2||x||_1^2$$

Where the L2 squared penalty was implemented by adding white noise with a standard deveation of $\sqrt{\lambda_1}$ to $A$ (which can be showed to be equivelent to ridge regression in the expectation) and I implemented the L1 squared penalty by adding a row to $A$ with a constant value of $\sqrt{\lambda_2}$ and added a value of 0 to the end of $b$, which in the case of NNLS can be shown to be equivelent to L1 squared penalty.

That worked good for my purposes, but I know that usually in sparse regression models (for example elastic net or lasso regression) the L1 penalty is not squared, so it made me wonder if there could be a problem I'm missing with using squared L1 penalty? Is there any specific reason the L1 penalty is not squared but the L2 penalty is squared in elastic net regression?

Topic elastic-net sparsity lasso regression

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.