Confusion with L2 Regularization in Back-propagation

In a very simple language, this is L2 regularization

$\hspace{3cm}$$Loss_R$ = $Loss_N + \sum w_i^2$
$Loss_N$ - Loss without regularization
$Loss_R$ - Loss with regularization

When implementing [Ref], we simply add the derivative of the new penaty to the current delta weight,
$\hspace{3cm}$$dw = dw_N + constant*w$
$dw_N$ - Weight delta without regularization


What I think - L2 regularization is achieved with the last step only i.e. the weight is penalized.


My question is -
Why do we then add the loss in the total loss as done in the first equation. Will, it not put an additional penalty during back-prop( On $dw_N$ component) for each weight because of increased Loss. I can understand if it is for console print purpose but it is not, I believe.

I know I am missing something very simple.

Topic mathematics regularization backpropagation

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.