If two functions are close apart can I proof the difference of their empirical loss is also small?
I am trying to understand the proof of Theorem 3 in the paper A Universal Law of Robustness via isoperimetry by Bubeck and Sellke.
Basically there exist atleast one $w_{L,e}$ in $\mathcal{W}_{L,e}$ for which there is another $w_{L}$ in $\mathcal{W}_{L}$ $\frac{\epsilon}{6j}$ apart.
And this this makes clear, $$||f_{w_{L,\epsilon}} - f_{w_{L}} ||_{\infty} = J * \epsilon/ 6J = \epsilon/6 \cdot \cdot \cdot\cdot\cdot\cdot\cdot\cdot\cdot (a) $$
using this assumption $1$
$$\boxed{ \left\|f_{\boldsymbol{w}_{1}}-f_{\boldsymbol{w}_{2}}\right\|_{\infty} \leq J\left\|\boldsymbol{w}_{1}-\boldsymbol{w}_{2}\right\|} $$
This equation (a) denotes that those two functions are close apart.
How can I prove empirical loss of $f_{w_{L,\epsilon}}$ and $f_{w_{L}}$ are also close? Here the label is $y$
My thoughts:- Let,
$p_{1}$ = $(y_{i} - f_{w_{L,\epsilon}}(x_{i}))$ and $p_{2}$ = $(y_{i} - f_{w_{L}}(x_{i}))$
Has to proof:- $p_{2}^2$ $p_{1}^2$ + (some constant)
then I can conclude from here that $EL(f_{w_{L}}) \leq EL(f_{w_{L,\epsilon}})$ + (some constant)
If this holds I can say that $EL(f_{w_{L}})$ being small implies $EL(f_{w_{L,\epsilon}})$ can't be too big
But How can I proof
$$\boxed{p_{2}^2 \leq p_{1}^2 + C }$$
Topic loss-function deep-learning neural-network
Category Data Science