When is the sum of models the model of the sum?
The response variable in a regression problem, $Y$, is modeled using a data matrix $X$.
In notation, this means:
$Y$ ~ $X$
However, $Y$ can be separated out into different components that can be modeled independently.
$$Y = Y_1 + Y_2 + Y_3$$
Under what conditions would $M$, the overall prediction, have better or worse performance than $M_1 + M_2 + M_3$, a sum of individual models?
To provide more background, the model used is a GBM. I was surprised to find that training a model for a specific $Y_i$ resulted in about equal performance than using the overall model $M$ to predict that $Y_i$. The $Y_i$'s are highly correlated. In hindsight, then this is not surprising because training a model for a vector correlated with the target also is correlated with the target.
For analogy, take the case with a linear model and independent response variables.
The overall model is
$Y = X\beta$
It is trivial to see that the sum of the models is the model of the sum.
$Y = X\beta = X\beta_1 + X\beta_2 + X\beta_3 = X(\beta_1 + \beta_2 + \beta_3)$
If the $Y$'s are independent then the $\beta$'s will be as well. This implies that each of the model coefficients will be unchanged. Take for example a two-dimensional case (where $X$ has two columns).
For $i \neq j$, $Y_i = X(\beta_i + \beta_j) = X\beta_j + 0$
Topic mathematics supervised-learning regression statistics
Category Data Science