Ensembling expressions
I have two models, $m_1$ and $m_2$, and I want to ensemble them into a final model. I want to be able weight one or the other more according to a grid search. There are two main ideas that come to my mind when doing so:
- Define a family of models $m_1 \cdot a + m_2 \cdot (1 - a)$, where $0 a 1$, find the $a$ that gives the best score.
- Define a family of models $m_1^a \cdot m_2^{1 - a}$, where $0 a 1$, find the $a$ that gives the best score.
However, in certain cases, I've seen top models in Kaggle competitions doing fairly different things, like having a final model of the form $m_1^a + m_2^b$.
My question is, what are the advantages and disadvantages of every solution? When do they work better and when do they work worse? When is the third kind of ensemble suitable, and is there any heuristic to tune $a$ and $b$?
Topic ensemble ensemble-modeling machine-learning
Category Data Science