Base model in ensemble learning

I've been doing some research on ensemble learning and read that for base models, model with high variance are often recommended (can't remember which book I read this from exactly).

But, it seems counter-intuitive because wouldn't having base models with low variance(doing good on test set) be better than having multiple bad base models?

Topic bagging ensemble-modeling machine-learning

Category Data Science


Intuitively speaking, ensembles benefit most from diversity.

Imagine being in a room of people making a decision together. If everyone more or less agrees, you don't benefit from having more people at the table. But if people tend to have different opinions, when they DO agree, it is a stronger message that the decision must be correct.

The same applies to ensembles. Models with high variance are more likely to produce different predictions, which will improve the quality of the prediction. High variance also minimizes the risk that multiple models are all wrong at the same time, based on the assumption that models are right more often than wrong.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.