What if the votes for 2 classes are equal in an ensemble learning technique?

Suppose in ensemble learning technique, if the number of models that predict class 1 is equal to the number of models that predict class 0. Then, which class will be decided as output?

Topic ensemble-learning ensemble-modeling random-forest machine-learning

Category Data Science


I would assign randomly based on sizes (or prior probabilities, if available) of those classes with equal votes.

That is, if $c_1$ to $c_r$ are the classes with equal votes and their class sizes are $s_1$ to $s_r$, respectively, then assign the observation to class $i$ with probability $$p_i = \frac{s_i} {\sum_{j = 1}^r s_j}.$$

I suspect most classifiers will provide an estimation of $p_i$'s, and you may not need to calculate it as given above.


Depending on the implementation, this problem never occurs. Most of the implementations build an odd number of trees or models to ensure one class's dominance.

However, some implementations allow an even number of trees and models. In such a case, some algorithms simply just throw an error by mentioning the tie has occurred.

Especially in R, some of the libraries solved this issue by randomly picking one of the classes.

The last and best (I think) solution is choosing the class that is more frequent than another. In other words, the class that is more frequent in the training data is chosen arbitrarily by the algorithm to break the tie.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.