Understanding Weighted learning in Ensemble Classifiers

Question

Understanding Weighted learning in Ensemble Classifiers

AnonymousMe

2022年2月6日 00:04

I'm currently studying Boosting techniques in Machine Learning and I happened to understand that in Algorithms like Adaboost, each of the training samples is given a weight depending on whether it was misclassified or not by the previous model in sequential boosting.

Although I intuitively understand that by weighting examples, we are letting the model pay more attention to examples that were previously misclassified, I do not understand how the weights are taken into account by a machine learning algorithm. Are the weights taken as a separate feature? I want to know how a simple machine learning model, lets say a decision tree, understands that a particular example carries more weight and therefore needs to be given more attention/ classified correctly.

Can someone please explain to me in simple words/equations how weighted training samples are handled by machine learning algorithms? How do they force themselves to classify samples with higher weights correctly? Should there be modifications made to existing ML algos to account for weights?

Topic boosting training weighted-data ensemble-modeling machine-learning

Category Data Science

Ben Reiniger · Accepted Answer · 2021年9月2日 21:39

There are two versions of Adaboost: one does not require the weak learner to accept sample weights, and instead just resamples the data according to the weights, using the resampled set for the weak learner.

As for how trees make use of sample weights: the impurity calculation is done making use of the weights, and also the leaf scores become weighted averages. See e.g. this SO post and this DS.SE post.

Finally, many other algorithms operate by minimizing a loss function over some hypothesis space; those can incorporate weights into their loss functions directly. (Examples include GLMs, SVMs, NNs, though none of those are particularly weak learners, so probably not desirable for an adaptive boosting ensemble.)

Ram · Accepted Answer · 2021年3月18日 18:54

I think below link has better explanation and it may help you to understand logic behind Boosting algorithms.

For boosting algorithms

https://youtu.be/LsK-xG1cLYA

To understand random sampling with replacement based on weight of each sample.

https://stackoverflow.com/questions/3679694/a-weighted-version-of-random-choice

Christos Karatsalos · Accepted Answer · 2021年3月17日 00:08

Adaboost is an Ensemble Classifier, which means that it consists of several weak classifiers that are able to consider weights in the training set. These weights can be used to inform the training of each weak classifier. For example, decision trees can be grown in a way that favor splitting sets of samples with higher weights.

Understanding Weighted learning in Ensemble Classifiers

About