How does bagging help reduce the variance
I learned that bagging helps reduce variance by averaging but I couldn't understand this. Can someone explain this intuitively?
Topic bagging ensemble-modeling decision-trees
Category Data Science
I learned that bagging helps reduce variance by averaging but I couldn't understand this. Can someone explain this intuitively?
Topic bagging ensemble-modeling decision-trees
Category Data Science
High Variance - Model varies a lot on small changes
High Bias - Model doesn't vary so much but predict quite away from the truth
Let's check a Decision Tree on 5 values -
\begin{array} {|r|r|}
\hline
1 &5 &10 &15 &20\\
\hline
\end{array}
In this tree split,
Value of 9.9 will be 7.5
Value of 10.1 will be 12.5.
Showing a very high variance.
Let's create 4 Random Tree of 3 elements each -
\begin{array} {|r|r|}
\hline
Tree-1 &5 &10 &15\\
\hline
Tree-2 &1 &15 &20\\
\hline
Tree-3 &1 &05 &20\\
\hline
Tree-4 &5 &15 &20\\
\hline
\end{array}
Value of 9.9 = (7.5 + 7.5 + 12.5 + 10)/4 ~ 9.375
Value of 10.1 = (12.5 + 7.5 + 12.5 + 10)/4 ~ 10.625
Variance is reduced a lot.
In bagging, we build multi-hundreds of the Tree(Can build other models too which offers high variance) which results in a large variance reduction
Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.