Proof of GOSS algorithm in lightGBM paper
In the LightGBM paper the authors make use of a newly developed sampling method GOSS
to reduce the number of data instances needed for finding the best split of a given feature in a tree-node.
They give an error estimation for the error made by sampling instead of taking the entire data (Theorem 3.2 in https://www.microsoft.com/en-us/research/wp-content/uploads/2017/11/lightgbm.pdf)
I am interested in the proof of this Theorem for which the paper refers to supplementary materials
Where can I find those?
Category Data Science