'Can we set minimum samples per leaf in XGBoost (like in other GBM algos)?

I'm curious why xgBoost doesn't support the min_samples_leaf parameter like the classic GB classifier in sklearn? And if I do want to control the min. number of samples on a single leaf, is there any workaround in xgboost?



Solution 1:[1]

xgboost has min_child_weight, but outside of the ordinary regression task that is indeed different from minimum samples. I couldn't say why the additional parameter isn't included. Note though that in binary classification, the logloss hessian is p(1-p) and is between 0 and 1/4, with values near zero for the very confident predictions; so in effect setting min_child_weight is requiring many currently-uncertain rows in each leaf, which may be close enough to (or better than!) setting a minimum number of rows.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ben Reiniger