'How does training of sklearn Stacking metaclassifier work?

In the docs it is said that metaclassifier is trained through cross_val_predict. From my perspective it means that data is splitten by folds, and all base estimators predict values on one fold, trained on all other folds. And that procedure goes for every fold. Then metaclassifier is trained on predictions of base estimators on these folds. Is it correct? If so, doesn't it contradict to

Note that estimators_ are fitted on the full X

in the way that base estimators are trained on several folds, not full X?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source