'Dividing dataset on 2 stratified parts
I need to divide the dataset into 2 parts stratified by the values of one categorical column. That being said, the sklearn.model_selection tools are not suitable as they create 4 parts. Can I do it with pandas or something else?
Solution 1:[1]
X_train, X_test = sklearn.model_selection.train_test_split(X, stratify='column_name')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wai Ha Lee |
