'Standard Scaler Producing Unexpected Result for Just One Column

I'm doing some preprocessing on my training data before fitting it to a model. Upon checking the results, there is one column that is returning 0 rather than 1 for the standard deviation. (all columns return a mean of 0 as expected). My code is below:

y = ml_df['target']
x = ml_df[['Feature1', 'Feature2',  'Feature3', 'Feature4', 'Feature5', 'Feature6', 
        'Feature7', 'Feature8', 'Feature9', 'Feature10', 'Feature11', 'Feature12']]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.55, random_state=3)
pt_hp = PowerTransformer()
x_train_gaussian = pt_hp.fit_transform(x_train)
x_test_gaussian =  pt_hp.transform(x_test)
ss_hp = StandardScaler()
std_x_train =ss_hp.fit_transform(x_train_gaussian)
std_x_test = ss_hp.transform(x_test_gaussian)

After running the above, this line produces the following output:

print(std_x_train.std(axis = 0))

Out: [1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1.]

This particular feature is not materially different than the others; is contains only positive values (no zeros that would impact PowerTransformer) and not anywhere close to near-zero variance. I realize the PowerTransformer also scales the data so the final 2 lines are currently unneccessary, but both 'x_train_gaussian' and 'std_x_train' return this output so I don't think that's the issue here. Does anyone have any idea why this one particular column is returning such a different standard deviation than the rest? Thanks in advance for any suggestions.



Solution 1:[1]

Issue was apparently caused by long-standing SKlearn bug - https://github.com/scikit-learn/scikit-learn/issues/14959

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AMJ