'Get feature names after sklearn pipeline
I want to match the output np array with the features to make a new pandas dataframe
Here is my pipeline:
from sklearn.pipeline import Pipeline
# Categorical pipeline
categorical_preprocessing = Pipeline(
[
('Imputation', SimpleImputer(missing_values=np.nan, strategy='most_frequent')),
('Ordinal encoding', OrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=-1)),
]
)
# Continuous pipeline
continuous_preprocessing = Pipeline(
[
('Imputation', SimpleImputer(missing_values=np.nan, strategy='mean')),
('Scaling', StandardScaler())
]
)
# Creating preprocessing pipeline
preprocessing = make_column_transformer(
(continuous_preprocessing, continuous_cols),
(categorical_preprocessing, categorical_cols),
)
# Final pipeline
pipeline = Pipeline(
[('Preprocessing', preprocessing)]
)
Here is how I call it:
X_train = pipeline.fit_transform(X_train)
X_val = pipeline.transform(X_val)
X_test = pipeline.transform(X_test)
Here is what I get when trying to get the feature names:
pipeline['Preprocessing'].transformers_[1][1]['Ordinal encoding'].get_feature_names()
OUT:
AttributeError: 'OrdinalEncoder' object has no attribute 'get_feature_names'
Here is a SO question that was similar: Sklearn Pipeline: Get feature names after OneHotEncode In ColumnTransformer
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
