'Getting selected features with SelectKBest() with Multioutput Ridge Regression
I am trying to get top-k best feature names selected during SelectKBest() for multioutput Ridge regression. I am aware SelectKBest() cannot do this by default. I found out about a sklearn-pipeline workaround from this question. However it seems not to output selected features amount, but instead the initial number of features fitted by SelectKBest.
regression_pipeline = Pipeline([('kbst', SelectKBest(f_regression, k=55)), ('regr', Ridge())])
pipe2 = Pipeline([('pipe', MultiOutputRegressor(regression_pipeline))])
pipe2.fit(X_df,y)
print(X.shape, y.shape)
print(pipe2.n_features_in_)
output:
(9887, 90) (9887, 48)
90
Niether the Original decision from the answer nor regression_pipeline.n_features_in_ worked.
regression_pipeline = Pipeline([('kbst', SelectKBest(f_regression, k=55)), ('regr', Ridge())])
pipe2 = Pipeline([('poly', PolynomialFeatures(2, include_bias=False)), ('pipe', MultiOutputRegressor(regression_pipeline))])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
