'Invalid parameter n_estimators for estimator Pipeline
I'm trying to create a pipline and to fine tune hyperparameters but I try to use fit, I get the error
ValueError: Invalid parameter n_esitmators for estimator Pipeline(steps=[('rfc', RandomForestClassifier())]). Check the list of available parameters with `estimator.get_params().keys()`.
I'd love to get some help with this please.
This is the code I'm using:
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
#Classifier Pipeline
pipeline = Pipeline([
('rfc', RandomForestClassifier())
])
# Params for classifier
params = {'n_estimators': [5,20,50,100,150],
"max_depth": [1, 3,5,10,20,30,50],
"max_features": [1, 3,5,10,20,30,45],
"min_samples_split": [1, 3,5,10],
"min_samples_leaf": [1, 3,5,10]}
# Grid Search Execute
rf_rnd = RandomizedSearchCV(estimator=pipeline, param_distributions = params, cv=5, verbose=2, random_state=0, n_jobs=-1)
rf_rnd.fit(training_data, target)
Made some changes:
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
#Classifier Pipeline
pipeline = Pipeline([
('rfc', RandomForestClassifier())
])
# Params for classifier
params = {'estimator__rfc__n_estimators': [5,20,50,100,150],
"estimator__rfc__max_depth": [1, 3,5,10,20,30,50],
"estimator__rfc__max_features": [1, 3,5,10,20,30,45],
"estimator__rfc__min_samples_split": [1, 3,5,10],
"estimator__rfc__min_samples_leaf": [1, 3,5,10]}
# Grid Search Execute
rf_rnd = RandomizedSearchCV(estimator=pipeline, param_distributions = params, cv=5, random_state=0, n_jobs=-1, return_train_score=True)
Now the error is:
ValueError: Invalid parameter estimator for estimator Pipeline(steps=[('rfc', RandomForestClassifier())]). Check the list of available parameters with
estimator.get_params().keys().
And this is what estimator.get_params().keys() outputs:
dict_keys(['cv', 'error_score', 'estimator__memory', 'estimator__steps', 'estimator__verbose', 'estimator__rfc', 'estimator__rfc__bootstrap', 'estimator__rfc__ccp_alpha', 'estimator__rfc__class_weight', 'estimator__rfc__criterion', 'estimator__rfc__max_depth', 'estimator__rfc__max_features', 'estimator__rfc__max_leaf_nodes', 'estimator__rfc__max_samples', 'estimator__rfc__min_impurity_decrease', 'estimator__rfc__min_impurity_split', 'estimator__rfc__min_samples_leaf', 'estimator__rfc__min_samples_split', 'estimator__rfc__min_weight_fraction_leaf', 'estimator__rfc__n_estimators', 'estimator__rfc__n_jobs', 'estimator__rfc__oob_score', 'estimator__rfc__random_state', 'estimator__rfc__verbose', 'estimator__rfc__warm_start', 'estimator', 'iid', 'n_iter', 'n_jobs', 'param_distributions', 'pre_dispatch', 'random_state', 'refit', 'return_train_score', 'scoring', 'verbose'])
Solution 1:[1]
When you use a pipeline with any search (randomized or grid) the param_grid keys have to follow this syntax (step name)__(param name). So you should use rfc__n_estimators and so on to the other parameters.
Summarizing just remove the estimator__ part that you've added in your last modification
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Eduardo Pacheco |
