'Get all prediction values for each CV in GridSearchCV
I have a time-dependent data set, where I (as an example) am trying to do some hyperparameter tuning on a Lasso regression.
For that I use sklearn's TimeSeriesSplit instead of regular Kfold CV, i.e. something like this:
tscv = TimeSeriesSplit(n_splits=5)
model = GridSearchCV(
estimator=pipeline,
param_distributions= {"estimator__alpha": np.linspace(0.05, 1, 50)},
scoring="neg_mean_absolute_percentage_error",
n_jobs=-1,
cv=tscv,
return_train_score=True,
max_iters=10,
early_stopping=True,
)
model.fit(X_train, y_train)
With this I get a model, which I can then use for predictions etc. The idea behind that cross validation is based on this:
However, my issue is that I would actually like to have the predictions from all the test sets from all cv's. And I have no idea how to get that out of the model ?
If I try the cv_results_ I get the score (from the scoring parameter) for each split and each hyperparameter. But I don't seem to be able to find the prediction values for each value in each test split. And I actually need that for some backtesting. I don't think it would be "fair" to use the final model to predict the previous values. I would imagine there would be some kind of overfitting in that case.
So yeah, is there any way for me to extract the predicted values for each split ?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

