'Get all prediction values for each CV in GridSearchCV

I have a time-dependent data set, where I (as an example) am trying to do some hyperparameter tuning on a Lasso regression.

For that I use sklearn's TimeSeriesSplit instead of regular Kfold CV, i.e. something like this:

tscv = TimeSeriesSplit(n_splits=5)

model = GridSearchCV(
    estimator=pipeline,
    param_distributions= {"estimator__alpha": np.linspace(0.05, 1, 50)},
    scoring="neg_mean_absolute_percentage_error",
    n_jobs=-1,
    cv=tscv,
    return_train_score=True,
    max_iters=10,
    early_stopping=True,
)

model.fit(X_train, y_train)

With this I get a model, which I can then use for predictions etc. The idea behind that cross validation is based on this:

enter image description here

However, my issue is that I would actually like to have the predictions from all the test sets from all cv's. And I have no idea how to get that out of the model ?

If I try the cv_results_ I get the score (from the scoring parameter) for each split and each hyperparameter. But I don't seem to be able to find the prediction values for each value in each test split. And I actually need that for some backtesting. I don't think it would be "fair" to use the final model to predict the previous values. I would imagine there would be some kind of overfitting in that case.

So yeah, is there any way for me to extract the predicted values for each split ?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source