'How to speed up XGBoost Classification for time series
I'm using the XGBoost Classifier for time series prediction. I am also doing out-of-time cross-validation (for example, training on 10 weeks and predicting/testing on the 11th week). This makes using something like early-stopping rounds fairly tough to speed up performance (because i would have to include a validation set). I'm already using gpu_hist tree method, num_cores = os.cpu_count(), and running the XGBoost model within a spark parallelization process to distribute the computing. I would like to speed it up even more if possible, any advice? I'm OK with using some method that stops the training short, it doesn't need to be the best fit as long as its still a good fit.
Also is there a reasonable way to use early stopping rounds with time-series data? My concern is that I would not like to sacrifice training on some of the most recent data (especially in production). My concern for not using a validation set for time series is summarized quite well here.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
