'Error in xgboost::xgb.DMatrix(as.matrix(mydat %>% dplyr::select(date, : 'data' has class 'character' and length 176. in R
This is my dput() dataset
mydat=structure(list(date = c("22.06.2021", "22.06.2021", "22.06.2021",
"22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021",
"22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021",
"22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021", "22.06.2021",
"23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021",
"23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021",
"23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021",
"23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021",
"23.06.2021", "23.06.2021", "23.06.2021", "23.06.2021", "24.06.2021",
"24.06.2021"), hour = c(6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 0L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 22L, 23L, 0L, 1L), weekday = c(2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L), base_price = c(3250.87,
3261.89, 3272.91, 3283.93, 3294.95, 3305.97, 3316.98, 3328, 3339.02,
3350.04, 3361.06, 3372.08, 3383.1, 3394.12, 3405.14, 3416.16,
3427.17, 3438.19, 3449.21, 3460.23, 3471.25, 3482.27, 3493.29,
3504.31, 3515.33, 3526.35, 3537.36, 3548.38, 3559.4, 3570.42,
3581.44, 3592.46, 3603.48, 3614.5, 3625.52, 3636.54, 3647.55,
3658.57, 3669.59, 3680.61, 3691.63, 3702.65, 3713.67, 3724.69
)), class = "data.frame", row.names = c(NA, -44L))
I'm trying to learn how to use boosting for time series analysis, but I'm having some difficulty. The example I am trying to do.
library(xgboost)
library(dplyr)
library(lubridate)
extended_data_mod <- mydat %>%
dplyr::mutate(.,
index_date = as.Date(paste0(lubridate::year(date), "-", lubridate::month(date), "-01")),
months = lubridate::month(index_date),
years = lubridate::year(index_date))
mydat <- extended_data_mod[1:length(ts), ] # initial data
pred <- extended_data_mod[(length(ts) + 1):nrow(extended_data), ] # extended time index
x_train <- xgboost::xgb.DMatrix(as.matrix(mydat %>%
dplyr::select(date, hour, weekday,
base_price)))
x_pred <- xgboost::xgb.DMatrix(as.matrix(pred %>%
dplyr::select(date, hour, weekday,
base_price)))
y_train <- mydat$base_price
#learn the model
xgb_trcontrol <- caret::trainControl(
method = "cv",
number = 5,
allowParallel = TRUE,
verboseIter = FALSE,
returnData = FALSE
)
and my error
for the place of the desired result i get the error
Error in xgboost::xgb.DMatrix(as.matrix(mydat %>% dplyr::select(date, :
'data' has class 'character' and length 176.
What did I do wrong and how do that this code would correct work? thank you in advance.
Solution 1:[1]
You need to use as.numeric on the date column. Try this:
x_train <- xgboost::xgb.DMatrix(
as.matrix(mydat %>%
dplyr::mutate(date = as.numeric(date)) %>%
dplyr::select(date,hour,weekday,base_price))
)
Similarly for x_pred
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | langtang |
