'data partitionning function CreateDataPartition cross validation problem

I am trying to get predictions of a multiple variables model, its eplt, its made of 7 scores and one final exam score moy_exam2, I want to predict the later using the 7 scores, I have 29441 obs,like this:

'data.frame':   19643 obs. of  8 variables:
 $ HG       : num  11.5 14 7.5 10.5 9.5 9.5 10 14 11.5 14 ...
 $ Math     : num  8 7.25 9.25 13.25 4.25 ...
 $ Ar       : num  11.2 12.8 8.5 11.5 9.5 ...
 $ Fr       : num  4 4.25 6.5 6.75 5.5 ...
 $ EI       : num  8 10.5 2.5 4 7 9.5 8.5 9.5 12 14 ...
 $ SVT      : num  5.25 9.25 7 11.5 12.5 ...
 $ PC       : num  11.5 16.75 4.25 13.75 10 ...
 $ moy_exam2: num  8.15 9.48 7.23 10.33 7.44 ...

I decided 85% for training and 15% for testing out the model, so in partitioning the data with CreateDataPartition I try this :

# Load the data
data("neplt")
# Inspect the data
library(tidyverse)
sample_n(neplt, 3)
# Split the data into training and test set
set.seed(1,sample.kind = "Rounding")
#remember the last sample 
training.samples=neplt$moy_exam2
library(Rcpp)
training.samples <- neplt$moy_exam2 %>%
createDataPartition(neplt,p = 0.85, list = FALSE,times = 1)
train.data  <- neplt[training.samples, ]
test.data <- neplt[-training.samples, ]
# Build the model
model <- lm(moy_exam2 ~., data = train.data, na.action=na.omit)
# Make predictions and compute the R2, RMSE and MAE
predictions <- model %>% predict(test.data)
data.frame( R2 = R2(predictions, test.data$moy_exam2),
            RMSE = RMSE(predictions, test.data$moy_exam2),
            MAE = MAE(predictions, test.data$moy_exam2))

I get the error

Error in split_indices(as.integer(splitv), attr(splitv, "n")) : 
function 'Rcpp_precious_remove' not provided by package 'Rcpp'

I don't use any split_indices function here! and the Rccp is already loaded, so I continue the executing, but the program gets stuck on the CreateDataPartition line, I clean the data eplt using na.omit and also with na.exclude to remove any doubt about the NA missing values, then, I tried adding the sample.kind = "Rounding" attribute to the set.seed to get it to work, still the Rstudio keeps loading indefinitely, and the console shows a + sign:

enter image description here

enter image description here

does it seems to be related to the memory capacity? or doesnt it have indefinite number of sample that the it couldn't finish it in 100 years, its been running for hours with no results!



Solution 1:[1]

I had a similar problem and error code when running summarySE. It seems like others have had issues like this too: Rcpp package doesn't include Rcpp_precious_remove

I installed and loaded Rcpp again and it worked thereafter!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 TylerH