'Split Train & Test Sets but Indexed Input Differs from Subscript by 1--why?

I've split my data into training and testing sets, but I keep receiving an error that

! Must subset rows with a valid subscript vector. ℹ Logical subscripts must match the size of the indexed input. x Input has size 4067 but subscript split_data_table == 0 has size 4066.

My data is named "JFK_weather_clean2". To execute the split, I did:

set.seed(1234)
split_data_table <- sample(c(rep(0, 0.8 * nrow(JFK_weather_clean2)), rep(1, 0.2 * nrow(JFK_weather_clean2))))

table(split_data_table) results:

0 1
3253 813

From there I tried to create the training set:

training_set <- JFK_weather_clean2[split_data_table == 0, ]

As you have probably noticed, my input data comprises 4,067 rows (which count includes header row), whereas the subscript has size 4,066. I am assuming this issue involves the header row, but I don't know what correction to make in my sample() code. Thanks for any help!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source