'KNN function in R producing NA/NaN/Inf in foreign function call (arg 6) error

I'm working on a project where I need to construct a knn model using R. The professor provided an article with step-by-step instructions (link to article) and some datasets to choose from (link to the data I'm using). I'm getting stuck on step 3 (creating the model from the training data).

Here's my code:

data <- read.delim("data.txt", header = TRUE, sep = "\t", dec = ".") 
set.seed(2)
part <- sample(2, nrow(data), replace = TRUE, prob = c(0.65, 0.35))
training_data <- data[part==1,]
testing_data <- data[part==2,]
outcome <- training_data[,2]
model <- knn(train = training_data, test = testing_data, cl = outcome, k=10)

Here's the error message I'm getting:

Error message

I checked and found that training_data, testing_data, and outcome all look correct, the issue seems to only be with the knn model.



Solution 1:[1]

The issue is with your data and the knn function you are using; it can't handle characters or factor variable

We can force this to work doing something like this first:

library(tidyverse)

data <- data %>% 
            mutate(Seeded = as.numeric(as.factor(Seeded))-1) %>%
            mutate(Season = as.numeric(as.factor(Season)))

But this is a bad idea in general, since Season is not ordered naturally. A better approach would be to instead treat it as a set of dummies.

See this link for examples:

R - convert from categorical to numeric for KNN

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jonathan Graves