'How should I fix "error in knn(): 'train' and 'class' have different lengths"?
I am attempting to use the knn() function in the class package to solve a problem. I have split the iris dataset into 50% training data and 50% test data. I am attempting to predict the variety variable using sepal width and petal width. My knn() call is as follows:
> predictions <- knn(iris.train[, c(1:2)], iris.test[, c(1:2)], iris.train[, 3], k = 10)
In this instance, columns 1 and 2 of iris.train and iris.test are sepal width and petal width. Column 3 of both datasets is the variety variable as a factor. I continuously get the error that 'train' and 'class' have different lengths. When checking dimensions of what I pass into the function, this is what I get:
> dim(iris.train[, c(1:2)])
[1] 75 2
> dim(iris.test[, c(1:2)])
[1] 75 2
> dim(iris.train[, 3])
[1] 75 1
So I would assume that I'm missing something. How can I resolve the issue of 'train' and 'class' being different lengths? Thank you to anyone who can help!
Solution 1:[1]
The cl argument should be a factor/vector of length equal to the number of rows in train. If you check length(iris.train[,3]), you'll see that it is equal to 1
(i.e. it is a one-column frame), which is not the same as the number of rows in train.
Try this:
predictions <- knn(iris.train[, c(1:2)], iris.test[, c(1:2)], iris.train[[3]], k = 10)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | langtang |
