'KNN From Scratch in Python: How Do I create and Go to the Next Test Instance?

I have the following KNN function that does produce erroneous predictions. I believe it is because the code only is using the the single test instance created. How do I adjust this code as per the comments in the function to pick the next test instance and repeat the process until the loop terminates?

def knn_predict(X_train, y_train, X_test, k=5):
    
    y_pred=[]
        
    for i in range(0, len(X_test)):

        #Grab a test instance from X_test
        test_instance = np.array([X_test.iloc[i]])

        #find distances between the test instance and all training instances
        d = metrics.euclidean_distances(X_train, test_instance)

        #stack the distances with the y_train to get a matrix
        stacked = np.stack((d.flatten(), y_train.Time.values), axis=1)

        #sort the matrix by the distance column pick k number of y_train values where k
        #is much less than the length of the training set
        y_train_nearest_k = stacked[np.argsort(stacked[:,-1])][0:k,0]

        #make a predicted value for the test instance and append it to y_pred
        y_pred.append(np.mean(y_train_nearest_k))

        #pick the next instance repeat the process until the loop terminates  
    
    return y_pred

Calling the function as is has these results:

y_delivery_test_pred = knn_predict(X_delivery_train, y_delivery_train, X_delivery_test, k=5)
y_delivery_test_pred[0:5]

[6.603648852515093,
 19.02562968007764,
 34.00249960949702,
 24.003332407921455,
 24.330669863436253]

The correct results (implemented using sklearn KNeighborsRegressor) should be more like below:

array([[5.14],
       [6.5 ],
       [6.32],
       [6.2 ],
       [9.16]])

Data Sample:

X_train:
  Miles Deliveries
0   100 4
1   50  3
2   100 4
3   100 2
4   50  2
5   80  2
6   75  3
7   65  4
8   90  3
9   90  2
10  50  5

y_train:
    Time
0   9.3
1   4.8
2   8.9
3   6.5
4   4.2
5   6.2
6   7.4
7   6.0
8   7.6
9   6.1
10  7.0

X_test:
  Miles Deliveries
0   50  3
1   65  2
2   80  1
3   70  1
4   70  5
5   95  6
6   50  6
7   90  3
8   60  3
9   80  1
10  95  6

Thanks!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source