'Why sklearn.neighbors.KDTree.query_radius does not return indices as integers?
I'm doing spherical neighbors research using a KDTree structure. I use the query_radius method from sklearn.neigbors.KDTree
On the documentation here, it says that the indices are returned as "object"-types. What does it mean exactly ? Can I convert them to integers ? Why is it not the case for the query method (indices are returned as integers) ?
Here is the important part of my code :
import numpy as np
from sklearn.neighbors import KDTree
def kdtree_spherical(queries, supports, radius, leaf_size=40):
supports_tree = KDTree(supports, leaf_size=leaf_size)
ind = supports_tree.query_radius(queries, r=radius, return_distance=False)
return supports[ind]
# Define the search parameters
points = ... # array of size N*3, N is very big basically
neighbors_num = 100
radius = 0.2
num_queries = 1000
random_indices = np.random.choice(points.shape[0], num_queries, replace=False)
queries = points[random_indices, :]
# Search spherical
neighborhoods = kdtree_spherical(queries, points, radius)
gives the following error
Traceback (most recent call last):
File "neighborhoods.py", line 178, in <module>
neighborhoods = kdtree_spherical(queries, points, radius, leaf_size)
File "neighborhoods.py", line 79, in kdtree_spherical
return supports[ind]
IndexError: arrays used as indices must be of integer (or boolean) type
Solution 1:[1]
Using this method, you are trying to find nearest neighbors around each points in the queries; queries contains some points, consequently, it will get an array containing index arrays (each of them is of type int64), corresponding to each point in the queries. These arrays differ in sizes (array sizes have different shapes) due to different sparsity around the points in the specified radius. So, the main array, which contain these index arrays, must be of type object (An object typed array can contains arrays with different sizes or shapes; This type of arrays needs further operations for vectorization and …).
Now, one easiest solution to solve this issue is to loop on each of index array in the main array. So the function must be modified as:
def kdtree_spherical(queries, supports, radius, leaf_size=40):
supports_tree = KDTree(supports, leaf_size=leaf_size)
print(type(supports_tree))
ind = supports_tree.query_radius(queries, r=radius, return_distance=False)
# The changing section
resulted_array = []
for i in range(len(queries)):
resulted_array.append(supports[ind[i]])
return np.array(resulted_array, dtype=object)
It can be handles in vectorized ways, too, but I think looping is the best in this regard in terms of probable memory leaks or such other issues.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
