'Calculate the minimum radius around a point within which there are X other points

I have a dataset with 500k coordinates (latitude and longitude). I'd like to calculate the minimum radius for each point within which there are X neighbouring points.

I've used scipy.spatial.KDTree to count the number of points within a defined radius. Is there a way to use this package to do the inverse? Or another package or feasible brute force method that can achieve this?

Solution 1:^[1]

have used cities as sample data. This data set is far smaller than one that you have stated.
extended on approach you started. Start by creating KDTree from points in geodataframe
take result from sparse_distance_matrix() into a pandas dataframe
pick nth() point where v will be distance you want
have used UTM CRS so that distances are in meters for points. This could have accuracy issues with points that span many UTM zones.
pulled it back together and also to validate generated a list of points that are within this distance

import geopandas as gpd
from scipy.spatial import KDTree
import pandas as pd

# get some points...
gdf = (
    gpd.read_file(gpd.datasets.get_path("naturalearth_cities"))
    .sjoin(
        gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).loc[
            lambda d: d["continent"].eq("Europe")
        ]
    )
    .reset_index(drop=True)
).loc[:, ["name_left", "geometry"]]

# 2d array of points
# use UTM so distances will be in meters
points = (
    gdf.to_crs(gdf.estimate_utm_crs())["geometry"]
    # gdf["geometry"]
    .apply(lambda p: {"x": p.x, "y": p.y})
    .apply(pd.Series)
    .values
)
kdtree = KDTree(points)

# how many points to consider for distance
N = 5
# create a dataframe from sparse distance matrix.
# nth(N+1) as self is included, so want N other points
df = (
    pd.DataFrame(kdtree.sparse_distance_matrix(kdtree, 10 ** 7, output_type="ndarray"))
    .sort_values(["i", "v"])
    .groupby("i")
    .nth(N + 1)
)

# join back to geodataframe
gdf = gdf.join(df)

# no need to do this,  but helps validate results...
gdf["nearest"] = gdf.apply(
    lambda r: kdtree.query_ball_point(points[r.name], r.v), axis=1
)

gdf

sample output

	name_left	geometry	j	v	nearest
0	Vatican City	POINT (12.45338654497177 41.90328217996012)	25	532056	[6, 1, 0, 2, 9, 25, 19]
1	San Marino	POINT (12.44177015780014 43.936095834768)	3	422411	[6, 3, 1, 0, 2, 9, 19]
2	Rome	POINT (12.481312562874 41.89790148509894)	25	530240	[6, 1, 0, 2, 9, 25, 19]
3	Vaduz	POINT (9.516669472907267 47.13372377429357)	32	490694	[6, 12, 3, 1, 9, 5, 32]
4	Vienna	POINT (16.36469309674374 48.20196113681686)	21	493170	[9, 32, 21, 19, 26, 10, 4]
5	Luxembourg	POINT (6.130002806227083 49.61166037912108)	42	490936	[8, 12, 3, 5, 35, 39]
6	Monaco	POINT (7.406913173465057 43.73964568785249)	7	499408	[7, 6, 12, 3, 1, 0, 2]
7	Andorra	POINT (1.51648596050552 42.5000014435459)	5	868547	[8, 7, 37, 6, 12, 3]
8	Paris	POINT (2.33138946713035 48.86863878981461)	3	569852	[8, 12, 5, 35, 42, 39]
9	Ljubljana	POINT (14.51496903347413 46.0552883087945)	25	392927	[1, 9, 25, 19, 26, 10, 4]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Rob Raymond

'Calculate the minimum radius around a point within which there are X other points

Solution 1:[1]

sample output

Sources

Related Questions

Solution 1:^[1]