'What does python ckdtree distance signifies when used with latitude and longitude data frames
I have two data frames with latitude and longitude coordinates.
gdf1
identifier geometry
0600a7c7-0c0f-411e-8077-bab184eec25e POINT(93.9000024.30000)
750df075-33f1-491a-b33c-94fa37567925 POINT(90.8000025.50000)
fec7e5fa-5eb8-429a-b052-757a85ca878c POINT(76.7933030.73430)
5339b76b-96be-4669-8772-7cea8588c9ee POINT(92.6000024.20000)
daab354b-160e-4995-9a09-ab6c288bc8af POINT(88.4000027.20000)
73220777-0343-410e-b0f0-4e90ebd77b14 POINT(93.9000024.70000)
216dd627-20f6-480f-a0e3-9b7a1dfb8788 POINT(76.7889030.73390)
c329d75e-0a4f-43f0-af92-333a7da37c25 POINT(94.0918125.71356)
7da0545d-ddd3-4cc1-a81c-e51b35ec1fdf POINT(91.8000025.50000)
8c323607-bd9d-42e8-9ee4-dacc72eda5ce POINT(73.045618.27274)
gdf2
id geometry
12113 POINT(72.8257318.93909)
12114 POINT(72.8247518.93252)
12115 POINT(73.0266719.04234)
12116 POINT(73.7065118.67469)
I am using Scipy ckdTree function to find the nearest id from gdf2 for each row in gdf1.
I am using the following function:
import itertools
from operator import itemgetter
import numpy as np
from scipy.spatial import cKDTree
from shapely.geometry import Point, LineString
def ckdnearest(gdfA, gdfB, gdfB_cols=['id']):
gdfA = gdfA.reset_index(drop=True)
gdfB = gdfB.reset_index(drop=True)
A = np.concatenate([np.array(geom.coords) for geom in gdfA.geometry.to_list()])
B = [np.array(geom.coords) for geom in gdfB.geometry.to_list()]
B_ix = tuple(itertools.chain.from_iterable([itertools.repeat(i, x) for i, x in enumerate(list(map(len, B)))]))
B = np.concatenate(B)
ckd_tree = cKDTree(B)
dist, idx = ckd_tree.query(A, k=1)
idx = itemgetter(*idx)(B_ix)
gdf = pd.concat([gdfA, gdfB.loc[idx, gdfB_cols].reset_index(drop=True),pd.Series(dist, name='dist')], axis=1)
return gdf
nn = ckdnearest(gdf1, gdf2)
print(nn)
The data frame nn prints the following:
Output df:
identifier geometry id dist
0600a7c7-0c0f-411e-8077-bab184eec25e POINT(93.9000024.30000) 12116 20.962381
750df075-33f1-491a-b33c-94fa37567925 POINT(90.8000025.50000) 12116 18.405773
fec7e5fa-5eb8-429a-b052-757a85ca878c POINT(76.7933030.73430) 12115 12.283708
5339b76b-96be-4669-8772-7cea8588c9ee POINT(92.6000024.20000) 12116 19.684848
daab354b-160e-4995-9a09-ab6c288bc8af POINT(88.4000027.20000) 12116 16.987636
73220777-0343-410e-b0f0-4e90ebd77b14 POINT(93.9000024.70000) 12116 21.073245
216dd627-20f6-480f-a0e3-9b7a1dfb8788 POINT(76.7889030.73390) 12115 12.281979
c329d75e-0a4f-43f0-af92-333a7da37c25 POINT(94.0918125.71356) 12116 21.566323
7da0545d-ddd3-4cc1-a81c-e51b35ec1fdf POINT(91.8000025.50000) 12116 19.338032
8c323607-bd9d-42e8-9ee4-dacc72eda5ce POINT(73.045618.27274) 12116 10.42292
The question is, what metric does the distance in output df signifies. Is it in meters, kilometres or in some other distance format. Can I convert this into distance in meters.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
