'Find the closest and the farthest points (2D) between a given point and a list of points in python

So I am trying to write a python function that can take any given point say (1,2) we can call this a point of interest and then take a list of points say [(6,2), (4,5), (3,4),...] the function should return the points (from the list) that are closest and farthest to our point of interest.

The function should accept two parameters:

    1. Point of interest i.e. pos=(1,2)
    2. List of points i.e. ptlist=[(1,3),(6,2), (4,5), (3,4)]

Also the function should return answers in a form of ((1,3),(4,5)) i.e (closest,farthest)

Here is my code, so far this works but it returns the numpy arrays, while I want my answers to be tuples. I have tried different ways and whatever I try I can't seem to get it right. Maybe we could use something else apart from numpy just so that we get answers in a form of ((1,3),(4,5)).

Here is the code I've written so far:

pos=(1,2)
ptlist=[(2, 6), (8,6), (5,8), (2,4)]
def get_nearest_farthest(pt, ptlist):
    pt=np.array(pt)
    ptlist=np.array(ptlist)
    dist=np.linalg.norm(ptlist-pt, axis=1)
    min_index = np.argmin(dist)
    #tuple(map(tuple,ptlist[min_index])) #Error
    max_index = np.argmax(dist)
    #tuple(map(tuple,ptlist[max_index])) #Error
    return ptlist[min_index],ptlist[max_index]

get_nearest_farthest(pt, ptlist)


Solution 1:[1]

You don't need numpy for this. All you need is min() / max() with a key parameter:

distance_metric = lambda x: (x[0] - pos[0])**2 + (x[1] - pos[1])**2
closest = min(ptlist, key=distance_metric)
farthest = max(ptlist, key=distance_metric)

print(closest, farthest)

This outputs:

(2, 4) (8, 6)

Solution 2:[2]

This will get you there without using numpy and is pretty efficient with only a single distance calculation per point in your list.

def get_nearest_farthest(pos, pt_list):
    nearest_pt, farthest_pt = None, None
    nearest_d, farthest_d = None, None

    for pt in pt_list:
        d = (pos[0]-pt[0])**2 + (pos[1]-pt[1])**2

        if nearest_d is None or d < nearest_d:
            nearest_d = d
            nearest_pt = pt

        if farthest_d is None or d > farthest_d:
            farthest_d = d
            farthest_pt = pt

    return nearest_pt, farthest_pt

pos = (1,2)
pt_list = [(2, 6), (8,6), (5,8), (2,4)]
get_nearest_farthest(pos, pt_list)

Solution 3:[3]

There are a number of approaches to solve this problem. Yours is one of them, and to obtain the result in the form of tuples it is sufficient to convert the resulting array back to list, with a combination of tuple() and np.ndarray.tolist().

As far as the fastest execution is concerned, it all boils down to the size of the input.

Here I provide some implementations (some similar to the other answers) as well as some timing:

import numpy as np


def p_dist_np(x, y, p=2):
    if p % 2:
        return np.sum(np.abs(x - y) ** p, axis=-1)
    else:
        return np.sum((x - y) ** p, axis=-1)


def minmax_dist_np(x0, xs, d_func=p_dist_np):
    x0 = np.array(x0)
    xs = np.array(xs)
    dists = d_func(x0, xs)
    min_i = np.argmin(dists)
    max_i = np.argmax(dists)
    return tuple(xs[min_i].tolist()), tuple(xs[max_i].tolist())
import numpy as np
import numba as nb


@nb.njit
def p_dist_nb(x, y, p=2):
    if p % 2:
        return np.sum(np.abs(x - y) ** p)
    else:
        return np.sum((x - y) ** p)


@nb.njit
def _minmax_dist_nb(x0, xs, d_func=p_dist_nb):
    min_i = max_i = 0
    min_d = max_d = d_func(x0, xs[min_i])
    for i, x in enumerate(xs[1:], 1):
        d = d_func(x0, x)
        if d < min_d:
            min_d = d
            min_i = i
        elif d > max_d:
            max_d = d
            max_i = i
    return min_i, max_i


def minmax_dist_nb(x0, xs, d_func=p_dist_nb):
    x0 = np.array(x0)
    xs = np.array(xs)
    min_i, max_i = _minmax_dist_nb(x0, xs, d_func=d_func)
    return tuple(xs[min_i].tolist()), tuple(xs[max_i].tolist())
def p_dist(x, y, p=2):
    if p % 2:
        return sum(abs(x_i - y_i) ** p for x_i, y_i in zip(x, y))
    else:
        return sum((x_i - y_i) ** p for x_i, y_i in zip(x, y))


def minmax_dist_loop(x0, xs, d_func=p_dist):
    i_xs = iter(xs)
    min_i = max_i = 0
    min_d = max_d = d_func(x0, next(i_xs))
    for i, x in enumerate(i_xs, 1):
        d = d_func(x0, x)
        if d < min_d:
            min_d = d
            min_i = i
        elif d > max_d:
            max_d = d
            max_i = i
    return xs[min_i], xs[max_i]
def minmax_dist_key(x0, xs, d_func=p_dist):
    return (
        min(xs, key=lambda x: d_func(x, x0)),
        max(xs, key=lambda x: d_func(x, x0)))

And the code to obtain the timings:

x0 = (1, 2)
xs = [(2, 6), (8, 6), (5, 8), (2, 4)]


funcs = minmax_dist_np, minmax_dist_loop, minmax_dist_key, minmax_dist_nb
for n in (1, 10, 100, 1000):
    nxs = xs * n
    print(f"N = {n * len(xs)}")
    base = funcs[0](x0, nxs)
    for func in funcs:
        res = func(x0, nxs)
        print(f"{func.__name__:>24}  {base == res}", end='  ')
        %timeit -n 50 func(x0, nxs)
# N = 4
#           minmax_dist_np  True  50 loops, best of 5: 16.2 µs per loop
#         minmax_dist_loop  True  50 loops, best of 5: 6.11 µs per loop
#          minmax_dist_key  True  50 loops, best of 5: 11.6 µs per loop
#           minmax_dist_nb  True  50 loops, best of 5: 18.1 µs per loop
# N = 40
#           minmax_dist_np  True  50 loops, best of 5: 39.6 µs per loop
#         minmax_dist_loop  True  50 loops, best of 5: 54.3 µs per loop
#          minmax_dist_key  True  50 loops, best of 5: 107 µs per loop
#           minmax_dist_nb  True  50 loops, best of 5: 43.2 µs per loop
# N = 400
#           minmax_dist_np  True  50 loops, best of 5: 232 µs per loop
#         minmax_dist_loop  True  50 loops, best of 5: 550 µs per loop
#          minmax_dist_key  True  50 loops, best of 5: 1.07 ms per loop
#           minmax_dist_nb  True  50 loops, best of 5: 281 µs per loop
# N = 4000
#           minmax_dist_np  True  50 loops, best of 5: 2.28 ms per loop
#         minmax_dist_loop  True  50 loops, best of 5: 5.7 ms per loop
#          minmax_dist_key  True  50 loops, best of 5: 11.3 ms per loop
#           minmax_dist_nb  True  50 loops, best of 5: 2.78 ms per loop

indicating that for sufficiently small inputs, minmax_dist_loop() is the fastest, but as soon as the inputs get larger, minmax_dist_np() and minmax_dist_nb() take over by a fair margin. Conversely, minmax_dist_key() gets progressively to be the slowest, essentially because it needs to compute the expensive distance function twice as much as minmax_dist_loop(). While the looping is computed twice also for minmax_dist_np(), the computation of the distance is done only once (at the cost of creating a potentially large temporary object).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Woodford
Solution 3