'How to compare two poly lines for equality?
I have two poly lines (paths), each represented by an array of 2D points. I'd like to compute a distance or similarity score between the two. There may be a different number of points in each array. If you were to plot the poly lines and they were directly on top of each other, the distance should be zero.
p1 = np.array([[0,0], [5,5], [9,9]])
p2 = np.array([[0,0], [3,3], [6,6], [9,9]])
p3 = np.array([[0,0], [3,4], [6,7], [9,9]])
p4 = np.array([[0,0], [0,9]])
polyline_dist(p1, p2) # Should be 0 since the plots are identical
polyline_dist(p1, p3) # Should be small since the plots are close
polyline_dist(p1, p4) # Should be larger since the plots are much different
I tried one approach where I calculated the distance from each point in array 1 to the line segments from array 2 and took the minimum distance, then took the average over all the points. This worked, but got very slow for longer arrays with hundreds of points.
Any suggestions would be welcome!
Solution 1:[1]
You could try finding the area each plot casts onto the x and y axis, then comparing the intersection of that area. Similar to with calculus where you have two curves f(x) and g(x), you can find the area between the curves using the integral from the lower bound to the upper bound of (f(x) - g(x)) dx. If your lines don't have overlapping domains/codomains, you may need to add some penalty and start at the maximum of p1[0] and p2[0] and end at the minimum of p1[-1] and p2[-1]. Polylines that are the same would have 0 distance between each other and thus the area between each polyline would be 0.You would check both the x and y axis for cases where polylines area heavily vertical. I wrote the following code minus the sample_at_point part due to it getting late, all that part needs to do is find the lower and upper bounding points of p in p1 for the given axis, then define a line equation and find where the passed point p lies on that line equation and return the value. I'll edit this answer tomorrow but figured I would post what I have right now. This solution would work for the polylines you show above however it would fail if the polylines are not similar to basic curves (ex: a circle/any curve where f(x) could have two different values).
def polydist(p1, p2):
x_area = area_between_squared(p1, p2, axis=0, value=1)
y_area = area_between_squared(p1, p2, axis=1, value=0)
return (x_area + y_area)**0.5
def sample_at_point(p1, p, axis, value):
"""
# There should be a fast way to do this,
# Finnd the bounding points that p is between,
# the bounding points are the max point < p and the min point > p
# these points form a line,
# find the value of p applied to that line formula.
"""
def area_between_squared(p1, p2, axis, value):
start = max(p1[0][axis], p2[0][axis])
start_penalty = start - min(p1[0][axis], p2[0][axis])
end = min(p1[-1][axis], p2[-1][axis])
end_penalty = max(p1[-1][axis], p2[-1][axis]) - end
# Getting late, may edit this later tomorrow with the code for this to work but feel free to implement the following idea below:
axis_points = [x[axis] for x in p1]
axis_points.extend([x[axis] for x in p2])
axis_points.sort()
prev_p1_v = sample_at_point(p1, axis_points[0], axis, value)
prev_p2_v = sample_at_point(p2, axis_points[0], axis, value)
prev_axis_point = axis_points[0]
axis_points.pop(0) # remove the first item
sum = 0
for e in axis_points:
p1_v = sample_at_point(p1, e, axis, value)
p2_v = sample_at_point(p2, e, axis, value)
point_c = p1_v - p2_v
point_b = prev_p1_v - prev_p2_v
area = 0.5 * (point_c + point_b) * (e - prev_axis_point)
sum += area
prev_p1_v = p1_v
prev_p2_v = p2_v
prev_axis_point = e
sum *= sum # square the result value, add the penalties since this difference should be positive
sum += start_penalty + end_penalty
return sum
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Michael |
