'Efficient function to detect equal elements at distance k
I am looking for an efficient function to find equal elements at distance k inside a list.
Input example:
v=['a','c','d','e','a','c','e','e','e','d']
k=4
Desired output:
['a', 'c', 'e']
why: 'a' is both in position 0 and 4 [4-0=k], 'c' is both in position 1 and 5 [5-1=k], 'e' is both in position 3 and 7 [7-3=k]
If it's possible, I am looking for something more efficient than looping element by element with a for loop. That is, I'm looking for something better than the following:
def dist_k(k, v):
v_len = len(v)
out = []
for i in range(0,v_len,1):
if i+k < v_len:
if v[i] == v[i+k]:
out.append(v[i])
print(out)
Solution 1:[1]
EDIT
I found a clever way to do that even faster, without the need of np.roll, and that will not consider the array as circular :
>>> import numpy as np
>>> v = ['a','c','d','e','a','c','e','e','e','d']
>>> v_np = np.array(v)
>>> k = 4
>>> v_init = v_np[:-k]
>>> v_init[v_init == v_np[k:]]
array(['a', 'c', 'e'])
The idea is I store the begining of the array without the last k elements (I will not need then as there are less than k elements after them). I store the array as I will need it twice.
>>> v_init = v_np[:-k]
>>> v_init
array(['a', 'c', 'd', 'e', 'a', 'c'])
Then I use the second part of the array without the first k elements:
>>> v_np[k:]
array(['a', 'c', 'e', 'e', 'e', 'd'])
Then I can compare the two arrays:
>>> v_init == v_np[k:]
array([ True, True, False, True, False, False])
And finally I can get the element corresponding to the index from v_init (not from v because the length of the arrays do not match):
>>> v_init[v_init == v_np[k:]]
array(['a', 'c', 'e'])
Performances As it uses numpy (which uses C for the computations), this method will be way faster with big arrays. However, due to the conversion to a numpy array, it might be slower for small lists (not even sure, try it with your problem):
def list_comprehension(v, k):
return [v[i] for i in range(len(v)-k) if v[i] == v[i+k]]
def using_numpy(v, k):
v_np = np.array(v)
v_init = v[:-k]
return v_init[v_init == v[k:]]
>>> from timeit import timeit
>>> v = np.random.rand(100000).tolist() # I consider that the input is a simple list
>>> timeit(lambda: list_comprehension(v,k), number=1000)
7.875344538999343
>>> timeit(lambda: using_numpy(v,k), number=1000)
3.646597085000394
Original answer, for circular arrays
import numpy as np
v = np.array(['a','c','d','e','a','c','e','e','e','d'])
k = 4
v[v == np.roll(v,-k)]
Basically np.roll shifts the array by k elements, then I just compare the initial array with the shifted array, and I keep elements for which they are equal :
>>> np.roll(v, -k)
array(['a', 'c', 'e', 'e', 'e', 'd', 'a', 'c', 'd', 'e'])
>>> v == np.roll(v,-k)
array([ True, True, False, True, False, False, False, False, False,
False])
>>> v[v == np.roll(v,-k)]
array(['a', 'c', 'e'])
Note that this method will consider the array is circular, i.e. it considers the first and last values of the array as adjacent :
>>> v=np.array(['a','c','d','e','a','c','e','e','e','d', 'a'])
>>> v[v == np.roll(v,-k)]
array(['a', 'c', 'e', 'd'])
Solution 2:[2]
You can try the following code:
v=['a','c','d','e','a','c','e','e','e','d']
k=4
result = [v[i] for i in range(len(v)-k) if v[i] == v[i+k] ]
Solution 3:[3]
I don't know if this is any more or less efficient than the other suggestions but it's another option:
v = ['a', 'c', 'd', 'e', 'a', 'c', 'e', 'e', 'e', 'd']
k = 4
print([e for e, f in zip(v, v[k:]) if e == f])
Output:
['a', 'c', 'e']
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Keziya |
| Solution 3 | Albert Winestein |
