'Numpy sum of 2D array along axis=1, floating range
I would like to perform a sum of a 2D array over the second axis, but on a range which is variable. Not vectorised it is:`
import numpy as np
nx = 3
ny = 5
a = np.ones((nx, ny))
left_bnd = np.array([0, 1, 0])
right_bnd = np.array([2, 2, 4])
b = np.zeros(nx)
for jx in range(nx):
b[jx] = np.sum(a[jx, left_bnd[jx]: right_bnd[jx]])
print(b)
The output, b, is [2. 1. 4.] I'd love to vectorise the loop, sort of
b = np.sum(a[:, left_bnd[:]: right_bnd[:], axis=1)
to speed up the calculation, because my "n" is typically a few 1e6. Unfortunately I cannot find a proper working syntax.
Solution 1:[1]
A jitted numba implementation with manual summation in a for loop is around ~100x faster. Using np.sum with slicing inside the numba function was only half as fast. This solution assumes that all slices are within valid bounds.
Generation of sufficiently large sample data for benchmarking
import numpy as np
import numba as nb
np.random.seed(42) # just for reproducibility
n, m = 5000, 100
a = np.random.rand(n,m)
bnd_l, bnd_r = np.sort(np.random.randint(m+1, size=(n,2))).T
Jitted with numba. Please make sure to benchmark compiled hot code by running the function at least twice.
@nb.njit
def slice_sum(a, bnd_l, bnd_r):
b = np.zeros(a.shape[0])
for j in range(a.shape[0]):
for i in range(bnd_l[j], bnd_r[j]):
b[j] += a[j,i]
return b
slice_sum(a, bnd_l, bnd_r)
Output
# %timeit 1000 loops, best of 5: 297 µs per loop
array([ 4.31060848, 35.90684722, 38.03820523, ..., 37.9578962 ,
3.61011028, 6.53631388])
With numpy inside a python loop (this is a nice, simple implementation)
b = np.zeros(n)
for j in range(n):
b[j] = np.sum(a[ j, bnd_l[j] : bnd_r[j] ])
b
Output
# %timeit 10 loops, best of 5: 29.2 ms per loop
array([ 4.31060848, 35.90684722, 38.03820523, ..., 37.9578962 ,
3.61011028, 6.53631388])
To verify the results are equal
np.testing.assert_allclose(slice_sum(a, bnd_l, bnd_r), b)
Solution 2:[2]
Here's a pure numpy solution that gets close to the speed of the posted numba solution. It leverages reduceat but the setup is quite convoluted.
def slice_sum_np(a, left_bnd, right_bnd):
nx, ny = a.shape
linear_indices = np.c_[left_bnd, right_bnd] + ny * np.arange(nx)[:,None]
sums = np.add.reduceat(a.ravel(), linear_indices.ravel())[::2]
# account for reduceat special case
sums[left_bnd >= right_bnd] = 0
return sums
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | user7138814 |
