'Is it possible to Pandas vectorize for operation involving a condition of a range of slice data?
In this operation, the array was sliced over a range.
Such that, given the array
arr = np.array([.1, .11, .21, .01, .5, .7, .91, .92, .95, .96, .1, .21, .23, .6, .7, .71, .72, .95, 0.96, 0.97])
and a range of values,
Step 1
drange = np.arange(start_, end_)
The slicing was conducted as below
Step 2
select_val = arr[drange]
Then the select_val was check for values larger than a threshold, th.
Step 3
bool_data = select_val<th
Finally, using argmin to return the indices of the minimum values along an axis.
Step 4
doutput = np.argmin(bool_data)
In my case, the variable start_, end_ was stored in a Pandas Dataframe:
df = pd.DataFrame(dict(s=[1, 10], e=[12, 19]))
whereas, the arr is as of Numpy type.
Currently, I employ Pandas' apply to a function which compress all the steps 1-4:
def fx(arr, st, en, th):
return np.argmin(arr[np.arange(st, en)] < th)
However, is it possible to employ a vectorization approach instead?
The code of the current strategy is as below:
def fx(arr, st, en, th):
return np.argmin(arr[np.arange(st, en)] < th)
th = 0.9
np.random.seed(0)
arr = np.array([.1, .11, .21, .01, .5, .7, .91, .92, .95, # 8 select 6 range: 1-12
.96, .1, .21, .23, .6, .7, .71, .72, .95, 0.96, 0.97]) # Select 15 range 10-17
df = pd.DataFrame(dict(s=[1, 10], e=[12, 19]))
df['opt'] = df.apply(lambda x: fx(arr, x['s'], x['e'], th), axis=1)
Solution 1:[1]
NumPy broadcasting
m1 = arr[:, None] > th
ix = np.arange(len(arr))[:, None]
m2 = (ix >= list(df.s)) & (ix < list(df.e))
df['opt'] = np.argmax(m1 & m2, axis=0) - df.s
Result
s e opt
0 1 12 5
1 10 19 7
Solution 2:[2]
Another alternative albeit not vectorize,
df['opt'] = df.apply(
lambda x: np.argmin(arr[x['s']:x['e']] <th), axis=1)
But, potential issue is it very difficult to handle exception error in this form.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Peter Mortensen |
| Solution 2 |
