'rolling most recent index where a value ocurred
I have a dataframe
pd.DataFrame([1,2,3,4,1,2,3])
0
0 1
1 2
2 3
3 4
4 1
5 2
6 3
I want to create another column, where it records the most recent index the value "1" occurred
d={'data':[1,2,3,4,1,2,3], 'desired_new_col': [0,0,0,0,4,4,4]}
pd.DataFrame(d)
data desired_new_col
0 1 0
1 2 0
2 3 0
3 4 0
4 1 4
5 2 4
6 3 4
I have some idea of using df.expand().apply(func), but not sure what would be an appropriate function to write for this.
Thanks
Solution 1:[1]
You can do cumsum with sub-group by key then we can groupby with transform idxmax
s = df['data'].eq(1)
df['out'] = s.groupby(s.cumsum())['data'].transform('idxmax')
Out[293]:
0 0
1 0
2 0
3 0
4 4
5 4
6 4
Name: data, dtype: int64
Solution 2:[2]
You can do this just by using list comprehension. :)
idx = [i for i in df.index if df[0][i] == 1][-1]
df['desired_new_col'] = [idx if idx <= df.index[i] else 0 for i in df.index]
Output:
df
0 desired_new_col
0 1 0
1 2 0
2 3 0
3 4 0
4 1 4
5 2 4
6 3 4
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
