'tag occurrences of a value multiple times inn column based on dates using pandas [duplicate]
Below is my dataframe:
df = pd.DataFrame({'ID':[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'date': ['2020-12-1', '2020-12-2', '2020-12-3', '2020-12-4',
'2020-12-10', '2020-12-11', '2020-12-12', '2020-12-13',
'2020-12-25', '2020-12-26', '2020-12-27', '2020-12-28'],
'name':['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b' , 'a', 'a', 'a', 'a']})
Looks like this:
I want the output with a column added as shown:
The last column adds a unique id for each new occurrence of value 'a' in the column 'name'.
Solution 1:[1]
Chain cumsum
with shift
df['id2'] = df.name.ne(df.name.shift()).cumsum()
df
Out[456]:
ID date name id2
0 1 2020-12-1 a 1
1 1 2020-12-2 a 1
2 1 2020-12-3 a 1
3 1 2020-12-4 a 1
4 1 2020-12-10 b 2
5 1 2020-12-11 b 2
6 1 2020-12-12 b 2
7 1 2020-12-13 b 2
8 1 2020-12-25 a 3
9 1 2020-12-26 a 3
10 1 2020-12-27 a 3
11 1 2020-12-28 a 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | BENY |