'Comparing ID, Type day by day Pandas
I have the following df:
index, date, ID, type
0, 2022-04-04, 1, A
1, 2022-04-04, 2, B
2, 2022-04-03, 1, A
3, 2022-04-03, 2, A
I want to have the following df (that adds a new column('change') that is a True/False if the type of the ID change of the actual date vs the date before):
index, date, ID, type, change
0, 2022-04-04, 1, A, False (False because type didnt change from one day to the other: A-->A)
1, 2022-04-04, 2, B, True (True because type change from one day to the other: A-->B)
2, 2022-04-03, 1, A, --- (no information of the day before)
3, 2022-04-03, 2, A, --- (no information of the day before)
I need a code that generates the new column "change" with the True/False per date.
EDIT: This is my real df (150.000 ID per date, 3 types - Enabled, Not eligible, Eligible (not enabled) and not the same number of ID per date (some ID are activated/desactivated)):
index, date, ID, type
2022-03-14,262199,Enabled
2022-03-14,267695,Enabled
2022-03-14,262708,Not eligible
2022-03-14,233,Enabled
2022-03-14,263652,Not eligible
2022-03-14,267908,Not eligible
2022-03-14,263992,Enabled
2022-03-14,267973,Not eligible
2022-03-14,1996,Enabled
2022-03-14,268216,Eligible (not enabled)
....
So is the same idea of code showed up in an easy way but with 3 types of Id and not a fixed amount of ID per date.
Solution 1:[1]
IIUC use:
df['change'] = df['type'].ne(df.groupby('ID')['type'].shift(-1).fillna(df['type']))
print (df)
date ID type change
0 2022-04-04 1 A False
1 2022-04-04 2 B True
2 2022-04-03 1 A False
3 2022-04-03 2 A False
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
