'Set message for duplicate value pandas
i have df like this
A B
0 1 ABC
1 2 XYX
2 1 RTC
3 3 fds
4 2 rtv
5 4 rtoc
and i want like this
A B message
0 1 ABC
1 2 XYX
2 1 RTC Duplicated
3 3 fds
4 2 rtv Duplicated
5 4 rtoc
if Column A value is duplicated then set the message duplicated in second occurrence.
Solution 1:[1]
all duplicates:
Use numpy.where and duplicated:
df['message'] = np.where(df['A'].duplicated(), 'Duplicated', '')
output:
A B message
0 1 ABC
1 2 XYX
2 1 RTC Duplicated
3 3 fds
4 2 rtv Duplicated
5 4 rtoc
only for the SECOND occurrence:
Use groupby+cumcount:
df['message'] = np.where(df.groupby('A').cumcount().eq(1), 'Duplicated', '')
example:
A B message
0 1 first occurrence
1 2 XYX
2 1 second occurrence Duplicated
3 1 third occurrence
4 3 fds
5 2 rtv Duplicated
6 4 rtoc
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
