'Drop duplicate IDs keeping if value = certain value , otherwise keep first duplicate

>>> df = pd.DataFrame({'id': ['1', '1', '2', '2', '3', '4', '4', '5', '5'],
...                    'value': ['keep', 'y', 'x', 'keep', 'x', 'Keep', 'x', 'y', 'x']})
>>> print(df)
  id value
0  1  keep
1  1     y
2  2     x
3  2  keep
4  3     x
5  4  Keep
6  4     x
7  5     y
8  5     x

In this example, the idea would be to keep index values 0, 3, 4, 5 since they are asscoiated with a duplicate id with a particular value == 'Keep' and 7 (since it is the first of the duplicates for id 5).



Solution 1:[1]

In your case try with idxmax

out = df.loc[df['value'].eq('keep').groupby(df.id).idxmax()]
Out[24]: 
  id value
0  1  keep
3  2  keep
4  3     x
5  4  Keep
7  5     y

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 BENY