'pandas drop_duplicates condition on two other columns values
I have a datframe with columns A,B and C. Column A is where there are duplicates. Column B is where there is email value or NaN. Column C is where there is 'wait' value or a number. My dataframe has duplicate values in A. I would like to keep those who have a non-NaN value in B and the non 'wait' value in C (ie numbers). How could I do that on a df dataframe? I have tried df.drop_duplicates('A') but i dont see any conditions on other columns
Edit : sample data :
df=pd.DataFrame({'A':[1,1,2,2,3,3],'B':['[email protected]',np.nan,np.nan,'[email protected]','np.nan',np.nan],'C':[123,456,567,'wait','wait','wait']})
>>> df
A B C
0 1 [email protected] 123
1 1 NaN 456
2 2 NaN 567
3 2 [email protected] wait
4 3 np.nan wait
5 3 NaN wait
I would like a resulting dataframe as
>>> df
A B C
0 1 [email protected] 123
1 2 [email protected] 567
2 3 np.nan wait
Thank you Best,
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
