'How to move ALL duplicated rows into separate dataframe
My code is removing all duplicates using the drop_duplicates, keep=false.
The issue I'm having is that before I remove the duplicates I want to move all removed duplicates to a separate dataframe. I've come up with the below line of code, however I think its leaving one duplicate remaining and not removing ALL duplicates.
duplicates_df = combined_df.loc[combined_df.duplicated(subset='Unique_ID_Count'), :]
combined_df.drop_duplicates(subset='Unique_ID_Count', inplace=True, keep=False)
Do you have any ideas on how I can move all duplicates dropped in the second line of code to the duplicates_df dataframe?
Any help would be much appreciated, thanks!
Solution 1:[1]
Try this:
duplicates_df = combined_df.loc[combined_df.duplicated(subset='Unique_ID_Count', keep=False)]
combined_df = combined_df.loc[~combined_df.duplicated(subset='Unique_ID_Count', keep=False)]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | richardec |
