'Pandas - Getting just the columns that changed between comparison
I have a post-outer-merge Dataframe where there's a column with the indicator (left_only, right_only) that differs a row from another.
But I want only the columns that changed, not all the row. I'm filtering just by left_only, but i just need the columns that effectively changed and the column ID needs to be "locked" there, is the only one that can't get removed when i filter just the ones that changed.
For example:
| ID | Manufacturer | TAG | ID2 | _merge |
|---|---|---|---|---|
| 10003 | Apple | 334 | 999 | left_only |
| 10003 | Samsung | 223 | 999 | right_only |
| 10004 | Samsung | 253 | 567 | left_only |
| 10004 | Samsung | 253 | 999 | right_only |
The output should be:
| ID | Manufacturer | TAG | ID2 |
|---|---|---|---|
| 10003 | Apple | 334 | 999 |
| 10004 | 567 |
And just for the record, I'm doing the merge like this:
df_final = (pd.merge(
df, df2, how='outer',
indicator=True)
).query('_merge != "both"')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
