'Pandas - How to keep rows in a Dataframe that are different when comparing with another Dataframe
my question is: I have two Dataframes:
Dataframe 1:

Dataframe 2:

If you notice, Dataframe 2 has some updated values and I want to create a new Dataframe that has only theses updated values, no matter which column had its updated value.
Desired Dataframe:

Solution 1:[1]
Let's make sure we have the right dataframes:
In [270]: df1
Out[270]:
a b
0 aaa 5
1 bbb 4
2 ccc 7
3 ddd 9
4 eee 11
In [271]: df2
Out[271]:
a b
0 aaaaa 5
1 bbb 38
2 ccc 7
3 ddddd 104
4 eee 11
We want df2's values, so left join:
In [272]: df = df2.set_index('a').join(df1.set_index('a'), how='left', rsuffix="_r")
In [273]: df
Out[273]:
b b_r
a
aaaaa 5 NaN
bbb 38 4.0
ccc 7 7.0
ddddd 104 NaN
eee 11 11.0
We only care about when the values differ:
In [274]: df = df[df.b != df.b_r]
In [275]: df
Out[275]:
b b_r
a
aaaaa 5 NaN
bbb 38 4.0
ddddd 104 NaN
We no longer need df1's values:
In [276]: df = df.drop(columns=['b_r'])
In [277]: df
Out[277]:
b
a
aaaaa 5
bbb 38
ddddd 104
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | inspectorG4dget |
