'Pandas - How to keep rows in a Dataframe that are different when comparing with another Dataframe

my question is: I have two Dataframes:

Dataframe 1:

dataframe 1

Dataframe 2:

Dataframe 2

If you notice, Dataframe 2 has some updated values and I want to create a new Dataframe that has only theses updated values, no matter which column had its updated value.

Desired Dataframe:

Desired Dataframe



Solution 1:[1]

Let's make sure we have the right dataframes:

In [270]: df1
Out[270]:
     a   b
0  aaa   5
1  bbb   4
2  ccc   7
3  ddd   9
4  eee  11

In [271]: df2
Out[271]:
       a    b
0  aaaaa    5
1    bbb   38
2    ccc    7
3  ddddd  104
4    eee   11

We want df2's values, so left join:

In [272]: df = df2.set_index('a').join(df1.set_index('a'), how='left', rsuffix="_r")

In [273]: df
Out[273]:
         b   b_r
a
aaaaa    5   NaN
bbb     38   4.0
ccc      7   7.0
ddddd  104   NaN
eee     11  11.0

We only care about when the values differ:

In [274]: df = df[df.b != df.b_r]

In [275]: df
Out[275]:
         b  b_r
a
aaaaa    5  NaN
bbb     38  4.0
ddddd  104  NaN

We no longer need df1's values:

In [276]: df = df.drop(columns=['b_r'])
In [277]: df
Out[277]:
         b
a
aaaaa    5
bbb     38
ddddd  104

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 inspectorG4dget