'Compare two dataframes and write the differences to another dataframe
I have two dataframes. df1 and df2
import pandas as pd
df1 = pd.DataFrame({
'Buyer': ['Carl', 'Alex', 'Lauren'],
'Quantity': [18, 3, 8]})
df2 = pd.DataFrame({
'Buyer': ['Carl', 'Alex', 'Maya', 'Emily'],
'Quantity': [18, 3, 5, 5]})
I was wondering if there was a way to compare df1 with df2 and append whatver is in not df1 to df2 so I will have an end result like
df2 = pd.DataFrame({
'Buyer': ['Carl', 'Alex', 'Maya', 'Emily', 'Lauren'],
'Quantity': [18, 3, 5, 5, 8]})
Solution 1:[1]
You just need to concat the two dfs and then drop dupes
df2 = pd.concat([df1,df2]).drop_duplicates()
>>> df2
Buyer Quantity
0 Carl 18
1 Alex 3
2 Lauren 8
2 Maya 5
3 Emily 5
Solution 2:[2]
Combine them like you said (append) and then just do a drop duplicate:
import pandas as pd
df1 = pd.DataFrame({
'Buyer': ['Carl', 'Alex', 'Lauren'],
'Quantity': [18, 3, 8]})
df2 = pd.DataFrame({
'Buyer': ['Carl', 'Alex', 'Maya', 'Emily'],
'Quantity': [18, 3, 5, 5]})
df = df1.append(df2)
df = df.drop_duplicates()
print(df)
output:
Buyer Quantity
0 Carl 18
1 Alex 3
2 Maya 5
3 Emily 5
4 Lauren 8
Solution 3:[3]
Using merge it can be achieved.
res = pd.merge(df2,df1,on=['Buyer','Quantity'],how='outer')
output:
Buyer Quantity
0 Carl 18
1 Alex 3
2 Maya 5
3 Emily 5
4 Lauren 8
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Boskosnitch |
| Solution 2 | Simon |
| Solution 3 |
