'Merge two DataFrame on the index, but if one DFs is missing an index I want it to create Null (Nan) values if one of the DFs is missing that index
I want to merge two DataFrames on the index. But if one of those DataFrames is missing an index value I want it to put null ('Nan') values in the place of the new DataFrame for whatever Dataframe is missing that index.
import pandas as pd
dict1 = {
'Short Name': ['SOO','BS', 'SOC'],
'File': ['r1','r2','r3'],
'acc1': ['321','321','321']
}
dict2 = {
'Short Name': ['S00','SOC'],
'File': ['r1','r2'],
'acc2': ['123','123']
}
df1 = pd.DataFrame(dict1)
df1.set_index('Short Name', inplace=True)
df1
df2 = pd.DataFrame(dict2)
df2.set_index('Short Name', inplace=True)
df2
new_df = pd.merge(df1,df2, on='Short Name')
The output that I'm trying to achieve is something that looks like this:
File_x acc1 File_y acc2
Short Name
SOO r1 321 r1 123
BS r2 321 Nan Nan
SOC r3 321 r2 123
[DataFrame of dict1][1] [DataFrame of dict2][2] [1]: https://i.stack.imgur.com/u5g0y.png [2]: https://i.stack.imgur.com/AwenX.png
Solution 1:[1]
Try join
out = df1.join(df2,lsuffix='_x',rsuffix='_y',how='left')
Out[934]:
File_x acc1 File_y acc2
Short Name
SOO r1 321 NaN NaN
BS r2 321 NaN NaN
SOC r3 321 r2 123
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BENY |
