'How to concat multiple df side by side with common column?
I have N number of DataFrames. Every DataFrame has two columns, the first column is always constant with a date and another column name is dynamic. So for example
df1
date A
2017-12-31 5
2020-12-31 7
2021-12-31 8
df2
date B
2012-12-31 9
2020-12-31 6
and so on. So I want to have a resultant result_df which will have all the date along with the columns side-by-side like this
result_df
date A B ...
2012-12-31 - 9 ...
2017-12-31 5 - ...
2020-12-31 7 6 ...
2021-12-31 8 - ...
Special case - I have few empty DataFrames too. I would like that to merge into result_df, for example an empty DataFrame with columns ['date', 'C'] will be added like this
result_df
date A B C ...
2012-12-31 - 9 - ...
2017-12-31 5 - - ...
2020-12-31 7 6 - ...
2021-12-31 8 - - ...
Solution 1:[1]
We could concat + groupby + first:
dfX = pd.DataFrame(columns=['date','C'])
dfs = [df1, df2, dfX]
out = pd.concat(dfs).groupby('date', as_index=False).first().fillna('-')
Output:
date A B C
0 2012-12-31 - 9.0 -
1 2017-12-31 5.0 - -
2 2020-12-31 7.0 6.0 -
3 2021-12-31 8.0 - -
Solution 2:[2]
You might want to look at pd.merge: https://pandas.pydata.org/docs/reference/api/pandas.merge.html in your case the following code should work
df1.merge(df2, on='date', how='outer')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | vjspranav |
