'How to concat panda dataframes so that they are sorted by year?
Say I have 2 df's
One simply just contains index and year such as:
| (index) | Year |
|---|---|
| 1 | 2000 |
| 2 | 2001 |
| 3 | 2002 |
| 4 | 2003 |
Then I have a dataframe that consist of index, year, and some other datapoint such as:
| (index) | Year | data |
|---|---|---|
| 1 | 2001 | 1.515 |
| 2 | 2003 | 2.631 |
How do I join them so that I only transfer over the relevant 'data' column and it properly aligns with the dates 2001 and 2003 in the 1st dataframe? Of-course I will be using this method to import many more columns. e.g:
| (index) | Year | data | different data |
|---|---|---|---|
| 1 | 2000 | potato | |
| 2 | 2001 | 1.515 | |
| 3 | 2002 | pickle | |
| 4 | 2003 | 2.631 |
Solution 1:[1]
Do a left merge:
>>> df = df.merge(df2, how='left')
>>> df
Year data
0 2000 NaN
1 2001 1.515
2 2002 NaN
3 2003 2.631
# Optional:
>>> df = df.fillna('')
Year data
0 2000
1 2001 1.515
2 2002
3 2003 2.631
Solution 2:[2]
Possible solution is the following:
import pandas as pd
data1 = {"Year": [2000, 2001, 2002, 2003]}
data2 = {"Year": [2001, 2003], "data": [1.515, 2.631]}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df = pd.merge(df1, df2, how="outer", on="Year")
df = df.fillna("")
df
Returns
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | richardec |
| Solution 2 | gremur |

