'Concatenate two dataframes and drop duplicates in Pandas

For df2 which only has data in the year of 2019:

  type  year  value
0    a  2019     13
1    b  2019      5
2    c  2019      5
3    d  2019     20

df1 has multiple years data:

  type  year  value
0    a  2015     12
1    a  2016      2
2    a  2019      3
3    b  2018     50
4    b  2019     10
5    c  2017      1
6    c  2016      5
7    c  2019      8

I need to concatenate them together while replacing df2's values in 2019 with the values from df1's same year.

The expected result will like this:

  type  date  value
0    a  2015     12
1    a  2016      2
2    b  2018     50
3    c  2017      1
4    c  2016      5
5    a  2019     13
6    b  2019      5
7    c  2019      5
8    d  2019     20

The result from pd.concat([df1, df2], ignore_index=True, sort =False), which clearly have multiple values in year of 2019 for one type. How should I improve the code? Thank you.

   type  date  value
0     a  2019     13
1     b  2019      5
2     c  2019      5
3     d  2019     20
4     a  2015     12
5     a  2016      2
6     a  2019      3
7     b  2018     50
8     b  2019     10
9     c  2017      1
10    c  2016      5
11    c  2019      8


Solution 1:[1]

Add DataFrame.drop_duplicates for get last rows per type and date after concat.

Solution working if type and date pairs are unique in both DataFrames.

df = (pd.concat([df1, df2], ignore_index=True, sort =False)
        .drop_duplicates(['type','date'], keep='last'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael