'Merging two datasets based on the 2 columns, or finding the missing values in first dataframe and filling that from the other

I have 2 pandas data-frames, with 2 columns namely index and date. Some of the dates are missing from the first data-frame, and those values can be obtained from the second data-frame corresponding to the index. I tried using pd.concat, pd.merge and pd.join etc but those doesn't seem to give me the results that I want. Here is the table.

df1 = data-frame 1

df2 = data-frame 2



Solution 1:[1]

Have you tried df1 = df1.update(df2)?

Although the update funtion will not increase the size of df1, it only updates the missing values or the values that were already there.

Solution 2:[2]

You can try this solution:

import pandas as pd
import numpy as np

# initialize list of lists
df1 = [[402, '15/05/2020'], [408, np.nan], [408, '14/05/2020']]
df2 = [[402, '16/05/2020'], [408, '10/05/2020'], [409, '13/05/2020']]

# Create the pandas DataFrame
df1 = pd.DataFrame(df1, columns=['index', 'date'])
df2 = pd.DataFrame(df2, columns=['index', 'date'])

df1.set_index("index", inplace=True)
df2.set_index("index", inplace=True)
for index, row in df1.iterrows():
    if row["date"] != row["date"]:
        row["date"] = df2.loc[index]["date"]

Output:

index            
402    15/05/2020
408    10/05/2020
408    14/05/2020

With this solution only the rows whose date is nan or null are updated with the corresponding value on the other dataframe.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ralph Angelo Almoneda
Solution 2 claudia