'Mapping the values of a dataframe on another dataframe

I have two dataframes, with the same indexes and column names, as follows:

df1:

              col1   col2   col3    col4   col5   col6   col7   col8   col9   col10   col11   col12   
2020-03-31    nan    nan    nan     nan    nan    nan    nan    nan    nan    nan     nan     nan
2020-04-01    nan    nan    nan     nan    nan    nan    nan    nan    nan    nan     1.0     1.0   
2020-04-02    nan    nan    nan     nan    nan    nan    nan    nan    nan    nan     1.0     1.0   
2020-04-03    nan    nan    nan     nan    nan    nan    nan    nan    nan    nan     1.0     nan

df2:

              col1   col2   col3    col4   col5   col6   col7   col8   col9   col10   col11   col12   
2020-03-31    3.42   3.12   3.42    3.40   3.74   3.13   3.85   3.03   3.03   3.02    3.33    3.44   
2020-04-01    3.44   3.53   3.48    3.44   3.22   3.52   3.66   3.21   3.14   3.01    3.14    3.20  
2020-04-02    3.23   3.34   3.13    3.42   3.32   3.63   3.22   3.63   3.24   3.05    3.29    3.34  
2020-04-03    3.49   3.53   3.57    3.48   3.49   3.17   3.83   3.31   3.42   3.10    3.21    3.40   

So, what I want to do is create some sort of mapper that detects 1s in df1 and replaces the values in the df2 with nan in the exact spot (index and column). I'm looking for something like this:

result:

              col1   col2   col3    col4   col5   col6   col7   col8   col9   col10   col11   col12   
2020-03-31    3.42   3.12   3.42    3.40   3.74   3.13   3.85   3.03   3.03   3.02    3.33    3.44   
2020-04-01    3.44   3.53   3.48    3.44   3.22   3.52   3.66   3.21   3.14   3.01    nan     nan   
2020-04-02    3.23   3.34   3.13    3.42   3.32   3.63   3.22   3.63   3.24   3.05    nan     nan   
2020-04-03    3.49   3.53   3.57    3.48   3.49   3.17   3.83   3.31   3.42   3.10    nan     3.40   

is there a fast way to do that without going through columns or indexes ?



Solution 1:[1]

Since your two dataframes have the same indexes and column names, you can try df.mask

df2 = df2.mask(df1.eq(1), pd.NA)
print(df2)

            col1  col2  col3  col4  col5  col6  col7  col8  col9  col10  col11  col12
2020-03-31  3.42  3.12  3.42  3.40  3.74  3.13  3.85  3.03  3.03   3.02   3.33   3.44
2020-04-01  3.44  3.53  3.48  3.44  3.22  3.52  3.66  3.21  3.14   3.01    NaN    NaN
2020-04-02  3.23  3.34  3.13  3.42  3.32  3.63  3.22  3.63  3.24   3.05    NaN    NaN
2020-04-03  3.49  3.53  3.57  3.48  3.49  3.17  3.83  3.31  3.42   3.10    NaN   3.40

If your df1 lacks some indexes, say 2020-03-31, you can try df.update. The idea is to update the df2 with df1 not-nan value then mask the updated 1 in df2 to nan.

m = df2.ne(1) & df1.eq(1)
df2.update(df1)
df2 = df2.mask(m, pd.NA)
print(df2)

            col1  col2  col3  col4  col5  col6  col7  col8  col9  col10  col11  col12
2020-03-31  3.42  3.12  3.42  3.40  3.74  3.13  3.85  3.03  3.03   3.02   3.33   3.44
2020-04-01  3.44  3.53  3.48  3.44  3.22  3.52  3.66  3.21  3.14   3.01    NaN    NaN
2020-04-02  3.23  3.34  3.13  3.42  3.32  3.63  3.22  3.63  3.24   3.05    NaN    NaN
2020-04-03  3.49  3.53  3.57  3.48  3.49  3.17  3.83  3.31  3.42   3.10   3.21   3.40

If your df1 lacks some columns, you can fill the missing columns with nan first

cols = list(set(df2.columns) - set(df1.columns))
df1[cols] = pd.NA

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1