'Mapping the values of a dataframe on another dataframe
I have two dataframes, with the same indexes and column names, as follows:
df1:
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12
2020-03-31 nan nan nan nan nan nan nan nan nan nan nan nan
2020-04-01 nan nan nan nan nan nan nan nan nan nan 1.0 1.0
2020-04-02 nan nan nan nan nan nan nan nan nan nan 1.0 1.0
2020-04-03 nan nan nan nan nan nan nan nan nan nan 1.0 nan
df2:
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12
2020-03-31 3.42 3.12 3.42 3.40 3.74 3.13 3.85 3.03 3.03 3.02 3.33 3.44
2020-04-01 3.44 3.53 3.48 3.44 3.22 3.52 3.66 3.21 3.14 3.01 3.14 3.20
2020-04-02 3.23 3.34 3.13 3.42 3.32 3.63 3.22 3.63 3.24 3.05 3.29 3.34
2020-04-03 3.49 3.53 3.57 3.48 3.49 3.17 3.83 3.31 3.42 3.10 3.21 3.40
So, what I want to do is create some sort of mapper that detects 1s in df1 and replaces the values in the df2 with nan in the exact spot (index and column). I'm looking for something like this:
result:
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12
2020-03-31 3.42 3.12 3.42 3.40 3.74 3.13 3.85 3.03 3.03 3.02 3.33 3.44
2020-04-01 3.44 3.53 3.48 3.44 3.22 3.52 3.66 3.21 3.14 3.01 nan nan
2020-04-02 3.23 3.34 3.13 3.42 3.32 3.63 3.22 3.63 3.24 3.05 nan nan
2020-04-03 3.49 3.53 3.57 3.48 3.49 3.17 3.83 3.31 3.42 3.10 nan 3.40
is there a fast way to do that without going through columns or indexes ?
Solution 1:[1]
Since your two dataframes have the same indexes and column names, you can try df.mask
df2 = df2.mask(df1.eq(1), pd.NA)
print(df2)
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12
2020-03-31 3.42 3.12 3.42 3.40 3.74 3.13 3.85 3.03 3.03 3.02 3.33 3.44
2020-04-01 3.44 3.53 3.48 3.44 3.22 3.52 3.66 3.21 3.14 3.01 NaN NaN
2020-04-02 3.23 3.34 3.13 3.42 3.32 3.63 3.22 3.63 3.24 3.05 NaN NaN
2020-04-03 3.49 3.53 3.57 3.48 3.49 3.17 3.83 3.31 3.42 3.10 NaN 3.40
If your df1 lacks some indexes, say 2020-03-31, you can try df.update. The idea is to update the df2 with df1 not-nan value then mask the updated 1 in df2 to nan.
m = df2.ne(1) & df1.eq(1)
df2.update(df1)
df2 = df2.mask(m, pd.NA)
print(df2)
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12
2020-03-31 3.42 3.12 3.42 3.40 3.74 3.13 3.85 3.03 3.03 3.02 3.33 3.44
2020-04-01 3.44 3.53 3.48 3.44 3.22 3.52 3.66 3.21 3.14 3.01 NaN NaN
2020-04-02 3.23 3.34 3.13 3.42 3.32 3.63 3.22 3.63 3.24 3.05 NaN NaN
2020-04-03 3.49 3.53 3.57 3.48 3.49 3.17 3.83 3.31 3.42 3.10 3.21 3.40
If your df1 lacks some columns, you can fill the missing columns with nan first
cols = list(set(df2.columns) - set(df1.columns))
df1[cols] = pd.NA
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
