'How to do map of two columns of different data frame and if get some null values then again mapping from other data frame and only fill null values
df1 df2 df3
t_cell t_psc cell1 psc cell2 psc2
fh4 dh2 2 fh4 5
dh2 jk2 3 ik5 7
jk2 po6 2 it7 9
ik5 po6 3
po6
it7
output should be
t_cell t_psc
fh4 5
dh2 2
jk2 3
ik5 7
it7 9
po6 2
Solution 1:[1]
First, rename your columns of df2 and df3 as df1 then concat your 2 dataframes to create a dict mapping. Finally, map to the t_cell column:
df2.columns = ['t_cell', 't_psc']
df3.columns = ['t_cell', 't_psc']
df1['t_psc'] = df1['t_cell'].map(pd.concat([df2, df3]).set_index('t_cell')['t_psc'])
print(df1)
# Output
t_cell t_psc
0 fh4 5
1 dh2 2
2 jk2 3
3 ik5 7
4 po6 2
5 it7 9
Solution 2:[2]
Use Series.map with rename columnsnames in df2 for same column like in df3 with remove duplicates by cell2 column:
s = (df2.rename(columns={'cell1':'cell2', 'psc':'psc2'})
.append(df3)
.drop_duplicates(['cell2'])
.set_index('cell2')['psc2'])
df1['t_psc'] = df1['t_cell'].map(s)
print (df1)
t_cell t_psc
0 fh4 5
1 dh2 2
2 jk2 3
3 ik5 7
4 po6 2
5 it7 9
Another idea is use:
s1 = df1['t_cell'].map(df2.set_index('cell1')['psc'])
s2 = df1['t_cell'].map(df3.set_index('cell2')['psc2'])
df1['t_psc'] = s1.fillna(s2)
print (df1)
t_cell t_psc
0 fh4 5.0
1 dh2 2.0
2 jk2 3.0
3 ik5 7.0
4 po6 2.0
5 it7 9.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Corralien |
| Solution 2 |
