'Trouble when trying to do a VLOOKUP like with two pandas dataframes
I've read a lot of questions regarding this matter, but none of it solved my problem.
I have 2 dataframes, one containing a list of all students of graduation level in a country, each one (each row) with informations about the student itself, as well as the course code.
On another dataframe, i have a list of unique course codes containing the address of the university that is assigned to the course code.
df1
CodCourse|Student|Address
1 10 outdated address
2 11 outdated address
2 12 outdated address
3 13 outdated address
3 14 outdated address
4 15 outdated address
4 16 outdated address
df2:
CodCourse Address
1 Xth avenue
2 Yth avenue
3 Zth avenue
4 Nth avenue
Expected result:
df1
CodCourse|Student|Address
1 10 Xth avenue
2 11 Yth Street
2 12 Yth Street
3 13 Zth Street
3 14 Zth Street
4 15 Nth Street
4 16 Nth Street
I want to update the dataframe 1 address column with the address column of the dataframe 2.
I'm doing like this, but it's not working. I've tried with join and using a dictionary, but all I have is a failure.
df1=df1.merge(df2[['CodCourse','Address']], on='CodCourse', how='left')
Please, can anyone help me?
Thanks! Eduardo.
Solution 1:[1]
If you want to update the values in df1
's address you can use two methods.
in place modification:
df1.update(df1['CodCourse']
.map(df2.set_index('CodCourse')['Address'])
.rename('Address'))
assignment:
df1['Address'] = (df1['CodCourse']
.map(df2.set_index('CodCourse')['Address'])
.fillna(df1['Address'])
)
In both case this will replace any Address that has a new value in df2 while keeping old addresses that don't have a new value.
output:
CodCourse Student Address
0 1 10 Xth avenue
1 2 11 Yth avenue
2 2 12 Yth avenue
3 3 13 Zth avenue
4 3 14 Zth avenue
5 4 15 Nth avenue
6 4 16 Nth avenue
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mozway |