'Replace substrings from a Dataframe column which correspond to values of another dataframe column with values of a third column

I have this huge dataset in which I have to replace each country's name with the corresponding ISO code. I have stored the ISO code of each country into another df. e.g.

df1:

TERRITORY
France, Italy
Italy
Spain, France
France, Spain, Italy

df2:

COUNTRY	ISO CODE
France	FR
Italy	IT
Spain	ES

Expected output:

TERRITORY
FR, IT
IT
ES, FR
FR, ES, IT

My last try was to convert the element into tuples and then replace, but it doesn't work (and I don't know if it makes sense, but here's my code anyway):

country = tuple(list(df2['COUNTRY']))
iso = tuple(list(df2['ISO CODE']))
z = zip(x, y)

for x, y in z:
  if x in df1['TERRITORY']:
    df1['TERRITORY'].str.replace(x, y)

But it doesn't change anything.

Solution 1:^[1]

You can use a combination of .str.split + .explode, then .replace + .set_index, and finally .groupby(level=0) + agg(list) + .str.join:

df1['TERRITORY'] = df1['TERRITORY'].str.split(', ').explode().replace(df2.set_index('COUNTRY')['ISO CODE']).groupby(level=0).agg(list).str.join(', ')

Output:

>>> df1
    TERRITORY
0      FR, IT
1          IT
2      ES, FR
3  FR, ES, IT

Solution 2:^[2]

If you have all the mapping you can also just use replace():

df1.replace(df2.set_index('COUNTRY').squeeze(), regex=True)

Output:

    Territory
0      FR, IT
1          IT
2      ES, FR
3  FR, ES, IT

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1
Solution 2	richardec

'Replace substrings from a Dataframe column which correspond to values of another dataframe column with values of a third column

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]