'Pandas create column based on values in other rows

This is my first question on stackoverflow, so forgive any formatting errors in my sample records.

I am currently trying to add a new column in a pandas dataset whose data is based on the contents of other rows in the dataset.

In the example, for each row x, I want to find the entry from id_real from row y, so that the content of id_par in row x matches the content from id in row y. See the following example.

id_real id id_par
100     1  2      
200     2  3      
300     3  4     
id_real id id_par new_col 
100     1  2      200
200     2  3      300
300     3  4      NaN

I have tried a lot of things and the last thing I tried was the following:

df["new_col"] = df[df["id"] == df["id_par"]]["node_id"]

Unfortunately, the new column then only contains NaN entries. Can you help me?



Solution 1:[1]

Use Series.map:

df['new_col'] = df['id_par'].map(df.set_index('id')['id_real'])
print (df)
   id_real  id  id_par  new_col
0      100   1       2    200.0
1      200   2       3    300.0
2      300   3       4      NaN

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael