'Pandas create column based on values in other rows
This is my first question on stackoverflow, so forgive any formatting errors in my sample records.
I am currently trying to add a new column in a pandas dataset whose data is based on the contents of other rows in the dataset.
In the example, for each row x, I want to find the entry from id_real from row y, so that the content of id_par in row x matches the content from id in row y. See the following example.
id_real id id_par
100 1 2
200 2 3
300 3 4
id_real id id_par new_col
100 1 2 200
200 2 3 300
300 3 4 NaN
I have tried a lot of things and the last thing I tried was the following:
df["new_col"] = df[df["id"] == df["id_par"]]["node_id"]
Unfortunately, the new column then only contains NaN entries. Can you help me?
Solution 1:[1]
Use Series.map:
df['new_col'] = df['id_par'].map(df.set_index('id')['id_real'])
print (df)
id_real id id_par new_col
0 100 1 2 200.0
1 200 2 3 300.0
2 300 3 4 NaN
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
