'Loop over two columns to check for identity of elemnts and create new column
I have the following dataset: (this is just a little part)

- Right now each "productid" corresponds to an "order_id"
- I have to create e new column with the "product_id" for each "order_id_OK"
- the majority of elements of "order_id_OK" are also in "order_id" but in a different order
So the objective would be to have a column where each "product_id" corresponds to the row of "order_id_OK" and not of "order_id"
Right now i'm trying to set up a for loop:
l = []
for i in df["order_id_OK"]:
for j in df["order_id"]:
if i == j:
for x in df["product_id"]:
l.append(x)
any idea?
Solution 1:[1]
you can merge your dataframe with itself, the output will be a dataframe where data['order_id'][j]==data['order_id_OK'][i] (i and j same meaning as used in your for loops).
merged_data=data.merge(data, left_on=['order_id'], right_on=['order_id_OK'], how='inner')
in the merged data you will find new columns 'order_id_OK_y' and 'product_id_x' corresponding to your desired output.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Triki Sadok |
