'assigning multiple values to one dataframe from another based on a key
I got some problem with joining DFs.
I got a sample dataframe main:
| id | some another columns... | key |
|---|---|---|
| 0 | ... | 3451 |
| 1 | ... | 7126 |
| 2 | ... | 6318 |
... and second one :
| id | key | values |
|---|---|---|
| 0 | 3451 | value1 |
| 1 | 3451 | value2 |
| 2 | 7126 | value3 |
I'd like to achieve the following output in main:
| id | some another columns... | values |
|---|---|---|
| 0 | ... | value1, value2 |
| 1 | ... | value3 |
| 2 | ... |
I recieved above output by using apply() on main DF with function using .isin(). But i think it is ineffective. Any other ideas how to achieve this? Please help.
Solution 1:[1]
Set the key as the index in df1 and use join
df.set_index('key').join(df2.groupby('key')['values'].agg(list))
id some another columns... values
key
3451 0 ... [value1, value2]
7126 1 ... [value3]
6318 2 ... NaN
What happens with duplicate values in the key column of df?
id some another columns... values
key
3451 0 ... [value1, value2]
3451 3 ... [value1, value2]
6318 2 ... NaN
7126 1 ... [value3]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
