'Confronting values between dataframe
I'm trying to find a way to confront the equality of values contained into a different dataframes having different column names.
label = {
'aoo' : ['a', 'b', 'c'],
'boo' : ['a', 'b', 'c'],
'coo' : ['a', 'b', 'c']
'label': ['label', 'label', 'label']
}
unlabel = {
'unlabel1' : ['a', 'b', 'c'],
'unlabel2' : ['a', 'b', 'c'],
'unlabel3': ['a', 'b', 'hhh']
}
label = pd.DataFrame(label)
unlabel = pd.DataFrame(unlabel)
Desired output is a dataframe that contains the column where their values is equal and the column label.
Where a single value is not equal unlabel['unlabel3'] i don't want to keep the values in the output.
desired_output = {
'unlabel1' : ['a', 'b', 'c'],
'unlabel2' : ['a', 'b', 'c'],
'label' : ['label', 'label', 'label']
}
If the labels where numbers I could try np.where but I can't find similar helper for string.
Could you help? Thanks
Solution 1:[1]
You can use pd.merge and specify the columns to merge with left_on and right_on
out = unlabel.merge(label, left_on=['unlabel1', 'unlabel2', 'unlabel3'], right_on=['aoo', 'boo', 'coo'], how='left').drop(['unlabel3', 'aoo', 'boo', 'coo'], axis=1)
print(out)
unlabel1 unlabel2 label
0 a a label
1 b b label
2 c c NaN
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ynjxsjmh |
