'Pandas : Create new column based on text values of other columns
My dataframe looks like this:
id text labels
0 447 glutamine synthetase [protein]
1 447 GS [protein]
2 447 hepatoma [indication]
3 447 NaN NaN
4 442 Metachromatic [indication]
I want to transform the dataframe and create two new columns named proteins and indications that contain the text when labels is protein or indication for the same id.
Wanted output
id protein indication
0 447 glutamine synthetase, GS hepatoma
0 442 NaN Metachromatic
Can someone help how to do this?
Solution 1:[1]
Use df.explode with Groupby.agg and df.pivot:
In [417]: out = df.explode('labels').groupby(['id', 'labels'])['text'].agg(','.join).reset_index().pivot('id', 'labels').reset_index().droplevel(0, axis=1).rename_axis(None, axis=1)
In [423]: out.columns = ['id', 'indication', 'protein']
In [424]: out
Out[424]:
id indication protein
0 442 Metachromatic NaN
1 447 hepatoma glutamine synthetase,GS
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
