'How to use python to get unique values from a column?
I have a dataframe looks like this.
This is an order form.
ORDER NUMBER PROMOTION CODE ORDER AMOUNT
abc1 128040;128040;128040;128040 3160
abc2 128040;127497;128040;128040;134497 1381
abc3 128406;128040;128040 345
abc4 NaN 698
I want to get unique PROMOTION CODE for each order,and the final dataframe would looks like this:
ORDER NUMBER PROMOTION CODE ORDER AMOUNT
abc1 128040 3160
abc2 128040;127497;134497 1381
abc3 128406;128040 345
abc4 NaN 698
I dont know how to drop these duplicate promotion code by python
Any help is highly appreciated.
Solution 1:[1]
For a regex option, we could try using str.replace here:
df["PROMOTION CODE"] = df["PROMOTION CODE"].str.replace(r';?(\d+)\b(?=.*;\1)', '').str.lstrip(';')
Solution 2:[2]
Using split follow with pd.unique and join
df['PROMOTION CODE'] = df['PROMOTION CODE'].apply(lambda x: ';'.join(pd.unique(x.split(';'))))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Tim Biegeleisen |
| Solution 2 | CHIDAMBARANATHAN M |
