'python remove duplicate substring parsed by comma

I have an input Pandas Series like this: input

I would like to remove duplicates in each row. For example, change M,S,S to M,S.

I tried
fifa22['player_positions'] = fifa22['player_positions'].str.split(',').apply(pd.unique)

But the results are a Series of ndarray output

I would like to convert the results to simple string, without the square bracket. Wondering what to do, thanks!



Solution 1:[1]

If it only on this one column, you should use map.

import pandas as pd
df = pd.DataFrame({
    'player_positions' : "M,S,S S S,M M,M M,M M M,S S,M,M,S".split(' ')
})
print(df)

  player_positions
0            M,S,S
1                S
2              S,M
3              M,M
4              M,M
5                M
6              M,S
7          S,M,M,S

out = df['player_positions'].map(lambda x: ','.join(set(x.split(','))))
print(out)

0    M,S
1     S
2    M,S
3     M
4     M
5     M
6    M,S
7    M,S

If you want to concatenate in any other way just change the , in ','.join(...) to anything else.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1