'CSV Python: How to combine duplicates in column A and put A-attached values together in Column B?
In a CSV file, I have duplicate values in column A, but they have different values in column B. I would like the value (IP) in column A to be reduced to one line and the individual associated values to be listed next to it in column B, separated by a comma. I've tried different code examples but somehow nothing worked. I would be very happy if you could help me :)
I'm able to use every module working with Python 3.x
How it actually looks:
IP;Ports
192.168.3.1;53
192.168.3.2;22
192.168.3.2;80
192.168.3.2;2000
192.168.3.2;3000
How it should look:
IP;Ports
192.168.3.1;53
192.168.3.2;22, 80 ,2000, 3000
Solution 1:[1]
You can always use groupby with unique function:
df = pd.DataFrame(data={
'IP': ['192.168.3.1', '192.168.3.2', '192.168.3.2', '192.168.3.2', '192.168.3.2'],
'Port': [53, 22, 80, 2000, 3000]
})
grouped_df = df.groupby(['IP'])['Port'].unique().reset_index()
which will result as:
IP Port
0 192.168.3.1 [53]
1 192.168.3.2 [22, 80, 2000, 3000]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Vladyslav Didukh |
