'Split dataframe colum by content
How can I separate this data column by 'A','B' ...? The first column as an index must be retained.
df = pd.DataFrame(data)
df = df[['seconds', 'marker', 'data1', 'data2', 'data3']]
seconds,marker,data1,data2,data3
00001,A,3,3,0,42,0
00002,B,3,3,0,34556,0
00003,C,3,3,0,42,0
00004,A,3,3,0,1833,0
00004,B,3,3,0,6569,0
00005,C,3,3,0,2454,0
00006,C,3,3,0,3256,0
00007,C,3,3,0,5423,0
00008,A,3,3,0,569,0
Solution 1:[1]
You can just get the unique values in the letter column (that's what I called it). And then filter the DataFrame containing all values using these unique values.
I am storing the newly created DataFrames in a dictionary here, but you could also store them in a list or whatever. I've used the input you have provided but have given the first 2 columns the names index and letter as they were unnamed in your .csv.
import pandas as pd
df = pd.DataFrame({
'index': {0: 1, 1: 2, 2: 3, 3: 4, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8},
'letter': {0: 'A', 1: 'B', 2: 'C', 3: 'A', 4: 'B', 5: 'C', 6: 'C', 7: 'C', 8: 'A'},
'seconds': {0: 3, 1: 3, 2: 3, 3: 3, 4: 3, 5: 3, 6: 3, 7: 3, 8: 3},
'marker': {0: 3, 1: 3, 2: 3, 3: 3, 4: 3, 5: 3, 6: 3, 7: 3, 8: 3},
'data1': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0},
'data2': {0: 42, 1: 34556, 2: 42, 3: 1833, 4: 6569, 5: 2454, 6: 3256, 7: 5423, 8: 569},
'data3': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0}
})
# get unique values
unique_values = df["letter"].unique()
# filter "big" dataframe using one of the unique value at a time
split_dfs = {value: df[df["letter"] == value] for value in unique_values}
print(split_dfs["A"])
print(split_dfs["B"])
print(split_dfs["C"])
Expected output:
index letter seconds marker data1 data2 data3
0 1 A 3 3 0 42 0
3 4 A 3 3 0 1833 0
8 8 A 3 3 0 569 0
index letter seconds marker data1 data2 data3
1 2 B 3 3 0 34556 0
4 4 B 3 3 0 6569 0
index letter seconds marker data1 data2 data3
2 3 C 3 3 0 42 0
5 5 C 3 3 0 2454 0
6 6 C 3 3 0 3256 0
7 7 C 3 3 0 5423 0
As you can see the index is preserved.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
