'How to combine multiple set of columns in a dataframe to single one? [duplicate]
I have a dataframe as follows
Cycle | A_0 | A_1 | A_2 | A_3 | B_0 | B_1 | B_2 | B_3 |
---|---|---|---|---|---|---|---|---|
1 | 3 | 4 | 5 | 6 | 1 | 4 | 5 | |
1 | 8 | 5 | 3 | 1 | 0 | 8 | 6 | 4 |
2 | 7 | 9 | 1 | 6 | 1 | 0 | 2 | 3 |
3 | 5 | 9 | 1 | 0 | 3 | 8 | 3 |
this dataframe has to combined to two column A and B
Expected output
Cycle | A | B |
---|---|---|
1 | 3 | |
1 | 4 | 1 |
1 | 5 | 4 |
1 | 6 | 5 |
1 | 8 | 0 |
1 | 5 | 8 |
1 | 3 | 6 |
1 | 1 | 4 |
2 | 7 | 1 |
2 | 9 | 0 |
2 | 1 | 2 |
2 | 6 | 3 |
3 | 5 | 0 |
3 | 3 | |
3 | 9 | 8 |
3 | 1 | 3 |
What i did?
A = [f"A_{i}" for i in range(20)]
B = [f"B_{i}" for i in range(20)]
df2['A'] = df[A].bfill(axis=1).iloc[:, 0]
df2['B'] = df[B].bfill(axis=1).iloc[:, 0]
This line of code is givng me an output datframe by avoiding the nan. How can i get the expected output?
ADDON
added a new colum to the initial data and expected outcome
Solution 1:[1]
You can use pandas.wide_to_long
:
(pd.wide_to_long(df.reset_index(), stubnames=['A', 'B'], i=['index','Cycle'], j='x', sep='_')
.droplevel(['index', 'x'])
)
Output:
A B
Cycle
1 3.0 NaN
1 4.0 1.0
1 5.0 4.0
1 6.0 5.0
1 8.0 0.0
1 5.0 8.0
1 3.0 6.0
1 1.0 4.0
2 7.0 1.0
2 9.0 0.0
2 1.0 2.0
2 6.0 3.0
3 5.0 0.0
3 NaN 3.0
3 9.0 8.0
3 1.0 3.0
Solution 2:[2]
code part
columns = pd.Index(['A_0', 'A_1', 'A_2', 'A_3', 'B_0', 'B_1', 'B_2', 'B_3'], dtype='string')
values = np.array([[ 3., 4., 5., 6., np.nan, 1., 4., 5.],
[ 8., 5., 3., 1., 0., 8., 6., 4.],
[ 7., 9., 1., 6., 1., 0., 2., 3.],
[ 5., np.nan, 9., 1., 0., 3., 8., 3.]],
dtype=float)
## Or retrive from raw DataFrame if already exists
# columns = df_raw.columns
# values = df_raw.values
## Construct MultiIndex
mi = pd.MultiIndex.from_tuples((s.split("_") for s in columns))
## Construct DataFrame
df = pd.DataFrame(values, columns=mi)
## reshape: stack level=1 (2nd row) of columns to index
df_result = df.stack(level=1)
>>> df_result
A B
0 0 3.0 NaN
1 4.0 1.0
2 5.0 4.0
3 6.0 5.0
1 0 8.0 0.0
1 5.0 8.0
2 3.0 6.0
3 1.0 4.0
2 0 7.0 1.0
1 9.0 0.0
2 1.0 2.0
3 6.0 3.0
3 0 5.0 0.0
1 NaN 3.0
2 9.0 8.0
3 1.0 3.0
Explain
Steps:
Construct MultiIndex from flat Index
Pandas provides 4 builtin method to construct MultiIndex; Here use from_tuples form doc: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.from_tuples.html
from_arrays :: input
[[x1, x2, ...], [y1, y2, ...]]
output[[x1, y1], [x2, y2], ...]
from_tuples :: input
[[x1, y1], [x2, y2], ...]
output samefrom_frame :: Transfer
DataFrames.values
toMultiIndex
from_product :: input like arrays, but zip them to output. e.g. input
[[x1, x2], [y1, y2, y3]]
output
MultiIndex([('x1', 'y1'), ('x1', 'y2'), ('x1', 'y3'), ('x2', 'y1'), ('x2', 'y2'), ('x2', 'y3')], )
Construct new DataFrame and reshape by stack
See User Guide on reshape/pivot topic: doc: https://pandas.pydata.org/docs/user_guide/reshaping.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mozway |
Solution 2 | mathsping220 |