'How to concatenate values from many columns into one column when one doesn't know the number of columns will have
My application saves an indeterminate number of values in different columns. As a results, I have a data frame with a certain number of columns at the beginning but then from a particular column (that I know) I will have an uncertain number of columns saving same data
Example:
known1 known2 know3 unknow1 unknow2 unknow3 ...
1 3 3 data data2 data3
The result I would like to get should be something like this
known1 known2 know3 all_unknow
1 3 3 data,data2,data3
How can I do this when I don't know the number of unknown columns but what I do know is this will occur (in this example) from the 4th column.
Solution 1:[1]
IIUC, use filter to select the columns by keyword:
cols = list(df.filter(like='unknow'))
# ['unknow1', 'unknow2', 'unknow3']
df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)
or take all columns from the 4th one:
cols = df.columns[3:]
df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)
output:
known1 known2 know3 all_unknow
0 1 3 3 data,data2,data3
Solution 2:[2]
df['all_unknown'] = df.iloc[:, 3:].apply(','.join, axis=1)
if you also want to drop all columns after the 4th:
cols = df.columns[3:-1]
df.drop(cols, axis=1)
the -1 is to avoid dropping the new column
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
