'How to concatenate values from many columns into one column when one doesn't know the number of columns will have

My application saves an indeterminate number of values in different columns. As a results, I have a data frame with a certain number of columns at the beginning but then from a particular column (that I know) I will have an uncertain number of columns saving same data

Example:

known1  known2 know3 unknow1 unknow2 unknow3 ...
1       3      3     data    data2   data3

The result I would like to get should be something like this

known1  known2 know3 all_unknow 
1       3      3     data,data2,data3

How can I do this when I don't know the number of unknown columns but what I do know is this will occur (in this example) from the 4th column.



Solution 1:[1]

IIUC, use filter to select the columns by keyword:

cols = list(df.filter(like='unknow'))
# ['unknow1', 'unknow2', 'unknow3']

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

or take all columns from the 4th one:

cols = df.columns[3:]

df['all_unknow'] = df[cols].apply(','.join, axis=1)
df = df.drop(columns=cols)

output:

   known1  known2  know3        all_unknow
0       1       3      3  data,data2,data3

Solution 2:[2]

df['all_unknown'] = df.iloc[:, 3:].apply(','.join, axis=1)

if you also want to drop all columns after the 4th:

cols = df.columns[3:-1]
df.drop(cols, axis=1)

the -1 is to avoid dropping the new column

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2