'String-join pandas dataframe colums and skip nan values

I'm trying to join column values into new column but I want to skip nan values:

df['col'] = 'df['col1'].map(str) + ',' + df['col2'].map(str) + ',' + df['col3'].map(str)'

For example if a col2 value is nan, corresponding col value becomes:

 val1,,val3
      ^

... but I want to suppress the unwanted comma corresponding to the NaN column:

val1,val3

Sample df:

col1 col2 col3
---------------
val11 nan val13
nan val22 val23
nan   nan val33

Desired output:

col1 col2 col3   col
---------------------
val11 nan val13  val11,val13  
nan val22 val23  val22,val23
nan   nan val33  val33


Solution 1:[1]

try this:

import numpy as np
import pandas as pd

data = {'col1': {0: 'val11', 1: np.nan, 2: np.nan},
        'col2': {0: np.nan, 1: 'val22', 2: np.nan},
        'col3': {0: 'val13', 1: 'val23', 2: 'val33'}}

df = pd.DataFrame(data)
print(df)
>>>
    col1    col2    col3
0   val11   NaN     val13
1   NaN     val22   val23
2   NaN     NaN     val33
df['col'] = df.apply(lambda s: s.str.cat(sep=','), axis=1)
print(df)
>>>
    col1    col2    col3    col
0   val11   NaN     val13   val11,val13
1   NaN     val22   val23   val22,val23
2   NaN     NaN     val33   val33

Solution 2:[2]

Oneliner:

df['col'] = df.agg(lambda x: ','.join(x[~x.isnull()].values), axis=1)
print(df)

Output:

    col1   col2   col3          col
0  val11    NaN  val13  val11,val13
1    NaN  val22  val23  val22,val23
2    NaN    NaN  val33        val33

Solution 3:[3]

Improving on BeRT2me's one-liner, directly use .dropna() on aeach row's columns:

df.agg(lambda cols: ','.join(cols.dropna()), axis=1)

val11,val13
val22,val23
val33

Solution 4:[4]

When you read the dataframe from csv file then use:

df.read_csv(path , na_filter=False)

If you already have the dataframe then you can replace nan with empty string in this way:

df = df.fillna('')

Updated solution:

From what I understand in your question you want to include only column values that aren't nan.

You can add a condition before aggregating each column value to the desired result column col on each row of dataframe:

df['col'] = ""
for index, row in df.iterrows():
    if not pd.isnull(row['col1']):
        df.at[index,'col'] = f"{row['col1']} "
    if not pd.isnull(row['col2']):
        df.at[index, 'col'] += f"{row['col2']} "
    if not pd.isnull(row['col3']):
        df.at[index, 'col'] += f"{row['col3']}"
    df.at[index, 'col'] = df.at[index, 'col'].rstrip().replace(" ",",")

Console output:

    col1   col2   col3          col
0  val11    NaN  val13  val11,val13
1    NaN  val22  val23  val22,val23
2    NaN    NaN  val33        val33

Process finished with exit code 0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ziying35
Solution 2 BeRT2me
Solution 3 smci
Solution 4