'Assign multiple columns different values based on conditions in Panda dataframe

I have dataframe where new columns need to be added based on existing column values conditions and I am looking for an efficient way of doing. For Ex:

df = pd.DataFrame({'a':[1,2,3],
                   'b':['x','y','x'],
                   's':['proda','prodb','prodc'],
                   'r':['oz1','0z2','oz3']})

I need to create 2 new columns ['c','d'] based on following conditions

  If df['b'] == 'x':
     df['c'] = df['s']
     df['d'] = df['r']
  elif df[b'] == 'y':
     #assign different values to c, d columns

We can use numpy where and apply conditions on new column like

df['c] = ny.where(condition, value)
df['d'] = ny.where(condition, value)

But I am looking if there is a way to do this in a single statement or without using for loop or multiple numpy or panda apply.

Solution 1:^[1]

The exact output is unclear, but you can use numpy.where with 2D data.

For example:

cols = ['c', 'd']
df[cols] = np.where(df['b'].eq('x').to_numpy()[:,None],
                    df[['s', 'r']], np.nan)

output:

   a  b      s    r      c    d
0  1  x  proda  oz1  proda  oz1
1  2  y  prodb  0z2    NaN  NaN
2  3  x  prodc  oz3  prodc  oz3

If you want multiple conditions, use np.select:

cols = ['c', 'd']
df[cols] = np.select([df['b'].eq('x').to_numpy()[:,None],
                      df['b'].eq('y').to_numpy()[:,None]
                      ],
                     [df[['s', 'r']],
                      df[['r', 'a']]
                      ], np.nan)

it is however easier here to use a loop for the conditions if you have many:

cols = ['c', 'd']
df[cols] = np.select([df['b'].eq(c).to_numpy()[:,None] for c in ['x', 'y']],
                     [df[repl] for repl in (['s', 'r'], ['r', 'a'])],
                     np.nan)

output:

   a  b      s    r      c    d
0  1  x  proda  oz1  proda  oz1
1  2  y  prodb  0z2    0z2    2
2  3  x  prodc  oz3  prodc  oz3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	mozway

'Assign multiple columns different values based on conditions in Panda dataframe

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]