'ValueError: Series.replace cannot use dict-value and non-None to_replace when creating a conditional column

given this dataframe named df:

Number  City    Country
one     Milan   Italy
two     Paris   France
three   London  UK
four    Berlin  Germany
five    Milan   Italy
six     Oxford  UK

I would like to create a new column called 'Classification' based on this condition:

if df['Country'] = "Italy" and df['City'] = "Milan", result = "zero" else result = df['Number']

The result I want to achieve is this:

Number  City    Country Classification
one     Milan   Italy   zero
two     Paris   France  two
three   London  UK      three
four    Berlin  Germany four
five    Milan   Italy   zero
six     Oxford  UK      six

I tried to use this code:

condition = [(df['Country'] == "Italy") & (df['City'] == 'Milan'),]
values = ['zero']
df['Classification'] = np.select(condition, values)

the result of which is this dataframe:

Number  City    Country Classification
one     Milan   Italy   zero
two     Paris   France  0
three   London  UK      0
four    Berlin  Germany 0
five    Milan   Italy   zero
six     Oxford  UK      0

now I try to replace the '0' in the 'Classification' column with the values of the column 'Number'

df['Classification'].replace(0, df['Number'])

but the result I get is an error:

ValueError: Series.replace cannot use dict-value and non-None to_replace

I would be very grateful for any suggestion on how to fix this



Solution 1:[1]

What you want is np.where

df['Classification'] = np.where((df['Country'] == "Italy") & (df['City'] == 'Milan'), 'zero', df['Number'])
print(df)

  Number    City  Country Classification
0    one   Milan    Italy           zero
1    two   Paris   France            two
2  three  London       UK          three
3   four  Berlin  Germany           four
4   five   Milan    Italy           zero
5    six  Oxford       UK            six

If you want to use np.select, you need to specify default argument

condition = [(df['Country'] == "Italy") & (df['City'] == 'Milan'),]
values = ['zero']
df['Classification'] = np.select(condition, values, default=df['Number'])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ynjxsjmh