'Pandas function to rename certain column values based off of a boolean condition in another column

I'm trying to clean a dataset that has demographic information for my company.

There is a text column for "Race" that contains the values ['White', 'Black', 'Asian', 'Two or More Races']. There is another boolean column for "Hispanic or Latino" that is either a 0 for no or a 1 for yes.

What I need to do is replace the values in the race column to "Hispanic/Latino" if the "Hispanic or Latino" column = 1, UNLESS it's "Two or More Races" which would stay the same. Does anybody have a good solution to this? I'm relatively new with Pandas and I've tried using df.loc to solve this, but the examples I see aren't as specific as mine.



Solution 1:[1]

You can select rows using

mask = (df["Hispanic or Latino"] == 1) & (df['Race'] != 'Two or More Races')

df.loc[mask, 'Race'] = 'Hispanic/Latino'

Tested on simple example

import pandas as pd

df = pd.DataFrame({
    'Race': ['White', 'Black', 'Asian', 'Two or More Races'],
    "Hispanic or Latino": [0, 1, 0, 1],
})

mask = (df["Hispanic or Latino"] == 1) & (df['Race'] != 'Two or More Races')

df.loc[mask, 'Race'] = 'Hispanic/Latino'

print(df)

Result:

                Race  Hispanic or Latino
0              White                   0
1    Hispanic/Latino                   1
2              Asian                   0
3  Two or More Races                   1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 furas