'Exclude rows containing a certain string

I have a dataset which looks like:

df.head()
applicationstartdate    segment fpd_30  fpd_90  fstpd_30
0   2020-01-01 00:04:10 3a.TBC Payroll with CB  0.0 0.0 0.0
1   2020-01-01 00:04:17 3a.TBC Payroll with CB  0.0 0.0 0.0
2   2020-01-01 00:14:25 1.TBC Payroll with CH (All) 0.0 0.0 0.0
3   2020-01-01 00:31:59 1.TBC Payroll with CH (All) 0.0 0.0 0.0
4   2020-01-01 00:41:49 1.TBC Payroll with CH (All) 0.0 0.0 0.

I want to exclude all the rows containing word "Payroll" in column "segment".

I tried:

df2 = df[~df["segment"].str.contains('Payroll')]

which yielded:

TypeError: bad operand type for unary ~: 'float'

Help would be appreciated.



Solution 1:[1]

You likely have NaNs in your column, you can use:

df2 = df[~df["segment"].fillna('').str.contains('Payroll')]

Or,f if you also want to filter out the NaNs:

df2 = df[~df["segment"].fillna('Payroll').str.contains('Payroll')]

Solution 2:[2]

You can use na = True argument - because you are negating the condition and you want NaN to be filtered.

df2 = df[~df['segment'].str.contains('Payroll', na=True)]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway
Solution 2 SomeDude