'Replace whole string if it contains substring in pandas
I want to replace all strings that contain a specific substring. So for example if I have this dataframe:
import pandas as pd
df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'],
'sport': ['tennis', 'football', 'basketball']})
I could replace football with the string 'ball sport' like this:
df.replace({'sport': {'football': 'ball sport'}})
What I want though is to replace everything that contains ball (in this case football and basketball) with 'ball sport'. Something like this:
df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})
Solution 1:[1]
You can use apply with a lambda. The x parameter of the lambda function will be each value in the 'sport' column:
df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)
Solution 2:[2]
you can use str.replace
df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')
0 tennis
1 ball sport
2 ball sport
Name: sport, dtype: object
reassign with
df['sport'] = df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')
df
Solution 3:[3]
A different str.contains
df['support'][df.name.str.contains('ball')] = 'ball support'
Solution 4:[4]
You can use a lambda function also:
data = {"number": [1, 2, 3, 4, 5], "function": ['IT', 'IT application',
'IT digital', 'other', 'Digital'] }
df = pd.DataFrame(data)
df.function = df.function.apply(lambda x: 'IT' if 'IT' in x else x)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DeepSpace |
| Solution 2 | piRSquared |
| Solution 3 | Axis |
| Solution 4 | prashangrg |

