'Data cleaning: regex to replace numbers
I have this dataframe:
p=pd.DataFrame({'text':[2,'string']})
and trying to replace digit 2 by an 'a' using this code:
p['text']=p['text'].str.replace('\d+', 'a')
But instead of letter a and get NaN?
What am I doing wrong here?
Solution 1:[1]
In your dataframe, the first value of the text column is actually a number, not a string, thus the NaN error when you try to call .str. Just convert it to a string first:
p['text'] = p['text'].astype(str).str.replace('\d+', 'a')
Output:
>>> p
text
0 a
1 string
(Note that .str.replace is soon going to change the default value of regex from True to False, so you won't be able to use regular expressions without passing regex=True, e.g. .str.replace('\d+', 'a', regex=True))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | richardec |
