'RegEx negation to handle decimal values in Pandas dataframe using .replace()
I have the following Pandas dataframe:
foo = {
'Sales' : [200, 'bar', 400, 500],
'Expenses' : [70, 90, 'baz', 170],
'Other' : [2.5, 'spam', 70, 101.25]
}
df = pd.DataFrame(foo)
Sales Expenses Other
200 70 2.5
bar 90 spam
400 baz 70
500 170 101.25
I'd like to remove non-numeric values and replace with NaN. I do so as follows:
df['Other'] = df['Other'].replace('[^0-9\.]', np.NaN, regex=True)
This gets me:
Sales Expenses Other
200 70 2.5
bar 90 NaN
400 baz 70
500 170 101.25
The decimals are not handled. I would expect [^0-9\.] to handle the decimal, but it doesn't. The following (without the escaped decimal) results in the same output:
df['Other'] = df['Other'].replace('[^0-9]', np.NaN, regex=True)
Sales Expenses Other
200 70 2.5
bar 90 NaN
400 baz 70
500 170 101.25
How do I treat the decimals?
Thanks!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
