'Python - Pandas CSV file with error when converting mixed data types string to number
So I have a large csv file with lots of data. The main column 'Results', that I am interested in has integers, float, NaN data types and also number as text. I need to aggregate 'Results' but before I do I want to convert the column to float data type. The values that are text have trailing spaces like the following: ["1.07 ", "8.22 ", "8.6 ", "11.41 ", "7.93 "]
The error I get is...
AttributeError: Can only use .str accessor with string values!
import pandas as pd
import os
import numpy as np
csv_file = 'c:/path/to/file/big.csv'
# ... more lines of code ...
df = pd.read_csv(csv_file, usecols=my_cols, parse_dates=['Date'])
df = df[df['Company ID'].str.contains(my_company)]
print('df of csv created')
# Above code works great.
# the below 2 tries did not work for me
# df['Result'] = pd.to_numeric(df['Result'].str.replace(' ', ''), errors='ignore')
# df['Result'] = df['Result'].str.strip() # causes an error
# now let's try np.where...
# the below causes AttributeError: Can only use .str accessor with string values!
df['Result'] = np.where(df['Result'].dtype == np.str, df['Result'].str.strip(),
df['Result'])
df['Result'] = pd.to_numeric(df['Result'], downcast="float", errors='raise')
How should I resolve this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
