'pandas reading CSV data formatted with comma for thousands separator
I am trying to create a dataframe in pandas using a CSV that is semicolon-delimited, and uses commas for the thousands separator on numeric data. Is there a way to read this in so that the type of the column is float and not string?
Solution 1:[1]
Pass param thousands=',' to read_csv to read those values as thousands:
In [27]:
import pandas as pd
import io
t="""id;value
0;123,123
1;221,323,330
2;32,001"""
pd.read_csv(io.StringIO(t), thousands=r',', sep=';')
Out[27]:
id value
0 0 123123
1 1 221323330
2 2 32001
Solution 2:[2]
The answer to this question should be short:
df=pd.read_csv('filename.csv', thousands=',')
Solution 3:[3]
Take a look at the read_csv documentation there is a keyword argument 'thousands' that you can pass the ',' into. Likewise if you had European data containing a '.' for the separator you could do the same.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | EdChum |
| Solution 2 | Dimanjan |
| Solution 3 | Grr |
