'Cannot perform std with type object Dask
performing normal calculation on dask is giving me the error
x_std = x.std().compute()
Computing head:
x.head()
LocalTime Ask Bid
0 2004.10.25 00:01:01.975 86.837 86.877
1 2004.10.25 00:01:19.300 86.791 86.891
2 2004.10.25 00:01:30.759 86.812 86.842
3 2004.10.25 00:01:41.798 86.801 86.831
4 2004.10.25 00:01:42.213 86.794 86.824
Error :
TypeError: cannot perform std with type object
I was doing in accordance with documentation ...
Solution 1:[1]
From the output of x.head(), it can be seen that one of the columns is a datetime column, however without conversion, it's likely stored as an object column. To check dtypes, run:
print(ddf.dtypes)
To convert, use dd.to_datetime as explained in this blog post:
from dask.dataframe import to_datetime
# note this overwrites the original column
ddf["LocalTime"] = to_datetime(ddf["LocalTime"])
If the other two columns, Ask and Bid, are also objects, then another conversion, to numeric, is needed (see this blog post for details):
from dask.dataframe import to_numeric
ddf["Ask"] = to_numeric(ddf["Ask"], errors="coerce")
ddf["Bid"] = to_numeric(ddf["Bid"], errors="coerce")
After conversion, the ddf_std = ddf.std().compute() should work without error.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
