'Using astype on a koalas column gives strange result of datatype of column as <U0
I have a column in my koalas dataframe called purchase_date. In databricks notebook, with runtime as 10.3, when I do the following lines of code, I get the dtype of the purchase_date column as <U0. I am not able to understand why this is happenning.
My code which caused this is as follows (in Databricks runtime 10.3):
import databricks.koalas as ks
print("Datatype of purchase_date before astype:" , my_ks_dataframe['purchase_date'].dtype) # Datatype of purchase_date before astype: object
# Using the astype
my_ks_dataframe['purchase_date'] = my_ks_dataframe['purchase_date'].astype('str')
print("Datatype of purchase_date before astype:" , my_ks_dataframe['purchase_date'].dtype) # Datatype of purchase_date after astype: <U0
I am not sure why I see this behaviour in Databricks runtime 10.3. When I execute the same code in Databricks runtime 8.1, I get the desired datatype for purchase_date as object before and after astype usage.
# print result in Databricks runtime 8.1
Datatype of purchase_date before astype: object
Datatype of purchase_date after astype: object
Solution 1:[1]
Koalas is included on clusters running Databricks Runtime 7.3 through 9.1. For clusters running Databricks Runtime 10.0 and above, use Pandas API on Spark instead.
Reference :- https://docs.microsoft.com/en-us/azure/databricks/languages/koalas
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | PratikLad-MT |
