'Cannot cast LongType to DateType error
(New to PySpark)
I has do lots of search and tried so many different way, I'm posting here my last try: I have dataframe that looks like:
txn_dt datetime64[ns] id int64
I'm trying to use Spark SQL to join txn_dt to another date type and get error type mistmatch, so I'm trying to convert the txn_dt to date using the following code:
df = df.withColumn("txn_dt_tmp",df["txn_dt"].cast(DateType())) \
.drop("txn_dt") \
.withColumnRenamed("txn_dt_tmp", "txn_dt")
but getting
org.apache.spark.sql.AnalysisException: cannot resolve 'CAST(`txn_dt` AS DATE)' due to data type mismatch: cannot cast LongType to DateType;
Please help
Solution 1:[1]
Another solution would be to use from_unixtime to get a timestamp datatype from your original column, then use to_date to convert it to DateType.
import pyspark.sql.functions as f
df.withColumn("txn_dt", f.to_date(f.from_unixtime(f.col("txn_dt"))))
Solution 2:[2]
Another solution in Scala, first casting to Timestamp and then getting the Date:
import org.apache.spark.sql.functions.{col, to_date}
df.withColumn("txn_dt", to_date(col("txn_dt").cast(TimestampType)))
Solution 3:[3]
Try to convert txn_dt to "timestamp" and then cast it as a "date" datatype.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Javier Montón |
| Solution 3 | Marioanzas |
