'Cannot cast LongType to DateType error

(New to PySpark)

I has do lots of search and tried so many different way, I'm posting here my last try: I have dataframe that looks like:

txn_dt datetime64[ns] id int64

I'm trying to use Spark SQL to join txn_dt to another date type and get error type mistmatch, so I'm trying to convert the txn_dt to date using the following code:

df = df.withColumn("txn_dt_tmp",df["txn_dt"].cast(DateType())) \
                                        .drop("txn_dt") \
                                        .withColumnRenamed("txn_dt_tmp", "txn_dt")

but getting

org.apache.spark.sql.AnalysisException: cannot resolve 'CAST(`txn_dt` AS DATE)' due to data type mismatch: cannot cast LongType to DateType;

Please help



Solution 1:[1]

Another solution would be to use from_unixtime to get a timestamp datatype from your original column, then use to_date to convert it to DateType.

import pyspark.sql.functions as f
df.withColumn("txn_dt", f.to_date(f.from_unixtime(f.col("txn_dt"))))

Solution 2:[2]

Another solution in Scala, first casting to Timestamp and then getting the Date:

import org.apache.spark.sql.functions.{col, to_date}

df.withColumn("txn_dt", to_date(col("txn_dt").cast(TimestampType)))

Solution 3:[3]

Try to convert txn_dt to "timestamp" and then cast it as a "date" datatype.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Javier Montón
Solution 3 Marioanzas