'TypeError: Datetime subtraction can only be applied to datetime series

I am trying to replace pandas with pyspark.pandas library, when I tried this : pdf is a pyspark.pandas dataframe

pdf["date_diff"] = pdf["date1"] - pdf["date2"] 

I got the below error :

File "C:\Users\abc\Anaconda3\envs\test\lib\site-packages\pyspark\pandas\data_type_ops\datetime_ops.py", line 75, in sub
raise TypeError("Datetime subtraction can only be applied to datetime series.")

TypeError: Datetime subtraction can only be applied to datetime series.



Solution 1:[1]

set the the column values to a date time value

pdf['date1'] = pd.to_datetime(pdf['date1'])
pdf['date2'] = pd.to_date_time(pdf['date2'])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pepijn