'How to calculate Max(Date) and Min(Date) for DateType in pyspark dataframe?
The dataframe has a date column in string type '2017-01-01'
It is converted to DateType()
df = df.withColumn('date', col('date_string').cast(DateType()))
I would like to calculate the first day and last day of the column. I tried with the following codes, but they do not work. Can anyone give any suggestions? Thanks!
df.select('date').min()
df.select('date').max()
df.select('date').last_day()
df.select('date').first_day()
Solution 1:[1]
Additional way to do it in a line
import pyspark.sql.functions as F
df.agg(F.min("date"), F.max("date")).show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ivan M. |
