'pyspark sql - compare first day of month with string
I have a column in a dataframe firstday, in timestamp format, looks like this 2022-03-01 00:00:00 I'm trying to pass run_date a string parameter at run time as filter. But the sql below didn't work. How do you achieve this? Many thanks!
Trying to convert run_date , which looks like 2022-03-28 to the first day of the month and use it to compare with firstday
spark.sql("select * from df where firstday = date_trunc('mon','{}')".format(run_date))
Solution 1:[1]
Almost there, little formatting issues
orginal df
+---+----------+
| id| firstday|
+---+----------+
| 1|2022-03-01|
| 2|2022-03-17|
+---+----------+
Apply filter
run_date="'2022-03-21'"
filtereddf=spark.sql(("select * from df where firstday ==date_trunc('mon',{})").format(run_date))
filtereddf.show()
or
run_date="2022-03-21"
filtereddf=spark.sql(("select * from df where firstday ==date_trunc('mon','{}')").format(run_date))
filtereddf.show()
+---+----------+
| id| firstday|
+---+----------+
| 1|2022-03-01|
+---+----------+
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | wwnde |
