'spark - detail explanation on data_format function

Where is date_format explained in detail such as what format is accepted in timestamp or expr argument?

date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt.

  • timestamp - A date/timestamp or string to be converted to the given format.
  • fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns.

date_format(expr, fmt)

  • expr: A DATE, TIMESTAMP, or a STRING in a valid datetime format.
  • fmt: A STRING expression describing the desired format.

2007-11-13 is OK for the timestamp expression but 2007-NOV-13 is not. Where is the explanation of this behavior?

spark.sql("select date_format(date '2007-11-13', 'MMM') AS month_text").show()
+----------+
|month_text|
+----------+
|       Nov|
+----------+
spark.sql("select date_format(date '2007-JAN-13', 'MMM') AS month_text").show()
...
ParseException: 
Cannot parse the DATE value: 2007-JAN-13(line 1, pos 19)

I suppose the time expression needs to be ISO 8601 but it should be documented somewhere.

Timestamp expression can be date '2007-11-13' or timestamp '2007-11-13' but where can I get the information of this expression format?



Solution 1:[1]

date '...' is a datetime literal and it has to follow a specific pattern. It's mostly used when you need to hardcode a date or you already have a datetime in ISO-8601 format.

To parse a date from string you use to_date function. To convert string representation from one format to the other first you parse with to_date, then format with date_format. In your case:

scala> spark.sql("select date_format(to_date('2007-JAN-13', 'yyyy-MMM-dd'), 'MMM') AS month_text").show()
+----------+
|month_text|
+----------+
|       Jan|
+----------+

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Kombajn zbo?owy