'Date format issue in Pyspark

I have an issue with formatting the date.

The actual value for the date is 2021/12/28

I am trying to format it to 20211228 with the following code

awardTypeDFrame = awardTypeDFrame.withColumn('ACTIONDATE', sf.date_format(sf.col('pyresolvedtimestamp'), 'YYYYMMdd'))

which gives me 20221228. So, basically it rounds off the year to 2022 from 2021. I have been trying to figure out why would this happen but could not find an explanation for it. Do you have any idea why it does not pick the right year ? Please let me know . Thanks for your time from now



Solution 1:[1]

The code snippet you shared is correct , however the pattern you are using is incorrect

The available DateTime Patterns for Parsing can be found - https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

Date Format

awardTypeDFrame = awardTypeDFrame.withColumn('ACTIONDATE', sf.date_format(sf.col('pyresolvedtimestamp'), 'yyyyMMdd'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vaebhav