'Unix timestamp granularity changed to hours instead of milliseconds

I have a Spark data frame with the column timestamp. I need to create event_hour in unix_timestamp format out of this column. The current issue is that the timestamp is in unix_timestamp format with a granularity of milliseconds while I need the granularity of hours.

Current values for timestamp:
1653192037
1653192026
1653192025
1653192024
1653192023
1653192022

Expected values:
1653192000
1653195600
1653199200
1653202800

How can I achieve that using Spark functions? I've already tried to convert it to timestamp and then format it but I got null as the result:

inputDf
.withColumn("event_hour", unix_timestamp(date_format($"timestamp".cast(TimestampType), "MM-dd-yyyy HH")))


Solution 1:[1]

You can use DateUtils API,

import org.apache.commons.lang3.time.DateUtils;

Long epochTimestamp_hour =  DateUtils.truncate(Timestamp_column, Calendar.HOUR)).getTime();
  • create new column of type timestamp
  • use that column to truncate timestamp to epochTimestamp_hour

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mayuri