'Unix timestamp granularity changed to hours instead of milliseconds
I have a Spark data frame with the column timestamp. I need to create event_hour in unix_timestamp format out of this column. The current issue is that the timestamp is in unix_timestamp format with a granularity of milliseconds while I need the granularity of hours.
Current values for timestamp:
1653192037
1653192026
1653192025
1653192024
1653192023
1653192022
Expected values:
1653192000
1653195600
1653199200
1653202800
How can I achieve that using Spark functions? I've already tried to convert it to timestamp and then format it but I got null as the result:
inputDf
.withColumn("event_hour", unix_timestamp(date_format($"timestamp".cast(TimestampType), "MM-dd-yyyy HH")))
Solution 1:[1]
You can use DateUtils API,
import org.apache.commons.lang3.time.DateUtils;
Long epochTimestamp_hour = DateUtils.truncate(Timestamp_column, Calendar.HOUR)).getTime();
- create new column of type timestamp
- use that column to truncate timestamp to epochTimestamp_hour
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mayuri |
