'How to aggregate timestamp by span (spark DataFrame)
Input DataFrame
+-----------+-----------+----------+-------------------------+
|start_ts |end_ts |event_type|category |
+-----------+-----------+----------+-------------------------+
|1577865600 |1577869200 |buy |Entertainment-equipment |
|1577865660 |1577869260 |buy |Everyday-jewelry |
|1577865720 |1577869320 |view |Cameras |
|1577865720 |1577869320 |view |Luggage |
|1577865780 |1577869380 |view |Mobile-phones |
|1577865780 |1577869380 |buy |Everyday-jewelry |
how to aggregate timestamp by span 3600
Output DataFrame
+-----------+-----------+
|start_ts |end_ts |
+-----------+-----------+
|1577865600 |1577869200 |
|1577869200 |1577872800 |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
