'How to aggregate timestamp by span (spark DataFrame)

Input DataFrame

+-----------+-----------+----------+-------------------------+
|start_ts   |end_ts     |event_type|category                 |
+-----------+-----------+----------+-------------------------+
|1577865600 |1577869200 |buy       |Entertainment-equipment  |
|1577865660 |1577869260 |buy       |Everyday-jewelry         |
|1577865720 |1577869320 |view      |Cameras                  |
|1577865720 |1577869320 |view      |Luggage                  |
|1577865780 |1577869380 |view      |Mobile-phones            |
|1577865780 |1577869380 |buy       |Everyday-jewelry         |

how to aggregate timestamp by span 3600

Output DataFrame

+-----------+-----------+
|start_ts   |end_ts     |
+-----------+-----------+
|1577865600 |1577869200 |
|1577869200 |1577872800 |


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source