'Resampling Session/Time Data (T SQL via Azure Synapse Serverless/OnDemand)
I am working with data which has a timestamp of activity (start timestamp) an identifier of content and then start second within that content alongside a duration viewed. e.g.
| start timestamp | identifier | start_second | duration(s) | type |
|---|---|---|---|---|
| 2022-01-01 12:00:00 | ABC | 0 | 720 | YouTube |
| 2022-01-01 12:01:00 | ABC | 600 | 100 | YouTube |
| 2022-01-01 12:05:00 | XYZ | 0 | 150 | Netflix |
Using T-SQL in Azure Synapse Serverless, I want to be able to split and resample this data by minute rather than a set duration e.g.
| start timestamp | identifier | start_second | duration(s) | type |
|---|---|---|---|---|
| 2022-01-01 12:00:00 | ABC | 0 | 60 | YouTube |
| 2022-01-01 12:01:00 | ABC | 60 | 60 | YouTube |
| 2022-01-01 12:02:00 | ABC | 120 | 60 | YouTube |
| 2022-01-01 12:03:00 | ABC | 180 | 60 | YouTube |
| 2022-01-01 12:04:00 | ABC | 240 | 60 | YouTube |
| 2022-01-01 12:05:00 | ABC | 300 | 60 | YouTube |
| 2022-01-01 12:06:00 | ABC | 360 | 60 | YouTube |
| 2022-01-01 12:07:00 | ABC | 420 | 60 | YouTube |
| 2022-01-01 12:08:00 | ABC | 480 | 60 | YouTube |
| 2022-01-01 12:09:00 | ABC | 540 | 60 | YouTube |
| 2022-01-01 12:10:00 | ABC | 600 | 60 | YouTube |
| 2022-01-01 12:11:00 | ABC | 660 | 60 | YouTube |
| 2022-01-01 12:01:00 | ABC | 600 | 60 | YouTube |
| 2022-01-01 12:02:00 | ABC | 660 | 40 | YouTube |
| 2022-01-01 12:05:00 | XYZ | 0 | 60 | Netflix |
| 2022-01-01 12:06:00 | XYZ | 60 | 60 | Netflix |
| 2022-01-01 12:07:00 | XYZ | 120 | 30 | Netflix |
I am a bit lost on how I can do this for the whole 24 hours in the day for the type of data I have above. The point being, aggregations to the timestamp can be carried out following this on different dimensions to calculate the total duration per minute or stats to that effect.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
