'Transforming corresponding Start-Stop Time in rows to columns
I have a table in this format:
| User | Time | status |
|---|---|---|
| User 1 | 2021-12-24 14:00:00 | connect |
| User 2 | 2021-12-24 14:05:23 | connect |
| User 2 | 2021-12-24 14:10:11 | disconnect |
| User 2 | 2021-12-24 14:10:17 | connect |
| User 1 | 2021-12-24 19:35:22 | disconnect |
| User 2 | 2021-12-25 01:10:40 | disconnect |
I'd like to have a table with connect & disconnect as columns, one row for each session from connect to next disconnect per user
| User | connect | disconnect |
|---|---|---|
| User 1 | 2021-12-24 14:00:00 | 2021-12-24 19:35:22 |
| User 2 | 2021-12-24 14:05:23 | 2021-12-24 14:10:11 |
| User 2 | 2021-12-24 14:10:17 | 2021-12-25 01:10:40 |
I could create this on MySQL, MariaDB or MSSQL, depending on where it's easier to do. Is it possible to do as a view? Great addon, but not absolutely necessary: column "duration" that shows the duration of each session from connect to disconnect.
If easier, connect/disconnect times could be (mili)seconds from 1970/1/1.
Solution 1:[1]
Try the following solution, it should work in most RDBMS, being careful to correctly delimit column names that clash with reserved words.
First you need to pair up the connects/disconnects, this can be done by numbering the rows and then duplicating every-other number using a modulo, then using conditional aggregation:
with rn as (
select *,
Row_Number() over(partition by user order by time) rn
from t
), gp as (
select *,
case when rn %2=0 then
Lag(rn) over(partition by user order by rn)
else rn end gp
from rn
)
select
[user],
Max(case when status = 'connect' then time end) Connect,
Max(case when status = 'disconnect' then time end) Disconnect
from gp
group by user, gp
Solution 2:[2]
The Conditional Aggregation can be used after applying ROW_NUMBER() window function if your available databases' versions are 8.0+, 10.2+ and 2005 for MySQL, MariaDB and SQL Server respectively such as
WITH t2 AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY user,status ORDER BY time) AS rn,
t.*
FROM t
)
SELECT User,
MAX(CASE WHEN status = 'connect' THEN time END) AS connect,
MAX(CASE WHEN status = 'disconnect' THEN time END) AS disconnect
FROM t2
GROUP BY user, rn
ORDER BY user, rn
where
the dataset is grouped by
userandstatuscolumns while sorted bytimecolumnuser, which's a reserved keyword, must be replaced by[user]as a column name for SQL Server DB
Solution 3:[3]
According to the general way of thinking, you just need to sort records arranged by user and time according to user, and, in each group, get the disconnect time from each even numbered row. SQL uses a roundabout method to do this. The method uses window function and specifically creates “row numbers” to achieve the goal. A convenient alternative is to import data out of database and handle it in Python or SPL. SPL, the open-source Java package, is easier to be integrated into a Java program and generate more concise code. It can accomplish the task with only two lines of code:
data=MYSQL.query("SELECT USER,TIME FROM t ORDER BY USER,TIME")
result=data.new(USER,TIME:CONNECT,if(#%2==1,TIME[+1]):DISCONNECT).select(DISCONNECT)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Stu |
| Solution 2 | Barbaros Özhan |
| Solution 3 | Miuccia |
