'Getting difference of a column from same table

I have a table with data that is similar to this table below

id session_id user_id action log_time
1 1 3 join 1642645048
2 1 3 left 1642645048
3 1 3 join 1642645552
4 1 3 left 1642646072
5 1 3 join 1642646632
6 1 3 left 1642646736
7 1 5 join 1642647083
8 1 5 join 1642649879
9 1 5 left 1642649951
10 1 5 join 1642650112
11 1 5 join 1642650159
12 1 5 join 1642651005

log_time is saved as a unix time

Question: Is it possible to retrieve the total amount of time that a user was in a session?

So it would do something like total_participation = ("1st left" - "1st join") + ("2nd left" - "2nd join") + ("3rd left - "3rd join")

I've already got the difference between first join and last left time by doing the following query:

SELECT s1.session_id as 'Id',
       u.name AS 'Participant',
       IFNULL(TIME_FORMAT(SEC_TO_TIME(s2.time_log - s1.time_log), '%Hh %im %ss'), 0) AS 'TotalParticipation'
FROM tblSessionLog AS s1
LEFT JOIN tblSessionLog AS s2 ON (
    s2.id = (
        SELECT id
        FROM tblSessionLog
        WHERE action = 'left'
            AND user_id = s1.user_id
            AND id > s1.id
        ORDER BY time_log DESC
        LIMIT 1
    )
)
LEFT JOIN tblUser AS u ON u.id = s1.user_id -- used only to get participant name
WHERE s1.action = 'join'
GROUP BY s1.session_id, s1.user_id
ORDER BY s1.session_id, s1.user_id;

But I can't seem to get how to remove the time in between the participant have left and join backed again. Or is that not possible on SQL and should be handled on backend code?

Not exactly a copy but more or less, the actual data is like this sample SQL Fiddle here: http://sqlfiddle.com/#!9/2d8f6c/1/0

Update:

@Akina's solution work well when the data is consistent which is very much appreciated but I found out it will not be suitable on my case as there are instances where it is possible to have more than one join action before having left action, as well as no left action after join action. I updated the example table above to further show the actual data. Updated the sample fiddle as well.

Any leads will be really appreciated. And apologies as for some reason I need to do this on MySQL without the help of backend code. Thanks!



Solution 1:[1]

Perhaps something like this:

WITH cte1 AS(
SELECT user_id, action, time_log,
       ROW_NUMBER() OVER (PARTITION BY user_id, action ORDER BY time_log) AS rn
FROM tblSessionLog
WHERE action='join'),
cte2 AS(
SELECT user_id, action, time_log,
       ROW_NUMBER() OVER (PARTITION BY user_id, action ORDER BY time_log) AS rn
FROM tblSessionLog
WHERE action='left')
SELECT *,
       cte2.time_log-cte1.time_log
FROM cte1
LEFT JOIN cte2 
ON cte1.user_id=cte2.user_id
AND cte1.rn=cte2.rn;

Which on your current data will return the following results:

user_id action time_log rn user_id action time_log rn cte2.time_log-cte1.time_log
3 join 1642645048 1 3 left 1642645048 1 0
3 join 1642645552 2 3 left 1642646072 2 520
3 join 1642646632 3 3 left 1642646736 3 104
5 join 1642647083 1 5 left 1642649951 1 2868
5 join 1642649879 2 NULL NULL NULL NULL NULL
5 join 1642650112 3 NULL NULL NULL NULL NULL
5 join 1642650159 4 NULL NULL NULL NULL NULL
5 join 1642651005 5 NULL NULL NULL NULL NULL

With two generated common table expressions (cte), each result from cte assigned with ROW_NUMBER() then LEFT JOIN them with matching user_id and the generated ROW_NUMBER(). As your sample data only have 1 left for user_id=5 then it only paired for the first found join for the same user_id and returned NULL for the rest. If a left action was added for user_id=5 afterwards, it will occupy as the left action for the join action that is NULL. For example, if we add:

INSERT INTO tblSessionLog (id, session_id, user_id, action, time_log) 
      VALUES (13, 1, 5, 'left', 1642652005);
INSERT INTO tblSessionLog (id, session_id, user_id, action, time_log) 
      VALUES (14, 1, 5, 'left', 1642652085);

then the results will be something like this:

user_id action time_log rn user_id action time_log rn cte2.time_log-cte1.time_log
3 join 1642645048 1 3 left 1642645048 1 0
3 join 1642645552 2 3 left 1642646072 2 520
3 join 1642646632 3 3 left 1642646736 3 104
5 join 1642647083 1 5 left 1642649951 1 2868
5 join 1642649879 2 5 left 1642652005 2 2126
5 join 1642650112 3 5 left 1642652085 3 1973
5 join 1642650159 4 NULL NULL NULL NULL NULL
5 join 1642651005 5 NULL NULL NULL NULL NULL

One thing to take note is that the ROW_NUMBER() I generated here is order by time_log and not by id (or which data was inserted first). If you wish to pair them by which data comes first, you can simply change the ORDER BY time_log to ORDER BY id on both of the ROW_NUMBER() assignment.

Demo fiddle

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 FanoFN