'MARIADB overlap function for time periods
We have multiple, connected tables that change over time and we need to do queries where we want all information over a certain time period.
Let's assume our database has these two tables tb1 and tb2 with the following corresponding columns and values in them:
tb1
pk fk start end
1 1 2020-01 2021-01
1 2 2021-01 inf
tb2
pk_refd val start end
1 1 2000-01 2022-01
1 2 2022-01 inf
2 4 2000-01 inf
When I do the following query:
SELECT tb1.pk, tb1.fk, tb2.val,
tb1.start_timestamp AS tb1_START, tb1.end_timestamp AS tb1_End,
tb2.start_timestamp AS tb2_START, tb2.end_timestamp AS tb2_End
FROM
tb1 FOR SYSTEM_TIME
FROM '2000-01-01 00:00:00' TO '2030-01-01 00:00:00',
tb2 FOR SYSTEM_TIME
FROM '2000-01-01 00:00:00' TO '2030-01-01 00:00:00'
WHERE tb1.fk = tb2.pk_refd;
I get:
pk fk val tb1_START tb1_End tb2_START tb2_End
1 1 1 2020-01-01 2021-01-01 2000-01-01 2022-01-01
1 1 2 2020-01-01 2021-01-01 2022-01-01 2038-01-19
1 2 3 2021-01-01 2038-01-19 2000-01-01 2038-01-19
(I deleted the exact time and just left the date for readability)
The problem is the entry
pk fk val tb1_START tb1_End tb2_START tb2_End
1 1 2 2020-01-01 2021-01-01 2022-01-01 2038-01-19
This is not a valid reference since the time periods of the entries don't overlap.
Is there some kind of overlap(period1, period2) function or some constraint, trigger or something I can define beforehand so queries like that don't return non overlapping references? I know I can manually check if the different start and end times of the tables overlap, but that becomes messy quickly as soon as there are 3 or more tables all connected in a row (tb1.fk=tb2.fk, tb2.fk2=tb3.fk2, ...).
Is e.g. MariaDB a good choice for this scenario? Is there any better database for this scenario that would also work (can also be NOSQL)?
Create table statemets:
CREATE TABLE tb1 (
pk int,
fk int,
start_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
end_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp),
PRIMARY KEY (pk, start_timestamp)
) WITH SYSTEM VERSIONING;
CREATE TABLE tb2 (
pk_refd int,
val varchar(255),
start_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
end_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp),
PRIMARY KEY(pk_refd, start_timestamp)
) WITH SYSTEM VERSIONING;
-- add foreign Key constraints
ALTER TABLE tb1
ADD CONSTRAINT FK_fk
FOREIGN KEY (fk)
REFERENCES tb2(pk_refd);
-- insert data
-- Beginning of Time
SET timestamp = UNIX_TIMESTAMP('2000-01-01 00:00:00');
INSERT INTO tb2(pk_refd, val) VALUES (1, '1');
INSERT INTO tb2(pk_refd, val) VALUES (2, '3');
-- 2. point in time
SET timestamp = UNIX_TIMESTAMP('2020-01-01 00:00:00');
INSERT INTO tb1(pk, fk) VALUES (1, 1);
-- 3. point in time
SET timestamp = UNIX_TIMESTAMP('2021-01-01 00:00:00');
UPDATE tb1 SET fk = 2 WHERE pk = 1;
-- 4. point in time
SET timestamp = UNIX_TIMESTAMP('2022-01-01 00:00:00');
Update tb2 SET val = '2' WHERE pk_refd = 1;
Solution 1:[1]
Step 1, Avoid , (almost always0 as a join operator and use explicit inner joins.
Step 2, use a join criteria that has overlaps
So:
SELECT tb1.pk, tb1.fk, tb2.val,
tb1.start_timestamp AS tb1_START, tb1.end_timestamp AS tb1_End,
tb2.start_timestamp AS tb2_START, tb2.end_timestamp AS tb2_End
FROM
tb1 FOR SYSTEM_TIME
FROM '2000-01-01 00:00:00' TO '2030-01-01 00:00:00'
JOIN
tb2 FOR SYSTEM_TIME
FROM '2000-01-01 00:00:00' TO '2030-01-01 00:00:00'
ON tb1.fk = tb2.pk_refd
AND tb2_START <= tb1_End
AND tb1_Start <= tb2_End
So yes, relational databases like MariaDB are good for this.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | danblack |
