'Slow changing dimension with two pairs of start/end dates
For my DWH I consider implementing scd with two pairs of start/end dates: effective_from_dttm/effective_to_dttm and valid_from_dttm/valid_to_dttm. The reason is that I needed to track changes with both timestamp from source and timestamp that shows when a row was extracted in staging area of dwh. In staging are I have tables with two timestamps: reliable timestamp (processed_dttm) that I generate when extracting from source and unreliable timestamp (last_upd_dttm) which comes from source. Effective/valid dates are created from these two timestamps accordingly.
Consider this example:
DDS Table
| id | changing_field | effective_from_dttm | effective_to_dttm | valide_from_dttm | valide_to_dttm |
|---|---|---|---|---|---|
| 1 | something | 01.01.2022 | null | 10.02.2022 | null |
INPUT TABLE
| id | changing_field | last_upd_dttm | processed_dttm |
|---|---|---|---|
| 1 | something_new! | 01.01.2022 | 11.02.2022 |
DDS Table (delta applied)
| id | changing_field | effective_from_dttm | effective_to_dttm | valide_from_dttm | valide_to_dttm |
|---|---|---|---|---|---|
| 1 | something | 01.01.2022 | null | 10.02.2022 | 11.02.2022 |
| 1 | something_new! | 01.01.2022 | null | 11.02.2022 | null |
As I said last_upd_dttm is unreliable timestamp, for example, in source system record can be changed while last_upd_dttm stays the same due to possible fraud or mistake which I need to detect.
INPUT TABLE
| id | changing_field | last_upd_dttm | processed_dttm |
|---|---|---|---|
| 1 | something_new2! | 12.02.2022 | 12.02.2022 |
DDS Table (delta applied)
| id | changing_field | effective_from_dttm | effective_to_dttm | valide_from_dttm | valide_to_dttm |
|---|---|---|---|---|---|
| 1 | something | 01.01.2022 | 12.02.2022 | 10.02.2022 | 11.02.2022 |
| 1 | something_new! | 01.01.2022 | 12.02.2022 | 11.02.2022 | 12.02.2022 |
| 1 | something_new2! | 12.02.2022 | null | 12.02.2022 | null |
Now I have a history of effectivity in source system and in dwh. Is there a more simple approach for reflecting such history? I guess having two pairs of dates will make process of building marts too complicated. And is there a type of SCD with this double versioning or maybe someplace where I can read about such approach?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
