'The ambiguity w.r.t date field in Dim_time

I have come across a fact table fact_trips - composed of columns like

 driver_id,
 vehicle_id,
 date ( in the int form  - 'YYYYMMDD')
 timestamp (in milliseconds -  bigint )
 miles,
 time_of_trip

I have another dim_time - composed of columns like

  date ( in the int form  - 'YYYYMMDD'),
  timestamp (in milliseconds -  bigint ),
  month,
  year,
  day_of_week
  day

Now when I want to see the trips grouped based on year, I have to join the two tables based on timestamp (in bigint) and then group by year from dim_time.

Why the hell do we keep date in int form then? Because ultimately, I have to join on timestamp. What needs to be changed?

Also, the dim_time does not have a primary key, hence there are multiple entries for the same date. So, when I join the tables, I get more rows in return than expected.



Solution 1:[1]

You should have 2 Dim tables:

  1. DIM_DATE: PK = YYYYMMDD
  2. DIM_TIME: PK = number. Will hold the same number of records as however many milliseconds there are in a day (assuming you are holding time at the millisecond grain rather than second, minute, etc)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 NickW