'How to convert different date formats to single one in snowflake

I have test table where column sys_created_on(datatype is varchar(15)) is a datetime field and we receive two different date formats like below.

03-04-2022 12:49
2/28/2022 10:35

Expected Result is:
03-04-2022 12:49
02-28-2022 10:35

Could you please suggest if there is any way to convert all formats to one format instead.. any suggestions can be appreciated. Please suggest if datatype change can help anything here.

Thank you!!



Solution 1:[1]

The best thing to do here would be to just convert your text timestamp column to a bona fide timestamp column. You could achieve this using the TO_TIMESTAMP() function along with a CASE expression:

SELECT
    ts,
    CASE WHEN REGEXP_LIKE(ts, '\\d{1,2}-\\d{2}-\\d{4} \\d{1,2}:\\d{2}')
         THEN TO_TIMESTAMP(ts, 'mm-dd-yyyy hh24:mi')
         ELSE TO_TIMESTAMP(ts, 'mm/dd/yyyy hh24:mi') END AS ts_real
FROM yourTable;

Assuming you had a new timestamp column, you could populate it using the ts text column as follows:

UPDATE yourTable
SET ts_real = CASE WHEN REGEXP_LIKE(ts, '\\d{1,2}-\\d{2}-\\d{4} \\d{1,2}:\\d{2}')
                   THEN TO_TIMESTAMP(ts, 'mm-dd-yyyy hh24:mi')
                   ELSE TO_TIMESTAMP(ts, 'mm/dd/yyyy hh24:mi') END;

Solution 2:[2]

TRY_TO_DATE return null if it fails so you can just chain different formats together with COALESCE or NVL

SELECT column1,
    TRY_TO_DATE(column1, 'dd-mm-yyyy hh:mi') as d1,
    TRY_TO_DATE(column1, 'mm/dd/yyyy hh:mi') as d2
    ,nvl(d1,d2) as answer
FROM VALUES ('03-04-2022 12:49'),('2/28/2022 10:35');

gives:

COLUMN1 D1 D2 ANSWER
03-04-2022 12:49 2022-04-03 2022-04-03
2/28/2022 10:35 2022-02-28 2022-02-28

which can be merged as

,nvl(TRY_TO_DATE(column1, 'dd-mm-yyyy hh:mi'),TRY_TO_DATE(column1, 'mm/dd/yyyy hh:mi')) as answer

ah, didn't read well enough, to make them all the same, UPDATE but use the "local format" thus just a TO_CHAR

thus:

UPDATE table
   SET sys_created_on = to_char(nvl(
       TRY_TO_TIMESTAMP(sys_created_on , 'dd-mm-yyyy hh:mi'),
       TRY_TO_TIMESTAMP(sys_created_on , 'mm/dd/yyyy hh:mi')
   ));

Solution 3:[3]

Replace the separator using replace():

    update test_table 
    set sys_created_on = replace(sys_created_on,'/','-');

If you're also dealing with different day and month field order, look into regexp_replace() to swap their places:

    update test_table 
    set sys_created_on = regexp_replace(sys_created_on,
                                        '(.*)/(.*)/(.*)',
                                        '\\2-\\1-\\3');

That's in case your 03-04-2022 is in format dd-mm-yyyy making it April 3rd, not March 4th. It's good to know what exact format you're dealing with. In extreme cases you might even need to make sure whether your hour field is 24-h or 12-h-based but missing an am/pm meridiem indicator.

As suggested by Tim's and Simeon's answers, a matching data type is always encouraged. It takes less space, queries faster, enables type-specific functions and maintains validity of data (varchar doesn't care if you get February 30th or 32nd day of month 13, at 25:60)

If you want to keep the cookie and eat it too, here's how you can add one virtual column where you'll always see a standardised version of your sys_created_on, and another one, which will always interpret it as a proper timestamp. This way you don't need to touch anything in how the table is populated, keep the original, unprocessed data, see how it gets standardised, and also benefit from a timestamp data type, while not using up any additional space:

alter table test_table 
    add column standardised_sys_created_on varchar(15)
        as replace(sys_created_on,'/','-'),
    add column timestamp_sys_created_on TIMESTAMP_NTZ
        as coalesce(
            try_to_date(sys_created_on, 'dd-mm-yyyy hh24:mi'),
            try_to_date(sys_created_on, 'dd/mm/yyyy hh24:mi'));

To make it faster at the expense of materializing them, you can turn those virtual columns into generated/computed using default.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tim Biegeleisen
Solution 2
Solution 3 Zegarek