'Errors running Trino syntax code for Zeus

I am re-writing a Exasol query to Trino (Using Trino for the first time). Problem is that I am getting timeout errors so it is difficult to test. I am not sure, but it might be happening that I am getting time out due to syntax is not correct... Query is the below one once I tried to translate it into Trino.

            WITH timeseries AS (
                -- Creating timeseries
                SELECT  calendar_date
                FROM revolut.calendar
                WHERE calendar_date BETWEEN current_date - INTERVAL '6' MONTH
                    AND current_date - INTERVAL '1' DAY
            )
                SELECT --date(t.calendar_date) AS calendar_date,
                    date_trunc('second', me.created_date) AS click_time,
                    lower(
                        -- Standardizing the principal_type to join with agent_logins CTE
                        translate(
                            trim(
                                regexp_replace(
                                    ae.principal_type,
                                    '[a-zA-Z ._\-0-9]+(?=(@|\[))?'
                                )
                            ),
                            ' ',
                            '.'
                        )
                    ) AS agent_login,
                    -- Required to join with the agent_logins CTE
                    'identitycheck' AS click_type
                FROM revolut.model_events AS me
                    JOIN revolut.action_events AS ae ON me.action_event_id = ae.id
                    JOIN timeseries AS t ON date(me.created_date) = date(t.calendar_date) -- Limiting the data to the timeseries
                WHERE me.event_type IN (
                        'AnonymousIdentityCheckDeclinedEvent',
                        'AnonymousIdentityCheckApprovedEvent'
                    )

Error I am getting:

TrinoQueryError: TrinoQueryError(type=INSUFFICIENT_RESOURCES, name=EXCEEDED_LOCAL_MEMORY_LIMIT, message="Query exceeded per-node memory limit of 75GB [Allocated: 74.97GB, Delta: 42.18MB, Top Consumers: {HashBuilderOperator=73.52GB, ScanFilterAndProjectOperator=1.14GB, PartitionedOutputOperator=212.48MB}]", query_id=20220329_121757_01776_rbgxn)

Some Trino questions I could not find online.

  1. Is it regexp_substr doing the same as regexp_replace in Trino? it seems that now agent_login is NULL (see screenshot).
  2. To convert timestamp to date in Trino I have found date(created_date) and to_date('created_date', YYYY-MM-DD') which one is best?
  3. Is date_trunc('second', me.created_date) a heavy function in Trino? I am saying this as commenting out timeseries JOIN, and un-commeting out date_trunc() function, I also get the type=INSUFFICIENT_RESOURCES error.

Thanks guys!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source