'Workaround on Sliding Window Function in Snowflake

I've stumbled upon a problem that is giving me huge headaches, which is the following:

I have a table Deals, that contains information about this entity from our Sales CRM. I also have a table Company, that contains information about the companies pegged to those deals.

I was asked to compute a metric called Pipeline Conversion Rate, which is calculated as: Won deals / Created Deals

Until here, everything is quite clear. Nevertheless, when computing this metric I was asked to do so in a sliding-window-function-fashion, which means to compute the metric only looking at the prior 90 days. Thing is that to look at the last 90 days of the numerator, we need to use one Date (created date); while when looking at the prior 90 days of the denominator, we should take into account the closed date (both dimensions are part of the Deals table).

There wouldn't be any problem if we could do this kind of window functions in Snowflake, as the following (I know syntax may not be exactly this one, but you get the idea):

count(deal_id) over (
  partition by is_inbound, sales_agent, sales_tier, country
  order by created_date range between 90 days preceding and current row
) as created_deals_last_90_days,
count(case when is_deal_won then deal_id end) over (
  partition by is_inbound, sales_agent, sales_tier, country
  order by created_date range between 90 days preceding and current row
) as won_deals_last_90_days

But we can't as far as I know. So my current workaround is the following (taken from this post):

select
  calendar_date,
  is_inbound,
  sales_tier,
  sales_agent,
  country,
  (
   select count(deal_id)
   from deals
   where d.is_inbound = is_inbound
     and d.sales_tier = sales_tier
     and d.sales_agent = sales_agent
     and d.country = country
     and created_date between cal.calendar_date - 90 and cal.calendar_date
   ) as created_deals_last_90_days,
   (
   select count(case when is_deal_won then deal_id end)
   from deals
   where d.is_inbound = is_inbound
     and d.sales_tier = sales_tier
     and d.sales_agent = sales_agent
     and d.country = country
     and closed_date between cal.calendar_date - 90 and cal.calendar_date
   ) as won_deals_last_90_days
from calendar as cal
left join deals as d on cal.calendar_date between d.created_date and d.closed_date

*Note that I am using a calendar table here as base table, in order to have visibility on all calendar dates since without it I might say I'd be missing on those dates where there are no new deals (could happen on weekends).

Problem is that I am not getting correct figures when I cross check the raw data and the output of this query, and I have no idea how to make this (ugly) workaround, well... work.

Any ideas are more than welcome!



Solution 1:[1]

Well, it turns out it was way easier than I expected. After some trial-and-error, I figured out the only thing that could be failing was the JOIN condition in the outer query:

on cal.calendar_date between d.created_date and d.closed_date

This was assuming that both dates needed to be in the range, while this assumption is wrong. By tweaking the above mentioned part of the outer query to:

on cal.calendar_date >= d.created_date

It captures all those Deals that were created on or before the calendar_date, and therefore all of them since it is a mandatory field.

Maintaining the rest of the query as is, and assuming that there will be no nulls in any of the partitions, the results are the ones I expected.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aleix CC