'Multiple aggregations for the same data
I have two queries:
SELECT
COUNT(warehouseCode) as count,
warehouseCode
FROM 'sparePartEventsTable'
WHERE
sparePartCode = 'SP.0000' AND
sparePartConsumed = 'true'
GROUP BY warehouseCode
ORDER BY count DESC
and
SELECT
COUNT(*) as eventsCount,
DATE(TIMESTAMP_SECONDS(epochSeconds)) as day
FROM 'sparePartEventsTable'
WHERE
sparePartCode = 'SP.0000' AND
sparePartConsumed = 'true'
GROUP BY day
ORDER BY day
As you can see the underlying data is the same, but I'm returning two different aggregations. Is there a way in SQL to avoid hitting the disk twice?
How would you implement this is either BigQuery or Postgres?
In mongodb I would build a cursor on the underlying common data, and then write an aggregation pipeline that spits two results.
EDIT: It seems that UNION ALL could be a first solution, but at least in postgres it scans twice. Sergey in the comment suggested GROUPING SETS, but unfortunately they are not available in BigQuery. An answer in either Postgres or BigQuery dialect will be accepted, extra points if both solutions are posted :)
Solution 1:[1]
Consider below (BigQuery) option
with temp as (
select
warehouseCode,
count(*) over(partition by warehouseCode) as `count`,
date(timestamp_seconds(epochSeconds)) as day,
count(*) over(partition by unix_date(date(timestamp_seconds(epochSeconds)))) as eventsCount
from `sparePartEventsTable`
where sparePartCode = 'SP.0000'
and sparePartConsumed = 'true'
)
select distinct 'warehouseCode' type, '' || warehouseCode as value, `count` from temp union all
select distinct 'day', string(day), eventsCount from temp
with output like below
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Mikhail Berlyant |

