'Count distinct by boolean value

Is there a better (prettier, more idiomatic, or even performant) way to do the following?

Objective: calculate distinct values for a column by another boolean column.

Sample data:

id  | metadata_streaming_date | cols_exist |
--- | ----------------------- | -----------|
 1  | 2022-02-20              | true       |
 1  | 2022-02-20              | true       |
 2  | 2022-02-20              | true       |
 2  | 2022-02-20              | true       |
 3  | 2022-02-20              | false      |
 1  | 2022-02-19              | true       |
 2  | 2022-02-19              | false      |
 3  | 2022-02-19              | false      |
 4  | 2022-02-19              | false      |
 4  | 2022-02-19              | false      |

Expected result is to count distinct id grouped by metadata_streaming_date split by wanted (where cols_exist = false) and overall (all rows for this id per date).

Result table expected:

| metadata_streaming_date | wanted | overall |
| ----------------------- | -------| --------|
| 2022-02-20              | 1      | 3       |
| 2022-02-19              | 3      | 4       |

I can achieve it through two sub-queries and inner-joining then by metadata_streaming_date:

select
  t1.metadata_streaming_date,
  overall,
  wanted,
  wanted / overall as perc
from
  (
    select
      metadata_streaming_date,
      count(distinct id) as overall
    from
      non_needed_fields_view
    where
      metadata_streaming_date >= '2022-02-19'
    group by
      metadata_streaming_date
  ) as t1
  inner join (
    select
      metadata_streaming_date,
      count(distinct id) as wanted
    from
      non_needed_fields_view
    where
      cols_exist is false
      and metadata_streaming_date >= '2022-02-19'
    group by
      metadata_streaming_date
  ) as t2 on t1.metadata_streaming_date = t2.metadata_streaming_date


Solution 1:[1]

You can try to use the aggregate condition function with DISTINCT, let your logic in CASE WHEN expression.

SELECT metadata_streaming_date,
       COUNT(DISTINCT CASE WHEN cols_exist = false THEN id END) wanted ,
       COUNT(DISTINCT id) overall 
FROM non_needed_fields_view
WHERE metadata_streaming_date >= '2022-02-19'
GROUP BY metadata_streaming_date 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 D-Shih