'Using Count + STRING_AGG, to display distinct values
I am trying to use COUNT with STRING_AGG. The problem is that in a subquery it counts as needed and I have count 1 when it's 1 employee that is taking the course, but when I run the full query I got the result doubled, like this:
COMPLETED: 1, COMPLETED: 1
When I put DISTINCT to count in STRING_AGG it gives me an error. Any idea how I can achieve this?
SELECT s1.course_id
, s1.course_name
, STRING_AGG(CONCAT(ts.training_status_name
+ ': ' +
CAST(countt as varchar), ''),', ') AS num_of_attendies_and_status
FROM (
SELECT th.course_id
, c.course_name
, COUNT(DISTINCT e.employee_id) AS countt
FROM course as c
INNER JOIN training_history th ON c.course_id = th.course_id
INNER JOIN employee e ON th.employee_id = e.employee_id
GROUP BY th.course_id, course_name
) s1
INNER JOIN training_history th ON s1.course_id = th.course_id
INNER JOIN training_status ts ON th.training_status_id = ts.training_status_id
INNER JOIN employee e ON th.employee_id = e.employee_id
GROUP BY s1.course_id
, s1.course_name
Example of Output:
| Course_id | Course_name | num_of_attendies_and_status |
|---|---|---|
| 5 | React | COMPLETED: 1, COMPLETED: 1 |
| 7 | C# | COMPLETED: 1, COMPLETED: 1 |
| 9 | Selenium | REGISTERED: 1, REGISTERED: 1 |
| 13 | Finance | IN PROGRESS: 1, IN PROGRESS: 1 |
I have table employees with employee names and employee_id, courses with courses id, training_history which is a linkage between course_id and employee_id, and training status which shows course status name accordingly to a course id. I have 3 statuses here: COMPLETED, REGISTRED, IN PROGRESS.
The thing is, when I run the subquery:
SELECT th.course_id
, c.course_name
, COUNT(DISTINCT e.employee_id) AS countt
FROM course as c
INNER JOIN training_history th ON c.course_id = th.course_id
INNER JOIN employee e ON th.employee_id = e.employee_id
GROUP BY th.course_id
, course_name
I got the following output:
| Course_id | Course_name | num_of_attendies_and_status |
|---|---|---|
| 5 | React | 1 |
| 7 | C# | 1 |
| 9 | Selenium | 1 |
| 13 | Finance | 1 |
But the final one for the first query, it's doubled. The thing is that there are 2 same employees that are duplicated. So I need somehow to distinct them. In the subquery COUNT(DISCTINCT employee_id) works fine. However in the main query, it's doubled. I am sorry for hard understanding questions. Hope something is clear.
Solution 1:[1]
I've used a CTE to collect the information with count(distinct) .
NB I've added some lines to your sample data to have 2 statuses per course.
with training as
(SELECT
c.course_id,
c.course_name,
th.training_status_id as t_status_id,
count(distinct employee_id) as count_emp
from dbo.course c
join dbo.training_history th on c.course_id = th.course_id
group by
c.course_id,
c.course_name,
th.training_status_id)
select
t.course_id,
t.course_name,
string_agg(concat(ts.training_status_name,':',count_emp),',') as num_of_attendies_and_status
from training t
join dbo.training_status ts
on t.t_status_id = ts.training_status_id
group by
t.course_id,
t.course_name
;
course_id | course_name | num_of_attendies_and_status
--------: | :----------------------------------------------------------- | :-----------------------------
5 | React - The Complete Guide (incl Hooks, React Router, Redux) | COMPLETED: 1, COMPLETED: 1
7 | Learn Parallel Programming with C# and .NET | COMPLETED: 1, COMPLETED: 1
9 | Selenium WebDriver with Docker, Jenkins & AWS | REGISTERED: 1, REGISTERED: 1
13 | Finance for Non Finance Executives | IN PROGRESS: 1, IN PROGRESS: 1
*db<>fiddle here5dab7f3db67ba78a)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
