'Produce DISTINCT values in STRING_AGG
I'm using the STRING_AGG function in SQL Server 2017. I'd like to create the same effect as COUNT(DISTINCT <column>). I tried STRING_AGG(DISTINCT <column>,',') but that is not legal syntax.
I'd like to know if there is a T-SQL work-around. Here is my sample:
WITH Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
SELECT State, City, COUNT(DISTINCT Siting) [# Of Types], STRING_AGG(Siting,',') Animals
FROM Sitings
GROUP BY State, City
The above produces the following result:
+---------+-----------+--------------+-------------------------+
| State | City | # Of Types | Animals |
+---------+-----------+--------------+-------------------------+
| Arizona | Flagstaff | 1 | dog |
| Florida | Orlando | 2 | dog,bird |
| Arizona | Phoenix | 2 | bird,bird,bird,dog,bird |
+---------+-----------+--------------+-------------------------+
The output is exactly what I want, except I want the concatenated "Animals" listed for Phoenix Arizona to be DISTINCT, like this:
+---------+-----------+--------------+--------------------+
| State | City | # Of Types | Animals |
+---------+-----------+--------------+--------------------+
| Arizona | Flagstaff | 1 | dog |
| Florida | Orlando | 2 | dog,bird |
| Arizona | Phoenix | 2 | bird,dog |
+---------+-----------+--------------+--------------------+
Any ideas?
When I use my real data set, which is much larger, I get an error about the "Animals" column exceeding 8000 characters.
My question I think is the same as this one, except my example is much simpler.
Solution 1:[1]
just use sub-query
WITH Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
select State,City,count(*) as [# Of Types],STRING_AGG(Siting,',') AS Animals from
(
SELECT State, City, Siting
FROM Sitings
GROUP BY State, City,Siting
) as T group by State,City
http://sqlfiddle.com/#!18/ba4b8/11
State City # Of Types Animals
Arizona Flagstaff 1 dog
Florida Orlando 2 bird,dog
Arizona Phoenix 2 bird,dog
Solution 2:[2]
Here is one more way of doing it (sql fiddle):
WITH Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
select State,City,count(*) as [# Of Sitings],(select string_agg(value,', ') from (select distinct value from string_split(string_agg(Siting, ','),',')) t) AS Animals
FROM Sitings
GROUP BY State, City
You may easily convert the splitting and merging part into a reusable scalar valued function.
NOTE
This is NOT an optimal solution, if you group first and then do aggregate (like answers above) it is better. Also, it does not get # of Types, it gets # of Sitings instead. However, it becomes handy as a quick inline function.
Solution 3:[3]
Of course a very late reply.
This is another way. The Siting in STRING_AGG(Siting,',') can be subqueried to returned a DISTINCT list of SITTING where the grouped keys match STATE and CITIES.
WITH Sitings
AS
(
SELECT * FROM (VALUES
(1, 'Florida', 'Orlando', 'bird'),
(2, 'Florida', 'Orlando', 'dog'),
(3, 'Arizona', 'Phoenix', 'bird'),
(4, 'Arizona', 'Phoenix', 'dog'),
(5, 'Arizona', 'Phoenix', 'bird'),
(6, 'Arizona', 'Phoenix', 'bird'),
(7, 'Arizona', 'Phoenix', 'bird'),
(8, 'Arizona', 'Flagstaff', 'dog')
) F (ID, State, City, Siting)
)
SELECT
S.State,
S.City,
COUNT(DISTINCT S.Siting) AS [# Of Types],
--STRING_AGG(S.Siting,',') AS Animals
(
SELECT STRING_AGG(U.SITING, ',')
FROM
(
SELECT DISTINCT T.Siting
FROM Sitings AS T
WHERE
T.State = S.State AND
T.City = S.City
) AS U
) AS ANIMAL
FROM
Sitings AS S
GROUP BY
S.State,
S.City
ORDER BY
S.State,
S.City
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Zaynul Abadin Tuhin |
| Solution 2 | |
| Solution 3 | Kwok Hoi Man |
