'Disambiguating columns used in group by that come from select
Assume a table person exists with columns id, first_name, last_name and table 'dog' exists with columns id, owner_id, name exist.
Take the following query:
Select
first_name + last_name as name,
count(*) dog_count
from
person
join dogs on person.id = dog.owner_id
group by
person.id,
first_name + last_name;
This query works and for this purpose, but has 1 thing that erks me whenever I have to write group by clauses and I went searching for solutions today and didn't really ever find one that I liked.
name from the query select clause doesn't exist when it processes the group by (due to the order of execution). Many RDMS allow it anyways as an extension, but even most (I think all) of those will run afoul of the following problem: if I use name instead of first_name + last_name in the group by clause, instead of going to the select clause to use name from there it'll to go name from the dog table.
I don't know if it's part of the standard, but many RDMS allow you to use column numbers instead to specify columns from the select, so I can do group by 1, but this becomes problematic with larger queries that may be dynamically specifying columns so that the ordering may change.
So... the final question: does any RDMS that anybody is aware of have a way to disambiguate by column name (or other consistent attribute) when pulling columns from the select clause? An example of something that could work would be group by select.name to specify 'take the column name from the select clause and apply it to the group by clause`?
Does anybody know if discussion on the topic has ever been brought as something to make it into SQL standards?
Solution 1:[1]
You're right that GROUP BY happens before SELECT. Therefore, GROUP BY does not yet know that first_name + last_name is aliased as name in the SELECT, rather than expecting that name would be referring to dog.name.
If it really bothers you, you can write your group by as this.
group by
person.id,
person.first_name,
person.last_name;
However, in general, regarding ambiguity, you can always put the table name before the column name to specify which table you're referring to. Assuming that column name actually exists in the base table. That still would not work there, though, as mentioned. However, it would work in an ORDER BY clause, since that happens after the SELECT.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
