'Find NON-duplicate column value combinations
This is for a migration script.
CompanyTable:
| EmployeeId | DivisionId |
|---|---|
| abc | div1 |
| def | div1 |
| abc | div1 |
| abc | div2 |
| xyz | div2 |
In the below code I am Selecting duplicate EmployeeId-DivisionId combinations, that is, the records that have the same EmployeeId and DivisionId will be selected. So from the above table, the two rows that have abc-div1 combination will be selected by the below code.
How can I invert it? It seems so simple but I can't figure it out. I tried replacing with HAVING count(*) = 0 instead of > 1, I've tried fiddling with the equality signs in the ON and AND lines. Basically from the above table, I want to select the other three rows that don't have the abc-div1 combination. If there is a way to select all the unique EmployeeID-DivisionId combinations, let me know.
SELECT a.EmployeeID, a.DivisionId FROM CompanyTable a
JOIN ( SELECT EmployeeID, DivisionId
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
HAVING count(*) > 1 ) b
ON a.EmployeeID = b.EmployeeID
AND a.DivisionId = b.DivisionId;
EmployeeId and DivisionId are both nvarchar(50) columns.
Solution 1:[1]
As already mentioned, you must replace > 1 by its real opposite <= 1, this works: db<>fiddle
Solution 2:[2]
A windowed count would seem a suitable method:
select employeeid, divisionid
from (
select *, Count(*) over(partition by employeeid, divisionid) ct
from t
)t
where ct = 1;
Solution 3:[3]
First, let's try rewriting your query using a common table expression (CTE), instead of a subquery:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 1
Notice how I've removed the HAVING clause & added a new HasDuplicates column? We're going to use that new column to find all of the table rows that -DON'T- have duplicates:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 0
The only character of SQL code that changed between the two queries was the last line, where and cts.HasDuplicates = ### is set.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jonas Metzler |
| Solution 2 | |
| Solution 3 | Mass Dot Net |
