'Case insensitive duplicates SQL
So I have a users table where the user.username has many duplicates like:
username and Username and useRnAmejohn and John and jOhn
That was a bug and these three records should have been only one.
I'm trying to come up with a SQL query that lists all of these cases ordered by their creation date, so ideally the result should be something like this:
username jan01
useRnAme jan02
Username jan03
john feb01
John feb02
jOhn feb03
Any suggestions will be much appreciated
Solution 1:[1]
Try something like these
SELECT UserName, CreatedDate
FROM User
WHERE LOWER(TRIM(UserName)) IN
(
SELECT LOWER(TRIM(UserName))
FROM User
GROUP BY LOWER(TRIM(UserName))
HAVING count(*) > 1
)
Solution 2:[2]
Use ToLower() or equivalent function in your SELECT, and order by that column.
Solution 3:[3]
In MySQL, a case-sensitive compare is done using a binary collation. So you could join the table on itself, looking for rows where the case sensitive compare is different from the case insensitive compare:
select *
from YourTable t1
inner join YourTable t2
on t1.name <> t2.name collate latin1_bin
and t1.name = t2.name
Solution 4:[4]
SELECT UserName, CreatedDate
FROM YourTable
WHERE UserName COLLATE UTF8_BIN != LOWER(UserName COLLATE UTF8_BIN)
GROUP BY UserName, CreatedDate
HAVING COUNT(*) > 1
Solution 5:[5]
so this is what i came up with. this was written against a postgres db but should work fine still against other sql engine.
select * from user u join user u2
on upper(u.email)=upper(u2.email) where u.id != u2.id
order by u.email;
so the query assume that the email are duplicate but the ids are not so it is looking to pull records with a duplicate email (case insensitive) but with unique id
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Christoph |
| Solution 2 | 3Dave |
| Solution 3 | Andomar |
| Solution 4 | cske |
| Solution 5 | Badmous |
