'Acceptable way to get around a UNIQUE w NULL constraint?
I have a table as follows:
I would like to add a UNIQUE KEY on (connection_id, parent_container_id, name). However, parent_container_id and connection_id are a disjoin union -- the record must have one or the other. Because of this, I thought perhaps using the value 1 as the 'null' value (that is, the first entry when using an auto-incrementing ID) and creating a BASE entry in the container table for the 1 record.
Example data that I want to enforce uniqueness on:
(connection_id=1, parent_container=null, name="hello")
(connection_id=1, parent_container=null, name="hello") # should fail
(connection_id=null, parent_container=10, name="goodbye")
(connection_id=null, parent_container=10, name="goodbye") # should fail
Is this a poor approach to use or does this seem like a good way to enforce the Unique-ness constraint here? If it's not a good way, what might be a better way?
Update: my current solution is now using a generated (virtual) column with an md5 hash:
ALTER TABLE container ADD unique_hash2 CHAR(32) GENERATED ALWAYS AS
(MD5(CONCAT(COALESCE(connection_id, '-1'), COALESCE(parent_container_id, '-1'), name))) VIRTUAL UNIQUE
Solution 1:[1]
Enforcing such requirements with:
two unique indexes
check constraint
Table definition:
CREATE TABLE tab(
container_id INT PRIMARY KEY AUTO_INCREMENT,
connection_id INT,
parent_container_id INT,
name VARCHAR(100),
UNIQUE INDEX (connection_id, name),
UNIQUE INDEX (parent_container_id, name),
CHECK ((connection_id IS NULL AND parent_container_id IS NOT NULL)
OR
(connection_id IS NOT NULL AND parent_container_id IS NULL))
);
Data test:
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (1, NULL, 'hello');
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (1, NULL, 'hello');
-- Duplicate entry '1-hello' for key 'tab.connection_id'
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (NULL, 10, 'goodbay');
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (NULL, 10, 'goodbay');
-- Duplicate entry '10-goodbay' for key 'tab.parent_container_id'
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (NULL, NULL, 'a');
-- Check constraint 'tab_chk_1' is violated.
INSERT INTO tab(connection_id, parent_container_id, name)
VALUES (2, 2, 'a');
-- Check constraint 'tab_chk_1' is violated.
Solution 2:[2]
Some things are best done in the application. Use it to do the enforcement. Or consider using Stored Procedure wrappers around the SQL code needed.
The one PRIMARY KEY for a table must include non-NULL column(s). But a UNIQUE secondary index may include NULL column(s).
Hence, neither of the two columns can be the PK. Nor can the combination of them.
Solution 3:[3]
The SQL92 standard defines:
A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. In addition, if the unique constraint was defined with PRIMARY KEY, then it requires that none of the values in the specified column or columns be the null value.
For NULL, it can be understood as unknown. Although we don't know it now, it has many possibilities in the future, but we don't know it yet. Of course many people think it is a bug.
For the two fields parent_container and connection_id described above, why not from a business point of view, for NULL, set a value that is considered impossible in business instead, so as to avoid the limitation of MySQL unique key
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Rick James |
| Solution 3 | taigetco |

