'in SQL, what is the right way to detect ASCII zero characters in a string
What is the right way to detect ascii zero characters?
I'm perplexed that in my example query a plain charindex call doesn't find it in the string.
Well, unless I manipulate it. For example by reversing the string, or chopping it to the last character...
This happens whether it is a trailing or interior character, but not when it is the initial character.
I need to do this to reliably eliminate them.
It's something to do with unicode because a strictly varchar variant of this does work correctly.
My example:
declare @testval as nvarchar(max) = N'abcde' + nchar(0)
select
charindex(char(0), @testval) where_is_char0
, charindex(nchar(0), @testval) where_is_nchar0
, charindex(nchar(0), reverse(@testval)) where_is_reversed_nchar0
, charindex(nchar(0), right(@testval, 1)) where_is_lastchar_nchar0
Here are the results:
where_is_char0 where_is_nchar0 where_is_reversed_nchar0 where_is_lastchar_nchar0
-------------------- -------------------- ------------------------ ------------------------
0 0 1 1
Solution 1:[1]
CHARINDEX docs say:
0x0000 (char(0)) is an undefined character in Windows collations and cannot be included in CHARINDEX.
You can convert your nvarchar string to varbinary and search for binary value.
declare @testval as nvarchar(max) = N'abcde' + nchar(0)
declare @bin_testval as varbinary(max) = CAST(@testval as varbinary(max));
select
charindex(0x0000, @bin_testval) where_is_char0
,@bin_testval AS BinaryValue
;
Result
+----------------+----------------------------+
| where_is_char0 | BinaryValue |
+----------------+----------------------------+
| 10 | 0x610062006300640065000000 |
+----------------+----------------------------+
Keep in mind that nvarchar is unicode 2 bytes per character, so position of 0x0000 in the original string is not 10, but twice less.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
