'Using T-SQL, return nth delimited element from a string
I have a need to create a function the will return nth element of a delimited string.
For a data migration project, I am converting JSON audit records stored in a SQL Server database into a structured report using SQL script. Goal is to deliver a sql script and a sql function used by the script without any code.
(This is a short-term fix will be used while a new auditing feature is added the ASP.NET/MVC application)
There is no shortage of delimited string to table examples available. I've chosen a Common Table Expression example http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings
Example: I want to return 67 from '1,222,2,67,888,1111'
Solution 1:[1]
Here is my initial solution... It is based on work by Aaron Bertrand http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings
I simply changed the return type to make it a scalar function.
Example: SELECT dbo.GetSplitString_CTE('1,222,2,67,888,1111',',',4)
CREATE FUNCTION dbo.GetSplitString_CTE
(
@List VARCHAR(MAX),
@Delimiter VARCHAR(255),
@ElementNumber int
)
RETURNS VARCHAR(4000)
AS
BEGIN
DECLARE @result varchar(4000)
DECLARE @Items TABLE ( position int IDENTITY PRIMARY KEY,
Item VARCHAR(4000)
)
DECLARE @ll INT = LEN(@List) + 1, @ld INT = LEN(@Delimiter);
WITH a AS
(
SELECT
[start] = 1,
[end] = COALESCE(NULLIF(CHARINDEX(@Delimiter,
@List, @ld), 0), @ll),
[value] = SUBSTRING(@List, 1,
COALESCE(NULLIF(CHARINDEX(@Delimiter,
@List, @ld), 0), @ll) - 1)
UNION ALL
SELECT
[start] = CONVERT(INT, [end]) + @ld,
[end] = COALESCE(NULLIF(CHARINDEX(@Delimiter,
@List, [end] + @ld), 0), @ll),
[value] = SUBSTRING(@List, [end] + @ld,
COALESCE(NULLIF(CHARINDEX(@Delimiter,
@List, [end] + @ld), 0), @ll)-[end]-@ld)
FROM a
WHERE [end] < @ll
)
INSERT @Items SELECT [value]
FROM a
WHERE LEN([value]) > 0
OPTION (MAXRECURSION 0);
SELECT @result=Item
FROM @Items
WHERE position=@ElementNumber
RETURN @result;
END
GO
Solution 2:[2]
This is the easiest answer to rerieve the 67 (type-safe!!):
SELECT CAST('<x>' + REPLACE('1,222,2,67,888,1111',',','</x><x>') + '</x>' AS XML).value('/x[4]','int')
In the following you will find examples how to use this with variables for the string, the delimiter and the position (even for edge-cases with XML-forbidden characters)
The easy one
This question is not about a string split approach, but about how to get the nth element. The easiest, fully inlineable way would be this IMO:
This is a real one-liner to get part 2 delimited by a space:
DECLARE @input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(@input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')
Variables can be used with sql:variable()
or sql:column()
Of course you can use variables for delimiter and position (use sql:column
to retrieve the position directly from a query's value):
DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')
Edge-Case with XML-forbidden characters
If your string might include forbidden characters, you still can do it this way. Just use FOR XML PATH
on your string first to replace all forbidden characters with the fitting escape sequence implicitly.
It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:
SET @input=N'Some <, > and &;Other äöü@€;One more';
SET @dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(@input,@dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)');
UPDATE for SQL-Server 2016+
Regretfully the developers forgot to return the part's index with STRING_SPLIT
. But, using SQL-Server 2016+, there is JSON_VALUE
and OPENJSON
.
With JSON_VALUE
we can pass in the position as the index' array.
For OPENJSON
the documentation states clearly:
When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.
A string like 1,2,3
needs nothing more than brackets: [1,2,3]
.
A string of words like this is an example
needs to be ["this","is","an"," example"]
.
These are very easy string operations. Just try it out:
DECLARE @str VARCHAR(100)='Hello John Smith';
DECLARE @position INT = 2;
--We can build the json-path '$[1]' using CONCAT
SELECT JSON_VALUE('["' + REPLACE(@str,' ','","') + '"]',CONCAT('$[',@position-1,']'));
--See this for a position safe string-splitter (zero-based):
SELECT JsonArray.[key] AS [Position]
,JsonArray.[value] AS [Part]
FROM OPENJSON('["' + REPLACE(@str,' ','","') + '"]') JsonArray
In this post I tested various approaches and found, that OPENJSON
is really fast. Even much faster than the famous "delimitedSplit8k()" method...
UPDATE 2 - Get the values type-safe
We can use an array within an array simply by using doubled [[]]
. This allows for a typed WITH
-clause:
DECLARE @SomeDelimitedString VARCHAR(100)='part1|1|20190920';
DECLARE @JsonArray NVARCHAR(MAX)=CONCAT('[["',REPLACE(@SomeDelimitedString,'|','","'),'"]]');
SELECT @SomeDelimitedString AS TheOriginal
,@JsonArray AS TransformedToJSON
,ValuesFromTheArray.*
FROM OPENJSON(@JsonArray)
WITH(TheFirstFragment VARCHAR(100) '$[0]'
,TheSecondFragment INT '$[1]'
,TheThirdFragment DATE '$[2]') ValuesFromTheArray
Solution 3:[3]
How about:
CREATE FUNCTION dbo.NTH_ELEMENT (@Input NVARCHAR(MAX), @Delim CHAR = '-', @N INT = 0)
RETURNS NVARCHAR(MAX)
AS
BEGIN
RETURN (SELECT VALUE FROM STRING_SPLIT(@Input, @Delim) ORDER BY (SELECT NULL) OFFSET @N ROWS FETCH NEXT 1 ROW ONLY)
END
Solution 4:[4]
@a - the value (f.e. 'a/bb/ccc/dddd/ee/ff/....')
@p - the desired position (1,2,3...)
@d - the delimeter ( '/' )
trim(substring(replace(@a,@d,replicate(' ',len(@a))),(@p-1)*len(@a)+1,len(@a)))
only problem is - if desired part has trailing or leading blanks they get trimmed.
Completely Based on article from https://exceljet.net/formula/split-text-with-delimiter
Solution 5:[5]
In a rare moment of lunacy I just thought that split is far easier if we use XML to parse it out for us:
(Using the variables from @Gary Kindel's answer)
declare @xml xml
set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'
select
el = split.el.value('.','varchar(max)')
from @xml.nodes('/split/el') split(el))
This lists all elements of the string, split by the specified character.
We can use an xpath test to filter out empty values, and a further xpath test to restrict this to the element we're interested in. In full Gary's function becomes:
alter FUNCTION dbo.GetSplitString_CTE
(
@List VARCHAR(MAX),
@Delimiter VARCHAR(255),
@ElementNumber int
)
RETURNS VARCHAR(max)
AS
BEGIN
-- escape any XML https://dba.stackexchange.com/a/143140/65992
set @list = convert(VARCHAR(MAX),(select @list for xml path(''), type));
declare @xml xml
set @xml = '<split><el>' + replace(@list,@Delimiter,'</el><el>') + '</el></split>'
declare @ret varchar(max)
set @ret = (select
el = split.el.value('.','varchar(max)')
from @xml.nodes('/split/el[string-length(.)>0][position() = sql:variable("@elementnumber")]') split(el))
return @ret
END
Solution 6:[6]
you can put this select into UFN. if you need you can customize it for specifying delimiter as well. in that case your ufn will have two input. number Nth and delimiter to use.
DECLARE @tlist varchar(max)='10,20,30,40,50,60,70,80,90,100'
DECLARE @i INT=1, @nth INT=3
While len(@tlist) <> 0
BEGIN
IF @i=@nth
BEGIN
select Case when charindex(',',@tlist) <> 0 Then LEFT(@tlist,charindex(',',@tlist)-1)
Else @tlist
END
END
Select @tlist = Case when charindex(',',@tlist) <> 0 Then substring(@tlist,charindex(',',@tlist)+1,len(@tlist))
Else ''
END
SELECT @i=@i+1
END
Solution 7:[7]
I would rather create a temp table with an identity column and fill it up with output from the SPLIT
function.
CREATE TABLE #tblVals(Id INT IDENTITY(1,1), Val NVARCHAR(100))
INSERT INTO #tblVals (Val)
SELECT [value] FROM STRING_SPLIT('Val1-Val3-Val2-Val5', '-')
SELECT * FROM #tblVals
Now you can easily do something like below.
DECLARE @val2 NVARCHAR(100) = (SELECT TOP 1 Val FROM #tblVals WHERE Id = 2)
See the snapshot below:
Solution 8:[8]
I don't have enough reputation to comment, so I am adding an answer. Please adjust as appropriate.
I have a problem with Gary Kindel's answer for cases where there is nothing between the two delimiters
If you do select * from dbo.GetSplitString_CTE('abc^def^^ghi','^',3) you get ghi instead of an empty string
If you comment out the WHERE LEN([value]) > 0 line, you get the desired result
Solution 9:[9]
I cannot comment on Gary's solution because of my low reputation
I know Gary was referencing another link.
I have struggled to understand why we need this variable
@ld INT = LEN(@Delimiter)
I also don't understand why charindex has to start at the position of length of delimiter, @ld
I tested with many examples with a single character delimiter, and they work. Most of the time, delimiter character is a single character. However, since the developer included the ld as length of delimiter, the code has to work for delimiters that have more than one character
In this case, the following case will fail
11,,,22,,,33,,,44,,,55,,,
I cloned from the codes from this link. http://codebetter.com/raymondlewallen/2005/10/26/quick-t-sql-to-parse-a-delimited-string/
I have tested various scenarios including the delimiters that have more than one character
alter FUNCTION [dbo].[split1]
(
@string1 VARCHAR(8000) -- List of delimited items
, @Delimiter VARCHAR(40) = ',' -- delimiter that separates items
, @ElementNumber int
)
RETURNS varchar(8000)
AS
BEGIN
declare @position int
declare @piece varchar(8000)=''
declare @returnVal varchar(8000)=''
declare @Pattern varchar(50) = '%' + @Delimiter + '%'
declare @counter int =0
declare @ld int = len(@Delimiter)
declare @ls1 int = len (@string1)
declare @foundit int = 0
if patindex(@Pattern , @string1) = 0
return ''
if right(rtrim(@string1),1) <> @Delimiter
set @string1 = @string1 + @Delimiter
set @position = patindex(@Pattern , @string1) + @ld -1
while @position > 0
begin
set @counter = @counter +1
set @ls1 = len (@string1)
if (@ls1 >= @ld)
set @piece = left(@string1, @position - @ld)
else
break
if (@counter = @ElementNumber)
begin
set @foundit = 1
break
end
if len(@string1) > 0
begin
set @string1 = stuff(@string1, 1, @position, '')
set @position = patindex(@Pattern , @string1) + @ld -1
end
else
set @position = -1
end
if @foundit =1
set @returnVal = @piece
else
set @returnVal = ''
return @returnVal
Solution 10:[10]
Alternatively, one can use xml
, nodes()
and ROW_NUMBER
. We can order the elements based on their document order. For example:
DECLARE @Input VARCHAR(100) = '1a,2b,3c,4d,5e,6f,7g,8h'
,@Number TINYINT = 3
DECLARE @XML XML;
DECLARE @value VARCHAR(100);
SET @XML = CAST('<x>' + REPLACE(@Input,',','</x><x>') + '</x>' AS XML);
WITH DataSource ([rowID], [rowValue]) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY T.c ASC)
,T.c.value('.', 'VARCHAR(100)')
FROM @XML.nodes('./x') T(c)
)
SELECT @value = [rowValue]
FROM DataSource
WHERE [rowID] = @Number;
SELECT @value;
Solution 11:[11]
You can use STRING_SPLIT with ROW_NUMBER:
SELECT value, idx FROM
(
SELECT
value,
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) idx
FROM STRING_SPLIT('Lorem ipsum dolor sit amet.', ' ')
) t
WHERE idx=2
returns second element (idx=2): 'ipsum'
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow