'SQL Long String into Substrings by a length and whitespaces

I Want to make out of a String ("Hello this is a String That is very odd") Substrings by a defined legth (eg. 8) so that when string gets cut at index 8 but alway at the whitespaces not in a word.

lenth : 11 ("Hello this is a String That is very odd") --> ("Hello this"),("is a String"),("That is"),("very odd")

I alredy have an array of the indexes Of the whitespaces but i dont know further.

I appreciate if you would help me



Solution 1:[1]

There is no easy solution... So the simple answer is: Do not use SQL-Server for this issue. It's just the wrong tool.

Nevertheless this can be done (if you have to):

--Some declared table to mock your scenario

DECLARE @tbl TABLE(ID INT IDENTITY, YourString NVARCHAR(1000));
INSERT INTO @tbl VALUES('Hello this is a String That is very odd')
                      ,('blah')
                      ,('And one withaverylongword');

--use this to define the portion's length. 8 will be to little...

DECLARE @portionLenght INT = 12;

-the query

WITH cte AS
(
    SELECT t.ID
          ,A.[key] AS fragmentPosition
          ,A.[value] AS fragment
    FROM @tbl t
    CROSS APPLY OPENJSON(CONCAT('["',REPLACE(t.YourString,' ','","'),'"]')) A
)
,recCTE AS
(
    SELECT ID,fragmentPosition,fragment 
          ,0 AS growingIndex
          ,CAST(fragment AS NVARCHAR(MAX)) AS growingString

    FROM cte 
    WHERE fragmentPosition=0

    UNION ALL

    SELECT cte.ID
          ,cte.fragmentPosition
          ,cte.fragment
          ,recCTE.growingIndex + CASE WHEN B.newLength>@portionLenght THEN 1 ELSE 0 END
          ,CASE WHEN B.newLength>@portionLenght THEN cte.fragment ELSE CONCAT(recCTE.growingString,N' ',cte.fragment) END
    FROM recCTE 
    INNER JOIN cte ON cte.ID=recCTE.ID AND  cte.fragmentPosition=recCTE.fragmentPosition+1
    CROSS APPLY(SELECT LEN(CONCAT(recCTE.growingString,N' ',cte.fragment))) B(newLength)
)
,final AS
(
    SELECT *
          ,ROW_NUMBER() OVER(PARTITION BY ID,growingIndex ORDER BY fragmentPosition DESC) lastGrowing
    FROM recCTE
)
SELECT * FROM final
WHERE lastGrowing=1
ORDER BY ID,fragmentPosition; 

The result (with length=12)

1   Hello this
1   is a String
1   That is very
1   odd
2   blah
3   And one
3   withaverylongword

The idea in short

  • we use a trick with OPENJSON to transform your string into a json array and split it with a guaranteed sort order.
  • we use a recursive CTE to run through your fragments.
  • each iteration will calculate the total length of the former parts together with the new fragment
  • depending on this calculation the fragment will either be added or a new protion is opened.
  • the final CTE will add a partitioned ROW_NUMBER() to find the last entry per portion.

And no, you should not use this... :-)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Shnugo