'Is there a better way than using SQL Cursor for dynamic string replace on large volume set
I really need some help or direction making the below more optimised as I am not an SQL developer and the below takes too long to complete. I have looked into a few different alternatives to cursors as I read they are not the best to work on large data sets, but I haven't seen an alternative for what I need to do, as I don't believe nested REPLACE for example will work as what I need to replace is dynamic based on record lookups.
Essentially, what I am trying to do is replace any found ids that can be found within 3 columns inside a table with the new ids based on the mapping (old & new) that are in another 3 tables. I have no idea which row these ids are in and how frequent.
So what I wrote is to build a single mapping temporary table, iterate through that table and the do a string replace on the 3 columns for wherever the original Id is found (sourceId) and replace it with the new Id (TargetId).
--update: What I'm trying to do is replace old ids that are inside 3 large json fields (500+chars), with new Ids. There's no risk of incorrect mapping as each id is unique even across the source and target tables and are always 18 nchar long, so cannot be inside another, i.e cant be 10 inside 110.
Any help/guidance would be appreciated.
For info, after the 3 tables have fed into the RefIds temporary table then RefIds_TempTable is around 800,000 rows. The Order table that has the 3 replace functions applied is around 200,000 rows.
DECLARE @RefCursor CURSOR;
DECLARE @SourceID nchar(18);
DECLARE @TargetID nchar(18);
BEGIN
-- copy refids into temp table
-- declare temp table
Create Table #RefIds_TempTable (SourceID nchar(18) Primary Key, TargetID nchar(18))
Insert Into #RefIds_TempTable (SourceID, TargetID)
Select b.id, a.id
From [target database].dbo.[Product] a
Join [source database].dbo.[Product] b
On b.ProductCode = a.ProductCode
Insert Into #RefIds_TempTable (SourceID, TargetID)
Select b.id, a.id
From [target database].dbo.[AttributeCategory] a
Join [source database].dbo.[AttributeCategory] b
On b.Code = a.Code
Insert Into #RefIds_TempTable (SourceID, TargetID)
Select b.id, a.id
From [target database].dbo.[Attribute] a
Join [source database].dbo.[Attribute] b
On b.Code = a.Code
Insert Into #RefIds_TempTable (SourceID, TargetID)
Select b.id, a.id
From [target database].dbo.[Assignment] a
Join [source database].dbo.[Assignment] b
On b.Key = a.Key
BEGIN
SET @RefCursor = CURSOR FORWARD_ONLY STATIC FOR
select SourceID, TargetID from #RefIds_TempTable
OPEN @RefCursor
FETCH NEXT FROM @RefCursor INTO @SourceID, @TargetID;
WHILE @@FETCH_STATUS = 0
BEGIN
Update [target database].dbo.Orders
set JSONValue = REPLACE(JSONValue, @SourceID,@TargetID)
Update [target database].dbo.Orders
set JSONStr = REPLACE(JSONStr, @SourceID,@TargetID)
Update [target database].dbo.Orders
set FullJSON = REPLACE(FullJSON, @SourceID,@TargetID)
FETCH NEXT FROM @RefCursor INTO @SourceID, @TargetID
END;
End;
CLOSE @RefCursor ;
DEALLOCATE @RefCursor;
END;
Example of cut down JSONStr value (Usually contains many more attributes etc):
{
"ABC": [
{
"AttributDefStart": null,
"objId": "99t9Z1234561AbLPJK",
"attri": "44t9Z1234561AbLGTY",
"catId": "Y9V2U368714EUFFQWQ",
"cde": "ABC",
"catnme": "provs",
"AttributDefEnd": null
},
{
"AttributDefStart": null,
"objId": "99t9Z1234561AbLPJK",
"attri__c": "98G7Z136900XZB0FFT",
"catId": "Y9V2U368714EUFFQWQ",
"cde": "ABC",
"catnme": "provs",
"AttributDefEnd": null
}
]
}
And if I had these id mappings for example: SourceId = 99t9Z1234561AbLPJK and TargetId = 33J9Z1234561AbLLTG
then I would expect the JSON to contain the TargetId values wherever the sourceId existed.
Solution 1:[1]
When complex JSON modifications are involved, it's often easier to shred the JSON, make the joins and rebuild.
Unfortunately, this is made more complex because the JSON has dynamic keys and SQL Server does not have JSON_AGG so we need to hack it with STRING_AGG and STRING_ESCAPE
UPDATE t
SET JsonStr = j.newJson
FROM YourTable t
CROSS APPLY (
SELECT newJson = -- re-aggregate the JSON using dynamic keys
'{' +
STRING_AGG('"' + STRING_ESCAPE(j2.[key], 'json') + '":"' + newJson.json), ',') +
'}'
FROM OPENJSON(t.JsonStr) j1 -- shred the array into [key] (which is index) and [value]
CROSS APPLY OPENJSON(j1.value) -- shred the values
WITH (
AttributDefStart varchar(100),
objId varchar(100),
attri varchar(100),
catId varchar(100),
cde varchar(100),
catnme varchar(100),
AttributDefEnd varchar(100)
) j2
LEFT JOIN [source database].dbo.[Product] pS
JOIN [target database].dbo.[Product] pT ON pT.ProductCode = pS.ProductCode
ON pS.id = j2.objId
LEFT JOIN [target database].dbo.[AttributeCategory] cS
JOIN [source database].dbo.[AttributeCategory] cT ON cT.Code = cS.Code
ON cS.id = j2.catId
LEFT JOIN [target database].dbo.[Attribute] aS
JOIN [source database].dbo.[Attribute] aT ON aT.Code = aS.Code
ON aS.id = j2.attri
CROSS APPLY (
SELECT
j2.AttributDefStart,
objId = pT.id,
attri = aT.id,
catId = cT.id,
j2.cde,
j2.catnme,
j2.AttributDefEnd
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER -- rebuild as single objects
) newJson(json)
) j
You may want to change the data types of the OPENJSON schema.
You also may want to change the LEFT JOINs to INNER. Alternatively you may want to filter out cases where the join fails.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Charlieface |
