'Azure SQL DB having performance issues, 100% log/io the cause?

For one of my clients, we have a large Azure SQL DB that is used to store data, and is used on a daily basis. We have a process (C#) that creates and updates data multiple times each night. During this process, data is also inserted, deleted and updated within the DB. For some time now, we are experiencing issues with this process, where the process often stops halfway through. In the logging of this process, we see that it usually stops on DB operations, most often on the DELETE step.

Due to this, we came to the conclusion that queries are taking too long, due to which the process kills itself. Other users of the DB, that query the DB for reporting and such, also are experiencing issues as of late, with queries taking too long to complete or not completing at all.

One thing I have noticed is that oftentimes during processing, the LOG/IO percentage goes up to 100%. I have tried to change some scaling configurations within the DB, but this does not seem to affect anything.

Good to note is that there are hundreds of millions of rows updated/inserted/deleted each night.

The DB as of now is GEN5, 8 Vcores, 1.5 TB memory (actual size at the moment is 280GB, but will increase every day until the end of the year when we normally archive and create a new DB).

I would really appreciate any tips and/or tricks on how to increase the performance of this DB and what we can do about the high LOG/IO percentage.

I have already tried renewing the indexes, but fragmentation does not seem exceptionally high.

example high LOG/IO.



Solution 1:[1]

If your Log I/O is hitting 100%, then it'll be because you're hammering your database log files. Creating/Updating/Deleting millions of rows will do that.

Without detail of your specific process, it's difficult to say with certainty, but I'd hazard a guess that you don't batch your create/update/delete operations. That is to say, if you've got 50 million rows to create, 50 million rows to update and 50 million rows to delete, you would do something like:

BEGIN TRANSACTION
DELETE * FROM Table WHERE ID between 0 and 50000000
COMMIT

Until that operation completes, the data is stored in the SQL Server log file; once it completes and SQL Server performs one of it's periodic CHECKPOINTs it flushes the data to disk and the log file is truncated.

So one suggestion would be to batch your overnight processes, such that each transaction only effects a smaller proportion of the total data set; taking my example above, you might do:

DECLARE @ID INT = 0
DECLARE @IDTop INT

WHILE @ID < 50000001
BEGIN

WAITFOR DELAY '00:00:01'

SET @IDTop = @ID + 10000

BEGIN TRANSACTION
DELETE * FROM Table 
WHERE ID between @ID and @IDTop
COMMIT

SET @ID = @IDTop

END

Which will iterate through in 10000 row blocks - this is obviously an example and can be tweaked, but should give you an idea on the structure.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Anthony Norwood