'Deadlock involving SELECT FOR UPDATE

I have transaction with several queries. First, a select rows with FOR UPDATE lock:

SELECT f.source_id FROM files AS f WHERE
    f.component_id = $1 AND
    f.archived_at IS NULL
FOR UPDATE

Next, there is an update query:

UPDATE files AS f SET archived_at = NOW()
WHERE
hw_component_id = $1 AND
f.source_id = ANY($2::text[])

And then there is an insert:

INSERT INTO files AS f (
    source_id,
    ...
)
VALUES (..)
ON CONFLICT (component_id, source_id) DO UPDATE
SET archived_at = null,
is_valid = excluded.is_valid

I have two application instances and sometimes I see deadlock errors in PostgreSQL log:

ERROR:  deadlock detected
DETAIL:  Process 3992939 waits for ShareLock on transaction 230221362; blocked by process 4108096.
Process 4108096 waits for ShareLock on transaction 230221365; blocked by process 3992939.
Process 3992939: SELECT f.source_id FROM files AS f WHERE f.component_id = $1 AND f.archived_at IS NULL FOR UPDATE
Process 4108096: INSERT INTO files AS f (source_id, ...) VALUES (..) ON CONFLICT (component_id, source_id) DO UPDATE SET archived_at = null, is_valid = excluded.is_valid
CONTEXT:  while locking tuple (41116,185) in relation \"files\"

I assume that it may be caused by ON CONFLICT DO UPDATE statement, which may update rows which are not locked by previous SELECT FOR UPDATE

But I can't understand how can SELECT ... FOR UPDATE query cause deadlock if it is the first query in transaction. There is not queries before it. Can SELECT ... FOR UPDATE statement lock several rows and then wait for other rows in condition to be unlocked?



Solution 1:[1]

SELECT FOR UPDATE is no safeguard against deadlocks. It just locks rows. Locks are acquired along the way, in the order instructed by ORDER BY, or in arbitrary order in the absence of ORDER BY. The best defense against deadlocks is to lock rows in consistent order across the whole transaction - and doing likewise in all concurrent transactions. Or, as the manual puts it:

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.

Else, this can happen (row1, row2, ... are rows numbered according to the virtual consistent order):

T1: SELECT FOR UPDATE ...          -- lock row2, row3
        T2: SELECT FOR UPDATE ...  -- lock row4, wait for T1 to release row2 
T1: INSERT ... ON CONFLICT ...     -- wait for T2 to release lock on row4

--> deadlock

Adding ORDER BY to your SELECT... FOR UPDATE may already avoid your deadlocks. (It would avoid the one demonstrated above.) Or this happens and you have to do more:

T1: SELECT FOR UPDATE ...          -- lock row2, row3
        T2: SELECT FOR UPDATE ...  -- lock row1, wait for T1 to release row2 
T1: INSERT ... ON CONFLICT ...     -- wait for T2 to release lock on row1

--> deadlock

Everything within the transaction must happen in consistent order to be absolutly sure.

Also, your UPDATE does not seem to be in line with the SELECT FOR UPDATE. component_id <> hw_component_id. Typo?
Also, f.archived_at IS NULL does not guarantee that the later SET archived_at = NOW() only affects these rows. You would have to add WHERE f.archived_at IS NULL to the UPDATE be in line. (Seems like a good idea in any case?)

I assume that it may be caused by ON CONFLICT DO UPDATE statement, which may update rows which are not locked by previous SELECT FOR UPDATE.

As long as the UPSERT (ON CONFLICT DO UPDATE) sticks to the consistent order, that wouldn't be a problem. But that may be hard or impossible to enforce.

Can SELECT ... FOR UPDATE statement lock several rows and then wait for other rows in condition to be unlocked?

Yes, as explained above, locks are acquired along the way. It can have to stop and wait half way through.

NOWAIT

If all that still can't resolve your deadlocks, the slow and sure method is to use Serializable Isolation Level. Then you have to be prepared for serialization failures and retry the transaction in this case. Considerably more expensive overall.

Or it might be enough to add NOWAIT:

SELECT FROM files
WHERE  component_id = $1
AND    archived_at IS NULL
ORDER  BY id   -- whatever you use for consistent, deterministic order
FOR    UPDATE NOWAIT;

The manual:

With NOWAIT, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately.

You may even skip the ORDER BY clause with NOWAIT if you cannot establish consistent order with the UPSERT anyway.

Then you have to catch that error and retry the transaction. Similar to catching serialization failures, but much cheaper - and less reliable. For example, multiple transactions can still interlock with their UPSERT alone. But it gets less and less likely.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1