使用ORDER BY进行Postgres UPDATE,该怎么做? [英] Postgres UPDATE with ORDER BY, how to do it?
问题描述
我需要对一系列记录进行Postgres更新,我正试图防止在压力测试中出现僵局。
I need to do a Postgres update on a collection of records & I'm trying to prevent a deadlock which appeared in the stress tests.
典型的解决方法是按特定顺序更新记录,例如按ID进行更新-但似乎Postgres不允许ORDER BY进行更新。
The typical resolution to this is to update records in a certain order, by ID for example - but it seems that Postgres doesn't allow ORDER BY for UPDATE.
假设我需要进行更新,例如:
Assuming I need to do an update, for example:
UPDATE BALANCES WHERE ID IN (SELECT ID FROM some_function() ORDER BY ID);
当您同时运行200个查询时,将导致死锁。该怎么办?
results in deadlocks when you run 200 queries concurrently. What to do?
我正在寻找一种通用的解决方案,而不是像使用ORDER BY更新
I'm looking for a general solution, not case-specific workarounds like in UPDATE with ORDER BY
感觉到必须有更好的解决方案而不是编写游标功能。另外,如果没有更好的方法,该光标的功能将如何最佳化?按记录更新记录
It feels that there must be a better solution than writing a cursor function. Also, if there's no better way, how would that cursor function optimally look like? Update record-by-record
推荐答案
据我所知,没有办法直接通过 UPDATE
语句;保证锁顺序的唯一方法是使用 SELECT ...通过ID进行更新以获取锁
,例如:
As far as I know, there's no way to accomplish this directly through the UPDATE
statement; the only way to guarantee lock order is to explicitly acquire locks with a SELECT ... ORDER BY ID FOR UPDATE
, e.g.:
UPDATE Balances
SET Balance = 0
WHERE ID IN (
SELECT ID FROM Balances
WHERE ID IN (SELECT ID FROM some_function())
ORDER BY ID
FOR UPDATE
)
这样做的缺点是在 Balances
表上重复 ID
索引查找。在简单的示例中,您可以通过获取物理行地址(由 ctid
系统列),然后使用它来驱动 UPDATE
:
This has the downside of repeating the ID
index lookup on the Balances
table. In your simple example, you can avoid this overhead by fetching the physical row address (represented by the ctid
system column) during the locking query, and using that to drive the UPDATE
:
UPDATE Balances
SET Balance = 0
WHERE ctid = ANY(ARRAY(
SELECT ctid FROM Balances
WHERE ID IN (SELECT ID FROM some_function())
ORDER BY ID
FOR UPDATE
))
(使用 ctid
s时要小心,因为这些值是瞬态的。我们在这里很安全,因为锁将阻止任何更改。)
(Be careful when using ctid
s, as the values are transient. We're safe here, as the locks will block any changes.)
不幸的是,计划者只会在狭窄的范围内使用 ctid
一组案例(您可以通过在 EXPLAIN
输出中查找 Tid Scan节点来判断它是否有效)。要在单个 UPDATE
语句中处理更复杂的查询,例如如果您的新余额是由 some_function()
和ID一起退还的,则需要退回基于ID的查询:
Unfortunately, the planner will only utilise the ctid
in a narrow set of cases (you can tell if it's working by looking for a "Tid Scan" node in the EXPLAIN
output). To handle more complicated queries within a single UPDATE
statement, e.g. if your new balance was being returned by some_function()
alongside the ID, you'll need to fall back to the ID-based lookup:
UPDATE Balances
SET Balance = Locks.NewBalance
FROM (
SELECT Balances.ID, some_function.NewBalance
FROM Balances
JOIN some_function() ON some_function.ID = Balances.ID
ORDER BY Balances.ID
FOR UPDATE
) Locks
WHERE Balances.ID = Locks.ID
如果性能开销是一个问题,则需要使用游标,会看起来像这样:
If the performance overhead is an issue, you'd need to resort to using a cursor, which would look something like this:
DO $$
DECLARE
c CURSOR FOR
SELECT Balances.ID, some_function.NewBalance
FROM Balances
JOIN some_function() ON some_function.ID = Balances.ID
ORDER BY Balances.ID
FOR UPDATE;
BEGIN
FOR row IN c LOOP
UPDATE Balances
SET Balance = row.NewBalance
WHERE CURRENT OF c;
END LOOP;
END
$$
这篇关于使用ORDER BY进行Postgres UPDATE,该怎么做?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!