多个线程会导致约束集重复更新吗? [英] Can multiple threads cause duplicate updates on constrained set?

查看:107
本文介绍了多个线程会导致约束集重复更新吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在postgres中,如果我运行以下语句

In postgres if I run the following statement

update table set col = 1 where col = 2

默认情况下 读已提交隔离级别,来自多个并发会话,我是否保证:

In the default READ COMMITTED isolation level, from multiple concurrent sessions, am I guaranteed that:


  1. 在单个匹配的情况下,只有1个线程的行数为1(表示仅一个线程写入)

  2. 多重匹配的情况下,只有1个线程将获得ROWCOUNT> 0(这意味着只有一个线程写入批处理)


推荐答案

您陈述的保证适用于这种简单情况,但不一定适用于稍微复杂一些的查询。

Your stated guarantees apply in this simple case, but not necessarily in slightly more complex queries. See the end of the answer for examples.

假设col1是唯一的,则具有一个值 2,或者具有稳定的顺序,因此每个 UPDATE 都以相同的顺序匹配相同的行:

Assuming that col1 is unique, has exactly one value "2", or has stable ordering so every UPDATE matches the same rows in the same order:

该查询将发生的事情是,线程将找到col = 2的行,并且所有线程都试图在该元组上获取写锁。恰好其中之一会成功。其他线程将阻塞等待第一个线程的事务提交。

What'll happen for this query is that the threads will find the row with col=2 and all try to grab a write lock on that tuple. Exactly one of them will succeed. The others will block waiting for the first thread's transaction to commit.

第一个TX将写入,提交并返回行数1。提交将释放锁。

That first tx will write, commit, and return a rowcount of 1. The commit will release the lock.

其他TX会再次尝试锁定。他们会一一成功。每个事务将依次执行以下过程:

The other tx's will again try to grab the lock. One by one they'll succeed. Each transaction will in turn go through the following process:


  • 获取有争议的元组的写锁。

  • 获得锁定后,重新检查 WHERE col = 2 条件。

  • 重新检查将显示该条件不再匹配,因此 UPDATE 将跳过该行。

  • UPDATE 没有其他行,因此它将报告零行更新。

  • 提交,释放下一个试图获取该行的tx的锁定。

  • Obtain the write lock on the contested tuple.
  • Re-check the WHERE col=2 condition after getting the lock.
  • The re-check will show that the condition no longer matches so the UPDATE will skip that row.
  • The UPDATE has no other rows so it will report zero rows updated.
  • Commit, releasing the lock for the next tx trying to get hold of it.

在这种简单情况下,行级锁定和条件重新检查可以有效地序列化更新。在更复杂的情况下,不需要那么多。

In this simple case the row-level locking and the condition re-check effectively serializes the updates. In more complex cases, not so much.

您可以轻松地演示这一点。打开说四个psql会话。首先,使用 BEGIN锁定表; LOCK TABLE测试; * 。在其余的会话中,运行相同的 UPDATE s-它们将锁定表级锁。现在,通过 COMMIT 设置您的第一个会话来释放锁定。观看他们的比赛。只有一个报告行计数为1,其他报告行计数为0。这很容易实现自动化,并通过脚本编写以重复和扩展到更多连接/线程。

You can easily demonstrate this. Open say four psql sessions. In the first, lock the table with BEGIN; LOCK TABLE test;*. In the rest of the sessions run identical UPDATEs - they'll block on the table level lock. Now release the lock by COMMITting your first session. Watch them race. Only one will report a row count of 1, the others will report 0. This is easily automated and scripted for repetition and scaling up to more connections/threads.

要学习更多信息,请阅读同时编写规则,第11页href = http://www.postgresql.org/files/developer/concurrency.pdf rel = nofollow> PostgreSQL并发问题-然后阅读该演示文稿的其余部分。

To learn more, read rules for concurrent writing, page 11 of PostgreSQL concurrency issues - and then read the rest of that presentation.

凯文在评论中指出,如果 col 不是唯一的,因此您可以匹配多个行,然后 UPDATE 的不同执行可能会获得不同的顺序。如果他们选择不同的计划,则可能会发生这种情况(例如,一个是通过 PREPARE EXECUTE 进行的,另一个是直接的,或者您可能会误用 enable _ GUC),或者如果它们全部使用的计划都使用了不稳定的相等值。如果他们以不同的顺序获取行,则tx1将锁定一个元组,tx2将锁定另一个元组,然后他们将各自尝试对彼此已经锁定的元组进行锁定。 PostgreSQL将使用死锁异常中止其中之一。这是为什么 all 您的数据库代码总是总是准备重试事务的另一个好理由。

As Kevin noted in the comments, if col isn't unique so you might match multiple rows, then different executions of the UPDATE could get different orderings. This can happen if they choose different plans (say one is a via a PREPARE and EXECUTE and another is direct, or you're messing with the enable_ GUCs) or if the plan they all use uses an unstable sort of equal values. If they get the rows in a different order then tx1 will lock one tuple, tx2 will lock another, then they'll each try to get locks on each others' already-locked tuples. PostgreSQL will abort one of them with a deadlock exception. This is yet another good reason why all your database code should always be prepared to retry transactions.

如果您'请确保并发 UPDATE 始终以相同的顺序获得相同的行,但仍然可以依靠答案第一部分中描述的行为。

If you're careful to make sure concurrent UPDATEs always get the same rows in the same order you can still rely on the behaviour described in the first part of the answer.

令人沮丧的是,PostgreSQL不提供 UPDATE ... ORDER BY ,因此请确保您的更新始终选择相同的命令并不像您希望的那么简单。通常,最安全的选择是 SELECT ... FOR UPDATE ... ORDER BY ,然后进行单独的 UPDATE

Frustratingly, PostgreSQL doesn't offer UPDATE ... ORDER BY so ensuring that your updates always select the same rows in the same order isn't as simple as you might wish. A SELECT ... FOR UPDATE ... ORDER BY followed by a separate UPDATE is often safest.

如果您要进行具有多个阶段,涉及多个元组或其他条件的查询与平等相比,您可以获得与串行执行结果不同的令人惊讶的结果。特别是,并发运行以下内容:

If you're doing queries with multiple phases, involving multiple tuples, or conditions other than equality you can get surprising results that differ from the results of a serial execution. In particular, concurrent runs of anything like:

UPDATE test SET col = 1 WHERE col = (SELECT t.col FROM test t ORDER BY t.col LIMIT 1);

或其他建立简单的队列系统的尝试 *失败*发挥您的期望。请参阅有关并发性的PostgreSQL文档此演示文稿以获取更多信息。

or other efforts to build a simple "queue" system will *fail* to work how you expect. See the PostgreSQL docs on concurrency and this presentation for more info.

如果您想要由数据库支持的工作队列中,有经过测试的解决方案,可以处理所有令人惊讶的复杂情况。最受欢迎的之一是 PgQ 。关于该主题,有一个有用的 PgCon论文 Google搜索 postgresql队列 充满了有用的结果。

If you want a work queue backed by a database there are well-tested solutions that handle all the surprisingly complicated corner cases. One of the most popular is PgQ. There's a useful PgCon paper on the topic, and a Google search for 'postgresql queue' is full of useful results.

* BTW,而不是 LOCK TABLE 使用 SELECT 1 FROM test WHERE col = 2 FOR UPDATE; 获得对元组的写锁定。这将阻止对此更新,但不会阻止对其他元组的写入或任何读取。这样您就可以模拟不同种类的并发问题。

* BTW, instead of a LOCK TABLE you can use SELECT 1 FROM test WHERE col = 2 FOR UPDATE; to obtain a write lock on just that on tuple. That'll block updates against it but not block writes to other tuples or block any reads. That allows you to simulate different kinds of concurrency issues.

这篇关于多个线程会导致约束集重复更新吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆