Postgres中的锁定和事务应该阻止查询 [英] Lock and transaction in postgres that should block a query
问题描述
让我们在SQL窗口1中进行假设:
-- query 1
BEGIN TRANSACTION;
UPDATE post SET title = 'edited' WHERE id = 1;
-- note that there is no explicit commit
然后从另一个窗口(窗口2)执行操作:
-- query 2
SELECT * FROM post WHERE id = 1;
我得到:
1 | original title
这很好,因为默认隔离级别为READ COMMITTED,并且因为从不提交查询1,所以直到我从窗口1显式提交之前,它执行的更改都是不可读的.
实际上,如果我在窗口1中执行以下操作:
COMMIT TRANSACTION;
如果我重新运行查询2,则可以看到更改.
1 | edited
我的问题是:
为什么查询2在我第一次运行时返回正常?我期望它会被阻塞,因为尚未提交窗口1中的事务,并且使用id = 1
放置在行上的锁是(应该是)一个未发布的排他性特权,它应像在窗口2中执行的那样阻止读取.所有休息对我来说很有意义,但是我希望SELECT
会卡住,直到在窗口1中执行显式提交为止.
您描述的行为是正常的,并且在任何事务性关系数据库中都是预期的.
如果PostgreSQL为您显示的第一个SELECT
值是edited
,那么这样做是错误的-这被称为脏读",这对数据库来说是个坏消息.
PostgreSQL将被允许在SELECT
上等待,直到您提交或回滚为止,但是SQL标准并不需要它,您没有告诉它您想要等待,并且它没有等待任何技术原因,因此它将立即返回您要求的数据.毕竟,在提交之前,只有update
种存在-它仍然可能会或可能不会发生.
如果PostgreSQL总是在这里等待,那么您很快就会遇到一种情况,即一次只能有一个连接对数据库执行任何操作.性能不佳,而且在大多数情况下完全没有必要.
如果要等待并发的UPDATE
(或DELETE
),则可以使用SELECT ... FOR SHARE
. (但请注意,这不适用于INSERT
.)
详细信息:
没有FOR UPDATE
或FOR SHARE
子句的 SELECT
不会获取任何行级锁.因此,它可以看到当前已提交的行,并且不受任何可能正在修改该行的正在进行的事务的影响.这些概念在文档的 MVCC部分中进行了解释.通常的想法是PostgreSQL是写时复制的,其版本控制使其可以根据事务或语句在启动时可以看到"的内容(即PostgreSQL称为快照")返回正确的副本. >
在默认的READ COMMITTED
隔离快照中,是在语句级别创建的,因此,如果SELECT
行,COMMIT
从另一笔交易中对其进行更改,然后再次SELECT
即使一次转换也是如此.如果您不希望在事务开始后看到更改,可以使用SNAPSHOT
隔离,或者使用SERIALIZABLE
隔离来增加针对某些类型的事务相互依赖的保护.
请参见文档中的事务隔离一章. /p>
如果希望SELECT
等待正在进行的事务提交或回滚对所选行的更改,则必须使用SELECT ... FOR SHARE
.这将阻止UPDATE
或DELETE
所获得的锁定,直到获得该锁定的事务回滚或提交为止.
INSERT
是不同的-元组只是在提交之前才存在于其他事务中.等待并发INSERT
的唯一方法是获取EXCLUSIVE
表级锁,因此您在读取表时知道没有其他人在更改表.通常,需要这样做意味着您虽然在应用程序中遇到了设计问题-如果仍有未提交的insert
仍在飞行中,则您的应用程序不应该关心.
请参见文档的显式锁定章节.
Let's assume in SQL window 1 I do:
-- query 1
BEGIN TRANSACTION;
UPDATE post SET title = 'edited' WHERE id = 1;
-- note that there is no explicit commit
Then from another window (window 2) I do:
-- query 2
SELECT * FROM post WHERE id = 1;
I get:
1 | original title
Which is fine as the default isolation level is READ COMMITTED and because query 1 is never committed, the change it performs is not readable until I explicitly commit from window 1.
In fact if I, in window 1, do:
COMMIT TRANSACTION;
I can then see the change if I re-run query 2.
1 | edited
My question is:
Why is query 2 returning fine the first time I run it? I was expecting it to block as the transaction in window 1 was not committed yet and the lock placed on row with id = 1
was (should be) an unreleased exclusive one that should block a read like the one performed in window 2. All the rest makes sense to me but I was expecting the SELECT
to get stuck until an explicit commit in window 1 was executed.
The behaviour you describe is normal and expected in any transactional relational database.
If PostgreSQL showed you the value edited
for the first SELECT
it'd be wrong to do so - that's called a "dirty read", and is bad news in databases.
PostgreSQL would be allowed to wait at the SELECT
until you committed or rolled back, but it isn't required to by the SQL standard, you haven't told it you want to wait, and it doesn't have to wait for any technical reason, so it returns the data you asked for immediately. After all, until it's committed, that update
only kind-of exists - it still might or might not happen.
If PostgreSQL always waited here, then you'd quickly land up with a situation where only one connection could be doing anything with the database at a time. Not pretty for performance, and totally unnecessary the vast majority of the time.
If you want to wait for a concurrent UPDATE
(or DELETE
), you'd use SELECT ... FOR SHARE
. (But be aware that this won't work for INSERT
).
Details:
SELECT
without a FOR UPDATE
or FOR SHARE
clause does not take any row level locks. So it sees whatever is the current committed row, and is not affected by any in-flight transactions that might be modifying that row. The concepts are explained in the MVCC section of the docs. The general idea is that PostgreSQL is copy-on-write, with versioning that allows it to return the correct copy based on what the transaction or statement could "see" at the time it started - what PostgreSQL calls a "snapshot".
In the default READ COMMITTED
isolation snapshots are taken at the statement level, so if you SELECT
a row, COMMIT
a change to it from another transaction, and SELECT
it again you'll see different values even within one transation. You can use SNAPSHOT
isolation if you don't want to see changes committed after the transaction begins, or SERIALIZABLE
isolation to add further protection against certain kinds of transaction inter-dependencies.
See the transaction isolation chapter in the documentation.
If you want a SELECT
to wait for in-progress transactions to commit or rollback changes to rows being selected, you must use SELECT ... FOR SHARE
. This will block on the lock taken by an UPDATE
or DELETE
until the transaction that took the lock rolls back or commits.
INSERT
is different, though - the tuples just don't exist to other transactions until commit. The only way to wait for concurrent INSERT
s is to take an EXCLUSIVE
table-level lock, so you know nobody else is changing the table while you read it. Usually the need to do that means you have a design problem in the application though - your app should not care if there are uncommitted insert
s still in flight.
See the explicit locking chapter of the documentation.
这篇关于Postgres中的锁定和事务应该阻止查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!