PostgreSQL 9.2.1中具有可串行化隔离的谓词锁定 [英] Predicate locking in PostgreSQL 9.2.1 with Serializable isolation

查看:131
本文介绍了PostgreSQL 9.2.1中具有可串行化隔离的谓词锁定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经仔细阅读有关事务隔离的postgres文档其他问题中建议,但我仍无法理解谓词锁定"的东西.

I have been reading thoroughly the postgres documentation on transaction isolation suggested in other of my questions but I have not still managed to understand the "predicate locking" stuff.

我希望有人能启发我:-)

I hope somebody can enlighten me :-)

根据文档: PostgreSQL中的谓词锁(与大多数其他数据库系统一样)基于事务实际访问的数据

听起来不错,那为什么会发生以下情况?

That sounds good, then why is the following happening?

CREATE TABLE mycustomer(cid integer PRIMARY KEY, licenses integer);
CREATE TABLE mydevice(id integer PRIMARY KEY, cid integer REFERENCES 
mycustomer (cid), status varchar(10));

INSERT INTO mycustomer(cid, licenses) VALUES (1, 5);
INSERT INTO mycustomer(cid, licenses) VALUES (2, 5);

    Request 1                            Request2
BEGIN TRANSACTION ISOLATION 
LEVEL SERIALIZABLE;
                                         BEGIN TRANSACTION ISOLATION 
                                         LEVEL SERIALIZABLE;
SELECT * from mydevice where cid = 1;

                                         SELECT * from mydevice where cid = 2;
INSERT INTO mydevice(id, cid, status) 
VALUES (1, 1, 'ok');

                                         INSERT INTO mydevice(id, cid, status)         
                                         VALUES (2, 2, 'ok');
commit;
(=ok)                                 
                                         commit;
                                         (=rollback)

我知道请求1和请求2中的插入与先前的读操作没有冲突,因此不应出现任何错误.为什么会出现错误:由于事务之间的读/写依赖性而无法序列化访问".

I understand that the inserts from request 1 and request 2 are not conflicting with the previous reads and thus there should not be any error launched. Why am I getting a "ERROR: could not serialize access due to read/write dependencies among transactions".

您可以想象,由于每个并发请求都将被回滚而不考虑其详细信息,因此我无法发生上述行为.在我的业务场景中,我希望并发请求仅在它们为同一个客户插入数据时(根据示例设备)才回滚.

As you can imagine I cannot have the aforementioned behavior happening since every concurrent request would be roll-backed regardless of its details. In my business scenario I would like concurrent requests to be only roll-backed when they were inserting data (as per the example devices) for the same single customer.

这些操作是从Java应用程序执行的,原则上,我正在考虑创建一个锁定表以满足我的需要.有什么想法吗?

These operations are performed from a Java application and In principle I am thinking about creating a locking table to satisfy my needs. Any ideas?

非常感谢!

推荐答案

来自

在执行查询期间获取的特定锁将取决于查询所使用的计划,并且在执行过程中,多个更细粒度的锁(例如,tuple锁)可以组合为更少的较粗粒度的锁(例如,页面锁)交易过程中,以防止用于跟踪锁的内存耗尽.

The particular locks acquired during execution of a query will depend on the plan used by the query, and multiple finer-grained locks (e.g., tuple locks) may be combined into fewer coarser-grained locks (e.g., page locks) during the course of the transaction to prevent exhaustion of the memory used to track the locks.

...

  • 顺序扫描将始终需要关系级别的谓词锁定.这会导致序列化失败率增加.

SELECT上的EXPLAIN可以告诉您正在执行的查询计划,但是如果表很小(或为空!),PostgreSQL几乎肯定会选择顺序扫描而不是引用索引.这将导致整个表上的谓词锁定,每当另一个事务对表执行任何操作时,就会导致序列化失败.

An EXPLAIN on that SELECT can tel you what the query plan is being taken, but if the table is small (or empty!), PostgreSQL will almost certainly pick a sequential scan instead of referencing the index. This will cause a predicate lock on the entire table, causing serialization failure whenever another transaction does anything to the table.

在我的系统上:

isolation=# EXPLAIN SELECT * from mydevice where cid = 1;
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on mydevice  (cost=0.00..23.38 rows=5 width=46)
   Filter: (cid = 1)
(2 rows)

您可以尝试添加索引并强制其使用该索引:

You could try adding an index and force it to use that:

isolation=# CREATE INDEX mydevice_cid_key ON mydevice (cid);
CREATE INDEX
isolation=# SET enable_seqscan = off;
SET
isolation=# EXPLAIN SELECT * from mydevice where cid = 1;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Index Scan using mydevice_cid_key on mydevice  (cost=0.00..8.27 rows=1 width=46)
   Index Cond: (cid = 1)
(2 rows)

但是,这不是正确的解决方案.让我们备份一点.

However, this is not the correct solution. Let's back up a little bit.

可序列化的意思是保证事务实际上具有与接连运行完全相同的效果,尽管实际上您实际上是在同时运行这些事务. PostgreSQL没有无限的资源,因此,确实可以对查询实际访问的数据设置谓词锁定,但数据"的含义要比返回的行"还重要.

Serializable is meant to guarantee that transactions will have exactly the same effect as if they were run one after another, despite the fact that you're actually running these transactions concurrently. PostgreSQL does not have infinite resources, so while it's true that it puts predicate locks on data that your query actually accesses, "data" can mean more than "rows returned".

PostgreSQL在认为可能存在问题而不是确定存在问题时选择标记序列化失败. (因此,它如何将行锁概括为页面锁.)这种设计选择会导致误报,例如您的示例中的错误.误报并不理想,但是不会影响隔离语义的正确性.

PostgreSQL chooses to flag serialization failures when it thinks there might be a problem, not when it's certain. (Hence how it generalizes row locks to page locks.) This design choice cause false positives, such as the one in your example. False positives are less than ideal, however, it doesn't affect correctness of the isolation semantics.

错误消息是:

ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
HINT:  The transaction might succeed if retried.

该提示是关键.您的应用程序需要捕获序列化失败并重试整个操作.每当SERIALIZABLE在运行时,这都是正确的-尽管并发,但它保证了串行正确性,但是如果没有应用程序的帮助,它就无法做到这一点.换句话说,如果您实际上是在进行并发修改,则PostgreSQL满足隔离要求的唯一方法是要求您的应用程序自行序列化.因此:

That hint is key. Your application needs to catch serialization failures and retry the whole operation. This is true whenever SERIALIZABLE is in play -- it guarantees serial correctness despite concurrency, but it can't do that without the help of your application. Put another way, if you're actually doing concurrent modifications, the only way PostgreSQL can satisfy the isolation requirements is to ask your application to serialize itself. Thus:

使用这种技术的环境必须具有通用的方式来处理序列化失败(该错误总是以SQLSTATE值"40001"返回),这很重要,因为很难准确地预测哪些事务可能会导致序列化失败.读/写依赖关系,需要回滚以防止序列化异常.

It is important that an environment which uses this technique have a generalized way of handling serialization failures (which always return with a SQLSTATE value of '40001'), because it will be very hard to predict exactly which transactions might contribute to the read/write dependencies and need to be rolled back to prevent serialization anomalies.

这篇关于PostgreSQL 9.2.1中具有可串行化隔离的谓词锁定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆