Postgres 9.3:使用简单INSERT的Sharelock问题 [英] Postgres 9.3: Sharelock issue with simple INSERT

查看:112
本文介绍了Postgres 9.3:使用简单INSERT的Sharelock问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:下面的潜在解决方案

Update: Potential solution below

我有大量的配置文件,其中包含键/值我正尝试推送到数据库中的数据对。在配置文件中重复了很多键和值,因此我使用3个表存储数据。一个用于所有唯一键值,一个用于所有唯一对值,一个列出每个文件的所有键/值对。

I have a large corpus of configuration files consisting of key/value pairs that I'm trying to push into a database. A lot of the keys and values are repeated across configuration files so I'm storing the data using 3 tables. One for all unique key values, one for all unique pair values, and one listing all the key/value pairs for each file.

问题:
我正在使用多个并发进程(并因此使用连接)将原始数据添加到数据库中。不幸的是,在尝试将值添加到键表和值表时,我发现了很多死锁。我尝试了几种不同的插入数据的方法(如下所示),但始终会遇到检测到死锁错误

Problem: I'm using multiple concurrent processes (and therefore connections) to add the raw data into the database. Unfortunately I get a lot of detected deadlocks when trying to add values to the key and value tables. I have a tried a few different methods of inserting the data (shown below), but always end up with a "deadlock detected" error


TransactionRollbackError:检测到死锁详细信息:进程26755
等待事务689456上的ShareLock;
进程26754等待事务689467上的ShareLock;被进程26754阻止。被
进程26755阻止。

TransactionRollbackError: deadlock detected
DETAIL: Process 26755 waits for ShareLock on transaction 689456; blocked by process 26754. Process 26754 waits for ShareLock on transaction 689467; blocked by process 26755.

我想知道是否有人可以确切地说明造成这些僵局的原因,可能会指出一些解决问题的方法。查看我正在使用的SQL语句(在下面列出),我真的看不出为什么存在任何相互依赖关系。

感谢阅读!

I was wondering if someone could shed some light on exactly what could be causing these deadlocks, and possibly point me towards some way of fixing the issue. Looking at the SQL statements I'm using (listed below), I don't really see why there is any co-dependency at all.

Thanks for reading!

示例配置文件:

example_key this_is_the_value
other_example other_value
third example yet_another_value

表定义:

    CREATE TABLE keys (
        id SERIAL PRIMARY KEY,
        hash UUID UNIQUE NOT NULL,
        key TEXT);

    CREATE TABLE values (
        id SERIAL PRIMARY KEY,
        hash UUID UNIQUE NOT NULL,
        key TEXT);

    CREATE TABLE keyvalue_pairs (
        id SERIAL PRIMARY KEY,
        file_id INTEGER REFERENCES filenames,
        key_id INTEGER REFERENCES keys,
        value_id INTEGER REFERENCES values);

SQL语句:

最初,我试图使用此语句来避免任何异常:

Initially I was trying to use this statement to avoid any exceptions:

    WITH s AS (
        SELECT id, hash, key FROM keys
            WHERE hash = 'hash_value';
    ), i AS (
        INSERT INTO keys (hash, key)
        SELECT 'hash_value', 'key_value'
        WHERE NOT EXISTS (SELECT 1 FROM s)
        returning id, hash, key
    )
    SELECT id, hash, key FROM i
    UNION ALL
    SELECT id, hash, key FROM s;

但是即使如此简单的操作也会导致死锁:

But even something as simple as this causes the deadlocks:

    INSERT INTO keys (hash, key)
        VALUES ('hash_value', 'key_value')
        RETURNING id;




  • 在两种情况下,如果由于插入的哈希而引发异常,
    的值不是唯一的,我使用保存点来回滚更改,而
    则使用另一条语句来选择我要使用的ID。

  • 我正在使用哈希唯一字段,因为某些键和值
    太长而无法索引

  • 完整示例带有保存点的python代码(使用psycopg2):

    key_value = 'this_key'
    hash_val = generate_uuid(value)
    try:
        cursor.execute(
            '''
            SAVEPOINT duplicate_hash_savepoint;
            INSERT INTO keys (hash, key)
                VALUES (%s, %s)
                RETURNING id;
            '''
            (hash_val, key_value)
        )
    
        result = cursor.fetchone()[0]
        cursor.execute('''RELEASE SAVEPOINT duplicate_hash_savepoint''')
        return result
    except psycopg2.IntegrityError as e:
        cursor.execute(
            '''
            ROLLBACK TO SAVEPOINT duplicate_hash_savepoint;
            '''
        )
    
        #TODO: Should ensure that values match and this isn't just
        #a hash collision
    
        cursor.execute(
            '''
            SELECT id FROM keys WHERE hash=%s LIMIT 1;
            '''
            (hash_val,)
        )
        return cursor.fetchone()[0]
    

    更新:
    因此,我相信我对另一个stackexchange网站的提示:

    具体来说:


    更新,删除,选择在搜索目标行方面,FOR UPDATE和SELECT FOR SHARE命令
    的行为与SELECT相同:它们
    仅查找从命令
    time1开始提交的目标行。但是,这样的目标行可能在发现
    时已被另一个并发事务更新(或删除或锁定了
    )。在这种情况下,可能的更新程序将等待第一个
    更新事务提交或回滚(如果它仍在
    进度中)。如果第一个更新程序回滚,则其作用被否定
    ,第二个更新程序可以继续更新最初找到的
    行。如果第一个更新程序提交,则第二个更新程序
    将忽略该行(如果第一个更新程序删除了该行)2,否则它
    将尝试将其操作应用于该行的更新版本。

    UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time1. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it2, otherwise it will attempt to apply its operation to the updated version of the row.

    虽然我仍然不确定确切的相互依赖关系,但似乎在不提交的情况下处理大量键/值对可能会导致这样的事情。果然,如果我在添加每个单独的配置文件后提交,则不会发生死锁。

    While I'm still not exactly sure where the co-dependency is, it seems that processing a large number of key/value pairs without commiting would likely result in something like this. Sure enough, if I commit after each individual configuration file is added, the deadlocks don't occur.

    推荐答案

    看起来像您在这种情况下:

    It looks like you're in this situation:


    1. 要插入的表具有主键(或任何类型的唯一索引)。 / li>
    2. 对该表的几个INSERT操作是在一个事务中执行的(而不是在每个事务之后立即提交)

    3. 要插入的行按随机顺序排列(关于主键)

    4. 将行插入并发事务中。

    此这种情况会为僵局带来以下机会:

    This situation creates the following opportunity for deadlock:

    假定有两个会话,每个会话都开始了一个事务。

    Assuming there are two sessions, that each started a transaction.


    1. 会话#1:插入具有PK'A'的行

    2. 会话#2:插入具有PK'B'的行

    3. 会话1:尝试插入带有PK'B'
      的行=>将会话1置于等待会话2提交或rollbac的状态ks

    4. 会话#2:尝试插入带有PK'A'的行
      =>将会话#2置于等待会话#1的位置。

    1. Session #1: insert row with PK 'A'
    2. Session #2: insert row with PK 'B'
    3. Session #1: try to insert row with PK 'B' => Session #1 is put to wait until Session #2 commits or rollbacks
    4. Session #2: try to insert row with PK 'A' => Session #2 is put to wait for Session #1.

    此后不久,死锁检测器意识到两个会话现在都在互相等待,并终止了其中一个致命的死锁错误。

    Shortly thereafter, the deadlock detector gets aware that both sessions are now waiting for each other, and terminates one of them with a fatal deadlock detected error.

    在这种情况下,最简单的解决方案是在插入新条目之后尝试将任何新行插入表中之前提交COMMIT。

    If you're in this scenario, the simplest solution is to COMMIT after a new entry is inserted, before attempting to insert any new row into the table.

    这篇关于Postgres 9.3:使用简单INSERT的Sharelock问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆