SELECT或INSERT是否在函数中容易出现竞争条件? [英] Is SELECT or INSERT in a function prone to race conditions?

查看:210
本文介绍了SELECT或INSERT是否在函数中容易出现竞争条件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个函数来创建一个简单的博客引擎的帖子:

I wrote a function to create posts for a simple blogging engine:

CREATE FUNCTION CreatePost(VARCHAR, TEXT, VARCHAR[])
RETURNS INTEGER AS $$
    DECLARE
        InsertedPostId INTEGER;
        TagName VARCHAR;
    BEGIN
        INSERT INTO Posts (Title, Body)
        VALUES ($1, $2)
        RETURNING Id INTO InsertedPostId;

        FOREACH TagName IN ARRAY $3 LOOP
            DECLARE
                InsertedTagId INTEGER;
            BEGIN
                -- I am concerned about this part.
                BEGIN
                    INSERT INTO Tags (Name)
                    VALUES (TagName)
                    RETURNING Id INTO InsertedTagId;
                EXCEPTION WHEN UNIQUE_VIOLATION THEN
                    SELECT INTO InsertedTagId Id
                    FROM Tags
                    WHERE Name = TagName
                    FETCH FIRST ROW ONLY;
                END;

                INSERT INTO Taggings (PostId, TagId)
                VALUES (InsertedPostId, InsertedTagId);
            END;
        END LOOP;

        RETURN InsertedPostId;
    END;
$$ LANGUAGE 'plpgsql';

当多个用户同时删除标签和创建帖子时,这是否容易出现竞争情况? >
具体来说,做交易(从而防止这种竞争条件发生)?

我使用PostgreSQL 9.2.3。

Is this prone to race conditions when multiple users delete tags and create posts at the same time?
Specifically, do transactions (and thus functions) prevent such race conditions from happening?
I'm using PostgreSQL 9.2.3.

推荐答案

更新Postgres 9.5 +



使用新的UPSERT实现 INSERT ... ON CONFLICT。 。DO UPDATE ,我们可以大大简化。 SQL函数 INSERT 单个 标记:

Update for Postgres 9.5+

Using the new UPSERT implementation INSERT ... ON CONFLICT .. DO UPDATE, we can largely simplify. SQL function To INSERT a single tag:

CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int) AS
$func$
   WITH ins AS (
      INSERT INTO tag AS t (tag)
      VALUES (_tag)
      ON     CONFLICT (tag) DO UPDATE
      SET    tag = t.tag WHERE FALSE  -- never executed, but locks the row
      RETURNING t.tag_id
      )
   SELECT tag_id FROM ins
   UNION  ALL
   SELECT tag_id FROM tag WHERE tag = _tag  -- only executed if no INSERT
   LIMIT  1
$func$ LANGUAGE sql;

或与plpgsql相同:

Or the same with plpgsql:

CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int) AS
$func$
BEGIN
   INSERT INTO tag AS t (tag)
   VALUES (_tag)
   ON     CONFLICT (tag) DO UPDATE
   SET    tag = t.tag WHERE FALSE  -- never executed, but locks the row
   RETURNING t.tag_id
   INTO   _tag_id;

   IF NOT FOUND THEN
      SELECT tag_id  -- only if row existed before
      FROM   tag
      WHERE  tag = _tag
      INTO   _tag_id;
   END IF;
END
$func$ LANGUAGE plpgsql;

调用(对于任一变体都相同):

Call (same for either variant):

SELECT f_tag_id('possibly_new_tag');

The documentation:


只有该表达式返回true的行将被更新,但是当执行 ON CONFLICT DO UPDATE 操作时,所有行都将被锁定

与下面的解决方案相反,首先尝试 INSERT - 因为无论如何都要执行数据修改CTE。如果 INSERT 大多数时候都会发生冲突,那么两个单独的SQL命令与 SELECT

As opposed to the solutions below this tries the INSERT first - since a data-modifying CTE is executed anyway. If the INSERT conflicts most of the time, two separate SQL commands with the SELECT first (like below) might be faster.

相关:

  • How do I insert into this PostgresSQL with ON CONFLICT?

考虑对我们避免的竞争条件和 UNION ALL ... LIMIT 1的解释

Consider explanation for the race condition we avoid and for UNION ALL ... LIMIT 1 below.

这是

考虑到这个(稍微简化)表:

Given this (slightly simplified) table:

CREATE table tag (
  tag_id serial PRIMARY KEY
, tag    text   UNIQUE
);

... 几乎100%安全功能可插入新标记/选择现有的,可能看起来像这样。

为什么不是100%?请考虑有关注释> UPSERT
example :

... a practically 100% secure function to insert new tag / select existing one, could look like this.
Why not 100%? Consider the notes in the manual for the related UPSERT example:

CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT tag_id int) AS
$func$
BEGIN

LOOP
   BEGIN

   WITH sel AS (SELECT t.tag_id FROM tag t WHERE t.tag = _tag FOR SHARE)
      , ins AS (INSERT INTO tag(tag)
                SELECT _tag
                WHERE  NOT EXISTS (SELECT 1 FROM sel)  -- only if not found
                RETURNING tag.tag_id)  -- qualified so no conflict with param
   SELECT sel.tag_id FROM sel
   UNION  ALL
   SELECT ins.tag_id FROM ins
   INTO   tag_id;

   EXCEPTION WHEN UNIQUE_VIOLATION THEN     -- insert in concurrent session?
      RAISE NOTICE 'It actually happened!'; -- hardly ever happens
   END;

   EXIT WHEN tag_id IS NOT NULL;            -- else keep looping
END LOOP;

END
$func$ LANGUAGE plpgsql;

SQL Fiddle。


  • 请先尝试 SELECT

使用 CTE 以尽量减少争用条件的(已经很小的)时间段。

Use a CTE to minimize the (already tiny) time slot for the race condition.

一个查询中的 SELECT INSERT / em>是超级小。如果你没有很强的并发负载,或者你每年可以有一次异常,你可以忽略这种情况,并使用更快的SQL语句。

The time window between the SELECT and the INSERT within one query is super tiny. If you don't have heavy concurrent load, or if you can live with an exception once a year, you could just ignore the case and use the SQL statement, which is faster.

不需要 FETCH FIRST ROW ONLY (= LIMIT 1 )。标签名称显然是 UNIQUE

No need for FETCH FIRST ROW ONLY (= LIMIT 1). The tag name is obviously UNIQUE.

删除FOR SHARE 在我的例子中,如果你通常不在表上有并发 DELETE UPDATE 标签

Remove FOR SHARE in my example if you don't usually have concurrent DELETE or UPDATE on the table tag. Costs a tiny bit of performance.

不要引用语言名称:'plpgsql' plpgsql 是一个标识符报价可能会导致问题,并且仅允许向后兼容性。

Never quote the language name: 'plpgsql'. plpgsql is an identifier. Quoting may cause problems and is only tolerated for backwards compatibility.

不要使用非描述性的栏名称,例如 id 。当加入几个表( 在关系数据库中)时,最终会出现多个相同的名称,并且必须使用别名。

Don't use non-descriptive column names like id or name. When joining a couple of tables (which is what you do in a relational DB) you end up with multiple identical names and have to use aliases.

使用此函数可以大大简化 FOREACH LOOP 到:

Using this function you could largely simplify your FOREACH LOOP to:

...
FOREACH TagName IN ARRAY $3
LOOP
   INSERT INTO taggings (PostId, TagId)
   VALUES   (InsertedPostId, f_tag_id(TagName));
END LOOP;
...

http://www.postgresql.org/docs/current/interactive/functions-array.html#ARRAY-FUNCTIONS-TABLE> unnest()

Faster, though, as a single SQL statement with unnest():

INSERT INTO taggings (PostId, TagId)
SELECT InsertedPostId, f_tag_id(tag)
FROM   unnest($3) tag;

取代整个循环。

此变体建立在 UNION ALL LIMIT 子句:只要找到足够的行,就不会执行其余行:

This variant builds on the behavior of UNION ALL with a LIMIT clause: as soon as enough rows are found, the rest is never executed:

  • Way to try multiple SELECTs till a result is available?

在此基础上,我们可以将 INSERT 外包到一个单独的函数。只有我们需要异常处理。与第一个解决方案一样安全。

Building on this, we can outsource the INSERT into a separate function. Only there we need exception handling. Just as safe as the first solution.

CREATE OR REPLACE FUNCTION f_insert_tag(_tag text, OUT tag_id int)
  RETURNS int AS
$func$
BEGIN
INSERT INTO tag(tag) VALUES (_tag) RETURNING tag.tag_id INTO tag_id;

EXCEPTION WHEN UNIQUE_VIOLATION THEN  -- catch exception, NULL is returned
END
$func$ LANGUAGE plpgsql;

在主函数中使用:

CREATE OR REPLACE FUNCTION f_tag_id(_tag text, OUT _tag_id int) AS
$func$
BEGIN
   LOOP
      SELECT tag_id FROM tag WHERE tag = _tag
      UNION  ALL
      SELECT f_insert_tag(_tag)  -- only executed if tag not found
      LIMIT  1  -- not strictly necessary, just to be clear
      INTO   _tag_id;

      EXIT WHEN _tag_id IS NOT NULL;  -- else keep looping
   END LOOP;
END
$func$ LANGUAGE plpgsql;




  • 需要 SELECT ,因为 INSERT 包含 EXCEPTION 子句很少输入。

    • This is a bit cheaper if most of the calls only need SELECT, because the more expensive block with INSERT containing the EXCEPTION clause is rarely entered. The query is also simpler.

      这里不支持FOR SHARE $ c> UNION 查询)。

      FOR SHARE is not possible here (not allowed in UNION query).

      LIMIT 1 必要(在pg 9.4中测试)。 Postgres从 INTO _tag_id 派生 LIMIT 1 ,并且只有执行,直到找到第一行。

      LIMIT 1 would not be necessary (tested in pg 9.4). Postgres derives LIMIT 1 from INTO _tag_id and only executes until the first row is found.

      这篇关于SELECT或INSERT是否在函数中容易出现竞争条件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆