仅使用SQL返回更新前的列值 [英] Return pre-UPDATE column values using SQL only

查看:77
本文介绍了仅使用SQL返回更新前的列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发布了一个相关问题,但这是我的另一个难题。

I posted a related question, but this is another part of my puzzle.

我想从已更新的行中获取一列的OLD值-不使用触发器(也不使用存储过程,也不要使用任何其他额外的非SQL /查询实体)。

I would like to get the OLD value of a column from a row that was UPDATEd - WITHOUT using triggers (nor stored procedures, nor any other extra, non -SQL/-query entities).

我有这样的查询:

   UPDATE my_table
      SET processing_by = our_id_info  -- unique to this worker
    WHERE trans_nbr IN (
                        SELECT trans_nbr
                          FROM my_table
                         GROUP BY trans_nbr
                        HAVING COUNT(trans_nbr) > 1
                         LIMIT our_limit_to_have_single_process_grab
                       )
RETURNING row_id;

如果我可以做进行my_table的更新子查询,那将是神圣的(并解决我的其他问题/问题)。但这是行不通的:无法将其与 GROUP BY 结合使用(这对于计算计数是必需的)。然后,我可以使用这些trans_nbr的值,并先执行查询以获取(即将成为)以前的 processing_by 值。

If I could do FOR UPDATE ON my_table at the end of the subquery, that'd be divine (and fix my other question/problem). But that won't work: can't combine this with GROUP BY (which is necessary for figuring out the count). Then I could just take those trans_nbr's and do a query first to get the (soon-to-be-) former processing_by values.

我尝试这样做:

   UPDATE my_table
      SET processing_by = our_id_info -- unique to this worker
     FROM my_table old_my_table
     JOIN (
             SELECT trans_nbr
               FROM my_table
           GROUP BY trans_nbr
             HAVING COUNT(trans_nbr) > 1
              LIMIT our_limit_to_have_single_process_grab
          ) sub_my_table
       ON old_my_table.trans_nbr = sub_my_table.trans_nbr
    WHERE     my_table.trans_nbr = sub_my_table.trans_nbr
      AND my_table.processing_by = old_my_table.processing_by
RETURNING my_table.row_id, my_table.processing_by, old_my_table.processing_by

但这是行不通的; old_my_table 在连接外部不可见; RETURNING 子句对此视而不见。

But that can't work; old_my_table is not visible outside the join; the RETURNING clause is blind to it.

我早已失去了所有尝试的机会;我已经在字面上进行了数小时的研究。

I've long since lost count of all the attempts I've made; I have been researching this for literally hours.

如果我能找到一种防弹的方式来锁定子查询中的行-仅锁定那些行,并且当子查询发生时-全部我试图避免的并发问题将会消失...

If I could just find a bullet-proof way to lock the rows in my subquery - and ONLY those rows, and WHEN the subquery happens - all the concurrency issues I'm trying to avoid would disappear ...

UPDATE :我在非上面的通用代码。在Erwin Brandstetter建议它可以工作之后,我重试了。既然我花了这么长时间找到这种解决方案,也许我的尴尬值得吗?至少现在这样子是后代了……:>

UPDATE: I had a typo in the non-generic code of the above. I retried after Erwin Brandstetter suggested it should work. Since it took me so long to find this sort of solution, perhaps my embarrassment is worth it? At least this is on SO for posterity now... :>

我现在拥有的(有效的)是这样的:

What I now have (that works) is like this:

   UPDATE my_table
      SET processing_by = our_id_info -- unique to this worker
     FROM my_table AS old_my_table
    WHERE trans_nbr IN (
                          SELECT trans_nbr
                            FROM my_table
                        GROUP BY trans_nbr
                          HAVING COUNT(*) > 1
                           LIMIT our_limit_to_have_single_process_grab
                       )
      AND my_table.row_id = old_my_table.row_id
RETURNING my_table.row_id, my_table.processing_by, old_my_table.processing_by AS old_processing_by

COUNT(*)是来自 Flimzy 的一个建议,该建议是对我的其他问题(在上方链接)的评论。

The COUNT(*) is per a suggestion from Flimzy in a comment on my other (linked above) question.

请参阅我的其他问题以获取c正确实现并发甚至是非阻塞版本;此查询仅显示如何从更新中获取旧值和新值,而忽略不良/错误的并发位。

Please see my other question for correctly implementing concurrency and even a non-blocking version; THIS query merely shows how to get the old and new values from an update, ignore the bad/wrong concurrency bits.

推荐答案

问题


手册说明


可选的 RETURNING 子句会导致 UPDATE 根据实际更新的每一行计算并返回
值。使用
表的列和/或 FROM 中提到的其他表的列的任何表达式都可以计算
。表格列的新(更新后)值已使用
RETURNING 列表的语法与 SELECT
输出列表的语法相同。

The optional RETURNING clause causes UPDATE to compute and return value(s) based on each row actually updated. Any expression using the table's columns, and/or columns of other tables mentioned in FROM, can be computed. The new (post-update) values of the table's columns are used. The syntax of the RETURNING list is identical to that of the output list of SELECT.

加粗强调。在 RETURNING 子句中无法访问旧行。您可以通过触发器或单独的 SELECT 之前 UPDATE 解决此限制,包裹在@Flimzy和@wildplasser评论的事务中,或包裹在@MattDiPasquale发布的CTE中。

Bold emphasis mine. There is no way to access the old row in a RETURNING clause. You work around this restriction with a trigger or with a separate SELECT before the UPDATE, wrapped in a transaction as @Flimzy and @wildplasser commented, or wrapped in a CTE as @MattDiPasquale posted.

但是,如果您加入 FROM 子句中表的另一个实例,您要达到的目标可以很好地工作

However, what you are trying to achieve works perfectly fine if you join in another instance of the table in the FROM clause:

UPDATE tbl x
SET    tbl_id = 23
     , name = 'New Guy'
FROM   tbl y                -- using the FROM clause
WHERE  x.tbl_id = y.tbl_id  -- must be UNIQUE NOT NULL
AND    x.tbl_id = 3
RETURNING y.tbl_id AS old_id, y.name AS old_name
        , x.tbl_id          , x.name;

返回:

 old_id | old_name | tbl_id |  name
--------+----------+--------+---------
  3     | Old Guy  | 23     | New Guy

用于自联接的列必须为 UNIQUE NOT NULL 。在简单的示例中, WHERE 条件在同一列 tbl_id 上,但这只是巧合。适用于任何条件。

The column(s) used to self-join must be UNIQUE NOT NULL. In the simple example, the WHERE condition is on the same column tbl_id, but that's just coincidence. Works for any conditions.

我使用8.4至13的PostgreSQL版本对此进行了测试。

I tested this with PostgreSQL versions from 8.4 to 13.

对于 INSERT

  • INSERT INTO ... FROM SELECT ... RETURNING id mappings

有多种方法可以避免在同一行上进行并发写入操作时出现竞争情况。 (请注意,对不相关的行进行并发写入操作完全没有问题。)一种简单,缓慢且确定(但昂贵)的方法是使用 SERIALIZABLE 隔离级别

There are various ways to avoid race conditions with concurrent write operations on the same rows. (Note that concurrent write operations on unrelated rows are no problem at all.) The simple, slow and sure (but expensive) method is to run the transaction with SERIALIZABLE isolation level:

BEGIN ISOLATION LEVEL SERIALIZABLE;
UPDATE ... ;
COMMIT;

但这可能太夸张了。而且,您需要准备在序列化失败的情况下重复操作。

But that's probably overkill. And you need to be prepared to repeat the operation in case of a serialization failure.

更简单,更快速(并且与并发写入负载同样可靠)是对的明确锁定。一个行要更新:

Simpler and faster (and just as reliable with concurrent write load) is an explicit lock on the one row to be updated:

UPDATE tbl x
SET    tbl_id = 24
     , name = 'New Gal'
FROM  (SELECT tbl_id, name FROM tbl WHERE tbl_id = 4 FOR UPDATE) y 
WHERE  x.tbl_id = y.tbl_id
RETURNING y.tbl_id AS old_id, y.name AS old_name
        , x.tbl_id          , x.name;

请注意 WHERE 条件如何移动到子查询(再次,可以是任何内容),只有外部联接中的自我联接(在 UNIQUE NOT NULL 列上)。这样可以保证仅处理由内部 SELECT 锁定的行。 WHERE 条件可能稍后会解析为一组不同的行。

Note how the WHERE condition moved to the subquery (again, can be anything), and only the self-join (on UNIQUE NOT NULL column(s)) remains in the outer query. This guarantees that only rows locked by the inner SELECT are processed. The WHERE conditions might resolve to a different set of rows a moment later.

请参阅:

  • Atomic UPDATE .. SELECT in Postgres

db<> fiddle 此处

Old sqlfiddle

这篇关于仅使用SQL返回更新前的列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆