重构功能,以便可以在CTE中使用它们 [英] Refactor functions so they can be used within a CTE

查看:81
本文介绍了重构功能,以便可以在CTE中使用它们的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个父母孩子的表,如下所示:

I have a parent and child table as following:

create table parent
(
  identifier serial primary key,
  name text
);
create table child
(
  identifier serial primary key,
  name text, parent_identifier integer references parent
);

我创建了两个实用程序函数来格式化 parent child 行到JSON对象:

I created two utility functions to format a parent and child row to a JSON object:

create function format(child) returns json
  as $$
  select json_build_object('identifier', $1.identifier, 'name', $1.name)
  $$ language sql stable;

create function format(parent) returns json
  as $$
  select json_build_object('identifier', $1.identifier, 'name', $1.name, 
                           'children', array(select format(child) from child where parent_identifier = $1.identifier))
  $$ language sql stable;

让我们测试一下:

insert into parent(name) values('first parent');
insert into parent(name) values('second parent');

insert into child(name, parent_identifier) values('first child first parent', (select identifier from parent where name = 'first parent'));
insert into child(name, parent_identifier) values('second child first parent', (select identifier from parent where name = 'first parent'));
insert into child(name, parent_identifier) values('first child second parent', (select identifier from parent where name = 'second parent'));

select format(parent) from parent;

这将返回以下JSON对象:

This returns the following JSON objects:

{
   "identifier":5,
   "name":"first parent",
   "children":[
      {
        "identifier":7,
        "name":"first child first parent"
      },
      {
        "identifier":8,
        "name":"second child first parent"
      }
   ]
}
{
   "identifier":6,
   "name":"second parent",
   "children":[
      {
        "identifier":9,
        "name":"first child second parent"
      }
   ]
}

太棒了!不过,这有一个大问题:如果另一个事务在我们的插入 select 查询之间做了一些更改,则 select 查询确实返回我们刚刚插入的内容。我们可以通过将事务隔离级别设置为可重复读取来解决此问题,但这有其性能成本和其他缺点(我们可能必须重试)。

Great! Though, this has one big problem: if another transaction does some changes in between our insert and select queries, the select query does exactly return what we just inserted. We can fix this by setting the transaction isolation level to repeatable read, but that has its performance costs and other drawbacks (we might have to retry).

因此,我想到了用一个CTE重写上述查询的问题。如果我没有记错的话,这不会遇到此类并发问题。我开始如下:

So I thought about rewriting the queries above in one single CTE. If I am not mistaken, this will not suffer from such concurrency issues. I started as following:

with
parents as
(
  insert into parent(name)
  select 'first parent'
  union all
  select 'second parent'
  returning parent.identifier, parent.name
),
children as
(
  insert into child(name, parent_identifier)
  select 'first child first parent', identifier from parents where name = 'first parent'
  union all
  select 'second child first parent', identifier from parents where name = 'first parent'
  union all
  select 'first child second parent', identifier from parents where name = 'second parent'
)
select format(parents::parent) from parents;

这无法正常工作。它返回以下JSON对象:

This does not work as expected. It returns the following JSON objects:

{
   "identifier":7,
   "name":"first parent",
   "children":[]
}
{
   "identifier":8,
   "name":"second parent",
   "children":[]
}

如您所见,没有包括儿童。经过阅读后,我了解发生了什么。 CTE处理在查询刚开始时创建的快照。在格式(父级)中,我们正在执行从子级中选择格式(子级),其中parent_identifier = $ 1.identifier),但这不会产生任何子行,因为这些子行不在快照中。因此,我的问题不是我所了解的。

As you can see, there are no children included. After some reading, I understand what is going on. The CTE works on a snapshot created right before when the query was started. In format(parent), we are doing select format(child) from child where parent_identifier = $1.identifier), but that does not yield any child rows, because the child rows are not in the snapshot. So my question is not about this, as I understand this.

当然,如果我简单地执行 json_build_object 的东西,与 format 函数中的东西完全相同,但是在主查询中,但是我正在复制代码。我也在其他查询中使用了这些 format 函数,与此问题无关。理想情况下,我想避免解决方案中的代码重复。因此,我想继续使用它们,可能需要先对其进行重构,以便可以在此问题中使用它们。

Of course, I could easily fix this if I simple do the json_build_object stuff, exactly the same as in the format functions, in the main query, but then I am duplicating code. I am using these format functions also in other queries, unrelated to this question. Ideally, I want to avoid code duplication in my solution. So I would like to keep using them and probably need to refactor them first so they can be used in the scenario in this question.

我现在很困。我真的很想继续使用CTE(这样就可以避免将事务隔离级别设置为可重复读取),但是我找不到方法来重构 format(父级) format(子级)函数和/或CTE,所以我不会以代码结尾在整个地方重复。 SO上有一些聪明的主意吗?

I'm quite stuck now. I really would like to continue with the CTE (so I can avoid having to set the transaction isolation level to repeatable read), but I cannot find a way to re-factor the format(parent) and format(child) functions and/or the CTE so I don't end up with code duplicates all over the place. Is there a smart soul on SO with some clever ideas?

请注意我正在使用PostgreSQL 10.1。请在此处找到一个小提琴: http://sqlfiddle.com/#!17/a251d/2

Note that I am using PostgreSQL 10.1. Please find a fiddle here: http://sqlfiddle.com/#!17/a251d/2

有关Laurenz Albe答案的更新

上下文: https://stackoverflow.com/revisions/48152380/1

在上面的问题,我确实是在简化我的情况。让我更详细地说明实际情况,而不必过多地考虑细节,否则会造成混淆。

In the question above, I was indeed simplifying my situation. Let me explain the real scenario more closely, without going into too much details which will be confusing.

在这种情况下,用户正在提供数据(= 父母及其对应的孩子)在特定日期范围内(例如,2018年1月)。而且,我不仅在做插入,我实际上正在做孤立行的更新和删除。因此,情况很简单:客户端将替换给定日期范围内的所有数据。

In the scenario, the user is providing data (= parents and their corresponding children) for a certain date range, for example the month January 2018. Also, I am not just doing inserts, I am actually doing upserts and deletes of orphaned rows. So, the scenario is simple: the client is replacing all data for a given date range.

如果我随后执行,请从父项,其中< parent在所提供的日期范围内在删除和删除之后,某些其他客户端可能已更改了两者之间的重叠日期范围。在这种情况下,我将返回客户端提供的不同结果,如果客户端未正确实现,则可能会引入错误。因此,这就是为什么我认为插入和选择需要属于同一事务的一部分,并且事务隔离级别设置为可重复读取

If I then do select format(parent) from parent where <parent is in date range as provided> after the upserts and deletes, some other client might have altered an overlapping date range in between. In that case, I am returning different results as provided by the client, which might introduce bugs if the clients are not properly implemented. Hence, that's why I believe the inserts and select need be part of the same transaction with transaction isolation level set to repeatable read.

但是后来,我开始考虑一个单一的胖CTE,因此是我的问题。

But then, I started thinking about one single, fat CTE, hence my question.

我希望这可以弄清楚情况。

I hope this clarifies the scenario.

推荐答案

正如您正确注意到的那样,您无法在主 SELECT 的CTE中看到修改过的行。 >

As you noticed correctly, you cannot see rows modified in the CTE in the main SELECT. This is documented:


WITH 中的子语句彼此同时执行
和与主要查询。因此,在 WITH 中使用数据修改语句
时,指定更新实际发生的顺序是
无法预测的。所有语句都使用相同的快照
执行(请参阅第13章),因此他们不能看到对方的
对目标桌的影响。这减轻了
实际行更新顺序的不可预测性的影响,并且意味着
RETURNING 数据是在不同版本之间传递更改的唯一方法
子语句和主查询。

The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot “see” one another's effects on the target tables. This alleviates the effects of the unpredictability of the actual order of row updates, and means that RETURNING data is the only way to communicate changes between different WITH sub-statements and the main query.

您应该使用 RETURNING

我想最简单的方法是不使用函数,而是执行 json_build_object 在主查询中,并使其对CTE 父母儿童

I guess the simplest way would be not to use a function, but to perform json_build_object in the main query and have it operate on the CTEs parents and children.

这篇关于重构功能,以便可以在CTE中使用它们的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆