循环检测与递归子查询分解 [英] Cycle detection with recursive subquery factoring

查看:142
本文介绍了循环检测与递归子查询分解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Oracle v2可以使用自己的专有CONNECT BY语法从v2开始进行分层查询。在最新的11g版本2中,他们添加了递归子查询因子分解,也称为递归条款。这是ANSI标准,如果我理解正确,这一个也已经被其他RDBMS供应商实施。



当将连接与递归比较时,在使用循环检测时,我注意到结果集中有差异。结果的连接对我来说更直观,所以我想知道Oracle的实现是否包含一个错误,或者这是标准的ANSI和预期的行为。因此,我的问题是如果您可以使用MySQL,DB2,SQL Server等其他数据库检查递归。提供这些数据库当然支持递归条款。



这是它在Oracle 11.2.0.1.0上的工作原理。

  SQL>选择* 
2从t
3 /

ID PARENT_ID
---------- ----------
1 2
2 1

选择2行。

使用CONNECT BY语法查询

  SQL>选择id 
2,parent_id
3,connect_by_iscycle
4从t
5连接通过nocycle parent_id =先前的id
6从id = 1开始
7 /

ID PARENT_ID CONNECT_BY_ISCYCLE
---------- ---------- -------------- ----
1 2 0
2 1 1

2行选择。

对我来说看起来很直观。但是,使用新的ANSI语法,它返回另一行:

  SQL>以tr(id,parent_id)为
2(select id
3,parent_id
4从t
5其中id = 1
6 union all
7 select t.id
8,t.parent_id
9从t
10连接tr on t.parent_id = tr.id
11)循环id set is_cycle到'1'默认'0'
12 select id
13,parent_id
14,is_cycle
15从tr
16 /

ID PARENT_ID I
---------- ---------- -
1 2 0
2 1 0
1 2 1

选择3行。

这是您可以用来检查的脚本:

  create table t 
(id number
,parent_id number
);
插入t值(1,2);
插入t值(2,1);
提交;
with tr(id,parent_id)as
(select id
,parent_id
from t
where id = 1
union all
select t.id
,t.parent_id
从t
加入tr on t.parent_id = tr.id
)循环id设置is_cycle为'1'默认'0'
select id
,parent_id
,is_cycle
from tr;


解决方案

CONNECT_BY_ISCYCLE


CONNECT_BY_ISCYCLE pseudocolumn返回 1 如果当前行有一个也是其祖先


的小孩,而在 CYCLE


如果一个行的祖先行具有与循环列相同的值,则会将行视为形成一个循环。


在你的例子中,行 2 有一个也是它的祖先的孩子,但是它的 id 尚未退回。



换句话说, CONNECT_BY_ISCYCLE c hildren (尚未返回),而 CYCLE 检查当前行(已返回)



CONNECT BY 是基于行的,而递归 CTE



递归 CTE 中没有孩子的概念。这是一个基于集合的操作,可以完全从树中产生结果。一般来说,锚点部分和递归部分甚至可以使用不同的表格。



由于递归 CTE 通常用于构建层次树, Oracle 决定添加循环检查。但是由于基于set的方式,递归的 CTE 的操作,通常不可能告诉下一步是否生成一个循环。



要执行下一步步骤,整个当前集需要可用,但要生成当前集合的每一行(包括循环列),我们只需要具有下一步操作。这不是一个单行的问题(例如 CONNECT BY ),但整个集合是一个问题。



尚未查看 Oracle 11 SQL Server 实现递归 CTE 只需在其后面隐藏一个 CONNECT BY ,这需要放置许多限制(所有这些都有效地禁止所有基于集合的操作)



PostgreSQL 的实现,另一方面,是真正的基于设置的。 b
$ b

如前所述, MySQL 不实现 CTE (它不实现 HASH JOIN MERGE JOIN s,只有嵌套循环,所以不要'呃,讽刺的是,今天我收到了一封关于这个主题的信,我将在我的博客中介绍。



更新:



递归 CTE SQL Server 不是mor e比 CONNECT BY 伪装。在我的博客中看到这篇文章令人震惊的细节:




Oracle SQL can do hierarchical queries since v2, using their proprietary CONNECT BY syntax. In their latest 11g release 2, they added recursive subquery factoring, also known as the recursive with clause. This is the ANSI standard, and if I understand correctly, this one has been implemented by other RDBMS vendors as well.

When comparing the connect-by with the recursive with, I noticed a difference in the result set when using cycle detection. The connect by results are more intuitive to me, so I'm wondering if Oracle's implementation contains a bug, or if this is standard ANSI and expected behaviour. Therefore my question is if you can check the recursive with query using other databases like MySQL, DB2, SQL Server and others. Provided those databases support the recursive with clause of course.

Here is how it works on Oracle 11.2.0.1.0

SQL> select *
  2    from t
  3  /

        ID  PARENT_ID
---------- ----------
         1          2
         2          1

2 rows selected.

The query using CONNECT BY syntax:

SQL>  select id
  2        , parent_id
  3        , connect_by_iscycle
  4     from t
  5  connect by nocycle parent_id = prior id
  6    start with id = 1
  7  /

        ID  PARENT_ID CONNECT_BY_ISCYCLE
---------- ---------- ------------------
         1          2                  0
         2          1                  1

2 rows selected.

Which looks intuitive to me. However, using the new ANSI syntax it returns one more row:

SQL> with tr (id,parent_id) as
  2  ( select id
  3         , parent_id
  4      from t
  5     where id = 1
  6     union all
  7    select t.id
  8         , t.parent_id
  9      from t
 10           join tr on t.parent_id = tr.id
 11  ) cycle id set is_cycle to '1' default '0'
 12  select id
 13       , parent_id
 14       , is_cycle
 15    from tr
 16  /

        ID  PARENT_ID I
---------- ---------- -
         1          2 0
         2          1 0
         1          2 1

3 rows selected.

This is the script you can use to check:

create table t
( id        number
, parent_id number
);
insert into t values (1, 2);
insert into t values (2, 1);
commit;
with tr (id,parent_id) as
( select id
       , parent_id
    from t
   where id = 1
   union all
  select t.id
       , t.parent_id
    from t
         join tr on t.parent_id = tr.id
) cycle id set is_cycle to '1' default '0'
select id
     , parent_id
     , is_cycle
  from tr;

解决方案

From documentation on CONNECT_BY_ISCYCLE:

The CONNECT_BY_ISCYCLE pseudocolumn returns 1 if the current row has a child which is also its ancestor

and that on CYCLE:

A row is considered to form a cycle if one of its ancestor rows has the same values for the cycle columns.

In your example, row 2 does have a child which is also its ancestor, but its id has not been returned yet.

In other words, CONNECT_BY_ISCYCLE checks the children (which are yet to be returned), while CYCLE checks the current row (which is already returned).

CONNECT BY is row based, while recursive CTE's are set-based.

There is no concept of a "child" in a recursive CTE. It's a set based operation which can yield results completely out of the tree. Generally speaking, the anchor part and the recursive part can even use the different tables.

Since recursive CTE's are usually used to build hierarchy trees, Oracle decided to add a cycle check. But due the set-based way the recursive CTE's operate, it's generally impossible to tell will the next step generate a cycle or not.

To perform the "next" step, the whole "current" set needs to be available, but to generate each row of the current set (which includes the cycle column) we just need to have the results of the "next" operation. It's not a problem with a single row (like in CONNECT BY), but it is a problem with a set as a whole.

Didn't look into Oracle 11 yet, but SQL Server implements recursive CTE's by just hiding a CONNECT BY behind them, which requires placing numerous restrictions (all of which effectively forbid all set-based operations).

PostgreSQL's implementation, on the other hand, is truly set-based.

As was mentioned before, MySQL does not implement CTE's at all (it does not implement HASH JOIN's or MERGE JOINs as well, only the nested loops, so don't be surprised much).

Ironically, I received a letter today on this very subject, which I will cover in my blog.

Update:

Recursive CTE's in SQL Server are no more than CONNECT BY in disguise. See this article in my blog for shocking details:

这篇关于循环检测与递归子查询分解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆