循环检测与递归子查询分解 [英] Cycle detection with recursive subquery factoring
问题描述
Oracle v2可以使用自己的专有CONNECT BY语法从v2开始进行分层查询。在最新的11g版本2中,他们添加了递归子查询因子分解,也称为递归条款。这是ANSI标准,如果我理解正确,这一个也已经被其他RDBMS供应商实施。
当将连接与递归比较时,在使用循环检测时,我注意到结果集中有差异。结果的连接对我来说更直观,所以我想知道Oracle的实现是否包含一个错误,或者这是标准的ANSI和预期的行为。因此,我的问题是如果您可以使用MySQL,DB2,SQL Server等其他数据库检查递归。提供这些数据库当然支持递归条款。
这是它在Oracle 11.2.0.1.0上的工作原理。
SQL>选择*
2从t
3 /
ID PARENT_ID
---------- ----------
1 2
2 1
选择2行。
使用CONNECT BY语法查询
SQL>选择id
2,parent_id
3,connect_by_iscycle
4从t
5连接通过nocycle parent_id =先前的id
6从id = 1开始
7 /
ID PARENT_ID CONNECT_BY_ISCYCLE
---------- ---------- -------------- ----
1 2 0
2 1 1
2行选择。
对我来说看起来很直观。但是,使用新的ANSI语法,它返回另一行:
SQL>以tr(id,parent_id)为
2(select id
3,parent_id
4从t
5其中id = 1
6 union all
7 select t.id
8,t.parent_id
9从t
10连接tr on t.parent_id = tr.id
11)循环id set is_cycle到'1'默认'0'
12 select id
13,parent_id
14,is_cycle
15从tr
16 /
ID PARENT_ID I
---------- ---------- -
1 2 0
2 1 0
1 2 1
选择3行。
这是您可以用来检查的脚本:
create table t
(id number
,parent_id number
);
插入t值(1,2);
插入t值(2,1);
提交;
with tr(id,parent_id)as
(select id
,parent_id
from t
where id = 1
union all
select t.id
,t.parent_id
从t
加入tr on t.parent_id = tr.id
)循环id设置is_cycle为'1'默认'0'
select id
,parent_id
,is_cycle
from tr;
CONNECT_BY_ISCYCLE
pseudocolumn返回1
如果当前行有一个也是其祖先
的小孩,而在 CYCLE
:
如果一个行的祖先行具有与循环列相同的值,则会将行视为形成一个循环。
在你的例子中,行2
有一个也是它的祖先的孩子,但是它的id
尚未退回。
换句话说,
CONNECT_BY_ISCYCLE
c hildren (尚未返回),而CYCLE
检查当前行(已返回)
CONNECT BY
是基于行的,而递归CTE
递归
CTE
中没有孩子的概念。这是一个基于集合的操作,可以完全从树中产生结果。一般来说,锚点部分和递归部分甚至可以使用不同的表格。
由于递归
CTE
通常用于构建层次树,Oracle
决定添加循环检查。但是由于基于set的方式,递归的CTE
的操作,通常不可能告诉下一步是否生成一个循环。
要执行下一步步骤,整个当前集需要可用,但要生成当前集合的每一行(包括循环列),我们只需要具有下一步操作。这不是一个单行的问题(例如
CONNECT BY
),但整个集合是一个问题。
尚未查看
Oracle 11
但SQL Server
实现递归CTE
只需在其后面隐藏一个CONNECT BY
,这需要放置许多限制(所有这些都有效地禁止所有基于集合的操作)
PostgreSQL
的实现,另一方面,是真正的基于设置的。 b
$ b如前所述,
MySQL
不实现CTE
(它不实现HASH JOIN
或MERGE JOIN
s,只有嵌套循环,所以不要'呃,讽刺的是,今天我收到了一封关于这个主题的信,我将在我的博客中介绍。
更新:
递归
CTE
的SQL Server
不是mor e比CONNECT BY
伪装。在我的博客中看到这篇文章令人震惊的细节:
Oracle SQL can do hierarchical queries since v2, using their proprietary CONNECT BY syntax. In their latest 11g release 2, they added recursive subquery factoring, also known as the recursive with clause. This is the ANSI standard, and if I understand correctly, this one has been implemented by other RDBMS vendors as well.
When comparing the connect-by with the recursive with, I noticed a difference in the result set when using cycle detection. The connect by results are more intuitive to me, so I'm wondering if Oracle's implementation contains a bug, or if this is standard ANSI and expected behaviour. Therefore my question is if you can check the recursive with query using other databases like MySQL, DB2, SQL Server and others. Provided those databases support the recursive with clause of course.
Here is how it works on Oracle 11.2.0.1.0
SQL> select * 2 from t 3 / ID PARENT_ID ---------- ---------- 1 2 2 1 2 rows selected.
The query using CONNECT BY syntax:
SQL> select id 2 , parent_id 3 , connect_by_iscycle 4 from t 5 connect by nocycle parent_id = prior id 6 start with id = 1 7 / ID PARENT_ID CONNECT_BY_ISCYCLE ---------- ---------- ------------------ 1 2 0 2 1 1 2 rows selected.
Which looks intuitive to me. However, using the new ANSI syntax it returns one more row:
SQL> with tr (id,parent_id) as 2 ( select id 3 , parent_id 4 from t 5 where id = 1 6 union all 7 select t.id 8 , t.parent_id 9 from t 10 join tr on t.parent_id = tr.id 11 ) cycle id set is_cycle to '1' default '0' 12 select id 13 , parent_id 14 , is_cycle 15 from tr 16 / ID PARENT_ID I ---------- ---------- - 1 2 0 2 1 0 1 2 1 3 rows selected.
This is the script you can use to check:
create table t ( id number , parent_id number ); insert into t values (1, 2); insert into t values (2, 1); commit; with tr (id,parent_id) as ( select id , parent_id from t where id = 1 union all select t.id , t.parent_id from t join tr on t.parent_id = tr.id ) cycle id set is_cycle to '1' default '0' select id , parent_id , is_cycle from tr;
解决方案From documentation on
CONNECT_BY_ISCYCLE
:The
CONNECT_BY_ISCYCLE
pseudocolumn returns1
if the current row has a child which is also its ancestorand that on
CYCLE
:A row is considered to form a cycle if one of its ancestor rows has the same values for the cycle columns.
In your example, row
2
does have a child which is also its ancestor, but itsid
has not been returned yet.In other words,
CONNECT_BY_ISCYCLE
checks the children (which are yet to be returned), whileCYCLE
checks the current row (which is already returned).
CONNECT BY
is row based, while recursiveCTE
's are set-based.There is no concept of a "child" in a recursive
CTE
. It's a set based operation which can yield results completely out of the tree. Generally speaking, the anchor part and the recursive part can even use the different tables.Since recursive
CTE
's are usually used to build hierarchy trees,Oracle
decided to add a cycle check. But due the set-based way the recursiveCTE
's operate, it's generally impossible to tell will the next step generate a cycle or not.To perform the "next" step, the whole "current" set needs to be available, but to generate each row of the current set (which includes the cycle column) we just need to have the results of the "next" operation. It's not a problem with a single row (like in
CONNECT BY
), but it is a problem with a set as a whole.Didn't look into
Oracle 11
yet, butSQL Server
implements recursiveCTE
's by just hiding aCONNECT BY
behind them, which requires placing numerous restrictions (all of which effectively forbid all set-based operations).
PostgreSQL
's implementation, on the other hand, is truly set-based.As was mentioned before,
MySQL
does not implementCTE
's at all (it does not implementHASH JOIN
's orMERGE JOIN
s as well, only the nested loops, so don't be surprised much).Ironically, I received a letter today on this very subject, which I will cover in my blog.
Update:
Recursive
CTE
's inSQL Server
are no more thanCONNECT BY
in disguise. See this article in my blog for shocking details:这篇关于循环检测与递归子查询分解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!