典型递归时出现意外数据 [英] Unexpected data at typical recursion
问题描述
我很难用单词来描述,所以这里是示例:
It's hard for me to use words to describe this, so here's the sample:
select *
into t
from (values (10, 'A'),
(25, 'B'),
(30, 'C'),
(45, 'D'),
(52, 'E'),
(61, 'F'),
(61, 'G'),
(61, 'H'),
(79, 'I'),
(82, 'J')
) v(userid, name)
注意F,G和H如何具有相同的用户ID。
Notice how F,G and H have the same userid.
现在,考虑以下递归查询:
Now, consider the following recursive query:
with tn as
(
select t.userId,t.name, row_number() over (order by userid,newid()) as seqnum
from t
),
cte as
(
select userId, name, seqnum as seqnum
from tn
where seqnum = 1
union all
select tn.userId, tn.name,tn.seqnum
from
cte
inner join tn on tn.seqnum = cte.seqnum + 1
)
select *
from cte
第一个cte tn
创建一个row_number,其中包含一个随机成分,该成分将导致F / G / H以随机顺序出现。但是,它们仍将各自出现一次(通过将最后一个和最外面的 from
修改为 from tn
可以很容易地进行检查)
The first cte,tn
, creates a row_number which includes a randomization component that will cause F/G/H to appear in random order. However, they will all still appear once each(easily checkable by modifying the last and outermost from
to be from tn
)
第二个cte, cte
,递归扫描 tn
。没什么太复杂的,因为它来自最小化的示例。但是,很明显,将锚成员手动设置为 tn
的第一行,然后递归扫描所有其余行。
The second cte, cte
, recursively scans tn
. There's nothing too complicated in it because it comes from a minimized example. However, it is evident that the anchor member is manually set to be tn
's first line, and then the recursion scans all the rest lines.
但是,最终结果集没有F / G / H出现一次!它们总共以3行出现,但可以任意组合出现。 FGH是可能的,但是FFH甚至FFF也是可以的!这是一个FHH示例:
However, the final result-set does not have F/G/H appear once each! They all appear in 3 lines total, but in any combination. FGH is possible, but so is FFH, and even FFF! Here's an FHH example:
+--------+------+--------+
| userId | name | seqnum |
+--------+------+--------+
| 10 | A | 1 |
| 25 | B | 2 |
| 30 | C | 3 |
| 45 | D | 4 |
| 52 | E | 5 |
| 61 | F | 6 |
| 61 | H | 7 |
| 61 | H | 8 |
| 79 | I | 9 |
| 82 | J | 10 |
+--------+------+--------+
为什么?
我不认为 ctes中的分析函数行为与之相关,因为 tn $ c
I don't think analytic function behaviour within ctes has anything to do with it, because tn
which includes the row_number is not recursive.
记录下来,由于以下事件顺序,我得到了这个问题:
有人问了一个问题一个优秀贡献者回答的。
相同的OP问跟随问题,尽管我可以稍加修改一下即可回答。但是,经过一番挖掘,我发现了一种我无法理解的行为。我将示例尽可能地简化了。
For the record, I got this question due to the following sequence of events: Someone asked a question that was answered by an excellent contributor. The same OP asked a follow-up question, which I though I could answer with a bit of tampering on the original. However, after digging a bit, I found a behaviour I cannot understand. I made the example minimal as much as i could.
推荐答案
CTE未被假脱机。每次引用 tn
都可能导致重新运行查询并重新随机化结果。
CTE's aren't spooled. Every reference to tn
may result in re-running the query and re-randomizing the results.
要避免这种情况,运行一次随机化查询并加载临时表。例如
To avoid this, run the randomizing query once and load a temp table. eg
select t.userId,t.name, row_number() over (order by userid,newid()) as seqnum
into #tn
from t
,并在后续查询中引用
with tn as
(
select * from #tn
),
cte as
(
select userId, name, seqnum as seqnum
from tn
where seqnum = 1
union all
select tn.userId, tn.name,tn.seqnum
from
cte
inner join tn on tn.seqnum = cte.seqnum + 1
)
select *
from cte
这篇关于典型递归时出现意外数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!