典型递归时出现意外数据 [英] Unexpected data at typical recursion

查看:72
本文介绍了典型递归时出现意外数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难用单词来描述,所以这里是示例:

It's hard for me to use words to describe this, so here's the sample:

select *
into t
from (values (10, 'A'),
             (25, 'B'),
             (30, 'C'),
             (45, 'D'),
             (52, 'E'),
             (61, 'F'),
             (61, 'G'),
             (61, 'H'),
             (79, 'I'),
             (82, 'J')
) v(userid, name)

注意F,G和H如何具有相同的用户ID。

Notice how F,G and H have the same userid.

现在,考虑以下递归查询:

Now, consider the following recursive query:

with tn as 
(
    select t.userId,t.name, row_number() over (order by userid,newid()) as seqnum
    from t
),
cte as 
(
        select userId, name,  seqnum as seqnum
        from tn 
        where seqnum = 1
    union all
        select tn.userId, tn.name,tn.seqnum
        from 
            cte 
            inner join tn on tn.seqnum = cte.seqnum + 1
)
select *
from cte

第一个cte tn 创建一个row_number,其中包含一个随机成分,该成分将导致F / G / H以随机顺序出现。但是,它们仍将各自出现一次(通过将最后一个和最外面的 from 修改为 from tn 可以很容易地进行检查)

The first cte,tn, creates a row_number which includes a randomization component that will cause F/G/H to appear in random order. However, they will all still appear once each(easily checkable by modifying the last and outermost from to be from tn)

第二个cte, cte ,递归扫描 tn 。没什么太复杂的,因为它来自最小化的示例。但是,很明显,将锚成员手动设置为 tn 的第一行,然后递归扫描所有其余行。

The second cte, cte, recursively scans tn. There's nothing too complicated in it because it comes from a minimized example. However, it is evident that the anchor member is manually set to be tn's first line, and then the recursion scans all the rest lines.

但是,最终结果集没有F / G / H出现一次!它们总共以3行出现,但可以任意组合出现。 FGH是可能的,但是FFH甚至FFF也是可以的!这是一个FHH示例:

However, the final result-set does not have F/G/H appear once each! They all appear in 3 lines total, but in any combination. FGH is possible, but so is FFH, and even FFF! Here's an FHH example:

+--------+------+--------+
| userId | name | seqnum |
+--------+------+--------+
|     10 | A    |      1 |
|     25 | B    |      2 |
|     30 | C    |      3 |
|     45 | D    |      4 |
|     52 | E    |      5 |
|     61 | F    |      6 |
|     61 | H    |      7 |
|     61 | H    |      8 |
|     79 | I    |      9 |
|     82 | J    |     10 |
+--------+------+--------+

为什么?

我不认为 ctes中的分析函数行为与之相关,因为 tn

I don't think analytic function behaviour within ctes has anything to do with it, because tn which includes the row_number is not recursive.

记录下来,由于以下事件顺序,我得到了这个问题:
有人问了一个问题优秀贡献者回答的。
相同的OP问跟随问题,尽管我可以稍加修改一下即可回答。但是,经过一番挖掘,我发现了一种我无法理解的行为。我将示例尽可能地简化了。

For the record, I got this question due to the following sequence of events: Someone asked a question that was answered by an excellent contributor. The same OP asked a follow-up question, which I though I could answer with a bit of tampering on the original. However, after digging a bit, I found a behaviour I cannot understand. I made the example minimal as much as i could.

推荐答案

CTE未被假脱机。每次引用 tn 都可能导致重新运行查询并重新随机化结果。

CTE's aren't spooled. Every reference to tn may result in re-running the query and re-randomizing the results.

要避免这种情况,运行一次随机化查询并加载临时表。例如

To avoid this, run the randomizing query once and load a temp table. eg

select t.userId,t.name, row_number() over (order by userid,newid()) as seqnum
into #tn
from t

,并在后续查询中引用

with tn as 
(
    select * from #tn
),
cte as 
(
        select userId, name,  seqnum as seqnum
        from tn
        where seqnum = 1
    union all
        select tn.userId, tn.name,tn.seqnum
        from 
            cte 
            inner join tn on tn.seqnum = cte.seqnum + 1
)
select *
from cte

这篇关于典型递归时出现意外数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆