为什么递归CTE会按程序运行分析功能(ROW_NUMBER)? [英] Why do Recursive CTEs run analytic functions (ROW_NUMBER) procedurally?

查看:44
本文介绍了为什么递归CTE会按程序运行分析功能(ROW_NUMBER)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨天我回答了一个递归CTE,该CTE暴露了在SQL Server中实现这些问题的方式(可能也在其他RDBMS中实现)的问题.基本上,当我尝试针对当前递归级别使用 ROW_NUMBER 时,它将针对当前递归级别的每一行子集运行.我希望这会在真正的SET逻​​辑中起作用,并且会违反整个当前递归级别.

I answered a recursive CTE yesterday that exposed an issue with the way that these are implemented in SQL Server (possibly in other RDBMS, too?). Basically, when I try to use ROW_NUMBER against the current recursive level, it runs against each row subset of the current recursive level. I would expect that this would work in true SET logic, and run against the entire current recursive level.

看来,从此MSDN文章,发现的问题是预期的功能:

It appears that, from this MSDN article, the issue I have found is intended functionality:

CTE的递归部分中的分析函数和集合函数是应用于当前递归级别的集合,而不应用于集合CTE. ROW_NUMBER之类的功能仅在以下子集上起作用当前递归级别而不是整个递归级别传递给他们的数据粘贴到CTE递归部分的一组数据.欲了解更多有关信息,请参阅J.在递归CTE中使用分析函数.

Analytic and aggregate functions in the recursive part of the CTE are applied to the set for the current recursion level and not to the set for the CTE. Functions like ROW_NUMBER operate only on the subset of data passed to them by the current recursion level and not the entire set of data pased to the recursive part of the CTE. For more information, see J. Using analytical functions in a recursive CTE.

在我的挖掘工作中,我找不到任何地方可以解释为什么选择这种方式来工作吗?这更像是一种基于集合语言的过程方法,因此这与我的SQL思考过程相反,并且在我看来非常令人困惑.是否有人知道和/或可以解释为什么递归CTE以程序方式在递归级别上处理分析功能?

In my digging, I could find nowhere that explains why this was chosen to work the way it does? This is more of a procedural approach in a set based language, so this works against my SQL thought process and is quite confusing in my opinion. Does anybody know and/or can anybody explain why the recursive CTE treats analytic functions at the recursion level in a procedural fashion?

以下是有助于形象化的代码:

Here is the code to help visualize this:

注意,这些代码输出的每一个中的 RowNumber 列.

Notice, the RowNumber column in each one of these code outputs.

此处是CTE的SQLFiddle(仅显示递归的第二级)

Here is the SQLFiddle for the CTE (only showing the 2nd level of the recursion)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE
WHERE RecurseLevel = 2;

这里是我希望CTE要做的第二个SQLFiddle(同样,仅需要第二级即可显示问题)

Here is the second SQLFiddle for what I would expect the CTE to do (again only need the 2nd level to display the issue)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

我一直设想SQL递归CTE可以像 while循环 那样运行

I always envisioned the SQL recursive CTE to run more like this while loop

DECLARE @RecursionLevel INT
SET @RecursionLevel = 0
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, @RecursionLevel AS recurseLevel
INTO #RecursiveTable
FROM tblGroups
WHERE ParentId IS NULL

WHILE EXISTS( SELECT tblGroups.* FROM tblGroups JOIN #RecursiveTable ON #RecursiveTable.GroupID = tblGroups.ParentID WHERE recurseLevel = @RecursionLevel)
BEGIN

    INSERT INTO #RecursiveTable
    SELECT tblGroups.*, 
        ROW_NUMBER() OVER (ORDER BY #RecursiveTable.RowNumber , tblGroups.Score desc) AS RowNumber, 
        recurseLevel + 1 AS recurseLevel
    FROM tblGroups
        JOIN #RecursiveTable
            ON #RecursiveTable.GroupID = tblGroups.ParentID
    WHERE recurseLevel = @RecursionLevel
    SET @RecursionLevel = @RecursionLevel + 1
END

SELECT * FROM #RecursiveTable ORDER BY RecurseLevel;

推荐答案

解析函数的特殊之处在于,它们需要已知的结果集来进行解析.它们取决于以下,先前或完整的结果集以计算当前值.就是说,绝不允许在包含分析功能的视图上合并视图.为什么?那会改变结果.

Analytic functions are special in the way that they need a known resultset to resolve. They depend on the following, preceding or full resultset to caculate current value. That said, merging view is never allowed on a view that contains an analytic function. Why? That will change the result.

例如:

    Select * from (
      select row_number() over (partition by c1 order by c2) rw, c3 from t) z
    where c3=123

    select row_number() over (partition by c1 order by c2) rw, c3 from t 
    where c3=123

这2将为rw返回不同的值.因此,包含解析函数的子查询将始终被完全解析,而永远不会与其余查询合并.

These 2 will return different values for rw. That's why sub-queries containing analytic functions will always be fully resolved before and never be merged with the rest.

更新

查看第二个查询:

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

它的工作方式就像是写的(相同的执行计划和结果):

It works exactly as if it was written like (Same execution plan and result) :

SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
FROM tblGroups
JOIN (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
    FROM tblGroups
    WHERE ParentId IS NULL
    )myCTE ON myCTE.GroupID = tblGroups.ParentID;

该分区需要重新设置行号.

This one needs to be partitioned to reset the rownumber.

递归查询在while循环中不起作用,它们不是过程性的.从根本上讲,它们就像递归函数一样工作,但是根据表,查询和索引,它们可以被优化为以一种或另一种方式运行.

Recursive queries don't work in a while loop, they are not procedural. At the base, they work like a recursive function, but depending on the tables, the query, the indexes, they can be optimized to run one way or the other.

如果我们遵循这样的概念,那就是在使用解析函数并查看查询1时无法合并视图.视图只能运行一次,并且处于嵌套循环中.

If we do follow the concept that view cannot be merged when using analytic functions, and looking at query 1. It can only run once way, and it's in nested loop.

WITH myCTE
AS
( /*Cannot be merged*/
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel,
  cast(0 as bigint) n
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

/*Cannot be merged*/
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber, tblGroups.Score desc) AS RowNumber,       RecurseLevel + 1 AS RecurseLevel,
  myCTE.RowNumber
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE;

因此,第一个选择不能合并,第二个也不能合并.运行此查询的唯一方法是在每个级别中返回的每个项目的嵌套循环中,因此需要进行重置.再次,这不是程序性问题,而仅仅是可能的执行计划问题.

So 1st select, cannot be merged 2nd, neither. The only way to run this query is in nested loop for each item returned in each level, hence the reset. Again, it's not a question of procedural or not, just a question of possible execution plan.

希望这能回答您的问题,如果不可以,请允许我:)

Hope this answers you question, let me if it doesn't:)

y

这篇关于为什么递归CTE会按程序运行分析功能(ROW_NUMBER)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆