CTE真正如何运作? [英] How CTE really works?

查看:63
本文介绍了CTE真正如何运作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了这种连接行元素的CTE解决方案,我认为它很棒,我意识到CTE的功能是多么强大。


但是,为了有效地使用这种工具,我需要知道


因此,我尝试放慢上述片段的流程,在这里,它如何在内部建立起对于像我这样的初学者而言必不可少的心理形象。


是代码

 使用[NORTHWIND] 
GO
/ ******对象:表[dbo]。 [Products2]脚本日期:2011年10月18日08:55:07 ****** /

GO
上设置ANSI_NULLS在
GO $ b $上设置QUOTED_IDENTIFIER b如果OBJECT_ID('Products2','U')不是空表[Products2]
创建表[dbo]。[Products2](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[ProductName] [nvarchar](40)NOT NULL,
[SupplierID] [int] NULL,
[C ategoryID] [int] NULL,
[QuantityPerUnit] [nvarchar](20)NULL,
[UnitPrice] [金钱] NULL,
[UnitsInStock] [smallint] NULL,
[UnitsOnOrder] [smallint] NULL,
[ReorderLevel] [smallint] NULL,
[已停产] [bit]非NULL
)ON [主]
GO
设置IDENTITY_INSERT [dbo]。[Products2]开启
INSERT [dbo]。[Products2]([ProductID],[ProductName],[SupplierID],[CategoryID],[QuantityPerUnit],[UnitPrice],[UnitsInStock] ,[UnitsOnOrder],[ReorderLevel],[Discontinued])值(1,N'vcbcbvcbvc',1、4,N'10盒x 20袋',18.0000、39、0、10、0)
INSERT [dbo]。[Products2]([ProductID],[ProductName],[SupplierID],[CategoryID],[QuantityPerUnit],[UnitPrice],[UnitsInStock],[UnitsOnOrder],[ReorderLevel],[Discontinued])值( 2,N'Changassad',1,1,N'24-12 oz瓶',19.0000,17,40,25,0)
插入[dbo]。[Products2]([ProductID],[ProductName] ,[SupplierID],[CategoryID],[QuantityPerUnit],[UnitPrice],[UnitsInStock],[UnitsOnOrde r],[ReorderLevel],[Discontinued])值(3,N'茴香糖浆',1、2,N'12-550毫升瓶',10.0000、13、70、25、0)
INSERT [ dbo]。[Products2]([ProductID],[ProductName],[SupplierID],[CategoryID],[QuantityPerUnit],[UnitPrice],[UnitsInStock],[UnitsOnOrder],[ReorderLevel],[Discontinued])值(4 ,N'Chef Anton's Cajun Seasoning',2、2,N'48-6 oz jars',22.0000、53、0、0、0)
插入[dbo]。[Products2]([ProductID ],[ProductName],[SupplierID],[CategoryID],[QuantityPerUnit],[UnitPrice],[UnitsInStock],[UnitsOnOrder],[ReorderLevel],[Discontinued])值(5,N'Chef Anton's Gumbo Mix',10、2,N'36个box',21.3500、0、0、0、1)
SET IDENTITY_INSERT [dbo]。[Products2] OFF
GO
IF OBJECT_ID(' DELAY_EXEC','FN')不为空DROP函数DELAY_EXEC
GO
创建函数DELAY_EXEC()返回日期时间
AS
开始
声明@I INT = 0
当@ I< 99999
开始
选择@ I + = 1
END
RETURN GETDATE()
END
GO

有CTE(EXEC_TIME,CategoryID,product_list,product_name,长度)
AS(SELECT dbo.DELAY_EXEC(),
CategoryID,
CAST(''AS VARCHAR( 8000)),
CAST(''AS VARCHAR(8000)),
0
from Northwind..Products2
GROUP BY CategoryID
UNION ALL
SELECT dbo.DELAY_EXEC(),
p.CategoryID,
CAST(product_list + CASE
当长度= 0时''''
ELSE','
END + ProductName AS VARCHAR(8000)),
CAST(产品名称AS VARCHAR(8000)),
长度+ 1
FROM CTE c
内联接Northwind..Products2 p
在c.CategoryID = p.CategoryID
的情况下,p.ProductName> c.product_name)
SELECT *
FROM CTE
在EXEC_TIME之前订购

--SELECT CategoryId,product_list
-FROM(SELECT CategoryId,product_list,
-RANK()OVER(分区按CategoryId排序或按长度DESC排序)
-FROM CTE)D(CategoryId,product_list,rank)
-WHERE rank = 1;

注释的块是级联问题的理想输出,但这不是这里的问题。


我添加了EXEC_TIME列,以了解首先添加了哪一行。
输出对我来说看起来不正确,有两个原因


  1. 我认为由于条件 p.ProductName> c.product_name 换句话说,CTE的第一部分的空行总是少于Product2表中的值,因此每次运行时,它应该再次带来一组新的已添加行。



  2. 数据层次确实很奇怪,最后一项应该是最长的,而最后一项又是什么呢?项 length = 1



任何专家营救?


样本结果


  EXEC_TIME CategoryID product_list产品名称长度
---- ------------------- ----------- -------------------- ----------------------------------------------- --- ------------------------------ -----------
2011-10-18 12:46:14.930 1 0
2011-10-18 12:46:14.990 2 0
2011-10-18 12:46:15.050 4 0
2011-10-18 12: 46:15.107 4 vcbcbvcbvc vcbcbvcbvc 1
2011-10-18 12:46:15.167 2大茴香Sy rup茴香糖浆1
2011-10-18 12:46:15.223 2厨师安东的调味酱厨师安东的调味酱1
2011-10-18 12:46:15.280 2厨师安东的浓汤厨师安东的浓汤混合1
2011-10-18 12:46:15.340 2厨师安东的印第安调味料,厨师安东的浓汤厨师安东的浓汤混合2
2011-10-18 12:46:15.400 2茴香糖浆,厨师Anton's Cajun调味厨师Anton's Cajun调味厨师2
2011-10-18 12:46:15.463 2茴香糖浆,厨师Anton的浓汤混合厨师Anton's浓汤混合2
2011-10-18 12:46:15.520 2茴香糖浆,安东厨师的调味料,安东厨师的浓汤Mi厨师安东的浓汤3
2011-10-18 12:46:15.580 1 Changassad Changassad 1


解决方案

页面使用公用表表达式的递归查询描述了CTE的逻辑:


递归执行的语义如下:


  1. CTE表达式分为锚成员和递归成员。


  2. 运行锚成员以创建第一个调用或基本结果集(T0)。


  3. 使用Ti作为输入并以Ti + 1作为输出运行递归成员。


  4. 重复步骤3,直到返回空集。


  5. 返回结果集。这是T0到Tn的UNION ALL。



但是,这仅仅是逻辑流程。与往常一样,如果使用SQL,则服务器可以自由地对其认为合适的操作进行重新排序,如果结果将是相同的,则可以认为重新排序可以更有效地提供结果。



函数的存在副作用(导致延迟,然后返回 GETDATE()),通常在决定是否执行以下操作时不会考虑



对查询进行重新排序的一种明显方式是,它可以决定开始处理结果集 Ti + 1 完全创建结果集 Ti 之前-与完全构造 Ti 首先,因为新行肯定已经在内存中,并且最近已被访问。


I came across this CTE solution for concatenating row elements and I thought it's brilliant and I realized how powerful CTEs can be.

However, in order to use such a tool effectively I need to know how it works internally to build that mental image which is essential for beginners, like me, to use it in different scenarios.

So I tried to slow motion the process of the above snippet and here is the code

USE [NORTHWIND]
GO
/****** Object:  Table [dbo].[Products2]  Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL  DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
  [ProductID] [int] IDENTITY(1,1) NOT NULL,
  [ProductName] [nvarchar](40) NOT NULL,
  [SupplierID] [int] NULL,
  [CategoryID] [int] NULL,
  [QuantityPerUnit] [nvarchar](20) NULL,
  [UnitPrice] [money] NULL,
  [UnitsInStock] [smallint] NULL,
  [UnitsOnOrder] [smallint] NULL,
  [ReorderLevel] [smallint] NULL,
  [Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL  DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
  DECLARE @I INT=0
  WHILE @I<99999
  BEGIN
  SELECT @I+=1
  END
  RETURN GETDATE()
END
GO

WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
     AS (SELECT dbo.DELAY_EXEC(),
                CategoryID,
                CAST('' AS VARCHAR(8000)),
                CAST('' AS VARCHAR(8000)),
                0
         FROM   Northwind..Products2
         GROUP  BY CategoryID
         UNION ALL
         SELECT dbo.DELAY_EXEC(),
                p.CategoryID,
                CAST(product_list + CASE
                                      WHEN length = 0 THEN ''
                                      ELSE ', '
                                    END + ProductName AS VARCHAR(8000)),
                CAST(ProductName AS VARCHAR(8000)),
                length + 1
         FROM   CTE c
                INNER JOIN Northwind..Products2 p
                  ON c.CategoryID = p.CategoryID
         WHERE  p.ProductName > c.product_name)
SELECT *
FROM   CTE
ORDER  BY EXEC_TIME  

--SELECT CategoryId, product_list
--  FROM ( SELECT CategoryId, product_list,
--  RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
--   FROM CTE ) D ( CategoryId, product_list, rank )
--   WHERE rank = 1 ;

The commented block is the desired output for the concatenation problem but it's not the question here.

I've added a column EXEC_TIME to know which row got added first. The output doesn’t look right to me for two reasons

  1. I thought there would be a redundant data because of the condition p.ProductName > c.product_name in another word the first part of the CTE the empty rows are always less then values in the Product2 table so each time it runs it should bring a new set of already added rows once again. Does this make any sense?

  2. The hierarchy of data is really weird the last item should be the longest and look what is the last item? An item with length=1?

Any expert to the rescue? Thanks in advance.

Sample Results

EXEC_TIME               CategoryID  product_list                                                        product_name                      length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1                                                                                                                 0
2011-10-18 12:46:14.990 2                                                                                                                 0
2011-10-18 12:46:15.050 4                                                                                                                 0
2011-10-18 12:46:15.107 4           vcbcbvcbvc                                                          vcbcbvcbvc                        1
2011-10-18 12:46:15.167 2           Aniseed Syrup                                                       Aniseed Syrup                     1
2011-10-18 12:46:15.223 2           Chef Anton's Cajun Seasoning                                        Chef Anton's Cajun Seasoning      1
2011-10-18 12:46:15.280 2           Chef Anton's Gumbo Mix                                              Chef Anton's Gumbo Mix            1
2011-10-18 12:46:15.340 2           Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix                Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.400 2           Aniseed Syrup, Chef Anton's Cajun Seasoning                         Chef Anton's Cajun Seasoning      2
2011-10-18 12:46:15.463 2           Aniseed Syrup, Chef Anton's Gumbo Mix                               Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.520 2           Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi  Chef Anton's Gumbo Mix            3
2011-10-18 12:46:15.580 1           Changassad                                                          Changassad                        1

解决方案

The page Recursive Queries Using Common Table Expressions describes the logic of CTEs:

The semantics of the recursive execution is as follows:

  1. Split the CTE expression into anchor and recursive members.

  2. Run the anchor member(s) creating the first invocation or base result set (T0).

  3. Run the recursive member(s) with Ti as an input and Ti+1 as an output.

  4. Repeat step 3 until an empty set is returned.

  5. Return the result set. This is a UNION ALL of T0 to Tn.

However, that's only the logical flow. As always, with SQL, the server is free to reorder operations as it sees fit, if the result will be "the same", and the reordering is perceived to provide the results more efficiently.

The presence of your function with side effects (causing a delay, then returning GETDATE()) isn't something that would normally be considered when deciding whether to reorder operations.

One obvious way in which the query may be reordered is that it may decide to start working on result set Ti+1 before it has fully created result set Ti - it may be more efficient to do this than to fully construct Ti first, since the new rows are definitely already in memory and have been accessed recently.

这篇关于CTE真正如何运作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆