CTE与T-SQL循环用于确定对象层次结构的深度 [英] CTE vs. T-SQL loop for determining depth of object hierarchy

查看:62
本文介绍了CTE与T-SQL循环用于确定对象层次结构的深度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含约70,000行和两列的表(均为 VARCHAR(16)): id parent_id

I have a table consisting of about 70,000 rows and two columns (both VARCHAR(16)): id and parent_id.

我想填充一个深度列,该列显示特定记录的距离

I'd like to populate a 'depth' column that shows how far a particular record is from the 'root' node.

例如

id,parent_id,depth
A,NULL,0
B,A,1
C,A,1
D,B,2
E,D,3

等。

我首先编写了一个基于在此答案上针对类似问题:

I started by writing a query based on this answer to a similar question:

WITH myCTE(id, depth) AS
(
    SELECT id, 0 FROM objects where id = 'A'
    UNION ALL
    SELECT objects.id, depth + 1 FROM myCTE JOIN objects ON objects.parent_id = myCTE.id
)
SELECT id, depth FROM myCTE

使用我的数据集(约80,000行),上面的代码要花几乎两个小时才能执行!

With my dataset (~80,000 rows) the above takes almost two hours to execute!

然后我将查询写为循环d的性能要好得多:

I then wrote my query as a loop and got far better performance:

ALTER TABLE objects ADD depth INT NULL
DECLARE @counter int
DECLARE @total int
SET @counter = 0
UPDATE objects SET depth = 0 WHERE id = 'A'

SELECT @total = COUNT(*) FROM objects WHERE depth IS NULL

WHILE (@total > 0)
BEGIN
    UPDATE objects SET depth = @counter + 1 WHERE parent_id IN (
        SELECT id FROM objects WHERE depth = @counter
    )
    SELECT @total = COUNT(*) FROM objects WHERE depth IS NULL
    SET @counter = @counter + 1
END

上面的代码只需要几分钟(它的好处是将结果添加到现有表中)

The above code only takes a couple of minutes (and it has the benefit of adding the results to the existing table)

我的问题是我的结果是使用CTE解决此问题的典型结果,还是我忽略了一些可以解释的问题?索引,也许? (我现在不在桌子上)

My question is whether my results are typical of using a CTE for this problem or whether there is something I have overlooked that might explain it? Indexes, maybe? (I have none on the table right now)

推荐答案

您将需要 parent_id上的索引。 CTE的递归部分将始终使用嵌套循环连接,并且不会渗透连接提示(结果添加到堆栈假脱机,并且按LIFO顺序逐行处理行)

You would need an index on parent_id. The recursive part of a CTE will always use a nested loops join and is impervious to join hints (Results are added to a stack spool and the rows are processed one by one in LIFO order)

没有索引在 parent_id 上,它将需要在嵌套循环的内侧多次扫描表。性能将随着行数成倍下降。

Without an index on parent_id it will need to scan the table multiple times on the inner side of the nested loops. Performance will degrade exponentially with number of rows.

您无需递归的查询将能够使用不同的联接类型(哈希或合并),每个联接类型仅对表进行两次扫描递归。在这种情况下,很可能是哈希联接,因为您没有避免排序的有用索引。

Your query without recursion will be able to use different join types (hash or merge) that only scan the table twice for each level of recursion. Most likely hash join in this case as you have no useful indexes that would avoid a sort.

这篇关于CTE与T-SQL循环用于确定对象层次结构的深度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆