为什么 CTE(递归)没有并行化(MAXDOP=8)? [英] Why CTE (recursive) is not parallelized (MAXDOP=8)?

查看：47 发布时间：2021/6/15 19:28:14 sql sql-server performance sql-server-2008

本文介绍了为什么 CTE(递归)没有并行化(MAXDOP=8)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们有相当大的机器 100GB+ 内存和 8+ 个内核.服务器范围 MAXDOP=8.

We have rather Big machine 100GB+ memory and 8+ cores in it. Server wide MAXDOP=8.

T_SEQ_FF rowcount = 61692209, size = 2991152 KB

UPD 1:表T_SEQ_FF有两个索引:

1) create index idx_1 on T_SEQ_FF (first_num)
2) create index idx_2 on T_SEQ_FF (second_num)

表 T_SEQ_FF 有 first_num、second_num 对 应该在 cte 之后提供序列的数字:

Table T_SEQ_FF have first_num, second_num pairs of nums that should provide a sequence after cte:

;with first_entity as ( 
    select first_num from  T_SEQ_FF a  where not exists (select 1 from  T_SEQ_FF b  where a.first_num = b.second_num) 
) ,
cte as ( 
select a.first_num, a.second_num, a.first_num as first_key, 1 as sequence_count 
from  T_SEQ_FF a  inner join first_entity b on a.first_num = b.first_num 
union all 
select a.first_num, a.second_num, cte.first_key, cte.sequence_count + 1 
from  T_SEQ_FF a  
inner join cte on a.first_num = cte.second_num 
) 
select * 
from cte 
option (maxrecursion 0);

但是当我运行这个查询时 - 我只看到没有并行的串行查询计划.如果我从上面的查询中删除 CTE 的第二部分:

But when I run this query - I only see serial query plan without Parallelism. If I remove 2nd part of CTE from query above:

union all 
    select a.first_num, a.second_num, cte.first_key, cte.sequence_count + 1 
    from  T_SEQ_FF a  
    inner join cte on a.first_num = cte.second_num

然后我可以看到查询计划使用重新分区和收集流变得并行化.

then I could see that query plan becomes Parallelized using Repartition and Gather Streams.

所以我可以总结一下，这是因为递归 CTE SQL Server 在处理此查询时没有使用并行.

So I can summarize that it is because of recurisve CTE SQL Server is not using Parallelism when processing this query.

我相信在拥有大量免费资源的大型机器上，并行应该有助于更快地完成查询.

I believe that on such big machine with tons of free resources Parallelism should help to finish query faster.

现在它运行了大约 40-50 分钟.

For now it runs for ~40-50mins.

您能否建议我们如何使用尽可能多的资源来更快地完成查询?

Could you advice how to use as much resources as we can to finish the query faster?

CTE 是唯一的选择，因为我们需要从 first_num - second_num 对中填充序列，并且这些序列可以是任意长度.

CTE is the only option because we need to populate sequences from first_num - second_num pairs and those sequences could be of any length.

推荐答案

我会尝试重写 CTE 以删除其中一个步骤，即

I would try rewriting the CTE to remove one of the steps i.e.

;cte as ( 
select a.first_num, a.second_num, a.first_num as first_key, 1 as sequence_count 
from  T_SEQ_FF a  where not exists (select 1 from  T_SEQ_FF b  where a.first_num = b.second_num) 
union all 
select a.first_num, a.second_num, cte.first_key, cte.sequence_count + 1 
from  T_SEQ_FF a  
inner join cte on a.first_num = cte.second_num 
) 
select * 
from cte 
option (maxrecursion 0);

如果只有一个根元素，最好将其作为变量传递到查询中，以便查询优化器可以使用该值.

If there is only one root element it would be better to pass this into the query as a variable so the value can be used by the query optimizer.

要尝试的另一件事是更改查询以获取没有子查询的根元素，即 second_num 为 null 或 first_num = second_num.

Another thing to try is change the query to get the root elements without a subquery i.e. second_num is null or first_num = second_num.

这篇关于为什么 CTE(递归)没有并行化(MAXDOP=8)?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么 CTE(递归)没有并行化(MAXDOP=8)? [英] Why CTE (recursive) is not parallelized (MAXDOP=8)?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

为什么 CTE(递归)没有并行化(MAXDOP=8)? [英] Why CTE (recursive) is not parallelized (MAXDOP=8)?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭