提高递归CTE的性能 [英] Improve performance of Recursive CTE
问题描述
我有一个非常简单的递归CTE,它在包含大约4百万条记录的单个源表( REP.INVENTMOVEMENTS
)上运行.该表已被索引很多.
I have a pretty simple recursive CTE running on a single sourcetable (REP.INVENTMOVEMENTS
) containing around 4mln records. The table is pretty heavily indexed.
with
dataset as (
select imv.sourceBatch,
imv.targetBatch,
imv.sourceDataArea,
imv.targetDataArea,
sum(Weight) as Weight
from REP.INVENTMOVEMENTS imv
where imv.sourceBatch <> ''
Group By imv.sourceBatch,
imv.targetBatch,
imv.sourceDataArea,
imv.targetDataArea
),
result as (
select targetBatch as Batch,
targetDataArea as DataArea,
sourceBatch,
targetBatch,
sourceDataArea,
targetDataArea,
1 as level,
Weight
from dataset
where sourceBatch <> targetBatch
union all
select result.Batch,
result.DataArea,
dataset.sourceBatch,
dataset.targetBatch,
dataset.sourceDataArea,
dataset.targetDataArea,
result.level + 1 as level,
dataset.Weight
from dataset inner join result on dataset.targetBatch = result.sourceBatch
and dataset.targetDataArea = result.sourceDataArea
and dataset.targetBatch <> dataset.sourceBatch
)
select * from result
union all
select targetBatch as Batch,
targetDataArea as DataArea,
sourceBatch,
targetBatch,
sourceDataArea,
targetDataArea,
0 as level,
Weight
from dataset
where sourceBatch = targetBatch
;
在没有选择的情况下运行初始查询会花费数据库122秒钟,返回517.947条记录.
running the initial query without selections takes the database 122 seconds returning 517.947 records.
运行相同的查询返回一个批处理所花费的数据库少于第二个查询所返回的5条记录.
Running that same query returning a single batch takes the database less then a second returning 5 records.
但是,如果我对1个批处理运行CTE,则数据库需要28秒才能完成2次递归并返回7条记录.
But if I run the CTE with a selection on 1 batch it takes the database 28 seconds to complete 2 recursions and return 7 records.
我需要用此视图的结果填充一张表,以获取15万个批次.因此,如果所有人都需要半分钟才能完成,那么将需要52天才能完成该任务.
I need to fill a table with the result from this view for 150k batches. so if all of them take half a minute to complete it would take 52 days to finish that task.
这是执行计划
只是为了阐明我的目标.批次可以合并为新批次,因此2个或更多源批次可以创建一个新批次.在这种合并中创建的两个批处理可用于创建新的批处理...等等.
Just to clarify my objective. Batches can merge into new batches so 2 or more source batches can create an new batch. Two batches created in such a merge can be used to create a new batch... and so forth.
我希望能够选择一个批次并找到用于创建此新批次的所有批次.
I want to be able to select a batch and find all the batches used to create this new batch.
请注意,一个批次可以在其他多个批次中使用.
Please take into account that one batch can be used in multiple other batches.
希望您能在这里为我提供帮助.
I hope you can help me out here.
推荐答案
我已经解决了这个问题,方法是创建一个内部表,并用执行递归查询所需的数据集填充该内部表.
I've solved the issue by creating an internal table and filling that with the dataset needed to execute the recursive query.
DECLARE @BatchSequence as table(Batch nvarchar(100),
SourceBatch nvarchar(100),
TargetBatch nvarchar(100),
Weight decimal(18,3));
insert into @BatchSequence
select ReportingBatch, SourceBatch,TargetBatch, SUM(Weight) as Weight
from REP.INVENTMOVEMENTS
WHERE sourceBatch <> ''
Group By ReportingBatch, SourceBatch,TargetBatch;
with
result as (
select targetBatch as Batch,
sourceBatch,
targetBatch,
1 as level,
Weight
from @BatchSequence dataset
where sourceBatch <> targetBatch
union all
select result.Batch,
dataset.sourceBatch,
dataset.targetBatch,
result.level + 1 as level,
dataset.Weight
from @BatchSequence dataset inner join result on dataset.targetBatch = result.sourceBatch
and dataset.targetBatch <> dataset.sourceBatch
)
这将在1分钟内返回25万条记录.
This returns 250k records in 1 minute.
希望这可以帮助其他人.
Hope this can help someone else.
这篇关于提高递归CTE的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!