提高递归CTE的性能 [英] Improve performance of Recursive CTE

查看:59
本文介绍了提高递归CTE的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常简单的递归CTE,它在包含大约4百万条记录的单个源表( REP.INVENTMOVEMENTS )上运行.该表已被索引很多.

I have a pretty simple recursive CTE running on a single sourcetable (REP.INVENTMOVEMENTS) containing around 4mln records. The table is pretty heavily indexed.

with 
dataset as (
    select imv.sourceBatch, 
           imv.targetBatch,
           imv.sourceDataArea, 
           imv.targetDataArea,
           sum(Weight) as Weight
    from REP.INVENTMOVEMENTS imv 
    where imv.sourceBatch <> ''
    Group By imv.sourceBatch, 
             imv.targetBatch,
             imv.sourceDataArea, 
             imv.targetDataArea
    ),
result as (
    select  targetBatch as Batch,
            targetDataArea as DataArea, 
            sourceBatch, 
            targetBatch,   
            sourceDataArea,
            targetDataArea, 
            1 as level,
            Weight
    from dataset
    where sourceBatch <> targetBatch

    union all 

    select result.Batch,
           result.DataArea, 
           dataset.sourceBatch, 
           dataset.targetBatch, 
           dataset.sourceDataArea,
           dataset.targetDataArea, 
           result.level + 1 as level,
           dataset.Weight
    from dataset inner join result on dataset.targetBatch       = result.sourceBatch 
                                  and dataset.targetDataArea    = result.sourceDataArea
                                  and dataset.targetBatch       <> dataset.sourceBatch
    )

select * from result
union all
select      targetBatch as Batch,
            targetDataArea as DataArea, 
            sourceBatch, 
            targetBatch,   
            sourceDataArea,
            targetDataArea, 
            0 as level,
            Weight
    from dataset
    where sourceBatch = targetBatch
;

在没有选择的情况下运行初始查询会花费数据库122秒钟,返回517.947条记录.

running the initial query without selections takes the database 122 seconds returning 517.947 records.

运行相同的查询返回一个批处理所花费的数据库少于第二个查询所返回的5条记录.

Running that same query returning a single batch takes the database less then a second returning 5 records.

但是,如果我对1个批处理运行CTE,则数据库需要28秒才能完成2次递归并返回7条记录.

But if I run the CTE with a selection on 1 batch it takes the database 28 seconds to complete 2 recursions and return 7 records.

我需要用此视图的结果填充一张表,以获取15万个批次.因此,如果所有人都需要半分钟才能完成,那么将需要52天才能完成该任务.

I need to fill a table with the result from this view for 150k batches. so if all of them take half a minute to complete it would take 52 days to finish that task.

这是执行计划

执行计划

只是为了阐明我的目标.批次可以合并为新批次,因此2个或更多源批次可以创建一个新批次.在这种合并中创建的两个批处理可用于创建新的批处理...等等.

Just to clarify my objective. Batches can merge into new batches so 2 or more source batches can create an new batch. Two batches created in such a merge can be used to create a new batch... and so forth.

我希望能够选择一个批次并找到用于创建此新批次的所有批次.

I want to be able to select a batch and find all the batches used to create this new batch.

请注意,一个批次可以在其他多个批次中使用.

Please take into account that one batch can be used in multiple other batches.

希望您能在这里为我提供帮助.

I hope you can help me out here.

推荐答案

我已经解决了这个问题,方法是创建一个内部表,并用执行递归查询所需的数据集填充该内部表.

I've solved the issue by creating an internal table and filling that with the dataset needed to execute the recursive query.

DECLARE @BatchSequence as table(Batch           nvarchar(100),
                                SourceBatch nvarchar(100),
                                TargetBatch nvarchar(100),
                                Weight          decimal(18,3));

insert into @BatchSequence 
select ReportingBatch, SourceBatch,TargetBatch, SUM(Weight) as Weight
from REP.INVENTMOVEMENTS
WHERE sourceBatch <> ''
Group By ReportingBatch, SourceBatch,TargetBatch;

with 
result as (
    select  targetBatch as Batch,
            sourceBatch, 
            targetBatch,   
            1 as level,
            Weight
    from @BatchSequence dataset
    where sourceBatch <> targetBatch

    union all 

    select result.Batch,
           dataset.sourceBatch, 
           dataset.targetBatch, 
           result.level + 1 as level,
           dataset.Weight
    from @BatchSequence dataset inner join result on dataset.targetBatch        = result.sourceBatch 
                                  and dataset.targetBatch       <> dataset.sourceBatch
    )

这将在1分钟内返回25万条记录.

This returns 250k records in 1 minute.

希望这可以帮助其他人.

Hope this can help someone else.

这篇关于提高递归CTE的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆