对树数据进行分组、聚合和求和的最佳方法是什么? [英] What is the best way to group and aggregate and sum tree data?

查看:42
本文介绍了对树数据进行分组、聚合和求和的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个自引用表

Item 
-------------
Id (pk)
ParentId (fk)

带有关联值的相关表

ItemValue
-------------
ItemId (fk)
Amount

以及一些示例数据

Item                       ItemValues 
Id      ParentId           ItemId      Amount
--------------------       ----------------------
1       null               1           10
2       1                  3           40
3       1                  3           20
4       2                  4           10
5       2                  5           30
6       null
7       6
8       7

我需要一个 sproc 来获取 Item.Id 并为他们、他们的孩子和他们的孩子返回所有 ItemValue.Amounts 的总和的直接孩子下树.

I need a sproc to take Item.Id and return the direct children with sums of all ItemValue.Amounts for the them, their children and their children all the way down the tree.

例如,如果传入1,则树为2, 3, 4, 5,直接子节点为2, 3> 输出将是

For example, if 1 is passed in, the tree would be 2, 3, 4, 5 the direct children are 2, 3 the output would be

 ItemId    Amount
 ------------------
 2         40     (values from ItemIds 4 & 5)
 3         60     (values from ItemId 3)

应该采用什么样的方法来实现这种行为?

What sort of approaches should be applied to make achieve this behavior?

我正在考虑使用 CTE,但想知道是否有更好/更快的方法.

I am considering using a CTE, but am wondering if there is a better/faster approach.

推荐答案

假设您的层次结构不会太深,像这样的递归 CTE 会起作用:

A recursive CTE like this would work, assuming your hierarchy doesn't go too deep:

declare @ParentId int;
set @ParentId = 1;

;with 
  Recurse as (
    select 
      a.Id as DirectChildId
    , a.Id
    from Item a 
    where ParentId = @ParentId
    union all
    select
      b.DirectChildId
    , a.Id
    from Item a 
    join Recurse b on b.Id = a.ParentId
    )
select
  a.DirectChildId, sum(b.Amount) as Amount
from Recurse a
left join ItemValues b on a.Id = b.ItemId
group by
  DirectChildId;

非 CTE 方法需要某种形式的迭代,基于游标或其他方式.由于它是一个存储过程,它是一种可能性,如果有大量数据要递归,只要您适当地对数据进行切片,它的扩展性可能会更好.

A non-CTE method would require some form of iteration, cursor-based or otherwise. Since it's a stored proc, its a possibility, and if there's a lot data to recurse through, it would probably scale better, so long as you slice the data appropriately.

如果聚集索引在 Id 上,则在 ParentId 上添加一个非聚集索引.作为覆盖索引,它将满足没有书签查找的初始查找.然后聚集索引将有助于递归连接.

If the clustered index is on Id, add a non-clustered index on ParentId. As a covering index, it will satisfy the initial seek w/out a bookmark lookup. The clustered index will then help with the recursive join.

如果聚集索引已经在 ParentId 上,则在 Id 上添加一个非聚集索引.总之,它们实际上等同于上述内容.对于 ItemValues,如果实际表格比这更宽,您可能需要 (ItemId) INCLUDE (Amount) 的索引.

If the clustered index is already on ParentId instead, add a non-clustered index on Id. Together, they will be virtually equivalent to the above. For ItemValues, you may want a index on (ItemId) INCLUDE (Amount), if the actual table is wider than this.

这篇关于对树数据进行分组、聚合和求和的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆