在 SQL Server 中计算运行总计 [英] Calculate a Running Total in SQL Server

查看:37
本文介绍了在 SQL Server 中计算运行总计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想象下表(称为TestTable):

id     somedate    somevalue
--     --------    ---------
45     01/Jan/09   3
23     08/Jan/09   5
12     02/Feb/09   0
77     14/Feb/09   7
39     20/Feb/09   34
33     02/Mar/09   6

我想要一个按日期顺序返回运行总计的查询,例如:

I would like a query that returns a running total in date order, like:

id     somedate    somevalue  runningtotal
--     --------    ---------  ------------
45     01/Jan/09   3          3
23     08/Jan/09   5          8
12     02/Feb/09   0          8
77     14/Feb/09   7          15  
39     20/Feb/09   34         49
33     02/Mar/09   6          55

我知道有 在 SQL Server 2000/2005/2008 中执行此操作的各种方法.

我对这种使用聚合集语句技巧的方法特别感兴趣:

I am particularly interested in this sort of method that uses the aggregating-set-statement trick:

INSERT INTO @AnotherTbl(id, somedate, somevalue, runningtotal) 
   SELECT id, somedate, somevalue, null
   FROM TestTable
   ORDER BY somedate

DECLARE @RunningTotal int
SET @RunningTotal = 0

UPDATE @AnotherTbl
SET @RunningTotal = runningtotal = @RunningTotal + somevalue
FROM @AnotherTbl

...这是非常有效的,但我听说这方面存在一些问题,因为您不一定能保证 UPDATE 语句会以正确的顺序处理行.也许我们可以得到一些关于这个问题的明确答案.

... this is very efficient but I have heard there are issues around this because you can't necessarily guarantee that the UPDATE statement will process the rows in the correct order. Maybe we can get some definitive answers about that issue.

但也许人们可以提出其他建议?

But maybe there are other ways that people can suggest?

现在使用 SqlFiddle 与设置和更新技巧"示例上面

edit: Now with a SqlFiddle with the setup and the 'update trick' example above

推荐答案

更新,如果您运行的是 SQL Server 2012,请参阅:https://stackoverflow.com/a/10309947

Update, if you are running SQL Server 2012 see: https://stackoverflow.com/a/10309947

问题是Over子句的SQL Server实现是有些限制.

The problem is that the SQL Server implementation of the Over clause is somewhat limited.

Oracle(和 ANSI-SQL)允许您执行以下操作:

Oracle (and ANSI-SQL) allow you to do things like:

 SELECT somedate, somevalue,
  SUM(somevalue) OVER(ORDER BY somedate 
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
          AS RunningTotal
  FROM Table

SQL Server 没有为您提供针对此问题的干净解决方案.我的直觉告诉我,这是游标速度最快的罕见情况之一,尽管我必须对大结果进行一些基准测试.

SQL Server gives you no clean solution to this problem. My gut is telling me that this is one of those rare cases where a cursor is the fastest, though I will have to do some benchmarking on big results.

更新技巧很方便,但我觉得它相当脆弱.似乎如果您正在更新一个完整的表,那么它将按主键的顺序进行.因此,如果您将日期设置为主键升序,则可能 是安全的.但是您依赖于未记录的 SQL Server 实现细节(如果查询最终由两个进程执行,我想知道会发生什么,请参阅:MAXDOP):

The update trick is handy but I feel its fairly fragile. It seems that if you are updating a full table then it will proceed in the order of the primary key. So if you set your date as a primary key ascending you will probably be safe. But you are relying on an undocumented SQL Server implementation detail (also if the query ends up being performed by two procs I wonder what will happen, see: MAXDOP):

完整的工作示例:

drop table #t 
create table #t ( ord int primary key, total int, running_total int)

insert #t(ord,total)  values (2,20)
-- notice the malicious re-ordering 
insert #t(ord,total) values (1,10)
insert #t(ord,total)  values (3,10)
insert #t(ord,total)  values (4,1)

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t
order by ord 

ord         total       running_total
----------- ----------- -------------
1           10          10
2           20          30
3           10          40
4           1           41

你要求一个基准,这是内幕.

You asked for a benchmark this is the lowdown.

最快的 SAFE 方法是 Cursor,它比交叉连接的相关子查询快一个数量级.

The fastest SAFE way of doing this would be the Cursor, it is an order of magnitude faster than the correlated sub-query of cross-join.

绝对最快的方法是 UPDATE 技巧.我唯一担心的是,我不确定在所有情况下更新都会以线性方式进行.查询中没有任何内容明确说明.

The absolute fastest way is the UPDATE trick. My only concern with it is that I am not certain that under all circumstances the update will proceed in a linear way. There is nothing in the query that explicitly says so.

最重要的是,对于生产代码,我会使用光标.

Bottom line, for production code I would go with the cursor.

测试数据:

create table #t ( ord int primary key, total int, running_total int)

set nocount on 
declare @i int
set @i = 0 
begin tran
while @i < 10000
begin
   insert #t (ord, total) values (@i,  rand() * 100) 
    set @i = @i +1
end
commit

测试 1:

SELECT ord,total, 
    (SELECT SUM(total) 
        FROM #t b 
        WHERE b.ord <= a.ord) AS b 
FROM #t a

-- CPU 11731, Reads 154934, Duration 11135 

测试 2:

SELECT a.ord, a.total, SUM(b.total) AS RunningTotal 
FROM #t a CROSS JOIN #t b 
WHERE (b.ord <= a.ord) 
GROUP BY a.ord,a.total 
ORDER BY a.ord

-- CPU 16053, Reads 154935, Duration 4647

测试 3:

DECLARE @TotalTable table(ord int primary key, total int, running_total int)

DECLARE forward_cursor CURSOR FAST_FORWARD 
FOR 
SELECT ord, total
FROM #t 
ORDER BY ord


OPEN forward_cursor 

DECLARE @running_total int, 
    @ord int, 
    @total int
SET @running_total = 0

FETCH NEXT FROM forward_cursor INTO @ord, @total 
WHILE (@@FETCH_STATUS = 0)
BEGIN
     SET @running_total = @running_total + @total
     INSERT @TotalTable VALUES(@ord, @total, @running_total)
     FETCH NEXT FROM forward_cursor INTO @ord, @total 
END

CLOSE forward_cursor
DEALLOCATE forward_cursor

SELECT * FROM @TotalTable

-- CPU 359, Reads 30392, Duration 496

测试 4:

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t

-- CPU 0, Reads 58, Duration 139

这篇关于在 SQL Server 中计算运行总计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆