计算SQL Server中的运行总计 [英] Calculate a Running Total in SQL Server

查看:53
本文介绍了计算SQL Server中的运行总计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想象一下下表(称为 TestTable ):

Imagine the following table (called TestTable):

id     somedate    somevalue
--     --------    ---------
45     01/Jan/09   3
23     08/Jan/09   5
12     02/Feb/09   0
77     14/Feb/09   7
39     20/Feb/09   34
33     02/Mar/09   6

我想要一个按日期顺序返回运行总计的查询,例如:

I would like a query that returns a running total in date order, like:

id     somedate    somevalue  runningtotal
--     --------    ---------  ------------
45     01/Jan/09   3          3
23     08/Jan/09   5          8
12     02/Feb/09   0          8
77     14/Feb/09   7          15  
39     20/Feb/09   34         49
33     02/Mar/09   6          55

我知道有我对这种使用aggregating-set-statement技巧的方法特别感兴趣:

I am particularly interested in this sort of method that uses the aggregating-set-statement trick:

INSERT INTO @AnotherTbl(id, somedate, somevalue, runningtotal) 
   SELECT id, somedate, somevalue, null
   FROM TestTable
   ORDER BY somedate

DECLARE @RunningTotal int
SET @RunningTotal = 0

UPDATE @AnotherTbl
SET @RunningTotal = runningtotal = @RunningTotal + somevalue
FROM @AnotherTbl

...这是非常有效的,但是我听说周围存在一些问题,因为您不一定能保证 UPDATE 语句将以正确的顺序处理行.也许我们可以获得有关该问题的明确答案.

... this is very efficient but I have heard there are issues around this because you can't necessarily guarantee that the UPDATE statement will process the rows in the correct order. Maybe we can get some definitive answers about that issue.

但是也许人们可以提出其他建议?

But maybe there are other ways that people can suggest?

edit:现在使用 SqlFiddle ,其中包含设置和更新技巧"示例以上

edit: Now with a SqlFiddle with the setup and the 'update trick' example above

推荐答案

更新,如果您正在运行SQL Server 2012,请参见:

Update, if you are running SQL Server 2012 see: https://stackoverflow.com/a/10309947

问题在于Over子句的SQL Server实现是

The problem is that the SQL Server implementation of the Over clause is somewhat limited.

Oracle(和ANSI-SQL)允许您执行以下操作:

Oracle (and ANSI-SQL) allow you to do things like:

 SELECT somedate, somevalue,
  SUM(somevalue) OVER(ORDER BY somedate 
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
          AS RunningTotal
  FROM Table

SQL Server无法为您解决此问题.我的直觉告诉我,这是极少数情况下光标最快的情况之一,尽管我必须对大结果进行一些基准测试.

SQL Server gives you no clean solution to this problem. My gut is telling me that this is one of those rare cases where a cursor is the fastest, though I will have to do some benchmarking on big results.

更新技巧很方便,但我觉得它相当脆弱.看来,如果您要更新一个完整的表,它将按照主键的顺序进行.因此,如果将日期设置为升序的主键,则可能是安全的.但是,您所依赖的是未公开的SQL Server实现细节(同样,如果查询最终由两个proc执行,我想知道会发生什么,请参阅:MAXDOP):

The update trick is handy but I feel its fairly fragile. It seems that if you are updating a full table then it will proceed in the order of the primary key. So if you set your date as a primary key ascending you will probably be safe. But you are relying on an undocumented SQL Server implementation detail (also if the query ends up being performed by two procs I wonder what will happen, see: MAXDOP):

完整的工作示例:

drop table #t 
create table #t ( ord int primary key, total int, running_total int)

insert #t(ord,total)  values (2,20)
-- notice the malicious re-ordering 
insert #t(ord,total) values (1,10)
insert #t(ord,total)  values (3,10)
insert #t(ord,total)  values (4,1)

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t
order by ord 

ord         total       running_total
----------- ----------- -------------
1           10          10
2           20          30
3           10          40
4           1           41

您要求提供基准,这是最低要求.

You asked for a benchmark this is the lowdown.

最快的SAFE方式是游标,它比交叉联接的相关子查询快一个数量级.

The fastest SAFE way of doing this would be the Cursor, it is an order of magnitude faster than the correlated sub-query of cross-join.

绝对最快的方法是UPDATE技巧.我唯一关心的是,我不确定在所有情况下更新都会以线性方式进行.查询中没有明确说明的内容.

The absolute fastest way is the UPDATE trick. My only concern with it is that I am not certain that under all circumstances the update will proceed in a linear way. There is nothing in the query that explicitly says so.

最底线,对于生产代码,我将使用光标.

Bottom line, for production code I would go with the cursor.

测试数据:

create table #t ( ord int primary key, total int, running_total int)

set nocount on 
declare @i int
set @i = 0 
begin tran
while @i < 10000
begin
   insert #t (ord, total) values (@i,  rand() * 100) 
    set @i = @i +1
end
commit

测试1:

SELECT ord,total, 
    (SELECT SUM(total) 
        FROM #t b 
        WHERE b.ord <= a.ord) AS b 
FROM #t a

-- CPU 11731, Reads 154934, Duration 11135 

测试2:

SELECT a.ord, a.total, SUM(b.total) AS RunningTotal 
FROM #t a CROSS JOIN #t b 
WHERE (b.ord <= a.ord) 
GROUP BY a.ord,a.total 
ORDER BY a.ord

-- CPU 16053, Reads 154935, Duration 4647

测试3:

DECLARE @TotalTable table(ord int primary key, total int, running_total int)

DECLARE forward_cursor CURSOR FAST_FORWARD 
FOR 
SELECT ord, total
FROM #t 
ORDER BY ord


OPEN forward_cursor 

DECLARE @running_total int, 
    @ord int, 
    @total int
SET @running_total = 0

FETCH NEXT FROM forward_cursor INTO @ord, @total 
WHILE (@@FETCH_STATUS = 0)
BEGIN
     SET @running_total = @running_total + @total
     INSERT @TotalTable VALUES(@ord, @total, @running_total)
     FETCH NEXT FROM forward_cursor INTO @ord, @total 
END

CLOSE forward_cursor
DEALLOCATE forward_cursor

SELECT * FROM @TotalTable

-- CPU 359, Reads 30392, Duration 496

测试4:

declare @total int 
set @total = 0
update #t set running_total = @total, @total = @total + total 

select * from #t

-- CPU 0, Reads 58, Duration 139

这篇关于计算SQL Server中的运行总计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆