使用SQL 2008的不同记录之间的部分总和 [英] Partial sum between different records using SQL 2008
问题描述
我正在尝试在SQL 2008中解决此问题.我有一个像这样的表:
I'm trying to solve this issue in SQL 2008. I've a table like this:
DECLARE @table TABLE (
TimeStamp DATETIME,
val INT,
typerow VARCHAR(3)
);
INSERT INTO @table(TimeStamp, val, typerow)
VALUES
('2018-06-03 13:30:00.000', 6, 'out'),
('2018-06-03 14:10:00.000', 8, 'out'),
('2018-06-03 14:30:00.000', 3, 'in'),
('2018-06-03 15:00:00.000', 9, 'out'),
('2018-06-03 15:30:00.000', 4, 'out'),
('2018-06-03 16:00:00.000', 2, 'out'),
('2018-06-03 17:05:00.000', 8, 'in'),
('2018-06-03 17:30:00.000', 0, 'out'),
('2018-06-03 18:15:00.000', 7, 'out'),
('2018-06-03 18:30:00.000', 1, 'in'),
('2018-06-03 19:00:00.000', 5, 'out')
此表包含不同的 TimeStamp ,相对值 val 和一个二进制列("in"/退出"). 键入 .
This table contains distinct TimeStamp with relative values val and a binary column ('in'/'out') typerow.
考虑@table按TimeStamp升序排列,我需要找到一种获取表的方法,其中 typerow ='in'
的每一行都包含在 val 列的当前值加上 val 字段中所有先前整数的和,其中 typerow ='out'
,直到前一个 typerow ='in'
记录.自然,对于具有 typerow ='in'
的第一条记录,总和将一直扩展到@table
Considering @table sorted by TimeStamp ascending, I need to figure a way to get a table in which every row with typerow = 'in'
contains in val column its current value plus the sum of all previous integer in val field where typerow = 'out'
, until the previous typerow = 'in'
record.
Naturally for the first record with typerow = 'in'
, the sum will be extended until the first record of @table
2018-06-03 13:30:00.000 6 out
2018-06-03 14:10:00.000 8 out
2018-06-03 14:30:00.000 17 in -- 6 + 8 + 3
2018-06-03 15:00:00.000 9 out
2018-06-03 15:30:00.000 4 out
2018-06-03 16:00:00.000 2 out
2018-06-03 17:05:00.000 23 in -- 9 + 4 + 2 + 8
2018-06-03 17:30:00.000 0 out
2018-06-03 18:15:00.000 7 out
2018-06-03 18:30:00.000 8 in -- 0 + 7 + 1
2018-06-03 19:00:00.000 5 out
考虑到@table将以这种方式创建数百条记录,我的第一个想法是创建一个新的id列,并将相同的id与涉及同一求和的所有记录相关联(也许可以通过递归CTE来做到这一点?)得到这个结果:
Considering @table will have hundreds of records made in this way, my first idea is to create a new id column and associate same id to all records involved in the same summation (maybe it's possible to do that by recursive CTE?) to get this result:
2018-06-03 13:30:00.000 6 out 1
2018-06-03 14:10:00.000 8 out 1
2018-06-03 14:30:00.000 17 in 1
2018-06-03 15:00:00.000 9 out 2
2018-06-03 15:30:00.000 4 out 2
2018-06-03 16:00:00.000 2 out 2
2018-06-03 17:05:00.000 23 in 2
2018-06-03 17:30:00.000 0 out 3
2018-06-03 18:15:00.000 7 out 3
2018-06-03 18:30:00.000 8 in 3
2018-06-03 19:00:00.000 5 out don't care for this element
并有一个新列,例如
SELECT SUM(vals) OVER (PARTITION BY id ORDER BY id) AS partial_sum
使用 partial_sum 更新 val 列,其中 typerow ='in'
.我不知道如何正确创建新的id列,如果考虑我的SQL Server版本,这是否是一个好的解决方案.
updating val column with partial_sum where typerow = 'in'
.
I don't know how create new id column correctly and if this is a good solution, considering also my SQL Server version.
在此先感谢您的支持,我们将不胜感激.
Thanks in advance for your support, any suggestion is appreciated.
推荐答案
这是一个缺岛问题,其中每个岛都以"in"结尾.记录,您想对每个岛中的值求和.
This is a gaps-and-islands problem, where each island ends with an "in" record, and you want to sum the values in each island.
这是一种使用后跟"in"的计数来定义组,然后使用每个组上的窗口总和的方法.
Here is one approach that uses the count of following "in"s to define the group, and then a window sum over each group.
select timestamp,
case when val = 'out'
then val
else sum(val) over(partition by grp order by timestamp)
end as val,
typerow
from (
select t.*,
sum(case when typerow = 'in' then 1 else 0 end) over(order by timestamp desc) grp
from @table t
) t
order by timestamp
timestamp | val | typerow
:---------------------- | --: | :------
2018-06-03 13:30:00.000 | 6 | out
2018-06-03 14:10:00.000 | 8 | out
2018-06-03 14:30:00.000 | 17 | in
2018-06-03 15:00:00.000 | 9 | out
2018-06-03 15:30:00.000 | 4 | out
2018-06-03 16:00:00.000 | 2 | out
2018-06-03 17:05:00.000 | 23 | in
2018-06-03 17:30:00.000 | 0 | out
2018-06-03 18:15:00.000 | 7 | out
2018-06-03 18:30:00.000 | 8 | in
2018-06-03 19:00:00.000 | 5 | out
这篇关于使用SQL 2008的不同记录之间的部分总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!