使用SQL 2008的不同记录之间的部分总和 [英] Partial sum between different records using SQL 2008

查看:44
本文介绍了使用SQL 2008的不同记录之间的部分总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在SQL 2008中解决此问题.我有一个像这样的表:

I'm trying to solve this issue in SQL 2008. I've a table like this:

DECLARE @table TABLE (
    TimeStamp        DATETIME,
    val              INT,
    typerow          VARCHAR(3)
);

INSERT INTO @table(TimeStamp, val, typerow)
VALUES
   ('2018-06-03 13:30:00.000', 6, 'out'),
   ('2018-06-03 14:10:00.000', 8, 'out'),
   ('2018-06-03 14:30:00.000', 3, 'in'),
   ('2018-06-03 15:00:00.000', 9, 'out'),
   ('2018-06-03 15:30:00.000', 4, 'out'),
   ('2018-06-03 16:00:00.000', 2, 'out'),
   ('2018-06-03 17:05:00.000', 8, 'in'),
   ('2018-06-03 17:30:00.000', 0, 'out'),
   ('2018-06-03 18:15:00.000', 7, 'out'),
   ('2018-06-03 18:30:00.000', 1, 'in'),
   ('2018-06-03 19:00:00.000', 5, 'out')

此表包含不同的 TimeStamp ,相对值 val 和一个二进制列("in"/退出"). 键入 .

This table contains distinct TimeStamp with relative values val and a binary column ('in'/'out') typerow.

考虑@table按TimeStamp升序排列,我需要找到一种获取表的方法,其中 typerow ='in'的每一行都包含在 val 列的当前值加上 val 字段中所有先前整数的和,其中 typerow ='out',直到前一个 typerow ='in'记录.自然,对于具有 typerow ='in'的第一条记录,总和将一直扩展到@table

Considering @table sorted by TimeStamp ascending, I need to figure a way to get a table in which every row with typerow = 'in' contains in val column its current value plus the sum of all previous integer in val field where typerow = 'out', until the previous typerow = 'in' record. Naturally for the first record with typerow = 'in', the sum will be extended until the first record of @table

2018-06-03 13:30:00.000    6      out
2018-06-03 14:10:00.000    8      out
2018-06-03 14:30:00.000    17     in  -- 6 + 8 + 3
2018-06-03 15:00:00.000    9      out
2018-06-03 15:30:00.000    4      out
2018-06-03 16:00:00.000    2      out
2018-06-03 17:05:00.000    23     in  -- 9 + 4 + 2 + 8
2018-06-03 17:30:00.000    0      out
2018-06-03 18:15:00.000    7      out
2018-06-03 18:30:00.000    8      in  -- 0 + 7 + 1
2018-06-03 19:00:00.000    5      out

考虑到@table将以这种方式创建数百条记录,我的第一个想法是创建一个新的id列,并将相同的id与涉及同一求和的所有记录相关联(也许可以通过递归CTE来做到这一点?)得到这个结果:

Considering @table will have hundreds of records made in this way, my first idea is to create a new id column and associate same id to all records involved in the same summation (maybe it's possible to do that by recursive CTE?) to get this result:

2018-06-03 13:30:00.000    6      out    1
2018-06-03 14:10:00.000    8      out    1
2018-06-03 14:30:00.000    17     in     1
2018-06-03 15:00:00.000    9      out    2
2018-06-03 15:30:00.000    4      out    2
2018-06-03 16:00:00.000    2      out    2
2018-06-03 17:05:00.000    23     in     2
2018-06-03 17:30:00.000    0      out    3
2018-06-03 18:15:00.000    7      out    3
2018-06-03 18:30:00.000    8      in     3
2018-06-03 19:00:00.000    5      out    don't care for this element

并有一个新列,例如

SELECT SUM(vals) OVER (PARTITION BY id ORDER BY id) AS partial_sum

使用 partial_sum 更新 val 列,其中 typerow ='in'.我不知道如何正确创建新的id列,如果考虑我的SQL Server版本,这是否是一个好的解决方案.

updating val column with partial_sum where typerow = 'in'. I don't know how create new id column correctly and if this is a good solution, considering also my SQL Server version.

在此先感谢您的支持,我们将不胜感激.

Thanks in advance for your support, any suggestion is appreciated.

推荐答案

这是一个缺岛问题,其中每个岛都以"in"结尾.记录,您想对每个岛中的值求和.

This is a gaps-and-islands problem, where each island ends with an "in" record, and you want to sum the values in each island.

这是一种使用后跟"in"的计数来定义组,然后使用每个组上的窗口总和的方法.

Here is one approach that uses the count of following "in"s to define the group, and then a window sum over each group.

select timestamp,
    case when val = 'out' 
        then val
        else sum(val) over(partition by grp order by timestamp)
    end as val,
    typerow
from (
    select t.*,
        sum(case when typerow = 'in' then 1 else 0 end) over(order by timestamp desc) grp
    from @table t
) t
order by timestamp

DB Fiddle上的演示 :


timestamp               | val | typerow
:---------------------- | --: | :------
2018-06-03 13:30:00.000 |   6 | out    
2018-06-03 14:10:00.000 |   8 | out    
2018-06-03 14:30:00.000 |  17 | in     
2018-06-03 15:00:00.000 |   9 | out    
2018-06-03 15:30:00.000 |   4 | out    
2018-06-03 16:00:00.000 |   2 | out    
2018-06-03 17:05:00.000 |  23 | in     
2018-06-03 17:30:00.000 |   0 | out    
2018-06-03 18:15:00.000 |   7 | out    
2018-06-03 18:30:00.000 |   8 | in     
2018-06-03 19:00:00.000 |   5 | out    

这篇关于使用SQL 2008的不同记录之间的部分总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆