按值组的连续日期范围对行进行分组 [英] Group rows by contiguous date ranges for groups of values

查看:89
本文介绍了按值组的连续日期范围对行进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑一些表T,由Col1, Col2, Date1, Date2排序:

Col1    Col2    Date1         Date2          rate
ABC     123     11/4/2014     11/5/2014      -90
ABC     123     11/4/2014     11/6/2014      -55
ABC     123     11/4/2014     11/7/2014      -90
ABC     123     11/4/2014     11/10/2014     -90

我想对数据进行分组,以便易于审核/减少重复,所以我有

I want to group the data so that changes are easily audited/reduce repetition, so I have

Col1    Col2    Date1         start_Date2    end_Date2      rate
ABC     123     11/4/2014     11/5/2014      11/5/2014      -90
ABC     123     11/4/2014     11/6/2014      11/6/2014      -55
ABC     123     11/4/2014     11/7/2014      11/10/2014     -90

如果我可以得到另一列的行编号为1 2 3 3(仅重要的是数字是不同的),然后再GROUP BY该列,那么我可以轻松做到这一点.

I can easily do that if I can get another column with the rows numbered as 1 2 3 3 (only important that numbers are distinct), and then GROUP BY that column.

我在查询中的尝试

SELECT *, DENSE_RANK() OVER (ORDER BY rate) island
FROM T
ORDER BY Date2

没有给出我想要的东西:

doesn't give what I'm looking for:

Col1    Col2    Date1         Date2          rate     island
ABC     123     11/4/2014     11/5/2014      -90      1
ABC     123     11/4/2014     11/6/2014      -55      2
ABC     123     11/4/2014     11/7/2014      -90      1
ABC     123     11/4/2014     11/10/2014     -90      1

我希望查询识别出第二组-90值应被视为新组,因为它们出现在具有不同rate的组之后.

I want the query to recognize the second group of -90 values should be treated as a new group, since they appeared after a group with a different rate.

[gap-and-islands] SQL标记非常有用,但是当速率恢复到先前的值时,我还无法弄清楚如何处理.我应该如何修改查询?

The [gaps-and-islands] SQL tag was pretty helpful, but I'm not quite able to figure out how to handle when the rate reverts back to a previous value. How should I modify my query?

推荐答案

您可以使用row_numbers()的区别来标识组.连续值将具有一个常数.

You can identify the groups by using the difference of row_numbers(). Consecutive values will have a constant.

select col1, col2, date1, min(date2), max(date2), rate
from (select t.*,
             (row_number() over (partition by col1, col2, date1 order by date2) -
              row_number() over (partition by col1, col2, date1, rate order by date2)
             ) as grp
      from table t
     ) t
group by col1, col2, date1, rate, grp

这篇关于按值组的连续日期范围对行进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆