查询以识别连续范围 [英] Query to identify contiguous ranges

查看:29
本文介绍了查询以识别连续范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对以下数据集编写查询,以添加具有某种period_id_group"的新列.

I'm trying to write a query on the below data set to add a new column which has some sort of "period_id_group".

contiguous  new_period  row_nr  new_period_starting_id
0           0           1       0
1           1           2       2
1           0           3       0
1           0           4       0
1           1           5       5
1           0           6       0

我想得到的是:

contiguous  new_period  row_nr  new_period_starting_id    period_id_group
0           0           1       0                         0
1           1           2       2                         2
1           0           3       0                         2
1           0           4       0                         2
1           1           5       5                         5
1           0           6       0                         5

逻辑是对于 new_period_starting_id 中的每个 0 值,它必须从上面的行中获取 >0 值.

The logic is that for each 0 value in the new_period_starting_id, it has to get the >0 value from the row above.

因此,对于 row_nr = 1,因为在它之前没有行,period_id_group 为 0.

So, for row_nr = 1 since there is no row before it, period_id_group is 0.

对于 row_nr = 2 由于这是一个新的 perid(由 new_period = 1 标记),period_id_group 是 2(这一行).

For row_nr = 2 since this is a new perid (marked by new_period = 1), the period_id_group is 2 (the id of this row).

对于 row_nr = 3 因为它是连续范围的一部分(因为 contiguous = 1),但不是范围的开始,因为它不是 new_period (new_period = 0),它的 period_id_group 应该继承前一行的值(它是连续范围的开始)——在这种情况下 period_id_group = 2 也是.

For row_nr = 3 since it's part of a contiguous range (because contiguous = 1), but is not the start of the range, because it's not a new_period (new_period = 0), its period_id_group should inherit the value from the previous row (which is the start of the contiguous range) - in this case period_id_group = 2 also.

我尝试了多个版本,但无法为 SQL Server 2008R2 找到好的解决方案,因为我无法使用 LAG().

I've tried multiple versions but couldn't get a good solution for SQL Server 2008R2, since I can't use LAG().

到目前为止,我所拥有的是可耻的:

What I have, so far, is a shameful:

select *
from #temp2 t1
left join (select distinct new_period_starting_id from #temp2) t2 
    on t1.new_period_starting_id >= t2.new_period_starting_id
where 1 = case 
            when contiguous = 0 
                then 1
            when contiguous = 1 and t2.new_period_starting_id > 0
                then 1
            else 1
        end
order by t1.rn

示例数据脚本:

declare @tmp2 table (contiguous int
                   , new_period int
                   , row_nr int
                   , new_period_starting_id int);

insert into @tmp2 values (0, 0, 1, 0)
                        , (1, 1, 2, 2)
                        , (1, 0, 3, 0)
                        , (1, 0, 4, 0)
                        , (1, 1, 5, 5)
                        , (1, 0, 6, 0);

感谢任何帮助.

推荐答案

找到解决方案:

select *
    , case
        when contiguous = 0
            then f1
        when contiguous = 1 and new_periods = 1
            then f1
        when contiguous = 1 and new_periods = 0
            then v
        else NULL
    end [period_group]
from (
    select *
        , (select max(f1) from #temp2 where new_period_starting_id > 0 and rn < t1.rn) [v]
    from #temp2 t1
    ) rs
order by rn

这篇关于查询以识别连续范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆