查询以识别连续范围 [英] Query to identify contiguous ranges
问题描述
我正在尝试对以下数据集编写查询,以添加具有某种period_id_group"的新列.
I'm trying to write a query on the below data set to add a new column which has some sort of "period_id_group".
contiguous new_period row_nr new_period_starting_id
0 0 1 0
1 1 2 2
1 0 3 0
1 0 4 0
1 1 5 5
1 0 6 0
我想得到的是:
contiguous new_period row_nr new_period_starting_id period_id_group
0 0 1 0 0
1 1 2 2 2
1 0 3 0 2
1 0 4 0 2
1 1 5 5 5
1 0 6 0 5
逻辑是对于 new_period_starting_id
中的每个 0 值,它必须从上面的行中获取 >0
值.
The logic is that for each 0 value in the new_period_starting_id
, it has to get the >0
value from the row above.
因此,对于 row_nr = 1
,因为在它之前没有行,period_id_group
为 0.
So, for row_nr = 1
since there is no row before it, period_id_group
is 0.
对于 row_nr = 2
由于这是一个新的 perid(由 new_period = 1
标记),period_id_group
是 2(这一行).
For row_nr = 2
since this is a new perid (marked by new_period = 1
), the period_id_group
is 2 (the id of this row).
对于 row_nr = 3
因为它是连续范围的一部分(因为 contiguous = 1
),但不是范围的开始,因为它不是 new_period (new_period = 0
),它的 period_id_group
应该继承前一行的值(它是连续范围的开始)——在这种情况下 period_id_group = 2
也是.
For row_nr = 3
since it's part of a contiguous range (because contiguous = 1
), but is not the start of the range, because it's not a new_period (new_period = 0
), its period_id_group
should inherit the value from the previous row (which is the start of the contiguous range) - in this case period_id_group = 2
also.
我尝试了多个版本,但无法为 SQL Server 2008R2 找到好的解决方案,因为我无法使用 LAG()
.
I've tried multiple versions but couldn't get a good solution for SQL Server 2008R2, since I can't use LAG()
.
到目前为止,我所拥有的是可耻的:
What I have, so far, is a shameful:
select *
from #temp2 t1
left join (select distinct new_period_starting_id from #temp2) t2
on t1.new_period_starting_id >= t2.new_period_starting_id
where 1 = case
when contiguous = 0
then 1
when contiguous = 1 and t2.new_period_starting_id > 0
then 1
else 1
end
order by t1.rn
示例数据脚本:
declare @tmp2 table (contiguous int
, new_period int
, row_nr int
, new_period_starting_id int);
insert into @tmp2 values (0, 0, 1, 0)
, (1, 1, 2, 2)
, (1, 0, 3, 0)
, (1, 0, 4, 0)
, (1, 1, 5, 5)
, (1, 0, 6, 0);
感谢任何帮助.
推荐答案
找到解决方案:
select *
, case
when contiguous = 0
then f1
when contiguous = 1 and new_periods = 1
then f1
when contiguous = 1 and new_periods = 0
then v
else NULL
end [period_group]
from (
select *
, (select max(f1) from #temp2 where new_period_starting_id > 0 and rn < t1.rn) [v]
from #temp2 t1
) rs
order by rn
这篇关于查询以识别连续范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!