根据sql server 2012中后续行索引之间的差异创建分区 [英] create partition based on the difference between subsequent row indices in sql server 2012

查看:36
本文介绍了根据sql server 2012中后续行索引之间的差异创建分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 SQL Server 2012.

I am using SQL Server 2012.

我想根据后续行中的索引是否增加 1 或更多来创建一个 row_number.例如,假设我有一个看起来像这样的表:

I want to create a row_number based on whether the index in subsequent rows are increasing by 1 or more. For example, say I have a table that looks like:

  event row_index
    1       24          
    2       25  
    3       26
    4       30
    5       31
    6       42 
    7       43
    8       44
    9       45

然后我想做的是在最后创建一个列,名为seq_ID:

Then what I want to do is create a column at the end, called seq_ID:

event    row_index   seq_id
    1       24         1         
    2       25         1
    3       26         1
    4       30         2
    5       31         2
    6       42         3
    7       43         3
    8       44         3
    9       45         3

基本上,seq_id 仅在后续行索引之间的差异 > 1 时才有机会.我尝试使用:

basically, the seq_id only chances if the difference between subsequent row indexes is > 1. I have tried to use:

<代码>rank() over(按不知道该放什么"分区)

推荐答案

试试这个:

    ;with cte as
    (select *, 
     coalesce(row_index - (lag(row_index) over (order by event)),1) diff
     from tbl
    ),

    cte2 as
    (select *, 
     (select max(diff) 
      from cte c 
      where c.row_index <= d.row_index
      ) minri
     from cte d
     )

    select event, row_index, minri, 
    dense_rank() over (order by minri) rn 
    from cte2

  • 第一个 CTE 使用 lag 函数获取差异(从 SQL Server 2012 开始可用).
  • 下一个 CTE 计算当差值超过 1 并分配此后的所有记录时指向组",直到找到下一个差异 <> 1.这是分组的关键步骤.
  • 最后一步是使用 dense_rank在上一步中计算的指标以获取行号根据需要.
    • The first CTE gets the differences using the lag function (available from SQL Server 2012 onwards).
    • The next CTE calculates when the difference exceeds 1 and assigns all records after that point to a 'group', until the next difference <> 1 is found. This is the key step in grouping.
    • The last step is to use dense_rank over the indicator calculated in the previous step to get the row numbers as required.
    • 此解决方案确实有一个限制,因为如果差异不是按递增顺序排列的,即如果样本数据中还有两个值(例如 52 和 53),它将失败,它将对它们进行分类进入第 3 组,而不是创建一个新组.

      This solution does have a limitation in that it will fail if the differences are not in increasing order i.e. if you have two more values in the sample data such as 52 and 53, it will classify them into group 3 instead of creating a new group.

      演示

      更新:以下方法可以克服上述限制:

      Update: The below approach can overcome the above limitation:

          ;with cte as
          (select *, 
           coalesce(row_index - (lag(row_index) over (order by event)),1) diff
           from tbl)
          ,cte2 as
          (select *,
           diff - coalesce(diff - (lag(diff) over (order by event)),0) tmp
           from cte d)
      
           select event,row_index, 
           1 + sum(case when tmp >= diff then 0 else 1 end) over (order by event) risum
           from cte2
      

      同样,第一步保持不变.但是在第 2 步中,我们只检查连续值之间差异的不同值的转换,而不是使用 min/max 函数.然后排名使用条件总和为原始数据中的每个值分配一个组.

      Again the first step remains the same. But in step 2, we check only for transition to a different value of the difference between successive values, instead of using a min/max function. The ranking then uses a conditional sum to assign a group for each value in the original data.

      演示

      这可以进一步简化为:

      select event, row_index, 
      sum(case when diff <= 1 then 0 else 1 end) over (order by event) as rb
      from
      (select *, 
       row_index - (lag(row_index) over (order by event)) diff
       from tbl
      ) s
      

      这篇关于根据sql server 2012中后续行索引之间的差异创建分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆