组内重复序列中缺少间隙 [英] Missing gaps in recurring series within a group

查看:30
本文介绍了组内重复序列中缺少间隙的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个包含以下数据的表格

Id,ItemId,SeqNumber;DateTimeTrx1,100,254,2011-12-01 09:00:002,100,1,2011-12-01 09:10:003,200,7,2011-12-02 11:00:004,200,5,2011-12-02 10:00:005,100,255,2011-12-01 09:05:006,200,3,2011-12-02 09:00:007,300,0,2011-12-03 10:00:008,300,255,2011-12-03 11:00:009,300,1,2011-12-03 10:30:00

Id 是标识列.ItemId 的序列从 0 开始,一直到 255,然后重置为 0.所有这些信息都存储在名为 Item 的表中.序列号的顺序由 DateTimeTrx 确定,但此类数据可以随时输入系统.预期输出如下图-

ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber100,Previous,255,2011-12-01 09:05:00,0100,Next,1,2011-12-01 09:10:00,0200,Previous,3,2011-12-02 09:00:00,4200,Next,5,2011-12-02 10:00:00,4200,Previous,5,2011-12-02 10:00:00,6200,Next,7,2011-12-02 11:00:00,6300,Previous,1,2011-12-03 10:30:00,2300,下一个,255,2011-12-03 16:30:00,2

我们需要在缺失序列之前和之后获得那些行.在上面的 ItemId 300 示例中 - 序列为 1 的记录首先输入 (2011-12-03 10:30:00) 然后是 255(2011-12-03 16:30:00),因此这里缺少的数字是2. 所以 1 是前一个,255 是下一个,2 是第一个缺失的数字.来到 ItemId 100,序列为 255 的记录首先进入 (2011-12-02 09:05:00) 然后是 1 (2011-12-02 09:10:00),因此 255 在前,然后是 1,因此0 是第一个缺失的数字.

在上面的预期结果中,MissingNumber 列是第一个出现的缺失数字,只是为了说明示例.

我们不会有一次完整的系列重置的情况,即它可以是从 255 到 0 的系列纲要,如 itemid 100 或 0 到 255,如 ItemId 300.因此我们需要按升序 (0,1,...255) 或降序 (254,254,0,2) 等识别序列缺失

我们如何在 t-sql 中完成此操作?

解决方案

可以这样工作:

;WITH b AS (选择 *,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn从表), x AS (选择出价,b.ItemId AS prev_Itm,b.SeqNumber AS prev_Seq,c.ItemId AS next_Itm,c.SeqNumber AS next_Seq从 bJOIN b c ON c.rn = b.rn + 1 -- 下一行WHERE c.ItemId = b.ItemId -- 仅具有相同的 ItemIdAND c.SeqNumber <>(b.SeqNumber + 1)%256 -- Seq 周期模 256)SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq发件人联合所有SELECT Id, next_Itm ,'Next', next_Seq发件人ORDER BY Id, PrevNext DESC

准确地产生请求的结果.
请参阅关于数据的完整工作演示..>

此解决方案考虑了 Id 列中的间隙,因为问题中没有提及无间隙 Id 序列.


Edit2:对更新问题的回答:

我更新了上面查询中的 CTE 以匹配您的最新版本 - 或者我认为是这样.

使用那些定义行顺序的列.根据需要在 ORDER BY 子句中添加尽可能多的列以打破关系.

我对您最新更新的解释并不完全清楚,但我认为您只需挤入 DateTimeTrx 即可实现您想要的.我在 ORDER BY 中还有 SeqNumber 以打破相同的 DateTimeTrx 留下的联系.我编辑了上面的查询.

We have a table with following data

Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00

Id is an identity column. The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-

ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2

We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.

In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.

We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.

How can we accomplish this in a t-sql?

解决方案

Could work like this:

;WITH b AS (
   SELECT *
         ,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
   FROM   tbl
   ), x AS (
   SELECT
       b.Id
      ,b.ItemId    AS prev_Itm
      ,b.SeqNumber AS prev_Seq
      ,c.ItemId    AS next_Itm
      ,c.SeqNumber AS next_Seq
   FROM   b
   JOIN   b c ON c.rn = b.rn + 1                -- next row
   WHERE  c.ItemId = b.ItemId                   -- only with same ItemId
   AND    c.SeqNumber <> (b.SeqNumber + 1)%256  -- Seq cycles modulo 256
   )
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM   x
UNION  ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM   x
ORDER  BY Id, PrevNext DESC

Produces exactly the requested result.
See a complete working demo on data.SE.

This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.


Edit2: Answer to updated question:

I updated the CTE in the query above to match your latest verstion - or so I think.

Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.

The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.

这篇关于组内重复序列中缺少间隙的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆