具有连续日期范围记录的数据集的最小和最大日期的日期范围 [英] Date range with minimum and maximum dates from dataset having records with continuous date range

查看:46
本文介绍了具有连续日期范围记录的数据集的最小和最大日期的日期范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含员工 id、状态和日期范围的数据集.
下面给出的输入数据集是一名员工的详细信息.
记录中的日期范围是连续的(按精确顺序),这样第二行的开始日期将是第一行结束日期的下一个日期.

如果员工在不同月份连续休假,则该表将存储不同月份的日期范围信息.
例如:在输入集中,员工已从16-10-2016"休病假到31-12-2016",并在1-1-2017"重新加入.
所以这个项目有 3 条记录,但日期是连续的.在输出中,我需要将其作为一条记录,如预期的输出数据集中所示.

输入

Id 状态 StartDate EndDate1 活跃 1-9-2007 15-10-20161 生病 16-10-2016 31-10-20161 生病 1-11-2016 30-11-20161 生病 1-12-2016 31-12-20161 活跃 1-1-2017 4-2-20171 未付 5-2-2017 9-2-20171 活跃 10-2-2017 11-2-20171 未付 12-2-2017 28-2-20171 未付 1-3-2017 31-3-20171 未付 1-4-2017 30-4-20171 活跃 1-5-2017 13-10-20171 生病 14-10-2017 11-11-20171 活跃 12-11-2017 NULL

预期输出

Id 状态 StartDate EndDate1 活跃 1-9-2007 15-10-20161 生病 16-10-2016 31-12-20161 活跃 1-1-2017 4-2-20171 未付 5-2-2017 9-2-20171 活跃 10-2-2017 11-2-20171 未付 12-2-2017 30-4-20171 活跃 1-5-2017 13-10-20171 生病 14-10-2017 11-11-20171 活跃 12-11-2017 NULL

我不能按 id,status 取 min(startdate) 和 max(EndDate) 组,因为如果同一个员工又休了一次病假,那么结束日期(示例中的11-11-2017")将会到来作为结束日期.

谁能帮我在 SQL Server 2014 中查询?

解决方案

这是一个 GROUPING AND WINDOW 的例子.

  • 首先为每个状态设置一个重置点
  • 求和设置组
  • 然后获取每个组的最大/最小日期.
<块引用>

;以 x 为(选择 ID、状态、开始日期、结束日期、iif (lag(Status) over (order by Id, StartDate) = Status, null, 1) rst来自 emp), y 为(选择 ID、状态、开始日期、结束日期、sum(rst) over (order by Id, StartDate) grp从 x)选择 ID,MIN(Status) 作为状态,MIN(StartDate) 开始日期,MAX(结束日期)结束日期从 y按 ID 分组,grp按 ID 排序,grp走

<前>身份证 |状态 |开始日期 |结束日期-: |:----- |:------------------ |:------------------1 |活跃 |01/09/2007 00:00:00 |15/10/2016 00:00:001 |生病了|16/10/2016 00:00:00 |31/12/2016 00:00:001 |活跃 |01/01/2017 00:00:00 |04/02/2017 00:00:001 |未付 |05/02/2017 00:00:00 |09/02/2017 00:00:001 |活跃 |10/02/2017 00:00:00 |11/02/2017 00:00:001 |未付 |12/02/2017 00:00:00 |30/04/2017 00:00:001 |活跃 |01/05/2017 00:00:00 |13/10/2017 00:00:001 |生病了|14/10/2017 00:00:00 |11/11/2017 00:00:001 |活跃 |12/11/2017 00:00:00 |

dbfiddle 这里

I have a dataset with id ,Status and date range of employees.
The input dataset given below are the details of one employee.
The date ranges in the records are continuous(in exact order) such that startdate of second row will be the next date of enddate of first row.

If an employee takes leave continuously for different months, then the table is storing the info with date range as separated for different months.
For example: In the input set, the employee has taken Sick leave from '16-10-2016' to '31-12-2016' and joined back on '1-1-2017'.
So there are 3 records for this item but the dates are continuous. In the output I need this as one record as shown in the expected output dataset.

INPUT

Id  Status   StartDate   EndDate

1   Active   1-9-2007    15-10-2016
1   Sick     16-10-2016  31-10-2016
1   Sick     1-11-2016   30-11-2016
1   Sick     1-12-2016   31-12-2016
1   Active   1-1-2017    4-2-2017  
1   Unpaid   5-2-2017    9-2-2017  
1   Active   10-2-2017   11-2-2017 
1   Unpaid   12-2-2017   28-2-2017 
1   Unpaid   1-3-2017    31-3-2017 
1   Unpaid   1-4-2017    30-4-2017 
1   Active   1-5-2017    13-10-2017
1   Sick     14-10-2017  11-11-2017
1   Active   12-11-2017  NULL   

EXPECTED OUTPUT

Id   Status    StartDate    EndDate

1    Active    1-9-2007     15-10-2016
1    Sick      16-10-2016   31-12-2016
1    Active    1-1-2017     4-2-2017  
1    Unpaid    5-2-2017     9-2-2017  
1    Active    10-2-2017    11-2-2017 
1    Unpaid    12-2-2017    30-4-2017 
1    Active    1-5-2017     13-10-2017
1    Sick      14-10-2017   11-11-2017
1    Active    12-11-2017   NULL  

I can't take min(startdate) and max(EndDate) group by id,status because if the same employee has taken another Sick leave then that end date ('11-11-2017' in the example) will come as the End date.

can anyone help me with the query in SQL server 2014?

解决方案

It's an example of GROUPING AND WINDOW.

  • First you set a reset point for each Status
  • Sum to set a group
  • Then get max/min dates of each group.

;with x as
(
    select Id, Status, StartDate,  EndDate,
           iif (lag(Status) over (order by Id, StartDate) = Status, null, 1) rst
    from   emp
), y as
 (
    select Id, Status, StartDate, EndDate,
           sum(rst) over (order by Id, StartDate) grp
    from   x
 )
 
 select Id, 
        MIN(Status) as Status, 
        MIN(StartDate) StartDate, 
        MAX(EndDate) EndDate
 from   y
 group by Id, grp
 order by Id, grp

GO

Id | Status | StartDate           | EndDate            
-: | :----- | :------------------ | :------------------
 1 | Active | 01/09/2007 00:00:00 | 15/10/2016 00:00:00
 1 | Sick   | 16/10/2016 00:00:00 | 31/12/2016 00:00:00
 1 | Active | 01/01/2017 00:00:00 | 04/02/2017 00:00:00
 1 | Unpaid | 05/02/2017 00:00:00 | 09/02/2017 00:00:00
 1 | Active | 10/02/2017 00:00:00 | 11/02/2017 00:00:00
 1 | Unpaid | 12/02/2017 00:00:00 | 30/04/2017 00:00:00
 1 | Active | 01/05/2017 00:00:00 | 13/10/2017 00:00:00
 1 | Sick   | 14/10/2017 00:00:00 | 11/11/2017 00:00:00
 1 | Active | 12/11/2017 00:00:00 | null               

dbfiddle here

这篇关于具有连续日期范围记录的数据集的最小和最大日期的日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆