确定SQL间隙和孤岛中的连续日期 [英] Determine contiguous dates in SQL gaps and islands

查看:75
本文介绍了确定SQL间隙和孤岛中的连续日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个病人可以接受多种服务的情况。这些服务可能有重叠的日期,并且可能会有空白和孤岛。我正在尝试编写一个查询,以显示患者正在接受某种服务的连续时间。

I have a situation where a single patient can receive multiple services. These services can have overlapping dates, and can gaps and islands. I am trying to write a query that will show the contiguous length of time that the patient was receiving some kind of service.

表如下:

CREATE TABLE #tt
(Patient    VARCHAR(10), StartDate DATETIME, EndDate DATETIME)
INSERT INTO #tt
VALUES
('Smith',   '2014-04-13',   '2014-06-04'),
('Smith',   '2014-05-07',   '2014-05-08'),
('Smith',   '2014-06-21',   '2014-09-19'),
('Smith',   '2014-08-27',   '2014-08-27'),
('Smith',   '2014-08-28',   '2014-09-19'),
('Smith',   '2014-10-30',   '2014-12-16'),
('Smith',   '2015-05-21',   '2015-07-03'),
('Smith',   '2015-05-22',   '2015-07-03'),
('Smith',   '2015-05-26',   '2015-11-30'),
('Smith',   '2015-06-25',   '2016-06-08'),
('Smith',   '2015-07-22',   '2015-10-22'),
('Smith',   '2016-08-11',   '2016-09-02'),
('Smith',   '2017-06-02',   '2050-01-01'),
('Smith',   '2017-12-22',   '2017-12-22'),
('Smith',   '2018-03-25',   '2018-06-30')

如您所见,许多日期重叠。最终,我希望看到以下结果,该结果将显示患者接受至少一项服务的日期,例如:

As you can see, many of the dates overlap. Ultimately what I want to see is the following results, which will show the dates where the patient was receiving at least one service, like so:

Patient     |StartDate        |EndDate
--------------------------------------
Smith       |2014-04-13       |2016-06-04
Smith       |2014-06-21       |2014-09-19
Smith       |2014-10-30       |2014-12-16
Smith       |2015-05-21       |2016-06-08
Smith       |2016-08-11       |2016-09-02
Smith       |2017-06-02       |2050-01-01

我对查看各种不同的空白和孤岛SQL代码不敢恭维。我从CTE开始,但是显然它不起作用,如果我想要这个,我可以简单地使用SELECT PHN,Min(StartDate),MAX(EndDate)

I've gotten bleary eyed from looking at the various gaps and islands SQL code. I've started out with this CTE, but obviously it isn't working, and if I wanted this, I could have simply used SELECT PHN, Min(StartDate), MAX(EndDate)

WITH HCC_PAT 
AS 
(
    SELECT DISTINCT
    PHN,
    StartDate,
    EndDate,
    MIN (StartDate) OVER (  PARTITION BY  PHN ORDER BY StartDate
                                        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS PreviousStartDate,
    MAX (EndDate) OVER (    PARTITION BY  PHN ORDER BY EndDate
                                        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS PreviousEndDate 

FROM    #tt)

SELECT  DISTINCT --hcc_Pat.HCCClientKey,
        hcc_pat.PHN,
        hcc_pat.StartDate,
        ISNULL (LEAD (PreviousEndDate) OVER (PARTITION BY PHN ORDER BY ENDDATE), 'January 1, 2050') AS EndDate
FROM    HCC_PAT
WHERE   PreviousEndDate > StartDate 
AND     (StartDate < PreviousStartDate OR PreviousStartDate IS NULL)

这时任何帮助将不胜感激

Any help at this point would be gratefully appreciated

推荐答案

一种方法将日期散布开来,并指明服务是开始还是结束。然后可以使用指标的累加总和来定义不同的组-累加总和中的零值是一个周期结束的时间。

One method spreads the dates out, with an indicator of whether the service is starting or ending. Then a cumulative sum of the indicator can be used to define the different groups -- the zero values in the cumulative sum are when a period ends.

最后一步是聚合:

with d as (
      select patient, startdate as dte, 1 as inc from tt
      union all
      select patient, enddate as dte, -1 as inc from tt
     ),
     dd as (
       select patient, dte, sum(sum(inc)) over (order by dte) as cume_inc
       from d
       group by patient, dte
      ),
     ddd as (
       select dd.*, sum(case when cume_inc = 0 then 1 else 0 end) over (partition by patient order by dte desc) as grp
       from dd
      )
select patient, min(dte) as startdate, max(dte) as enddate
from ddd
group by grp;

这里是一个SQL提琴。

这篇关于确定SQL间隙和孤岛中的连续日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆