从下一行的 StarDate 开始驱动当前行的 EndDate [英] Drive EndDate of Current Row From StarDate of Next Row

查看:25
本文介绍了从下一行的 StarDate 开始驱动当前行的 EndDate的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以帮助我如何从开始日期创建结束日期.

产品提交给一家公司进行测试,而产品与他们在不同日期进行多次测试并记录测试日期以建立产品条件,即(结果 ID).我需要确定 StartDate 是 testDate 和 EndDate 是下一行的开始日期.但是,如果多个连续测试产生相同的 OutcomeID,我只需要返回一行,其中包含第一次测试的开始日期和最后一次测试的结束日期.换句话说,如果结果 ID 在几个连续的测试中没有改变.这是我的数据集<预><代码>声明@ProductTests 表

(RequestID int 不为空,ProductID int 不为空,TestID int 不为空,TestDate 日期时间空,结果 ID 整数)插入@ProductTests(RequestID,ProductID,TestID,TestDate,OutcomeID)选择 1,2,22,'2005-01-21',10联合所有选择 1,2,42,'2007-03-17',10联合所有选择 1,2,45,'2010-12-25',10联合所有选择 1,2,325,'2011-01-14',13联合所有选择 1,2,895,'2011-08-10',15联合所有选择 1,2,111,'2011-12-23',15联合所有选择 1,2,636,'2012-05-02',10联合所有选择 1,2,554,'2012-11-08',17

--select *from @producttests<预><代码>RequestID ProductID TestID TestDate OutcomeID1 2 22 2005-01-21 101 2 42 2007-03-17 101 2 45 2010-12-25 101 2 325 2011-01-14 131 2 895 2011-08-10 151 2 111 2011-12-23 151 2 636 2012-05-02 101 2 554 2012-11-08 17

这就是我需要实现的目标.

<代码>RequestID ProductID StartDate EndDate OutcomeID1 2 2005-01-21 2011-01-14 101 2 2011-01-14 2011-08-10 131 2 2011-08-10 2012-05-02 151 2 2012-05-02 2012-11-08 101 2 2012-11-08 无 17

正如您从数据集中看到的,前三个测试(22、42 和 45)的结果都是 OutcomeID 10,因此在我的结果中,我只需要测试 22 的开始日期和测试 45 的结束日期,即开始日期测试 325.正如您在测试 636 中看到的,结果 ID 已从 15 返回到 10,因此它也需要返回.

--这是我目前使用以下脚本设法实现的

<预><代码>选择 T1.RequestID,T1.ProductID,T1.TestDate AS StartDate,MIN(T2.TestDate) AS EndDate ,T1.OutcomeID来自@producttests T1左加入@ProductTests T2 ON T1.RequestID=T2.RequestID和 T1.ProductID=T2.ProductID 和 T2.TestDate>T1.TestDate按 T​​1.RequestID,T1.ProductID,T1.OutcomeID,T1.TestDate 分组按 T​​1.TestDate 排序

结果:<预><代码>RequestID ProductID StartDate EndDate OutcomeID1 2 2005-01-21 2007-03-17 101 2 2007-03-17 2010-12-25 101 2 2010-12-25 2011-01-14 101 2 2011-01-14 2011-08-10 131 2 2011-08-10 2011-12-23 151 2 2011-12-23 2012-05-02 151 2 2012-05-02 2012-11-08 101 2 2012-11-08 NULL 17

解决方案

实际上,您的问题似乎有两个问题.一种是如何对包含相同值的连续(基于特定标准)行进行分组.另一个是标题中实际拼写的内容,即如何使用下一行的 StartDate 作为当前行的 EndDate.

就我个人而言,我会按照我提到的顺序解决这两个问题,所以我会先解决分组问题.在这种情况下正确分组数据的一种方法是使用双重排名,如下所示:

WITH 分区 AS (选择*,grp = ROW_NUMBER() OVER(PARTITION BY RequestID,ProductID ORDER BY TestDate)- ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)来自@ProductTests), 分组 AS (选择请求ID,产品编号,开始日期 = MIN(测试日期),结果ID从分区通过...分组请求ID,产品编号,结果 ID,格鲁普)选择 *FROM 分组;

应该为您提供数据样本的以下输出:

RequestID ProductID StartDate OutcomeID--------- --------- ---------- ---------1 2 2005-01-21 101 2 2011-01-14 131 2 2011-08-10 151 2 2012-05-02 101 2 2012-11-08 17

显然,还缺少一件事,那就是EndDate,现在是关心它的好时机.再次使用ROW_NUMBER(),对grouped CTE的结果集进行排名,然后在将结果集与自身连接时使用连接条件中的排名(使用外部加入):

WITH 分区 AS (选择*,grp = ROW_NUMBER() OVER(PARTITION BY RequestID,ProductID ORDER BY TestDate)- ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)来自@ProductTests), 分组 AS (选择请求ID,产品编号,开始日期 = MIN(测试日期),结果ID,rnk = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID ORDER BY MIN(TestDate))从分区通过...分组请求ID,产品编号,结果 ID,格鲁普)选择g1.RequestID,g1.ProductID,g1.开始日期,g2.StartDate AS EndDate,g1.结果IDFROM 分组 g1LEFT JOIN 分组 g2ON g1.RequestID = g2.RequestIDAND g1.ProductID = g2.ProductIDAND g1.rnk = g2.rnk - 1;

您可以在在 SQL Fiddle 尝试此查询以验证它是否返回了您的输出之后.

Can some one please help me with how to create end date from start date.

Products referred to a company for testing while the product with the company they carry out multiple tests on different dates and record the test date to establish the product condition i.e. (outcomeID). I need to establish the StartDate which is the testDate and EndDate which is the start date of the next row. But if multiple consecutive tests resulted in the same OutcomeID I need to return only one row with the StartDate of the first test and the end date of the last test. In another word if the outcomeID did not change over a few consecutive tests. Here is my data set


DECLARE @ProductTests TABLE

( RequestID int not null, ProductID int not null, TestID int not null, TestDate datetime null, OutcomeID int ) insert into @ProductTests (RequestID ,ProductID ,TestID ,TestDate ,OutcomeID ) select 1,2,22,'2005-01-21',10 union all select 1,2,42,'2007-03-17',10 union all select 1,2,45,'2010-12-25',10 union all select 1,2,325,'2011-01-14',13 union all select 1,2,895,'2011-08-10',15 union all select 1,2,111,'2011-12-23',15 union all select 1,2,636,'2012-05-02',10 union all select 1,2,554,'2012-11-08',17

--select *from @producttests


RequestID   ProductID   TestID    TestDate        OutcomeID
1               2           22    2005-01-21         10
1               2           42    2007-03-17         10
1               2           45    2010-12-25         10
1               2           325   2011-01-14         13
1               2           895   2011-08-10         15
1               2           111   2011-12-23         15
1               2           636   2012-05-02         10
1               2           554   2012-11-08         17

And this is what I need to achieve.


RequestID ProductID  StartDate        EndDate           OutcomeID
1            2       2005-01-21       2011-01-14        10
1            2       2011-01-14       2011-08-10        13
1            2       2011-08-10       2012-05-02        15
1            2       2012-05-02       2012-11-08        10
1            2       2012-11-08       NULL              17

As you see from the dataset the first three tests (22, 42, and 45) all resulted in OutcomeID 10 so in my result I only need start date of test 22 and end date of test 45 which is the start date of test 325.As you see in test 636 outcomeID has gone back to 10 from 15 so it needs to be returned too.

--This is what I have managed to achieve at the moment using the following script


select T1.RequestID,T1.ProductID,T1.TestDate AS StartDate
       ,MIN(T2.TestDate) AS EndDate ,T1.OutcomeID 
from   @producttests T1
left join @ProductTests T2 ON T1.RequestID=T2.RequestID 
and T1.ProductID=T2.ProductID and T2.TestDate>T1.TestDate

group by T1.RequestID,T1.ProductID ,T1.OutcomeID,T1.TestDate

order by T1.TestDate

Result:


RequestID   ProductID   StartDate   EndDate       OutcomeID
1                  2    2005-01-21  2007-03-17         10
1                  2    2007-03-17  2010-12-25         10
1                  2    2010-12-25  2011-01-14         10
1                  2    2011-01-14  2011-08-10         13
1                  2    2011-08-10  2011-12-23         15
1                  2    2011-12-23  2012-05-02         15
1                  2    2012-05-02  2012-11-08         10
1                  2    2012-11-08  NULL               17

解决方案

Actually, there seem to be two problems in your question. One is how to group sequential (based on specific criteria) rows containing the same value. The other is the one actually spelled out in your title, i.e. how to use the next row's StartDate as the current row's EndDate.

Personally, I would solve these two problems in the order I mentioned them, so I would first address the grouping problem. One way to group the data properly in this case would be to use double ranking like this:

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT *
FROM grouped
;

This should give you the following output for your data sample:

RequestID  ProductID  StartDate   OutcomeID
---------  ---------  ----------  ---------
1          2          2005-01-21  10
1          2          2011-01-14  13
1          2          2011-08-10  15
1          2          2012-05-02  10
1          2          2012-11-08  17

Obviously, one thing is still missing, and it's EndDate, and now is the right time to care about it. Use ROW_NUMBER() once again, to rank the result set of the grouped CTE, then use the rankings in the join condition when joining the result set with itself (using an outer join):

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID,
    rnk = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID ORDER BY MIN(TestDate))
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT
  g1.RequestID,
  g1.ProductID,
  g1.StartDate,
  g2.StartDate AS EndDate,
  g1.OutcomeID
FROM grouped g1
LEFT JOIN grouped g2
  ON g1.RequestID = g2.RequestID
 AND g1.ProductID = g2.ProductID
 AND g1.rnk = g2.rnk - 1
;

You can try this query at SQL Fiddle to verify that it returns the output you are after.

这篇关于从下一行的 StarDate 开始驱动当前行的 EndDate的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆