查找多个重叠“开始日期"的总持续时间和“结束日期"条目,在截止日期之前? [英] Finding the total duration of multiple overlapping "start date" and "end date" entries, before a cut-off date?

查看:82
本文介绍了查找多个重叠“开始日期"的总持续时间和“结束日期"条目,在截止日期之前?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多个重叠条目的主题列表,格式如下:

I have a list of subjects with multiple overlapping entries in the following format:

     ID     startdate    stopdate     cutoffdate

1    101    07MAR2014    07MAR2014    14MAR2014
2    105    30MAR2017    03APR2017    07APR2017
3    105    03APR2017    09APR2017    07APR2017

我之前曾使用 SAS 来计算每个科目的总时长.我使用了 SAS 文档中描述的代码这里,并在此处的另一个SO问题中进行了改编.使用此方法的输出将是主题 101 的 1 天和主题 105 的 11 天.

I have previously used SAS to count the total duration for each subject. I used the code described in the SAS documentation here, and adapted in another SO question here. The output using this method would be 1 day for subject 101 and 11 days for subject 105.

现在我在最右边的栏中有一个截止日期.我希望我的代码忽略超出此范围的日子;即,主题 101 的输出将变为 1 天,主题 105 的输出将变为 9 天.

Now I have a cut-off date in the far right column. I want my code to disregard days beyond this; i.e. the output would then become 1 day for subject 101 and 9 days for subject 105.

如何计算每个主题的这些重叠日期条目的持续时间,但忽略任何超出截止日期的日期?

How do I calculate the duration of these overlapping date entries for each subject, but disregard any dates which fall beyond the cut-off date?

来自先前答案的代码:

 data want;
  set have;
 by id;
 
 retain episode;
 
 start_date = input(start_date, yymmdd10.);
 end_date = input(stopdate, yymmdd10.);
 prev_stop_date = lag(stopDate);

 if first.id then do;
      episode = 0;
      call missing(prev_stop_date);
 end;

 if not (start_date <=prev_stop_date <= end_date) then episode+1;

 *could add in logic to calculate dates and durations as well depending....;

 run;

推荐答案

这里设置了更多可能的条件:

Here more possible conditions are set:

data have;
  input ID startdate : date9. stopdate : date9. cutoffdate : date9.;
  format startdate stopdate cutoffdate date9.;
  datalines;
  101 07MAR2014 07MAR2014 14MAR2014
  105 30MAR2017 03APR2017 07APR2017
  105 03APR2017 09APR2017 07APR2017
  106 12MAY2018 18MAY2018 01JUL2018
  106 15MAY2018 20MAY2018 01JUL2018
  106 25MAY2018 28MAY2018 01JUL2018
  107 01JAN2005 09JAN2005 01FEB2005
  107 05JAN2005 20JAN2005 01FEB2005 
  107 16JAN2005 18JAN2005 01FEB2005 
  107 26JAN2005 31JAN2005 01FEB2005 
  ;
run;

首先要考虑cutoffdate的insider,所以使用min(stopdate,cutoffdate);其次,需要考虑该期间是否在之前的记录内完成;第三,如果startdate是前一个stopdate,需要+1,这里是ifn函数中的'_stop+1'.

Firstly, it is necessary to consider insider of cutoffdate, so min(stopdate,cutoffdate) is used; Secondly, need to consider if the period is complete within the previous record; Thirdly, if startdate is previous stopdate, it is needed to +1, here is '_stop+1' in ifn function.

data want;
   set have ;
   by id startdate notsorted;
   retain total;
   _start=lag(startdate);_stop=lag(stopdate);
   if first.id then total=min(stopdate,cutoffdate)-startdate+1;
   else do;
      if _start<=startdate and stopdate<=_stop then return;  
     total=total+min(stopdate,cutoffdate)-ifn(_stop<startdate,startdate,_stop+1)+1;
   end;
   if last.id then output;
   drop _:;
run;


The SAS System 

Obs ID startdate stopdate cutoffdate total 
1 101 07MAR2014 07MAR2014 14MAR2014 1 
2 105 03APR2017 09APR2017 07APR2017 9 
3 106 25MAY2018 28MAY2018 01JUL2018 13 
4 107 26JAN2005 31JAN2005 01FEB2005 26 


   

这篇关于查找多个重叠“开始日期"的总持续时间和“结束日期"条目,在截止日期之前?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆