如何在R中为介于某个时间间隔之间的日期创建一个虚拟变量? [英] How to create a dummy variable in R for dates that lie between a certain interval?

查看:50
本文介绍了如何在R中为介于某个时间间隔之间的日期创建一个虚拟变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些像这样的医院数据:

I have some hospital data that looks like this:

<身体>
patient_id treatment_1 treatment_2 date_dummy
3 2012-01-04 2012-03-27 0
3 2021-07-11 2012-10-20 0
3 2013-04-04 2013-06-22 0
12 2012-12-09 2013-11-09 0
18 2012-02-25 2012-03-26 0
25 2012-10-06 2013-12-29 1
25 2013-04-06 2013-07-07 0

我需要重新创建 date_dummy 变量,如果在两个治疗日期之间再次对患者进行了治疗,则该变量等于1,否则为0.25号病人就是最好的例子.

I need to re-create the date_dummy variable that equals 1 if the patient was treated again between the two treatment dates, and 0 otherwise. Patient 25 is the best example of this.

如果有人知道使用R中的dplyr软件包执行此操作的命令,那就太好了.感谢您的帮助.

If anyone knows a command to do this using the dplyr package in R that would awesome. Thanks for any help.

推荐答案

以@Rex Parsons的答案为基础,您可以做到:

Building upon @Rex Parsons answer you can do :

library(dplyr)
library(lubridate)
library(purrr)

df %>%
  mutate(across(starts_with('treatment'), as.Date), 
         interval = interval(treatment_1, treatment_2)) %>%
  group_by(patient_id) %>%
  mutate(date_dummy = map_int(row_number(), 
                       ~as.integer(any(interval[-.x] %within% interval[.x])))) %>%
  ungroup

#  patient_id treatment_1 treatment_2 date_dummy interval                      
#       <int> <date>      <date>           <int> <Interval>                    
#1          3 2012-01-04  2012-03-27           0 2012-01-04 UTC--2012-03-27 UTC
#2          3 2012-07-11  2012-10-20           0 2012-07-11 UTC--2012-10-20 UTC
#3          3 2013-04-04  2013-06-22           0 2013-04-04 UTC--2013-06-22 UTC
#4         12 2012-12-09  2013-11-09           0 2012-12-09 UTC--2013-11-09 UTC
#5         18 2012-02-25  2012-03-26           0 2012-02-25 UTC--2012-03-26 UTC
#6         25 2012-10-06  2013-12-29           1 2012-10-06 UTC--2013-12-29 UTC
#7         25 2013-04-06  2013-07-07           0 2013-04-06 UTC--2013-07-07 UTC

如果不需要,您可能希望从最终输出中删除 interval 列.

You may want to remove interval column from the final output if you don't need it.

这篇关于如何在R中为介于某个时间间隔之间的日期创建一个虚拟变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆