我需要创建一个新列,以汇总发生特定观察的实例数量 [英] I need to create a new column that sums the amount of instances a particular observation happened

查看:58
本文介绍了我需要创建一个新列,以汇总发生特定观察的实例数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一套医院就诊的数据,我需要统计在一定时间范围内发生的急诊就诊情况。下面的代码模仿了我的数据集的样子。医院帐户是该次遭遇的唯一号码,MRN号码对于患者来说是唯一的。一名患者可以拥有许多医院账户,但只有一个MRN。

I have a data set of hospital visits and I need to tally utilization of emergency room visits that happened in a certain time frame. The code below mimics what my dataset looks like. The hospital account is a unique number for that encounter and the MRN number is unique to the patient. One patient can have many hospital accounts but only one MRN.

hosp.acct <- sample(10000:30000, 20, replace = F)
MRN <- c(sample(40000:60000, 6, replace = F),61212,sample(40000:60000, 6, 
replace = F),61212,sample(40000:60000, 5, replace = F),61212)
set.seed(41)   
admit.date <- Sys.Date() + sort(sample(1:30, 20))
disch.date <- admit.date
disch.date[15] <- disch.date[15] + 2
disch.date[20] <- disch.date[20] + 3
admit.type <- c(rep("ED",14), "Inpatient",rep("ED",4),"Inpatient")
patients <- data.frame(hosp.acct,MRN,admit.date,disch.date,admit.type)

数据看起来像这样

   hosp.acct   MRN admit.date disch.date admit.type
1      12265 43685 2018-02-17 2018-02-17         ED
2      26536 58422 2018-02-18 2018-02-18         ED
3      15569 59675 2018-02-20 2018-02-20         ED
4      24261 50523 2018-02-21 2018-02-21         ED
5      27088 42435 2018-02-23 2018-02-23         ED
6      20027 49420 2018-02-28 2018-02-28         ED
7      17513 61212 2018-03-01 2018-03-01         ED
8      15140 49624 2018-03-02 2018-03-02         ED
9      19561 56721 2018-03-03 2018-03-03         ED
10     19380 53224 2018-03-05 2018-03-05         ED
11     22578 54514 2018-03-06 2018-03-06         ED
12     21358 59142 2018-03-07 2018-03-07         ED
13     24919 53604 2018-03-08 2018-03-08         ED
14     29646 61212 2018-03-09 2018-03-09         ED
15     26191 44328 2018-03-10 2018-03-12  Inpatient
16     25820 56705 2018-03-11 2018-03-11         ED
17     21598 41922 2018-03-12 2018-03-12         ED
18     13708 43277 2018-03-13 2018-03-13         ED
19     29753 46511 2018-03-14 2018-03-14         ED
20     28573 61212 2018-03-17 2018-03-20  Inpatient

我的目标是在特定时间范围内(即30天)对患者进行ER访视允许的类型为住院。目标结果将如下所示。

My goal is to have R count ER visits the patient had within a specific timeframe (ie 30 days) if there admit type was "Inpatient". The goal result would look like this.

   hosp.acct   MRN admit.date disch.date admit.type ed.visits
1      12265 43685 2018-02-17 2018-02-17         ED         0 << Maybe NA HERE on the ED observations
2      26536 58422 2018-02-18 2018-02-18         ED         0
3      15569 59675 2018-02-20 2018-02-20         ED         0
4      24261 50523 2018-02-21 2018-02-21         ED         0
5      27088 42435 2018-02-23 2018-02-23         ED         0
6      20027 49420 2018-02-28 2018-02-28         ED         0
7      17513 61212 2018-03-01 2018-03-01         ED         0
8      15140 49624 2018-03-02 2018-03-02         ED         0
9      19561 56721 2018-03-03 2018-03-03         ED         0
10     19380 53224 2018-03-05 2018-03-05         ED         0
11     22578 54514 2018-03-06 2018-03-06         ED         0
12     21358 59142 2018-03-07 2018-03-07         ED         0
13     24919 53604 2018-03-08 2018-03-08         ED         0
14     29646 61212 2018-03-09 2018-03-09         ED         0
15     26191 44328 2018-03-10 2018-03-12  Inpatient         0  <<Only the inpatient encounters check previous ED visits
16     25820 56705 2018-03-11 2018-03-11         ED         0
17     21598 41922 2018-03-12 2018-03-12         ED         0
18     13708 43277 2018-03-13 2018-03-13         ED         0
19     29753 46511 2018-03-14 2018-03-14         ED         0
20     28573 61212 2018-03-17 2018-03-20  Inpatient         2 << This patient had two ED visits in last 30 days

我认为dplyr可能是可以使用的工具,但除此之外按MRN和日期对它们进行分类和分组,我不知道dplyr如何计算当前住院访问前30天进行了多少次ed访问。

I think dplyr is probably the tool to use but beyond arranging and grouping them by MRN and date, I do not know how to have dplyr count up how many ed visits happened 30 days prior to current inpatient visit.

谢谢,

布拉德

感谢大家的贡献。在所有人的帮助下,我得以收集想法并将我认为有效的内容汇总在一起。如果有人发现问题,请发出提示。我的解决方案如下。

Thanks to everyones contribution. With the help of all, I was able to gather ideas and put together what I believe works. If anyone sees problems, please chime in. My solution is below.

library(tidyverse)
library(lubridate) 
patients.temp <- patients %>%
    mutate(admit.date = as_date(admit.date)) %>%
    filter(admit.type %in% c("Inpatient","ED")) %>%
    arrange(MRN,admit.date) %>%
    group_by(MRN) %>%
    mutate(ED.30day = map(admit.date, ~sum(admit.type == "ED" & between(admit.date, . - 30, . - 0))))


推荐答案

在dplyr的帮助下,我可以获得所需的结果。以下是我用来获取所需结果的内容。

I was able to get the results I needed with the help of dplyr. Below is what I used to get the results I was looking for. Thanks to everyone who contributed.

library(tidyverse)
library(lubridate) 
patients.temp <- patients %>%
    mutate(admit.date = as_date(admit.date)) %>%
    filter(admit.type %in% c("Inpatient","ED")) %>%
    arrange(MRN,admit.date) %>%
    group_by(MRN) %>%
    mutate(ED.30day = map(admit.date, ~sum(admit.type == "ED" & between(admit.date, . - 30, . - 0))))

这篇关于我需要创建一个新列,以汇总发生特定观察的实例数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆