我需要创建一个新列,以汇总发生特定观察的实例数量 [英] I need to create a new column that sums the amount of instances a particular observation happened
问题描述
我有一套医院就诊的数据,我需要统计在一定时间范围内发生的急诊就诊情况。下面的代码模仿了我的数据集的样子。医院帐户是该次遭遇的唯一号码,MRN号码对于患者来说是唯一的。一名患者可以拥有许多医院账户,但只有一个MRN。
I have a data set of hospital visits and I need to tally utilization of emergency room visits that happened in a certain time frame. The code below mimics what my dataset looks like. The hospital account is a unique number for that encounter and the MRN number is unique to the patient. One patient can have many hospital accounts but only one MRN.
hosp.acct <- sample(10000:30000, 20, replace = F)
MRN <- c(sample(40000:60000, 6, replace = F),61212,sample(40000:60000, 6,
replace = F),61212,sample(40000:60000, 5, replace = F),61212)
set.seed(41)
admit.date <- Sys.Date() + sort(sample(1:30, 20))
disch.date <- admit.date
disch.date[15] <- disch.date[15] + 2
disch.date[20] <- disch.date[20] + 3
admit.type <- c(rep("ED",14), "Inpatient",rep("ED",4),"Inpatient")
patients <- data.frame(hosp.acct,MRN,admit.date,disch.date,admit.type)
数据看起来像这样
hosp.acct MRN admit.date disch.date admit.type
1 12265 43685 2018-02-17 2018-02-17 ED
2 26536 58422 2018-02-18 2018-02-18 ED
3 15569 59675 2018-02-20 2018-02-20 ED
4 24261 50523 2018-02-21 2018-02-21 ED
5 27088 42435 2018-02-23 2018-02-23 ED
6 20027 49420 2018-02-28 2018-02-28 ED
7 17513 61212 2018-03-01 2018-03-01 ED
8 15140 49624 2018-03-02 2018-03-02 ED
9 19561 56721 2018-03-03 2018-03-03 ED
10 19380 53224 2018-03-05 2018-03-05 ED
11 22578 54514 2018-03-06 2018-03-06 ED
12 21358 59142 2018-03-07 2018-03-07 ED
13 24919 53604 2018-03-08 2018-03-08 ED
14 29646 61212 2018-03-09 2018-03-09 ED
15 26191 44328 2018-03-10 2018-03-12 Inpatient
16 25820 56705 2018-03-11 2018-03-11 ED
17 21598 41922 2018-03-12 2018-03-12 ED
18 13708 43277 2018-03-13 2018-03-13 ED
19 29753 46511 2018-03-14 2018-03-14 ED
20 28573 61212 2018-03-17 2018-03-20 Inpatient
我的目标是在特定时间范围内(即30天)对患者进行ER访视允许的类型为住院。目标结果将如下所示。
My goal is to have R count ER visits the patient had within a specific timeframe (ie 30 days) if there admit type was "Inpatient". The goal result would look like this.
hosp.acct MRN admit.date disch.date admit.type ed.visits
1 12265 43685 2018-02-17 2018-02-17 ED 0 << Maybe NA HERE on the ED observations
2 26536 58422 2018-02-18 2018-02-18 ED 0
3 15569 59675 2018-02-20 2018-02-20 ED 0
4 24261 50523 2018-02-21 2018-02-21 ED 0
5 27088 42435 2018-02-23 2018-02-23 ED 0
6 20027 49420 2018-02-28 2018-02-28 ED 0
7 17513 61212 2018-03-01 2018-03-01 ED 0
8 15140 49624 2018-03-02 2018-03-02 ED 0
9 19561 56721 2018-03-03 2018-03-03 ED 0
10 19380 53224 2018-03-05 2018-03-05 ED 0
11 22578 54514 2018-03-06 2018-03-06 ED 0
12 21358 59142 2018-03-07 2018-03-07 ED 0
13 24919 53604 2018-03-08 2018-03-08 ED 0
14 29646 61212 2018-03-09 2018-03-09 ED 0
15 26191 44328 2018-03-10 2018-03-12 Inpatient 0 <<Only the inpatient encounters check previous ED visits
16 25820 56705 2018-03-11 2018-03-11 ED 0
17 21598 41922 2018-03-12 2018-03-12 ED 0
18 13708 43277 2018-03-13 2018-03-13 ED 0
19 29753 46511 2018-03-14 2018-03-14 ED 0
20 28573 61212 2018-03-17 2018-03-20 Inpatient 2 << This patient had two ED visits in last 30 days
我认为dplyr可能是可以使用的工具,但除此之外按MRN和日期对它们进行分类和分组,我不知道dplyr如何计算当前住院访问前30天进行了多少次ed访问。
I think dplyr is probably the tool to use but beyond arranging and grouping them by MRN and date, I do not know how to have dplyr count up how many ed visits happened 30 days prior to current inpatient visit.
谢谢,
布拉德
感谢大家的贡献。在所有人的帮助下,我得以收集想法并将我认为有效的内容汇总在一起。如果有人发现问题,请发出提示。我的解决方案如下。
Thanks to everyones contribution. With the help of all, I was able to gather ideas and put together what I believe works. If anyone sees problems, please chime in. My solution is below.
library(tidyverse)
library(lubridate)
patients.temp <- patients %>%
mutate(admit.date = as_date(admit.date)) %>%
filter(admit.type %in% c("Inpatient","ED")) %>%
arrange(MRN,admit.date) %>%
group_by(MRN) %>%
mutate(ED.30day = map(admit.date, ~sum(admit.type == "ED" & between(admit.date, . - 30, . - 0))))
推荐答案
在dplyr的帮助下,我可以获得所需的结果。以下是我用来获取所需结果的内容。
I was able to get the results I needed with the help of dplyr. Below is what I used to get the results I was looking for. Thanks to everyone who contributed.
library(tidyverse)
library(lubridate)
patients.temp <- patients %>%
mutate(admit.date = as_date(admit.date)) %>%
filter(admit.type %in% c("Inpatient","ED")) %>%
arrange(MRN,admit.date) %>%
group_by(MRN) %>%
mutate(ED.30day = map(admit.date, ~sum(admit.type == "ED" & between(admit.date, . - 30, . - 0))))
这篇关于我需要创建一个新列,以汇总发生特定观察的实例数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!