找到每个病人最匹配的时间 [英] Finding closest matching time for each patient

查看：152 发布时间：2017/3/26 0:21:31 r dataframe time logic

本文介绍了找到每个病人最匹配的时间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两组数据：

第一组：

 patient<-c("A","A","B","B","C","C","C","C")
 arrival<-c("11:00","11:00","13:00","13:00","14:00","14:00","14:00","14:00")
 lastRow<-c("","Yes","","Yes","","","","Yes")

 data1<-data.frame(patient,arrival,lastRow)

另一组数据：

 patient<-c("A","A","A","A","B","B","B","C","C","C")
 availableSlot<-c("11:15","11:35","11:45","11:55","12:55","13:55","14:00","14:00","14:10","17:00")

 data2<-data.frame(patient, availableSlot)

我想为第一个数据集创建一个列，以便每个患者的每一行都显示可用的插槽，是
最接近到达时间：

I want to create add a column to the first dataset such that for each last row of each patient, it shows the available slot that is closest to the arrival time:

结果将是：

  patient arrival lastRow availableSlot
       A   11:00        
       A   11:00     Yes     11:15
       B   13:00        
       B   13:00     Yes     12:55
       C   14:00        
       C   14:00        
       C   14:00        
       C   14:00     Yes     14:00

感谢任何人可以告诉我如何实现这个在R.中

Would appreciate if anyone can tell me how I can implement this in R.

推荐答案

我将使用data.table，首先通过转换为ITime进行清理，忽略冗余行： p>

I'd use data.table, first cleaning up by converting to ITime and ignoring redundant rows:

library(data.table)
setDT(data1)[, arrival := as.ITime(as.character(arrival))]
setDT(data2)[, availableSlot := as.ITime(as.character(availableSlot))]
DT1 = unique(data1, by="patient", fromLast=TRUE)

然后你可以做一个滚动加入：

Then you can do a "rolling join":

res = data2[DT1, on=.(patient, availableSlot = arrival), roll="nearest", 
  .(patient, availableSlot = x.availableSlot)]

#    patient availableSlot
# 1:       A      11:15:00
# 2:       B      12:55:00
# 3:       C      14:00:00

如何工作

语法是 x [i，on =，roll =，j] 。

on = 是合并列。

这是一个连接：对于 i 的每一行，我们正在寻找 x 。

使用 roll =nearest， on = / code>被滚动到最接近的匹配。

 
 可以引用原始表中的 on = 列与 x。* 和 i。* 前缀。
 
   j 参数应该列出列，而。（）是 list（）的别名， 这里。



on= are the merge-by columns. 
It's a join: for each row of i, we are looking for matches in x.
With roll="nearest", the final column in the on= is "rolled" to its nearest match.
The on= columns in the original tables can be referenced with x.* and i.* prefixes.
The j argument should give a list of columns, and .() is an alias for list() here.

查看包裹的介绍资料 http://r-datatable.com/Getting-started 并键入？data.table 用于与滚动联接相关的文档。
Check out the package's introductory materials at http://r-datatable.com/Getting-started and type ?data.table for the docs relevant to rolling joins.
我将停在 res ，但如果你真的想要它在原来的表... 
I would stop at res, but if you really want it back in your original table...
# a very nonstandard step:
data1[lastRow == "Yes", availableSlot := res$availableSlot ]

#    patient  arrival lastRow availableSlot
# 1:       A 11:00:00                  <NA>
# 2:       A 11:00:00     Yes      11:15:00
# 3:       B 13:00:00                  <NA>
# 4:       B 13:00:00     Yes      12:55:00
# 5:       C 14:00:00                  <NA>
# 6:       C 14:00:00                  <NA>
# 7:       C 14:00:00                  <NA>
# 8:       C 14:00:00     Yes      14:00:00

现在，  data1 在新列中有 availableSlot ，类似于 data1 $ col < -  val 。
Now, data1 has availableSlot in a new column, similar to when you do data1$col <- val.

                        这篇关于找到每个病人最匹配的时间的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

找到每个病人最匹配的时间 [英] Finding closest matching time for each patient

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

找到每个病人最匹配的时间 [英] Finding closest matching time for each patient

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭