如何判断一组前后时间之间是否存在时间点 [英] How can I tell if a time point exists between a set of before and after times

查看:26
本文介绍了如何判断一组前后时间之间是否存在时间点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图回答一个关于堆栈溢出的问题(使用 R 映射多个 ID) 当我不知道如何完成它时.即,如何测试一组前后时间点之间是否存在时间点.

I was trying to answer a question on stack overflow (Mapping multiple IDs using R) when I got stuck with how to finish it. Namely, how can I test if there is a time point between a set of before and after time points.

帖子中的用户没有给出可重现的示例,但这是我想出的.我想用数据帧 emtek_file 中的前后时间测试 hidenic_file$hidenic_time 中的时间点,并返回与时间匹配的 emtek_id每个 hidenic_id 的框架.海报没有提到它,但似乎有可能为每个 hidenic_id 返回多个 emtek_id.

The user from the post did not make a reproducible example but here is what I came up with. I want to test time points in hidenic_file$hidenic_time with the before and after times in dataframe emtek_file and return the emtek_id's that match the time frame of each hidenic_id. The poster didn't mention it but it seems like there is a possibility of multiple emtek_id's being returned for each hidenic_id.

library(zoo)
date_string <- paste("2001", sample(12, 10, 3), sample(28,10), sep = "-")
time_string <- c("23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26",
                 "23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26")

entry_emtek <- strptime(paste(date_string, time_string), "%Y-%m-%d %H:%M:%S")
entry_emtek <- entry_emtek[order(entry_emtek)]
exit_emtek <- entry_emtek + 3600 * 24
emtek_file <- data.frame(emtek_id = 1:10, entry_emtek, exit_emtek)

hidenic_id <- 110380:110479
date_string <- paste("2001", sample(12, 100, replace = TRUE), sample(28,100, replace = T), sep = "-")
time_string <- rep(c("23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26",
                 "23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26"),10)
hidenic_time <- strptime(paste(date_string, time_string), "%Y-%m-%d %H:%M:%S")
hidenic_time <- hidenic_time[order(hidenic_time)]
hidenic_file <- data.frame(hidenic_id, hidenic_time)

##Here is where I fail to write concise and working code to find what I want. 
combined_file <- list() 
for(i in seq(hidenic_file[,1])) {
  for(j in seq(emtek_file[,1])) {
    if(length(zoo(1, emtek_file[j,2:3]) + zoo(1,hidenic_file[i,2])) == 0) {next}
    if(length(zoo(1, emtek_file[j,2:3]) + zoo(1,hidenic_file[i,2])) == 1) {combined_file[[i]] < c(combinedfile[[i]],emtek_file[j,1])}
  }
  names(combined_file)[i] <- hidenic_file[i,1]
}

推荐答案

我不确定您想做什么,因为您没有提供预期的结果.这是一个使用 IRanges 包的解决方案.初读时可能不太容易理解,但找到连续间隔的重叠非常有用.

I am not sure to understand all what you want to do since you don't provide the expected result. Here a solution using IRanges package. It is maybe not simple to understand at first reading but it is extremely useful to find overlaps for continuous intervals.

library(IRanges)
## create a time intervals 
subject <- IRanges(as.numeric(emtek_file$entry_emtek),
        as.numeric(emtek_file$exit_emtek))
## create a time intervals (start=end here)
query <- IRanges(as.numeric(hidenic_file$hidenic_time),
        as.numeric(hidenic_file$hidenic_time))
## find overlaps and extract rows (both time point and intervals)  
emt.ids <- subjectHits(findOverlaps(query,subject))
hid.ids <- queryHits(findOverlaps(query,subject))
cbind(hidenic_file[hid.ids,],emtek_file[emt.ids,])

 hidenic_id        hidenic_time emtek_id         entry_emtek          exit_emtek
8      110387 2001-03-13 22:29:56        3 2001-03-13 22:29:56 2001-03-14 22:29:56
9      110388 2001-03-14 01:03:30        3 2001-03-13 22:29:56 2001-03-14 22:29:56
41     110420 2001-06-09 16:56:26        7 2001-06-09 16:56:26 2001-06-10 16:56:26

Ps:安装包:

  source("http://bioconductor.org/biocLite.R")
  biocLite("IRanges")

这篇关于如何判断一组前后时间之间是否存在时间点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆