如何判断一组前后时间之间是否存在时间点 [英] How can I tell if a time point exists between a set of before and after times
问题描述
我试图回答一个关于堆栈溢出的问题(使用 R 映射多个 ID) 当我不知道如何完成它时.即,如何测试一组前后时间点之间是否存在时间点.
I was trying to answer a question on stack overflow (Mapping multiple IDs using R) when I got stuck with how to finish it. Namely, how can I test if there is a time point between a set of before and after time points.
帖子中的用户没有给出可重现的示例,但这是我想出的.我想用数据帧 emtek_file
中的前后时间测试 hidenic_file$hidenic_time
中的时间点,并返回与时间匹配的 emtek_id
每个 hidenic_id
的框架.海报没有提到它,但似乎有可能为每个 hidenic_id
返回多个 emtek_id
.
The user from the post did not make a reproducible example but here is what I came up with. I want to test time points in hidenic_file$hidenic_time
with the before and after times in dataframe emtek_file
and return the emtek_id
's that match the time frame of each hidenic_id
. The poster didn't mention it but it seems like there is a possibility of multiple emtek_id
's being returned for each hidenic_id
.
library(zoo)
date_string <- paste("2001", sample(12, 10, 3), sample(28,10), sep = "-")
time_string <- c("23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26",
"23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26")
entry_emtek <- strptime(paste(date_string, time_string), "%Y-%m-%d %H:%M:%S")
entry_emtek <- entry_emtek[order(entry_emtek)]
exit_emtek <- entry_emtek + 3600 * 24
emtek_file <- data.frame(emtek_id = 1:10, entry_emtek, exit_emtek)
hidenic_id <- 110380:110479
date_string <- paste("2001", sample(12, 100, replace = TRUE), sample(28,100, replace = T), sep = "-")
time_string <- rep(c("23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26",
"23:03:20", "22:29:56", "01:03:30", "18:21:03", "16:56:26"),10)
hidenic_time <- strptime(paste(date_string, time_string), "%Y-%m-%d %H:%M:%S")
hidenic_time <- hidenic_time[order(hidenic_time)]
hidenic_file <- data.frame(hidenic_id, hidenic_time)
##Here is where I fail to write concise and working code to find what I want.
combined_file <- list()
for(i in seq(hidenic_file[,1])) {
for(j in seq(emtek_file[,1])) {
if(length(zoo(1, emtek_file[j,2:3]) + zoo(1,hidenic_file[i,2])) == 0) {next}
if(length(zoo(1, emtek_file[j,2:3]) + zoo(1,hidenic_file[i,2])) == 1) {combined_file[[i]] < c(combinedfile[[i]],emtek_file[j,1])}
}
names(combined_file)[i] <- hidenic_file[i,1]
}
推荐答案
我不确定您想做什么,因为您没有提供预期的结果.这是一个使用 IRanges
包的解决方案.初读时可能不太容易理解,但找到连续间隔的重叠非常有用.
I am not sure to understand all what you want to do since you don't provide the expected result. Here a solution using IRanges
package. It is maybe not simple to understand at first reading but it is extremely useful to find overlaps for continuous intervals.
library(IRanges)
## create a time intervals
subject <- IRanges(as.numeric(emtek_file$entry_emtek),
as.numeric(emtek_file$exit_emtek))
## create a time intervals (start=end here)
query <- IRanges(as.numeric(hidenic_file$hidenic_time),
as.numeric(hidenic_file$hidenic_time))
## find overlaps and extract rows (both time point and intervals)
emt.ids <- subjectHits(findOverlaps(query,subject))
hid.ids <- queryHits(findOverlaps(query,subject))
cbind(hidenic_file[hid.ids,],emtek_file[emt.ids,])
hidenic_id hidenic_time emtek_id entry_emtek exit_emtek
8 110387 2001-03-13 22:29:56 3 2001-03-13 22:29:56 2001-03-14 22:29:56
9 110388 2001-03-14 01:03:30 3 2001-03-13 22:29:56 2001-03-14 22:29:56
41 110420 2001-06-09 16:56:26 7 2001-06-09 16:56:26 2001-06-10 16:56:26
Ps:安装包:
source("http://bioconductor.org/biocLite.R")
biocLite("IRanges")
这篇关于如何判断一组前后时间之间是否存在时间点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!