如何总结数据点之间的重叠 [英] How to summarise overlaps between data points
问题描述
我有一组通过RFID读取器的动物数据,看起来像这样-
I have a data set of animals passing an RFID reader, it looks like this -
ID date_time
A 2019-11-02 08:07:47
B 2019-11-02 08:07:48
A 2019-11-02 08:07:49
A 2019-11-02 08:07:50
A 2019-11-02 08:09:12
A 2019-11-02 08:09:13
B 2019-11-02 08:09:17
我最近问了这个问题,(将多个行合并为一个时间间隔),现在我的数据如下所示- (将数据组织成十秒的间隔)
I asked this question recently, (combine multiple rows into one time interval), and now my data looks like this - (with the data organised into ten second intervals)
ID start_date_time. end_date_time
A 2019-11-02 08:07:47 2019-11-02 08:07:50
B 2019-11-02 08:07:48 2019-11-02 08:07:48
A 2019-11-02 08:09:12 2019-11-02 08:09:13
B 2019-11-02 08:09:17 2019-11-02 08:09:47
我还添加了一个列,该列总结了时间间隔
I have also added a column which summarises the intervals
dat$Interval = interval(dat$start_date_time,dat$end_date_time)
我现在需要找到并总结这些间隔在何处相交并将其产生为一个计数,以显示动物互动(或同时出现在RFID阅读器上)的次数- (并且无需重复反向互动,即A-B,B-A)
I now need to find and summarise where these intervals intersect and produce this as a count, to show the number of times animals interact (or are present at the RFID reader at the same time) something like this - (and without repeating reverse interactions, i.e. A-B, B-A)
ID ID2 Interactions(n)
A A 0
A B 1
A C 3
任何帮助表示赞赏.
推荐答案
这个问题不太容易回答.也许这可能是一个很好的起点:
Not really easy to answer, this question. Maybe this might be a good starting point:
library(tidyverse)
library(lubridate)
cbind(
dat[rep(1:nrow(dat[-1, ]), nrow(dat[-1, ]):1), c(1, 4)],
setNames(
dat[unlist(sapply(2:nrow(dat), seq, to = nrow(dat))), c(1, 4)],
c('ID2', 'Interval2')
)
) %>%
mutate(
interacts = intersect(Interval, Interval2),
Interval = NULL,
Interval2 = NULL
) %>%
filter(!is.na(as.numeric(interacts))) %>%
count(ID, ID2)
# # A tibble: 1 x 3
# ID ID2 n
# <chr> <chr> <int>
# 1 A B 1
这篇关于如何总结数据点之间的重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!