计算子集分组后的顺序事件之间的时间间隔 [英] Calculating time lag between sequential events after grouping for subsets

查看：78 发布时间：2020/4/26 15:04:44 r datetime dplyr lag lubridate

本文介绍了计算子集分组后的顺序事件之间的时间间隔的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试计算针对不同列组合的连续观察之间的时间.我已经在此处附加了数据示例.

I am trying to calculate the time between sequential observations for different combinations of my columns. I have attached a sample of my data here.

我的数据子集如下:

head(d1) #visualize the first few lines of the data

date       time   year    km sps      pp datetime          prev  timedif   seque
<fct>      <fct> <int> <dbl> <fct> <dbl> <chr>            <dbl>    <dbl>   <chr>
2012/06/09 2:22   2012   110 MICRO     0 2012-06-09 02:22     0   260.    00
2012/06/19 2:19   2012    80 MICRO     0 2012-06-19 02:19     1  4144     01
2012/06/19 22:15  2012   110 MICRO     0 2012-06-19 22:15     0   100.    00
2012/06/21 23:23  2012    80 MUXX      1 2012-06-21 23:23     0 33855     10
2012/06/24 2:39   2012   110 MICRO     0 2012-06-24 02:39     0   120.    00
2012/06/29 2:14   2012   110 MICRO     0 2012-06-29 02:14     0    43.7   00

位置:

pp:哪些物种(sps)是捕食者(编码为1)，哪些是猎物(编码为0)
prev:当前观察之后的下一个pp
timedif:当前观测值与下一个观测值之间的时间差(以秒为单位)
seque:这是顺序顺序:第一个数字是当前的pp，第二个数字是下一个pp

pp: which species (sps) are predators (coded as 1) and which are prey (coded as 0)
prev: very next pp after the current observation
timedif: time difference (in seconds?) between the current observation and the next one
seque: this is the sequence order: where the first number is the current pp and the second number is the next pp

要生成datetime列，我这样做:

d1$datetime=strftime(paste(d1$date,d1$time),'%Y-%m-%d %H:%M',usetz=FALSE) #converting the date/time into a new format

要创建其他列，我使用了以下代码:

To make the other columns I used the following code:

d1 = d1 %>% 
    ungroup() %>% 
    group_by(km, year) %>% #group by km and year because I don't want time differences calculated between different years or km (i.e., locations)
    arrange(datetime)%>%
    mutate(next = dplyr::lead(pp)) %>% 
    mutate(timedif = lead(as.POSIXct(datetime))-as.numeric(as.POSIXct(datetime)))
d1 = d1[2:nrow(d1),] %>% mutate(seque = as.factor(paste0(pp,prev)))

然后我可以提取序列之间的平均时间(几何平均值):

I can then extract the average (geometric mean) time between sequences:

library(psych)
geo_avg = d1 %>% group_by(seque) %>% summarise(geometric.mean(timedif))

geo_avg 
# A tibble: 6 x 2
  seque `geometric.mean(timedif)`
  <chr>           <dbl>
1 00             58830. #prey followed by a prey
2 01            147062. #prey followed by a predator
3 0NA               NA  #prey followed by nothing (end of time series)
4 10            178361. #predator followed by prey
5 11              1820. #predator followed by predator
6 1NA               NA  #predator followed by nothing (end of time series)

我有一个问题，可以分为三个部分

如何计算之间的时间差:

How can I calculate the time difference between:

sps

个人(例如，一个MICRO之后跟着下一个MICRO

pp

sps

01

10

特定物种时间猎物MICRO被其他捕食者(pp = 1)跟随需要多长时间.

pp

sps

00

11

物种特定时间将猎物MICRO后面跟着任何其他猎物(pp = 0)，MICRO，否则.

individuals of the same sps (for example how long does it take for one MICRO to be followed by the next MICRO
species-specific time for opposite classifications prey-predator (01 or 10) sequences for each prey (pp = 0) or predator (pp = 1) sps (for example, how long does it take for the prey MICRO to be followed by each other predator (pp = 1).
species-specific time for same classification (00 or 11) sequences for each prey (pp = 0) or predator (pp = 1) sps (for example, how long does it take for the prey MICRO to be followed by any other prey (pp = 0), MICRO and otherwise.

我希望能够按照以下方式做一些事情:

I would like to be able to do something along these lines:

sps    pp    same_sps     same_class   opposite_class   
MICRO  0     10 days      5 days       2 days    
MUXX   1     15 days      20 days      12 days
etc

以防万一，这是dput(d1[1:10,])的输出:

Just in case, here is the output for dput(d1[1:10,]):

structure(list(
 date = structure(c(11L, 21L, 21L, 23L, 26L, 31L,32L, 37L, 38L, 39L), .Label = c("2012/05/30", "2012/05/31", "2012/06/01", "2015/08/19", "2015/08/20"), class = "factor"), 
 time = structure(c(742L, 739L, 915L, 983L, 759L, 734L, 897L, 769L, 901L, 14L), .Label = c("0:00", "0:01", "0:02", "0:03", "9:58", "9:59"), class = "factor"), 
 year = c(2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L), 
 km = c(110, 80, 110, 80, 110, 110, 110, 110, 110, 110), 
 sps = structure(c(9L, 9L, 9L, 11L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("CACA", "ERDO", "FEDO", "LEAM", "LOCA", "MAAM", "MAMO", "MEME", "MICRO", "MUVI", "MUXX", "ONZI", "PRLO", "TAHU", "TAST", "URAM", "VUVU"), class = "factor"), 
 pp = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0), 
 datetime = c("2012-06-09 02:22", "2012-06-19 02:19", "2012-06-19 22:15", "2012-06-21 23:23"), 
 prev = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0), 
 timedif = c(259.883333333333, 4144, 100.4, 43.2, 2.2, 453.083333333333), 
 seque = c("00", "01", "00", "10", "00", "00", "00", "00", "00", "00")), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), 
groups = structure(list(km = c(80, 110), year = c(2012L, 2012L), .rows = list(c(2L, 4L), c(1L, 3L, 5L, 6L, 7L, 8L, 9L, 10L))), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))

计算子集分组后的顺序事件之间的时间间隔 [英] Calculating time lag between sequential events after grouping for subsets

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

计算子集分组后的顺序事件之间的时间间隔 [英] Calculating time lag between sequential events after grouping for subsets

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭