R 计算事件之间的时间差 [英] R Calculate time difference between events

查看:81
本文介绍了R 计算事件之间的时间差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据:

Timestamp              Tag    OldValue  NewValue
5/4/2015 8:45   MD_LL46001_BYP  NORMAL  MAN_BYP
5/7/2015 20:46  MD_LL46001_BYP  MAN_BYP NORMAL
5/4/2015 2:58   MD_LL46101_BYP  MAN_BYP NORMAL
5/7/2015 20:47  MD_LL47401_BYP  MAN_BYP NORMAL
5/7/2015 20:47  MD_LL47401_BYP  NORMAL  MAN_BYP
5/7/2015 20:47  MD_LL47401_BYP  MAN_BYP NORMAL
5/11/2015 1:37  MD_LL47401_BYP  NORMAL  MAN_BYP
5/11/2015 2:04  MD_LL47401_BYP  MAN_BYP NORMAL
5/11/2015 11:03 MD_LL47401_BYP  NORMAL  MAN_BYP
5/11/2015 11:11 MD_LL47401_BYP  MAN_BYP NORMAL
5/10/2015 13:16 MD_LXL21102_BYP NORMAL  MAN_BYP
5/10/2015 19:42 MD_LXL21102_BYP MAN_BYP NORMAL
5/4/2015 11:40  MD_PL41201_BYP  NORMAL  MAN_BYP
5/4/2015 11:57  MD_PL41201_BYP  MAN_BYP NORMAL

我想计算相同标签的事件之间的时间差(从正常更改为 man_byp"和从 man_byp 更改为正常"),这可以在 R 中进行吗?我想看到的输出:

I'd like to calculate time difference between events ("change from normal to man_byp" and "change from man_byp to normal") for the same tags, is that possible to do in R? Output I'd like to see:

Tag             Bypass put in   Duration
MD_LL46001_BYP  5/4/2015 8:45   xxx minutes
MD_LL47401_BYP  5/7/2015 20:47  xx minutes

事件和标签不一定连续放置在表格中.

The events and tags are not necessarily consecutively placed in the table.

推荐答案

欢迎来到 SO!我特意使用 dplyr 来刷新我的技能,所以要注意有些东西可能不是很有效或惯用的.无论如何,这是工作流程:

Welcome to SO! I'm deliberately using dplyr to refresh my skills, so be aware that some things may not be very effective or idiomatic. Anyway, here's the workflow:

  1. 制作适当的日期时间对象,编码正确的状态变化
  2. 删除多余的列并按标签和时间排序
  3. 将描述的逻辑应用于每个标签

代码如下:

library(dplyr)
df <- read.table("clipboard")
names(df) <- c("date", "time", "tag", "status1", "status2")
df$datetime <- as.POSIXct(strptime(paste(df$date, df$time), "%m/%d/%Y %H:%M"))
df$status_change <- 
  ifelse(df$status1 == "NORMAL" & df$status2 == "MAN_BYP", 1, 
         ifelse(df$status2 == "NORMAL" & df$status1 == "MAN_BYP", 2, NA))
df %>% 
  select(datetime, tag, status_change) %>% 
  arrange(tag, datetime) -> df2
df2 %>% 
  group_by(tag) %>% 
  do({
    k <- nrow(.)
    res <- rep(NA, k)
    ind <- as.logical(c(0, .$status_change[-1] - .$status_change[-k]))
    dmins <- c(NA, difftime(.$datetime[-1], .$datetime[-k], units = "mins"))
    res[ind] <- dmins[ind]
    data.frame(res)
  }) %>% 
  (function(x) cbind(df2, x[, -1]))

结果:

              datetime             tag status_change  res
1  2015-05-04 08:45:00  MD_LL46001_BYP             1   NA
2  2015-05-07 20:46:00  MD_LL46001_BYP             2 5041
3  2015-05-04 02:58:00  MD_LL46101_BYP             2   NA
4  2015-05-07 20:47:00  MD_LL47401_BYP             2   NA
5  2015-05-07 20:47:00  MD_LL47401_BYP             1    0
6  2015-05-07 20:47:00  MD_LL47401_BYP             2    0
7  2015-05-11 01:37:00  MD_LL47401_BYP             1 4610
8  2015-05-11 02:04:00  MD_LL47401_BYP             2   27
9  2015-05-11 11:03:00  MD_LL47401_BYP             1  539
10 2015-05-11 11:11:00  MD_LL47401_BYP             2    8
11 2015-05-10 13:16:00 MD_LXL21102_BYP             1   NA
12 2015-05-10 19:42:00 MD_LXL21102_BYP             2  386
13 2015-05-04 11:40:00  MD_PL41201_BYP             1   NA
14 2015-05-04 11:57:00  MD_PL41201_BYP             2   17

请注意,0 和 NA 的含义不同.

Notice that zero and NA do not have the same sense.

这篇关于R 计算事件之间的时间差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆