将时间段扩展到定期发生的时间戳记 [英] Expand periods to regularly occuring timestamps

查看:96
本文介绍了将时间段扩展到定期发生的时间戳记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对基于时间的数据有一个小标题,其中包含开始时间,结束时间和具有以下一般形式的类变量:

I have a tibble with time based data with start time, end time and a class variable of the following general form:

创建表格的代码:

library(lubridate)
st <- c(ymd_hms("2016-01-01 00:35:00"),
        ymd_hms("2016-01-01 00:39:00"),
        ymd_hms("2016-01-01 00:54:00"),
        ymd_hms("2016-01-01 00:56:00"),
        ymd_hms("2016-01-01 00:57:00"))

en <- c(ymd_hms("2016-01-01 00:36:00"),
        ymd_hms("2016-01-01 00:45:00"),
        ymd_hms("2016-01-01 00:55:00"),
        ymd_hms("2016-01-01 00:57:00"),
        ymd_hms("2016-01-01 00:58:00"))

cl <- c("a","a","a","b","b")

df <- tibble(st,en,cl)

时间段不一致,并且数据中存在一个隐藏的类:本质上,在此示例中,数据中未明确列出的时间属于第三类.

The periods are inconsistent, and there is a hidden class in the data: essentially, the time not explicitly listed in the data belongs to a third class in this example.

我需要一种方法来将此表扩展为具有规则的句点(1分钟),以便可以将丢失的类分配给那些句点;目标是:

I need a way to expand this table to have regular periods (1-min) so that I can assign the missing class to those periods; the goal is to get to:

我确信这可以用dplyr和lubridate来完成,但是已经能够完成.请记住,我的数据集非常庞大,因此最好采用无循环方法.

I am sure this can be done with dplyr and lubridate, but have ot been able to accomplish it. Keep in mind that my data set is huge, so preferably a non loopy approach would be great.

预先感谢

MR

推荐答案

尝试一下:

df_exp <- tibble(st = seq.POSIXt(from = min(st), to = max(st), by = "min"),
                 en = st + 60)
merge(df_exp, df, all = T)

首先,创建所有开始时间.结束时间是开始时间加1分钟.与包含类信息的数据框合并.顺便说一句:您的开始时间和结束时间确实重叠,对于某些任务来说可能是个问题...

First, create all start times. End time is just start time plus 1 minute. Merge with the data frame containing the class info. BTW: your start and end times do overlap, which might be an issue for some Tasks...

library(tidyr)
library(dplyr)
df_exp <- tibble(st = seq.POSIXt(from = min(st), to = max(en), by = "min"), en = st + 60)

# with tidyr 0.8
df_n <- df %>% 
  rowwise() %>% 
  mutate(st = list(as.character(seq.POSIXt(from = st, to = en, by = "min"))[-length(seq.POSIXt(from = st, to = en, by = "min"))])) %>% 
  unnest() %>% 
  select(-en) %>% 
  mutate(st = as.POSIXct(st))

df_exp %>% left_join(df_n)

# with tidyr 0.8.1 (untested)
df_n <- df %>% 
  rowwise() %>% 
  mutate(st = list(seq.POSIXt(from = st, to = en, by = "min")[-length(seq.POSIXt(from = st, to = en, by = "min"))])) %>% 
  unnest() %>% 
  select(-en)

df_exp %>% left_join(df_n)

这篇关于将时间段扩展到定期发生的时间戳记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆