连接多个周期以获取时间序列，同时针对不同的起点 [英] concatenate periods to get time sequences, simultaneously for different starting points

查看：76 发布时间：2020/10/15 20:48:24 r data.table sequence purrr

本文介绍了连接多个周期以获取时间序列，同时针对不同的起点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下示例数据：

library(data.table)
set.seed(42)
t <- data.table(time=1:1000, period=round(runif(100,1,5)))
p <- data.table(id=1:10, cut=sample(1:100,5))


> t[62:71]
    time period
 1:   62      5
 2:   63      4
 3:   64      3
 4:   65      4
 5:   66      2
 6:   67      2
 7:   68      4
 8:   69      4
 9:   70      2
10:   71      1

> head(p)
   id cut
1:  1  63
2:  2  22
3:  3  99
4:  4  38
5:  5  91
6:  6  63

其中 t 给出与时间点关联的期间的向量，而 p 为每个人提供<$ c的界限$ c> time 。

where t gives some vector of periods associated with time points, and p gives for each person a cutoff in time.

对于 p 中的每个人，我想从此人的临界值开始，并通过连接期间创建一个4个时间点的序列。例如，对于人1，从时间63开始，顺序为 63 ， 63 + 4 = 67 ， 67 + 2 = 69 和 69 + 4 = 73 。

For each person in p, I would like to start at the person's cutoff and create a sequence of 4 time points by concatenating the periods. For example, for person 1, starting at time 63, the sequence would be 63, 63+4=67, 67+2=69 and 69+4=73.

理想情况下，输出将是：

Ideally, the output would then be:

> head(res)
   id  t1   t2   t3   t4
    1  63   67   69   73
    2  22   24   29   32
    3  99  103  105  109
    4  38   40   43   44
    5  91   95  100  103
    6  63   67   69   73

使用 accumulate :: purrr （迭代总和，其中sum确定要添加的下一个位置。但是，我想知道是否可以使用 data.table 或其他软件包同时为不同的人同时执行这样的操作，但由于数据集很大，因此避免了for循环。

I learned before how to create the sequences using accumulate::purrr (iterative cumsum where sum determines the next position to be added). However, I wonder whether something like this can be done simultaneously for different persons using data.table or other packages but avoiding for-loops as the datasets are rather large.

编辑：时间值与行指标不一致的版本

library(data.table)
set.seed(42)
t <- data.table(time=1001:2000, period=round(runif(100,1,5)))
p <- data.table(id=1:10, cut=sample(1:100,5))

与上述类似，除了

> t[62:71]
    time period
 1: 1062      5
 2: 1063      4
 3: 1064      3
 4: 1065      4
 5: 1066      2
 6: 1067      2
 7: 1068      4
 8: 1069      4
 9: 1070      2
10: 1071      1

其中 t $ time [i] 不等于 i ，这禁止了Jaap的第一个解决方案。

where t$time[i] does not equal i, which prohibits Jaap's first solution.

推荐答案

For循环不一定很糟糕或效率很低。如果使用得当，它们可以有效解决您的问题。

For-loops aren't necessarily bad or inefficient. When used correctly, they can be an efficient solution for your problem.

对于您当前遇到的问题，我会在 data.table -package，因为 data.table 通过引用进行更新：

For your current problem I would use a for-loop with the data.table-package which is efficient because the data.table is updated by reference:

res <- p[, .(id, t1 = cut)]

for(i in 2:4) {
  res[, paste0("t",i) := t[res[[i]], time + period] ]
}

给出：

> res
    id t1  t2  t3  t4
 1:  1 63  67  69  73
 2:  2 22  24  29  32
 3:  3 99 103 105 109
 4:  4 38  40  43  44
 5:  5 91  95 100 103
 6:  6 63  67  69  73
 7:  7 22  24  29  32
 8:  8 99 103 105 109
 9:  9 38  40  43  44
10: 10 91  95 100 103

或者，您可以选择更新 p 如下：

for(i in 2:4) {
  p[, paste0("t",i) := t[p[[i]], time + period]]
}
setnames(p, "cut", "t1")

给出相同的结果。

对于更新后的示例数据，应将上述方法更改为：

For the updated example data, you should change the above method to:

for(i in 2:4) {
  p[, paste0("t",i) := t[match(p[[i]], t$time), time + period]]
}
setnames(p, "cut", "t1")

这篇关于连接多个周期以获取时间序列，同时针对不同的起点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

连接多个周期以获取时间序列，同时针对不同的起点 [英] concatenate periods to get time sequences, simultaneously for different starting points

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

连接多个周期以获取时间序列，同时针对不同的起点 [英] concatenate periods to get time sequences, simultaneously for different starting points

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭