滚动连接:向前和向后滚动 [英] Rolling joins: roll forwards and backwards

查看:270
本文介绍了滚动连接:向前和向后滚动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

data.table 是真棒,因为我可以滚动连接,甚至在组内滚动连接!

data.table is awesome, because I can do rolling joins, and even do rolling joins within groups!

library(data.table)
set.seed(42)
metrics <- data.frame(
  ID=c(rep(1, 10), rep(2,5), rep(3,5)),
  Time=c(1:10, 4:8, 8:12),
  val1=runif(20),
  val2=runif(20),
  val3=runif(20),
  val4=runif(20)
  )
metrics <- data.table(metrics[sample(1:nrow(metrics), 15),], key=c('ID', 'Time'))
calendar <- data.table(expand.grid(ID=1:3, Time=1:12), key=c('ID', 'Time'))

metrics[calendar,roll=TRUE]

但是,这对我来说不够棒。此 data.table 仍有NAs:

However, this isn't awesome enough for me. This data.table still has NAs:

> metrics[calendar,roll=TRUE]
    ID Time      val1      val2      val3        val4
 1:  1    1 0.9148060 0.9040314 0.3795592 0.675607275
 2:  1    2 0.9370754 0.1387102 0.4357716 0.982817198
 3:  1    3 0.9370754 0.1387102 0.4357716 0.982817198
 4:  1    4 0.8304476 0.9466682 0.9735399 0.566488424
 5:  1    5 0.8304476 0.9466682 0.9735399 0.566488424
 6:  1    6 0.5190959 0.5142118 0.9575766 0.189473935
 7:  1    7 0.7365883 0.3902035 0.8877549 0.271286615
 8:  1    8 0.7365883 0.3902035 0.8877549 0.271286615
 9:  1    9 0.6569923 0.4469696 0.9709666 0.693204820
10:  1   10 0.7050648 0.8360043 0.6188382 0.240544740
11:  1   11 0.7050648 0.8360043 0.6188382 0.240544740
12:  1   12 0.7050648 0.8360043 0.6188382 0.240544740
13:  2    1        NA        NA        NA          NA
14:  2    2        NA        NA        NA          NA
15:  2    3        NA        NA        NA          NA
16:  2    4 0.4577418 0.7375956 0.3334272 0.042988796
17:  2    5 0.7191123 0.8110551 0.3467482 0.140479094
18:  2    6 0.9346722 0.3881083 0.3984854 0.216385415
19:  2    7 0.2554288 0.6851697 0.7846928 0.479398564
20:  2    8 0.2554288 0.6851697 0.7846928 0.479398564
21:  2    9 0.2554288 0.6851697 0.7846928 0.479398564
22:  2   10 0.2554288 0.6851697 0.7846928 0.479398564
23:  2   11 0.2554288 0.6851697 0.7846928 0.479398564
24:  2   12 0.2554288 0.6851697 0.7846928 0.479398564
25:  3    1        NA        NA        NA          NA
26:  3    2        NA        NA        NA          NA
27:  3    3        NA        NA        NA          NA
28:  3    4        NA        NA        NA          NA
29:  3    5        NA        NA        NA          NA
30:  3    6        NA        NA        NA          NA
31:  3    7        NA        NA        NA          NA
32:  3    8 0.9400145 0.8329161 0.7487954 0.719355838
33:  3    9 0.9400145 0.8329161 0.7487954 0.719355838
34:  3   10 0.1174874 0.2076590 0.1712643 0.375489965
35:  3   11 0.4749971 0.9066014 0.2610880 0.514407708
36:  3   12 0.5603327 0.6117786 0.5144129 0.001570554
    ID Time      val1      val2      val3        val4

我可以使用 zoo ::: na.locf fromLast = TRUE ,但这不是很有趣。在 data.table 加入期间,任何人都可以想到一种优雅的方式,我可以滚动NA的向后 >

I could fill these NA's using zoo:::na.locf, fromLast=TRUE, but that's not very fun. Can anyone think of an elegant way I can roll NA's backward, (after rolling them forward), during the data.table join?

推荐答案

这可以在2013年3月发布的 data.table 版本1.8.8中实现:

This is possible in data.table version 1.8.8 released March 2013:

metrics[calendar, roll=TRUE, rollends=c(TRUE, TRUE)]

data.table 新闻档案


除了TRUE / FALSE,现在是正数(前滚/ LOCF)或
负数(后滚/ NOCB)。有限数字限制了值
滚动的距离(有限的陈旧度)。 roll = TRUE和roll = + Inf是等效的。
'rollends'是一个保存两个逻辑的新参数。如果rollends的第一个值为TRUE,则第一个观察向后滚动
。如果rollends
的第二个值为TRUE,则最后一个观察结果是向前滚动。如果roll是有限数字,则相同的限制适用于末端。
新值roll ='nearest'连接到最近的值(向后或向前),当
该值落在间隙中,并根据'rollends'到结束值。
'rolltolast'已被弃用。为了向后兼容,它转换为
{roll = TRUE; rollends = c(FALSE,FALSE)}。

In addition to TRUE/FALSE, 'roll' may now be a positive number (roll forwards/LOCF) or negative number (roll backwards/NOCB). A finite number limits the distance a value is rolled (limited staleness). roll=TRUE and roll=+Inf are equivalent. 'rollends' is a new parameter holding two logicals. The first observation is rolled backwards if the first value of rollends is TRUE. The last observation is rolled forwards if the second value of rollends is TRUE. If roll is a finite number, the same limit applies to the ends. New value roll='nearest' joins to the nearest value (either backwards or forwards) when the value falls in a gap, and to the end value according to 'rollends'. 'rolltolast' has been deprecated. For backwards compatibility it is converted to {roll=TRUE;rollends=c(FALSE,FALSE)}.

始终要下载 data.table 的最新版本,请参阅安装

As always, to download the most up-to-date version of data.table, see Installation.

这篇关于滚动连接:向前和向后滚动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆