data.table滚动连接无法正常工作 [英] data.table roll join not working correctly

查看:24
本文介绍了data.table滚动连接无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将两个data.table的连接进行滚动连接.这是一个示例:

I'm trying to roll join two data.table's. Here's an example:

library(data.table)

tmp1 <- data.table(structure(list(Code = c("AED", "AED", "AED", "AED", "AED"),
                                  Date = structure(c(97286400, 97372800, 97459200, 97545600, 97632000),
                                                   class = c("POSIXct", "POSIXt"), tzone = "UTC")),
                             .Names = c("Code", "Date"), row.names = c(NA, -5L), class = "data.frame"))

tmp2 <- data.table(structure(list(Date = structure(c(97286400, 99705600, 102297600), tzone = "UTC",
                                                   class = c("POSIXct", "POSIXt")),
                                  Val = c(4.39, 3.96, 3.9474), Code = c("AED", "AED", "AED")),
                             .Names = c("Date", "Val", "Code"), row.names = c(NA, -3L), class = "data.frame"))

> tmp1
   Code       Date
1:  AED 1973-01-31
2:  AED 1973-02-01
3:  AED 1973-02-02
4:  AED 1973-02-03
5:  AED 1973-02-04

> tmp2
         Date    Val Code
1: 1973-01-31 4.3900  AED
2: 1973-02-28 3.9600  AED
3: 1973-03-30 3.9474  AED

> setkey(tmp1,Code,Date)

> setkey(tmp2,Code,Date)

> tmp2[tmp1,roll=TRUE]
         Date  Val Code
1: 1973-01-31 4.39  AED
2: 1973-02-01 4.39  AED
3: 1973-02-02 4.39  AED
4: 1973-02-03 4.39  AED
5: 1973-02-04 4.39  AED

> tmp2[tmp1,roll=2]
         Date  Val Code
1: 1973-01-31 4.39  AED
2: 1973-02-01   NA  AED
3: 1973-02-02   NA  AED
4: 1973-02-03   NA  AED
5: 1973-02-04   NA  AED

第一卷正常工作.在第二个示例中,根据文档,我希望将4.39结转到1973-02-02:当roll为正数时,这将限制值结转的距离."我希望看到:

The first roll works correctly. In the second example, I would expect 4.39 to be carried forward to 1973-02-02, as per the documentation: "When roll is a positive number, this limits how far values are carried forward." I'd expect to see:

> tmp2[tmp1,roll=2]
         Date  Val Code
1: 1973-01-31 4.39  AED
2: 1973-02-01 4.39  AED
3: 1973-02-02 4.39  AED
4: 1973-02-03   NA  AED
5: 1973-02-04   NA  AED

这是错误还是我误解了功能?

Is this a bug or am I misinterpreting the functionality?

推荐答案

您的解释很好.原因是您的日期是 POSIXct ,所以 roll 的编号以秒为单位,而不是几天.将滚动设置为2天(以秒为单位):

You're interpreting it fine. The reason is that your date is POSIXct so the roll number is in seconds, not days. Set your roll to 2 days, in seconds:

class(tmp1$Date)
> class(tmp1$Date)
[1] "POSIXct" "POSIXt"
> tmp2[tmp1, roll=2*3600*24]
         Date  Val Code
1: 1973-01-31 4.39  AED
2: 1973-02-01 4.39  AED
3: 1973-02-02 4.39  AED
4: 1973-02-03   NA  AED
5: 1973-02-04   NA  AED

或通过 Date:= as.Date(Date)强制您的 Date ,并根据您的喜好使用 roll = 2 .

Or coerce your Date via Date:=as.Date(Date) and use roll=2, depending on your preference.

这篇关于data.table滚动连接无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆