通过data.table中的by连接继承的作用域 [英] Join inherited scope with by in data.table

查看:90
本文介绍了通过data.table中的by连接继承的作用域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用data.table 1.9.3,也许我错了,但是我不记得 以下是之前所期望的.

I'm on data.table 1.9.3, and maybe I'm wrong, but I don't recall the following to be expected previously.

我构建了2个data.tables,dta和dtb

I build 2 data.tables, dta and dtb

> dta
   idx vala fdx
1:   1    2   a
2:   2    4   a
3:   3    6   b

> dtb
   idx valb
1:   1    3
2:   4    6

> dput(x = dta)
structure(list(idx = c(1, 2, 3), vala = c(2, 4, 6), fdx = c("a",
"a", "b")), .Names = c("idx", "vala", "fdx"), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref =
<pointer: 0x0000000000110788>, sorted = "idx")

> dput(x = dtb)
structure(list(idx = c(1, 4), valb = c(3, 6)), .Names = c("idx",
"valb"), row.names = c(NA, -2L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x0000000000110788>, sorted = "idx")

在两种情况下,密钥都是idx.

The key is idx in both cases.

以下作品,当然

> dta[dtb, sum(valb)]
[1] 9

但这不是

> dta[dtb, sum(valb), by = fdx]
Error in `[.data.table`(dta, dtb, sum(valb), by = fdx) :
  object 'valb' not found

但这确实

> dta[dtb][, sum(valb), by = fdx]
   fdx V1
1:   a  3
2:  NA  6

如果我们看到中间步骤

> dta[dtb]
   idx vala fdx valb
1:   1    2   a    3
2:   4   NA  NA    6

我会期望的

dta[dtb, sum(valb), by = fdx] == dta[dtb][, sum(valb), by = fdx]

我哪里出错了?

推荐答案

只是一个猜测

library(data.table)

dta <- data.frame(idx=c(1,2,3), 
                  vala=c(2,4,6),
                  fdx=c('a','a','b'))
dta <- data.table(dta)

dtb <- data.frame(idx=c(1,4),
                  valb=c(3,6))
dtb <- data.table(dtb)

setkey(dta,idx)
setkey(dtb,idx)

所以当你打电话

dta[dtb, sum(valb)]

有点像打电话

tmp <- dta[dtb]
attach(tmp)
sum(valb)
detach(tmp)

但是,如果您致电

dta[dtb, sum(valb), by=fdx]

那有点像打电话

tmp <- dta[dtb]
# attach(tmp) : attach doesn't happen
sum(valb)
# detach(tmp) : detach doesn't happen

该函数不知道如何处理其他参数.例如,这也会引发错误:

The function doesn't know what to do with the additional arguments. For instance, this would also throw an error:

dta[dtb, class(fdx), sum(valb)]

但是,这可行

dta[dtb][, sum(valb), by=fdx]

有点像

tmp <- dta[dtb]
tmp[, sum(valb), by=fdx]

就像我说的那样,这只是关于为什么该功能可能无法按预期工作的猜测.

Like I said, this is just a guess as to why the function may not be working as expected.

这篇关于通过data.table中的by连接继承的作用域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆