在我自己的包中使用 data.table 包 [英] Using data.table package inside my own package
问题描述
我正在尝试在我自己的包中使用 data.table 包.MWE如下:
I am trying to use the data.table package inside my own package. MWE is as follows:
我创建了一个函数 test.fun,它只是创建一个小的 data.table 对象,然后对按A"列分组的Val"列求和.代码是
I create a function, test.fun, that simply creates a small data.table object, and then sums the "Val" column grouping by the "A" column. The code is
test.fun<-function ()
{
library(data.table)
testdata<-data.table(A=rep(seq(1,5), 5), Val=rnorm(25))
setkey(testdata, A)
res<-testdata[,{list(Ct=length(Val),Total=sum(Val),Avg=mean(Val))},"A"]
return(res)
}
当我在常规 R 会话中创建此函数,然后运行该函数时,它按预期工作.
When I create this function in a regular R session, and then run the function, it works as expected.
> res<-test.fun()
data.table 1.8.0 For help type: help("data.table")
> res
A Ct Total Avg
[1,] 1 5 -0.5326444 -0.1065289
[2,] 2 5 -4.0832062 -0.8166412
[3,] 3 5 0.9458251 0.1891650
[4,] 4 5 2.0474791 0.4094958
[5,] 5 5 2.3609443 0.4721889
当我把这个函数放到一个包中,安装包,加载包,然后运行该函数时,我收到一条错误消息.
When I put this function into a package, install the package, load the package, and then run the function, I get an error message.
> library(testpackage)
> res<-test.fun()
data.table 1.8.0 For help type: help("data.table")
Error in `[.data.frame`(x, i, j) : object 'Val' not found
任何人都可以向我解释为什么会发生这种情况以及我可以做些什么来解决它.非常感谢任何帮助.
Can anybody explain to me why this is happening and what I can do to fix it. Any help is very much appreciated.
推荐答案
Andrie 的猜测是对的,+1.有一个关于它的常见问题解答(参见 vignette("datatable-faq")
),以及一个新的 vignette 关于导入 data.table
:
Andrie's guess is right, +1. There is a FAQ on it (see vignette("datatable-faq")
), as well as a new vignette on importing data.table
:
FAQ 6.9:我创建了一个依赖于 data.table 的包.我如何能确保我的包是 data.table-aware 以便从data.frame 有效吗?
i) 在描述文件的 Depends:
字段中包含 data.table
,或者 ii) 在描述文件中包含 data.table
Imports:
描述文件的字段和 NAMESPACE 文件中的 import(data.table)
.
Either i) include data.table
in the Depends:
field of your DESCRIPTION file, or ii) include data.table
in the Imports:
field of your DESCRIPTION file AND import(data.table)
in your NAMESPACE file.
进一步的背景...在[.data.table
(和其他data.table
函数)的顶部,你会看到一个取决于结果的开关调用 cedta()
.这代表调用环境数据表感知.键入data.table::cedta
会显示它是如何完成的.它依赖于具有命名空间的调用包,并且该命名空间导入或依赖于 data.table
.这就是 data.table
可以传递给非data.table-aware 包(例如 base
中的函数)的方式,这些包可以在 data.table
上使用绝对标准的 [.data.frame
语法,完全不知道 data.frame
is()
也是一个 data.table
.
Further background ... at the top of [.data.table
(and other data.table
functions), you'll see a switch depending on the result of a call to cedta()
. This stands for Calling Environment Data Table Aware. Typing data.table:::cedta
reveals how it's done. It relies on the calling package having a namespace, and, that namespace Import'ing or Depend'ing on data.table
. This is how data.table
can be passed to non-data.table-aware packages (such as functions in base
) and those packages can use absolutely standard [.data.frame
syntax on the data.table
, blissfully unaware that the data.frame
is()
a data.table
, too.
这也是为什么 data.table
继承过去不与无命名空间包兼容的原因,以及为什么在用户请求时我们不得不要求此类包的作者为他们的包添加命名空间兼容.令人高兴的是,现在 R 为缺少一个的包添加了默认命名空间(从 v2.14.0 开始),这个问题已经消失了:
This is also why data.table
inheritance didn't used to be compatible with namespaceless packages, and why upon user request we had to ask authors of such packages to add a namespace to their package to be compatible. Happily, now that R adds a default namespace for packages missing one (from v2.14.0), that problem has gone away :
R 版本 2.14.0 的变化
* 所有包都必须有一个命名空间,如果源代码中没有提供,则在安装时创建一个.
CHANGES IN R VERSION 2.14.0
* All packages must have a namespace, and one is created on installation if not supplied in the sources.
这篇关于在我自己的包中使用 data.table 包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!