在我自己的包中使用 data.table 包 [英] Using data.table package inside my own package

查看:22
本文介绍了在我自己的包中使用 data.table 包的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我自己的包中使用 data.table 包.MWE如下:

I am trying to use the data.table package inside my own package. MWE is as follows:

我创建了一个函数 test.fun,它只是创建一个小的 data.table 对象,然后对按A"列分组的Val"列求和.代码是

I create a function, test.fun, that simply creates a small data.table object, and then sums the "Val" column grouping by the "A" column. The code is

test.fun<-function ()
{
    library(data.table)
    testdata<-data.table(A=rep(seq(1,5), 5), Val=rnorm(25))
    setkey(testdata, A)
    res<-testdata[,{list(Ct=length(Val),Total=sum(Val),Avg=mean(Val))},"A"]
    return(res)
}

当我在常规 R 会话中创建此函数,然后运行该函数时,它按预期工作.

When I create this function in a regular R session, and then run the function, it works as expected.

> res<-test.fun()
data.table 1.8.0  For help type: help("data.table")
> res
     A Ct      Total        Avg
[1,] 1  5 -0.5326444 -0.1065289
[2,] 2  5 -4.0832062 -0.8166412
[3,] 3  5  0.9458251  0.1891650
[4,] 4  5  2.0474791  0.4094958
[5,] 5  5  2.3609443  0.4721889

当我把这个函数放到一个包中,安装包,加载包,然后运行该函数时,我收到一条错误消息.

When I put this function into a package, install the package, load the package, and then run the function, I get an error message.

> library(testpackage)
> res<-test.fun()
data.table 1.8.0  For help type: help("data.table")
Error in `[.data.frame`(x, i, j) : object 'Val' not found

任何人都可以向我解释为什么会发生这种情况以及我可以做些什么来解决它.非常感谢任何帮助.

Can anybody explain to me why this is happening and what I can do to fix it. Any help is very much appreciated.

推荐答案

Andrie 的猜测是对的,+1.有一个关于它的常见问题解答(参见 vignette("datatable-faq")),以及一个新的 vignette 关于导入 data.table :

Andrie's guess is right, +1. There is a FAQ on it (see vignette("datatable-faq")), as well as a new vignette on importing data.table:

FAQ 6.9:我创建了一个依赖于 data.table 的包.我如何能确保我的包是 data.table-aware 以便从data.frame 有效吗?

i) 在描述文件的 Depends: 字段中包含 data.table,或者 ii) 在描述文件中包含 data.tableImports: 描述文件的字段和 NAMESPACE 文件中的 import(data.table).

Either i) include data.table in the Depends: field of your DESCRIPTION file, or ii) include data.table in the Imports: field of your DESCRIPTION file AND import(data.table) in your NAMESPACE file.

进一步的背景...在[.data.table(和其他data.table 函数)的顶部,你会看到一个取决于结果的开关调用 cedta().这代表调用环境数据表感知.键入data.table::cedta 会显示它是如何完成的.它依赖于具有命名空间的调用包,并且该命名空间导入或依赖于 data.table.这就是 data.table 可以传递给非data.table-aware 包(例如 base 中的函数)的方式,这些包可以在 data.table 上使用绝对标准的 [.data.frame 语法,完全不知道 data.frame is() 也是一个 data.table.

Further background ... at the top of [.data.table (and other data.table functions), you'll see a switch depending on the result of a call to cedta(). This stands for Calling Environment Data Table Aware. Typing data.table:::cedta reveals how it's done. It relies on the calling package having a namespace, and, that namespace Import'ing or Depend'ing on data.table. This is how data.table can be passed to non-data.table-aware packages (such as functions in base) and those packages can use absolutely standard [.data.frame syntax on the data.table, blissfully unaware that the data.frame is() a data.table, too.

这也是为什么 data.table 继承过去不与无命名空间包兼容的原因,以及为什么在用户请求时我们不得不要求此类包的作者为他们的包添加命名空间兼容.令人高兴的是,现在 R 为缺少一个的包添加了默认命名空间(从 v2.14.0 开始),这个问题已经消失了:

This is also why data.table inheritance didn't used to be compatible with namespaceless packages, and why upon user request we had to ask authors of such packages to add a namespace to their package to be compatible. Happily, now that R adds a default namespace for packages missing one (from v2.14.0), that problem has gone away :

R 版本 2.14.0 的变化
* 所有包都必须有一个命名空间,如果源代码中没有提供,则在安装时创建一个.

CHANGES IN R VERSION 2.14.0
* All packages must have a namespace, and one is created on installation if not supplied in the sources.

这篇关于在我自己的包中使用 data.table 包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆