当 x 也是数据表中的一列时,get(x) 在 R data.table 中不起作用 [英] get(x) does not work in R data.table when x is also a column in the data table

查看:9
本文介绍了当 x 也是数据表中的一列时,get(x) 在 R data.table 中不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到当 x 也是同一个数据表中的一列时,get(x) 在 R 数据表中不起作用.请参阅下面的代码片段.在编写将数据表作为输入的 R 函数时,这很难完全避免.这是 R data.table 包中的错误吗?谢谢!

I noticed that get(x) does not work in R data table when x is also a column in the same data table. See the code snippet below. This is hard to avoid completely when writing an R function which takes the data table as an input. Is this a bug in the R data.table package? Thanks!

library(data.table)

dt = data.table(x=1:3, y=2:4)

var = 'y'
x = 'y'

dt[, 3*get(var)]      # [1] 6 9 12
dt[, 3*get(x)]        # Error in get(x): invalid first argument

推荐答案

一般情况下,当列和变量之间存在命名冲突时,列优先.自 data.table 的 v1.10.2(2017 年 1 月 31 日)以来,澄清名称是 not 列名称的首选方法是使用 .. 前缀 [1]:

In general, when there is a naming conflict between columns and variables, columns will take precedence. Since v1.10.2 (31 Jan 2017) of data.table, the preferred approach to clarify that a name is a not a column name is to use the .. prefix [1]:

当 j 是一个以 .. 为前缀的符号时,它将在调用范围内查找,其值被视为列名或数字.当您看到 .. 前缀时,认为是上一级的,就像所有操作系统中的目录 .. 都表示父目录.将来,.. 前缀可以用于所有出现在 DT[...] 内任何位置的符号....

When j is a symbol prefixed with .. it will be looked up in calling scope and its value taken to be column names or numbers. When you see the .. prefix think one-level-up, like the directory .. in all operating systems means the parent directory. In future the .. prefix could be made to work on all symbols apearing anywhere inside DT[...]. ...

我们认为 .. 实现的主要重点是解决当 var 位于调用范围内且 var 位于也是一个列名.此外,我们没有忘记过去我们建议您自己在调用作用域中为变量添加前缀 ...如果您这样做并且 ..var 存在于调用范围中,那么它仍然有效,前提是调用范围中既不存在 var 也不存在 ..var作为列名.现在请在调用范围中删除 ..var 上的 .. 前缀以进行整理.将来 data.table 将开始对此类使用发出警告/错误.

Our main focus here which we believe .. achieves is to resolve the more common ambiguity when var is in calling scope and var is a column name too. Further, we have not forgotten that in the past we recommended prefixing the variable in calling scope with .. yourself. If you did that and ..var exists in calling scope, that still works, provided neither var exists in calling scope nor ..var exists as a column name. Please now remove the .. prefix on ..var in calling scope to tidy this up. In future data.table will start to warn/error on such usage.

在您的情况下,您可以 get(..x) 强制名称 x 在调用范围内而不是在 data.table 环境中解析:

In your case, you can get(..x) to force the name x to be resolved in calling scope rather than within the data.table environment:

library(data.table)

dt = data.table(x=1:3, y=2:4)

var = 'y'
x = 'y'

dt[, 3*get(var)]      # [1] 6 9 12
dt[, 3*get(x)]        # Error in get(x): invalid first argument
dt[, 3*get(..x)]      # [1]  6  9 12

.. 前缀仍处于试验阶段,因此文档有限,但在 data.table 的帮助页面上简要提及:

The .. prefix is still somewhat experimental and thus has limited documentation, but it is mentioned briefly on the help page for data.table:

默认情况下,with=TRUEjx 的框架内计算;列名可以用作变量.如果数据集中和父范围内的变量名称重叠,您可以使用双点前缀 ..cols 显式引用 'cols 变量父范围,而不是来自数据集.

By default with=TRUE and j is evaluated within the frame of x; column names can be used as variables. In case of overlapping variables names inside dataset and in parent scope you can use double dot prefix ..cols to explicitly refer to 'cols variable parent scope and not from your dataset.

这不是一个错误,而是 with = T 允许在数据环境中将列用作变量的不幸但自然的结果.实际上,您可以通过使用 get()posenvir 参数以更基本的 R 方式避免此问题.

This is less a bug and more an unfortunate but natural consequence of with = T to allow using columns as variables in a data environment. Indeed, you could avoid this issue in a more base R way by using the pos or envir argument of get().

这篇关于当 x 也是数据表中的一列时,get(x) 在 R data.table 中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆