当 x 也是数据表中的一列时,get(x) 在 R data.table 中不起作用 [英] get(x) does not work in R data.table when x is also a column in the data table
问题描述
我注意到当 x 也是同一个数据表中的一列时,get(x) 在 R 数据表中不起作用.请参阅下面的代码片段.在编写将数据表作为输入的 R 函数时,这很难完全避免.这是 R data.table 包中的错误吗?谢谢!
I noticed that get(x) does not work in R data table when x is also a column in the same data table. See the code snippet below. This is hard to avoid completely when writing an R function which takes the data table as an input. Is this a bug in the R data.table package? Thanks!
library(data.table)
dt = data.table(x=1:3, y=2:4)
var = 'y'
x = 'y'
dt[, 3*get(var)] # [1] 6 9 12
dt[, 3*get(x)] # Error in get(x): invalid first argument
推荐答案
一般情况下,当列和变量之间存在命名冲突时,列优先.自 data.table 的 v1.10.2(2017 年 1 月 31 日)以来,澄清名称是 not 列名称的首选方法是使用 ..
前缀 [1]:
In general, when there is a naming conflict between columns and variables, columns will take precedence. Since v1.10.2 (31 Jan 2017) of data.table, the preferred approach to clarify that a name is a not a column name is to use the ..
prefix [1]:
当 j 是一个以 ..
为前缀的符号时,它将在调用范围内查找,其值被视为列名或数字.当您看到 ..
前缀时,认为是上一级的,就像所有操作系统中的目录 ..
都表示父目录.将来,..
前缀可以用于所有出现在 DT[...]
内任何位置的符号....
When j is a symbol prefixed with
..
it will be looked up in calling scope and its value taken to be column names or numbers. When you see the..
prefix think one-level-up, like the directory..
in all operating systems means the parent directory. In future the..
prefix could be made to work on all symbols apearing anywhere insideDT[...]
. ...
我们认为 ..
实现的主要重点是解决当 var
位于调用范围内且 var
位于也是一个列名.此外,我们没有忘记过去我们建议您自己在调用作用域中为变量添加前缀 ..
.如果您这样做并且 ..var
存在于调用范围中,那么它仍然有效,前提是调用范围中既不存在 var
也不存在 ..var
作为列名.现在请在调用范围中删除 ..var
上的 ..
前缀以进行整理.将来 data.table 将开始对此类使用发出警告/错误.
Our main focus here which we believe ..
achieves is to resolve the more common ambiguity when var
is in calling scope and var
is a column name too. Further, we have not forgotten that in the past we recommended prefixing the variable in calling scope with ..
yourself. If you did that and ..var
exists in calling scope, that still works, provided neither var
exists in calling scope nor ..var
exists as a column name. Please now remove the ..
prefix on ..var
in calling scope to tidy this up. In future data.table will start to warn/error on such usage.
在您的情况下,您可以 get(..x)
强制名称 x
在调用范围内而不是在 data.table 环境中解析:
In your case, you can get(..x)
to force the name x
to be resolved in calling scope rather than within the data.table environment:
library(data.table)
dt = data.table(x=1:3, y=2:4)
var = 'y'
x = 'y'
dt[, 3*get(var)] # [1] 6 9 12
dt[, 3*get(x)] # Error in get(x): invalid first argument
dt[, 3*get(..x)] # [1] 6 9 12
..
前缀仍处于试验阶段,因此文档有限,但在 data.table
的帮助页面上简要提及:
The ..
prefix is still somewhat experimental and thus has limited documentation, but it is mentioned briefly on the help page for data.table
:
默认情况下,with=TRUE
和 j
在 x
的框架内计算;列名可以用作变量.如果数据集中和父范围内的变量名称重叠,您可以使用双点前缀 ..cols
显式引用 'cols
变量父范围,而不是来自数据集.
By default
with=TRUE
andj
is evaluated within the frame ofx
; column names can be used as variables. In case of overlapping variables names inside dataset and in parent scope you can use double dot prefix..cols
to explicitly refer to 'cols
variable parent scope and not from your dataset.
这不是一个错误,而是 with = T
允许在数据环境中将列用作变量的不幸但自然的结果.实际上,您可以通过使用 get()
的 pos
或 envir
参数以更基本的 R 方式避免此问题.
This is less a bug and more an unfortunate but natural consequence of with = T
to allow using columns as variables in a data environment. Indeed, you could avoid this issue in a more base R way by using the pos
or envir
argument of get()
.
这篇关于当 x 也是数据表中的一列时,get(x) 在 R data.table 中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!