选择列序列:`:` 有效,但 `seq` 无效 [英] Select a sequence of columns: `:` works but not `seq`
问题描述
我正在尝试通过从 data.table 中选择一些列来对数据集进行子集化.但是,我的代码不适用于某些变体.
I'm trying to subset a dataset by selecting some columns from a data.table. However, my code does not work with some variations.
这是一个示例数据表
library(data.table)
DT <- data.table( ID = 1:50,
Capacity = sample(100:1000, size = 50, replace = F),
Code = sample(LETTERS[1:4], 50, replace = T),
State = rep(c("Alabama","Indiana","Texas","Nevada"), 50))
这是一个工作子集代码,其中列的数字序列使用 :
:
Here is a working subset code, where a numeric sequence of columns is specified using :
:
DT[ , 1:2]
但是,使用 seq
指定 same 列序列不起作用:
However, specifying the same sequence of columns using seq
does not work:
DT[ , seq(1:2)]
请注意,这适用于数据框,但不适用于 data.table.
Note that this works with a dataframe but not with a data.table.
我需要类似于第二种格式的东西,因为我正在根据 grep()
的输出进行子集化,它提供与第二种格式相同的输出.我做错了什么?
I need something along the lines of the second format because I'm subsetting based on the output of grep()
and it gives the same output as the second format. What am I doing incorrectly?
谢谢!
推荐答案
在最新版本的data.table中,可以在j
中使用数字来指定列.此行为包括诸如 DT[,1:2]
之类的格式来指定列的数字范围.(请注意,此语法不适用于旧版本的 data.table).
On recent versions of data.table, numbers can be used in j
to specify columns. This behaviour includes formats such as DT[,1:2]
to specify a numeric range of columns. (Note that this syntax does not work on older versions of data.table).
那么为什么 DT[ , 1:2]
有效,而 DT[ , seq(1:2)]
无效?答案隐藏在 data.table:::[.data.table
的代码中,其中包括以下几行:
So why does DT[ , 1:2]
work, but DT[ , seq(1:2)]
does not? The answer is buried in the code for data.table:::[.data.table
, which includes the lines:
if (!missing(j)) {
jsub = replace_dot_alias(substitute(j))
root = if (is.call(jsub))
as.character(jsub[[1L]])[1L]
else ""
if (root == ":" || (root %chin% c("-", "!") && is.call(jsub[[2L]]) &&
jsub[[2L]][[1L]] == "(" && is.call(jsub[[2L]][[2L]]) &&
jsub[[2L]][[2L]][[1L]] == ":") || (!length(all.vars(jsub)) &&
root %chin% c("", "c", "paste", "paste0", "-", "!") &&
missing(by))) {
with = FALSE
}
我们可以看到,data.table
在检测到使用:
函数时,会自动为你设置with = FALSE
参数在 j
中.它没有为 seq
内置相同的功能,所以如果我们想使用 seq
,我们必须自己指定 with = FALSE
语法.
We can see here that data.table
is automatically setting the with = FALSE
parameter for you when it detects the use of function :
in j
. It doesn't have the same functionality built in for seq
, so we have to specify with = FALSE
ourselves if we want to use the seq
syntax.
DT[ , seq(1:2), with = FALSE]
这篇关于选择列序列:`:` 有效,但 `seq` 无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!