在相同条件下为多列过滤data.table [英] Filter data.table on same condition for multiple columns
问题描述
我正在使用列名称的向量来选择data.table列的子集。我想到了是否可以在 i
中基本定义条件,然后将其应用于所有选定的列。
例如,使用 mtcars
数据集。
我想选择圆柱和齿轮柱,然后对所有具有四个圆柱和四个齿轮的汽车进行过滤。当然,我还需要为过滤器定义它是和
还是或
,但是我只是想知道是否这个想法可以在 data.table
上下文中应用。
I am using a vector of column names to select a subset of columns of a data.table. I had the idea if it's possible to basically define conditions in i
which are then applied to all the selected columns.
For example using the mtcars
dataset.
I would like to select the columns cylinder and gear and then would like to filter on all cars which have four cylinders and four gears. Of course I would also need to define if it is and
or or
for the filter, but I am just interested if the idea can be applied somehow in the data.table
context.
# working code
sel.col <- c("cyl", "gear")
dt <- data.table(mtcars[1:4,])
dt[, ..sel.col]
dt[cyl == 4 & gear == 4, ..sel.col]
# Non-working code
dt[ sel.col == 4 , ..sel.col]
推荐答案
我们可以使用 get
sel.col <- "cyl"
dt[get(sel.col) == 4, ..sel.col]
# cyl gear
#1: 4 4
或 eval(as.name
dt[eval(as.name(sel.col)) == 4, ..sel.col]
# cyl gear
#1: 4 4
以前,我们认为只有一个要评估的列,如果有多个列,请指定它在 .SDcols
中,循环遍历Data.table的子集( .SD
),将其与感兴趣的值进行比较('4'),通过 |
Reduce
将其还原为逻辑矢量,即,每行和子集中的任何TRUE
Earlier, we thought that there is only a single column to be evaluated. If we have more than one column, specify it in the .SDcols
, loop through the Subset of Data.table (.SD
) compare it with the value of interest ('4'), Reduce
it to logical vector with |
i.e. any TRUE in each of the rows and subset the rows based on this
dt[dt[, Reduce(`|`, lapply(.SD, `==`, 4)),.SDcols = sel.col], ..sel.col]
这篇关于在相同条件下为多列过滤data.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!