如何在 R 中的 data.table 中按两个条件选择行 [英] How do I select rows by two criteria in data.table in R
问题描述
假设我有一个 data.table,我想选择变量 x 的值为 b 的所有行.这很容易
Let's say I have a data.table and I want to select all the rows where the variable x has a value of b. That is easy
library(data.table)
DT <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
setkey(DT,x) # set a 1-column key
DT["b"]
顺便说一句,似乎必须设置一个键,如果该键未设置为 x 则这不起作用.顺便说一句,如果我将两列设置为键会发生什么?
By the way, it appears that one has to set a key, if the key is not set to x then this does not work. By the way what would happen if I set two columns as keys?
无论如何,继续前进,假设我想选择变量 x 为 a 或 b 的所有行
Anyway, moving along, lets say that I want to select all the rows where the variable x was a or b
DT["b"|"a"]
没用
但以下工作
DT[x=="a"|x=="b"]
但这使用矢量扫描数据帧.它不使用二进制搜索.我猜对于较小的数据集,这无关紧要.
But that uses vector scanning a la data frames. It does not use the binary search. I guess for smaller data sets it will not matter.
这是我应该做的还是我对 data.table 语法一无所知?
Is that what I should do or am I ignorant of data.table syntax?
还有一件事.是否有使用 data.table 的更复杂的布尔多变量选择(或子集)过程的示例?
And one more thing. Are there any examples of more complex Boolean multi-variable selection (or subset) procedures with data.table?
我知道我总是可以恢复使用 subset() 函数,因为如果必须,data.table 将表现为 data.frame.
I know I could always revert to using the subset() function since a data.table will behave as a data.frame if it must.
推荐答案
这是我提出问题后才想到的一种方法,它有效,但我不知道它在基准测试中的效果如何.我目前不在安装了 R 的计算机上.我想我应该使用云实例.反正我喜欢这个语法
Here is a way that only crossed my mind after I asked the question and it works but I do not know how it does in benchmarks. I am not currently at a computer with an installed R. I guess I should use a cloud instance. Anyway, I like the syntax
DT[c("a","b")]
这篇关于如何在 R 中的 data.table 中按两个条件选择行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!