如何在 R 中的 data.table 中按两个条件选择行 [英] How do I select rows by two criteria in data.table in R

查看:24
本文介绍了如何在 R 中的 data.table 中按两个条件选择行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个 data.table,我想选择变量 x 的值为 b 的所有行.这很容易

Let's say I have a data.table and I want to select all the rows where the variable x has a value of b. That is easy

library(data.table)
DT <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
setkey(DT,x)               # set a 1-column key
DT["b"]

顺便说一句,似乎必须设置一个键,如果该键未设置为 x 则这不起作用.顺便说一句,如果我将两列设置为键会发生什么?

By the way, it appears that one has to set a key, if the key is not set to x then this does not work. By the way what would happen if I set two columns as keys?

无论如何,继续前进,假设我想选择变量 x 为 a 或 b 的所有行

Anyway, moving along, lets say that I want to select all the rows where the variable x was a or b

DT["b"|"a"]

没用

但以下工作

DT[x=="a"|x=="b"]

但这使用矢量扫描数据帧.它不使用二进制搜索.我猜对于较小的数据集,这无关紧要.

But that uses vector scanning a la data frames. It does not use the binary search. I guess for smaller data sets it will not matter.

这是我应该做的还是我对 data.table 语法一无所知?

Is that what I should do or am I ignorant of data.table syntax?

还有一件事.是否有使用 data.table 的更复杂的布尔多变量选择(或子集)过程的示例?

And one more thing. Are there any examples of more complex Boolean multi-variable selection (or subset) procedures with data.table?

我知道我总是可以恢复使用 subset() 函数,因为如果必须,data.table 将表现为 data.frame.

I know I could always revert to using the subset() function since a data.table will behave as a data.frame if it must.

推荐答案

这是我提出问题后才想到的一种方法,它有效,但我不知道它在基准测试中的效果如何.我目前不在安装了 R 的计算机上.我想我应该使用云实例.反正我喜欢这个语法

Here is a way that only crossed my mind after I asked the question and it works but I do not know how it does in benchmarks. I am not currently at a computer with an installed R. I guess I should use a cloud instance. Anyway, I like the syntax

DT[c("a","b")]

这篇关于如何在 R 中的 data.table 中按两个条件选择行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆