如何在data.table中进行负/ nomatch /反向搜索? [英] How do I do a negative / nomatch / inverse search in data.table?
问题描述
如果我想选择data.table中的所有行,在关键变量中不包含特定值使用二叉搜索会发生什么?顺便问一下,我想做什么是正确的行话?是nojoin吗?是否为否定选择?
DT = data.table(x = rep(c(a,b ,c),每个= 3),y = c(1,3,6),v = 1:9)
/ pre>
setkey(DT,x)
允许对所有x ==a行使用二进制搜索进行肯定选择
DT [a]
相反。我想要所有不是a的行,其中x!=a
DT [x!= a]
这是一个矢量扫描。上面的行工作,但使用矢量扫描。我想使用二进制。我期待下面的工作,但唉...
DT [!a]
DT [ - a]
上面两个不工作,试图和nomatch一起玩,
解决方案这个习语是这样的:
DT [-DT [a,which = TRUE]]
xyv
1:b 1 4
2:b 3 5
3: b 6 6
4:c 1 7
5:c 3 8
6:c 6 9
灵感来源:
- 邮寄名单张贴< a href =http://lists.r-forge.r-project.org/pipermail/datatable-help/2010-July/000140.html =noreferrer>返回选择/加入不匹配?
- 上一个问题不与data.tables非联接
-
- Matthew Dowle的 answer 到将R的数据框移植到数据表:如何识别重复的行?
更新。 v1.8.3中的新增内容是非连接语法。 Farrel的第一个期望(!
而不是 -
)已经实现:
DT [-DT [a,which = TRUE,nomatch = 0],...]#old idiom
DT [! ..]#相同的结果,现在首选。
请参阅 NEWS item for more detailed information and example。
What happens if I want to select all the rows in a data.table that do not contain a particular value in the key variable using binary search? By the way, what is the correct jargon for what I want to do? Is it "nojoin"? Is it "negative selection"?
DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
setkey(DT,x)
Lets do a positive selection for all rows where x=="a" but using binary search
DT["a"]
That's beautiful but I want the opposite of that. I want all the rows that are not "a" in other words where x!="a"
DT[x!="a"]
That is a vector scanning. The above line works but is uses vector scanning. I want to use binary. I was expecting the following to work, but alas...
DT[!"a"]
DT[-"a"]
The above two do not work and trying to play with nomatch got me nowhere.
The idiom is this:
DT[-DT["a", which=TRUE]]
x y v
1: b 1 4
2: b 3 5
3: b 6 6
4: c 1 7
5: c 3 8
6: c 6 9
Inspiration from:
- The mailing list posting Return Select/Join that does NOT match?
- The previous question non-joins with data.tables
- Matthew Dowle's answer to Porting set operations from R's data frames to data tables: How to identify duplicated rows?
Update. New in v1.8.3 is not-join syntax. Farrel's first expectation (!
rather than -
) has been implemented :
DT[-DT["a",which=TRUE,nomatch=0],...] # old idiom
DT[!"a",...] # same result, now preferred.
See the NEWS item for more detailed info and example.
这篇关于如何在data.table中进行负/ nomatch /反向搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!