如何计算 R 中 data.table 中的出现组合 [英] How to count occurrences combinations in data.table in R
问题描述
我有两个 data.tables.我想计算与另一个表中的表组合匹配的行数.我检查了 data.table 文档,但没有找到答案.我正在使用 data.table 1.9.2.
I have two data.tables. I would like to count the number of rows matching a combination of a table in another table. I have checked the data.table documentation but I have not found my answer. I am using data.table 1.9.2.
DT1 <- data.table(a=c(3,2), b=c(8,3))
DT2 <- data.table(w=c(3,3,3,2,3), x=c(8,8,8,3,7), z=c(2,6,7,2,2))
DT1
# a b
# 1: 3 8
# 2: 2 3
DT2
# w x z
# 1: 3 8 2
# 2: 3 8 6
# 3: 3 8 7
# 4: 2 3 2
# 5: 3 7 2
现在我想计算 DT2 中 (3, 8) 对和 (2, 3) 对的数量.
Now I would like to count the number of (3, 8) pairs and (2, 3) pairs in DT2.
setkey(DT2, w, x)
nrow(DT2[J(3, 8), nomatch=0])
# [1] 3 ## OK !
nrow(DT2[J(2, 3), nomatch=0])
# [1] 1 ## OK !
DT1[,count_combination_in_dt2 := nrow(DT2[J(a, b), nomatch=0])]
DT1
# a b count_combination_in_dt2
# 1: 3 8 4 ## not ok.
# 2: 2 3 4 ## not ok.
预期结果:
# a b count_combination_in_dt2
# 1: 3 8 3
# 2: 2 3 1
推荐答案
你只需要添加by=list(a,b)
.
DT1[,count_combination_in_dt2:=nrow(DT2[J(a,b),nomatch=0]), by=list(a,b)]
DT1
##
## a b count_combination_in_dt2
## 1: 3 8 3
## 2: 2 3 1
更多细节:在您的原始版本中,您使用了 DT2[DT1, nomatch=0]
(因为您使用了所有 a, b
组合.如果您想要对每个 a, b
组合分别使用 J(a,b)
,您需要使用 by
参数.data.table
然后按 a, b
分组,nrow(...)
在每个组内进行评估.
Some more details: In your original version, you used DT2[DT1, nomatch=0]
(because you used all a, b
combinations. If you want to use J(a,b)
for each a, b
combination separately, you need to use the by
argument. The data.table
is then grouped by a, b
and the nrow(...)
is evaluated within each group.
这篇关于如何计算 R 中 data.table 中的出现组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!