R:按组,检查是否对于一个var的每个唯一值,有至少一个观察值,其中var的值等于另一个var的值 [英] R: By group, check if for each unique value of one var, there is at least one observation where the value of the var equals the value of another var

查看:118
本文介绍了R:按组,检查是否对于一个var的每个唯一值,有至少一个观察值,其中var的值等于另一个var的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我尝试在Google和SE上发现一些有用的东西,但我发现似乎不能以一种能够得到我寻找答案的方式来制定问题。



我可以写一个 for-loop ,为每个 id 和每行的 a 的每个唯一值进行比较,但我努力实现更高级别的R

  id <-c(1,1,1,2,2,2) ,3,3,3,4,4,4,5,5,5)
a <-c(1,1,1,2,2,2,3,3,4,4,4, 5,5,5,6)
b < - c(1,2,3,3,3,4,3,4,5,4,4,5,6,7,8)

require(data.table)
dt< - data.table(id,a,b)

dt
dt [,unique in%b,by = id]
tmp < - dt [,unique(a)%in%b,by = id]
tmp $ id [tmp $ V1 == FALSE]

在我的示例中, ID 2,3和5应该是结果,规则为:通过 id ,检查 a 的每个唯一值是否至少有一个观察值,其中 b



但是,我的代码只输出 ID 2和5,因为对于 ID 3, 4 与先前观察的 4 匹配。



结果应该输出不满足条件的ID,或者向原始表中添加一个虚拟变量,指示是否满足ID的条件。

解决方案

如何

  dt [ ,function(i)any(a == i& b == i))),by = id] 

#id V1
#1:1 TRUE
#2:2 FALSE
#3:3 FALSE
#4:4 TRUE
#5:5 FALSE

要添加一个虚拟变量到原始表,可以修改它像

  dt [,check:= all (a),function(i)any(a == i& b == i))),by = id] 


I think I am on the right direction with this code, but I am not quite there yet.

I tried finding something useful on Google and SE, but I did not seem to be able to formulate the question in a way that gets me the answer I am looking for.

I could write a for-loop for this, comparing for each id and for each unique value of a per row, but I strive to achieve a higher level of R-understanding and thus want to avoid loops.

id <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5)
a <- c(1,1,1,2,2,2,3,3,4,4,4,5,5,5,6)
b <- c(1,2,3,3,3,4,3,4,5,4,4,5,6,7,8)

require(data.table)
dt <- data.table(id, a, b)

dt
dt[,unique(a) %in% b, by=id]
tmp <- dt[,unique(a) %in% b, by=id]
tmp$id[tmp$V1 == FALSE]

In my example, IDs 2, 3 and 5 should be the result, the decision rule being: "By id, check if for each unique value of a if there is at least one observation where the value of b equals value of a."

However, my code only outputs IDs 2 and 5, but not 3. This is because for ID 3, the 4 is matched with the 4 of the previous observation.

The result should either output the IDs for which the condition is not met, or add a dummy variable to the original table that indicated whether the condition is met for the ID.

解决方案

How about

dt[, all(sapply(unique(a), function(i) any(a == i & b == i))), by = id]

#   id    V1
#1:  1  TRUE
#2:  2 FALSE
#3:  3 FALSE
#4:  4  TRUE
#5:  5 FALSE

If you want to add a dummy variable to the original table, you can modify it like

dt[, check:=all(sapply(unique(a), function(i) any(a == i & b == i))), by = id]

这篇关于R:按组,检查是否对于一个var的每个唯一值,有至少一个观察值,其中var的值等于另一个var的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆