如何查找数据框中各组之间共享的值? [英] How to find values shared between groups in a data frame?

查看:58
本文介绍了如何查找数据框中各组之间共享的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个整洁的data.frame,有两列: exp val 。我想查找所有不同实验之间共享的 val 值。

I have a tidy data.frame with two columns: exp and val. I want to find which values of val are shared among all different experiments.

df <- data.frame(exp = c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'),
                 val = c(10, 20, 15, 10, 10, 15, 99, 2, 15, 20, 10, 4))
df

   exp val
1    A  10
2    A  20
3    A  15
4    A  10
5    B  10
6    B  15
7    B  99
8    B   2
9    C  15
10   C  20
11   C  10
12   C   4

预期结果可以是值的向量:

Expected result could be either a vector of values:

10, 15

或数据框上的一列,告知是否共享价值:

or a column on the data frame telling if that value is shared:

   exp     val shared
   <fct> <dbl> <lgl> 
 1 A        10 TRUE  
 2 A        20 FALSE 
 3 A        15 TRUE  
 4 A        10 TRUE  
 5 B        10 TRUE  
 6 B        15 TRUE  
 7 B        99 FALSE 
 8 B         2 FALSE 
 9 C        15 TRUE  
10 C        20 FALSE 
11 C        10 TRUE  
12 C         4 FALSE 

我能够找到答案(请参阅下面的自我答案),但这似乎是一个足够普遍的问题, 必须比我提出的真正棘手的解决方案更好。

I was able to find an answer (see the self-answer below) but this seems like a common enough question that there must be a better way than the really hacky solution I cam up with.

我试图在中解决此问题dplyr 因为这是我所熟悉的,但是我对任何一种解决方案都感兴趣。

I tried to solve this problem in dplyr since that's what I'm familiar with, but I'm interested in any kind of solution.

推荐答案

或者您可以按 val 分组,然后检查该<$ c $的独立 exp 的数量c> val 等于不同的 exp 的数据帧级别数:

Or you can group by val and then check whether the number of distinct exp for that val is equal to the data frame level number of distinct exp:

df %>% 
    group_by(val) %>% 
    mutate(shared = n_distinct(exp) == n_distinct(.$exp))
    # notice the first exp refers to exp for each group while .$exp refers 
    # to the overall exp column in the data frame

# A tibble: 12 x 3
# Groups:   val [6]
#   exp     val shared
#   <fct> <dbl> <lgl> 
# 1 A        10 TRUE  
# 2 A        20 FALSE 
# 3 A        15 TRUE  
# 4 A        10 TRUE  
# 5 B        10 TRUE  
# 6 B        15 TRUE  
# 7 B        99 FALSE 
# 8 B         2 FALSE 
# 9 C        15 TRUE  
#10 C        20 FALSE 
#11 C        10 TRUE  
#12 C         4 FALSE

这篇关于如何查找数据框中各组之间共享的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆