dplyr/tidyverse 函数中的条件语句,用于排除同一因子水平之间的比较 [英] Conditional statement in dplyr/tidyverse function to exclude comparisons among same levels of a factor

查看:21
本文介绍了dplyr/tidyverse 函数中的条件语句,用于排除同一因子水平之间的比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像这样的数据框:

I have a dataframe like so:

 data = read.table(text = "region     plot    species
 1          1A      A_B  
 1          1A      A_B
 1          1B      B_C
 1          1C      A_B
 1          1D      C_D
 2          2A      B_C
 2          2A      B_C
 2          2A      E_F
 2          2B      B_C
 2          2B      E_F     
 2          2C      E_F
 2          2D      B_C
 3          3A      A_B
 3          3B      A_B", stringsAsFactors = FALSE, header = TRUE)

我想比较 plot 的每个级别,以获得两个 plot 比较中唯一 species 匹配的计数.但是,我不想在相同的图之间进行比较(即删除/不包括 1A_1A 或 1B_1B 或 2C_2C 等).此示例的输出应如下所示:

I wanted to compare each level of plot to get a count of unique species matches among two plot comparisons. However, I do not want comparisons among the same plots (i.e. remove/do not include 1A_1A or 1B_1B or 2C_2C, ect.). The output for this example should appear as follows:

output<-
  region  plot   freq
  1     1A_1B     0     
  1     1A_1C     1
  1     1A_1D     0
  1     1B_1C     0    
  1     1B_1D     0 
  1     1C_1D     0
  2     2A_2B     2     
  2     2A_2C     1
  2     2A_2D     1
  2     2B_2C     1    
  2     2B_2D     1 
  2     2C_2D     0
  3     3A_3B     1  

我改编了@HubertL 的以下代码,将矩阵列表转换为单个数据框但是很难合并一个合适的 if else 语句来满足这个条件:

I have adapted the following code from @HubertL, Convert list of matrices to a single data frame but struggle to incorporate an appropriate if else statement to meet this condition:

library(tidyverse)

data %>% group_by(region, species) %>% 
    filter(n() > 1) %>%
    summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>% 
    unnest %>%
    group_by(region, y) %>% 
    summarize(ifelse(plot[i] = plot[i], freq = 
    length(unique((species),)

推荐答案

您可以通过添加filter(!duplicated(plot))来过滤掉重复项:

You can filter out duplicates by adding filter(!duplicated(plot)):

data %>% group_by(region, species) %>% 
  filter(!duplicated(plot)) %>%
  filter(n() > 1) %>%
  summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>% 
  unnest %>%
  group_by(region, y)  %>% 
  summarize(freq=n())

  region     y  freq
   <int> <chr> <int>
1      1 1A_1C     1
2      2 2A_2B     2
3      2 2A_2C     1
4      2 2A_2D     1
5      2 2B_2C     1
6      2 2B_2D     1
7      3 3A_3B     1

这篇关于dplyr/tidyverse 函数中的条件语句,用于排除同一因子水平之间的比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆