如何在R数据框中选择两列的所有唯一组合? [英] How do I select all unique combinations of two columns in an R data frame?
本文介绍了如何在R数据框中选择两列的所有唯一组合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个相关矩阵,我将其放在数据框中,如下所示:
I have a correlation matrix that I put in a dataframe like so:
row | var1 | var2 | cor
1 | A | B | 0.6
2 | B | A | 0.6
3 | A | C | 0.4
4 | C | A | 0.4
这些结果分别复制到两行中,并同时包含 var1和 var2两种组合。我只需要一个,最好先使用较低的变量(例如第1行和第3行)。
These results are duplicated into 2 rows each, with both combinations of "var1" and "var2". I only need one, preferably with the lower variable first (e.g. rows 1 and 3).
我已经和dplyr玩了两个小时,并阅读了旧的主题,但没有找到我需要的东西。
I've been playing with dplyr for two hours and reading old threads, but not finding what I need.
# get correlation of every concept versus every concept
data.cor <- data.jobs %>%
select(-y,-X) %>%
as.matrix %>%
cor %>%
as.data.frame %>%
rownames_to_column(var = 'var1') %>%
gather(var2, value, -var1)
我希望输出看起来像这样:
I would like output to look like so:
row | var1 | var2 | cor
1 | A | B | 0.6
3 | A | C | 0.4
我正在尝试不使用循环的方法。
I am trying to do this without resorting to a loop.
推荐答案
这里是 tidyverse
-
dat2 <- dat %>%
filter(!duplicated(paste0(pmax(var1, var2), pmin(var1, var2))))
# A tibble: 2 x 3
var1 var2 cor
<chr> <chr> <dbl>
1 A B 0.600
2 A C 0.400
数据-
dat <- data_frame(
var1 = LETTERS[c(1,2,1,3)],
var2 = LETTERS[c(2,1,3,1)],
cor = c(0.6,0.6,0.4,0.4))
注意:感谢@tmfmnk清理了逻辑
Note: cleaned up the logic thanks to @tmfmnk
这篇关于如何在R数据框中选择两列的所有唯一组合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文