如何选择 R 数据框中两列的所有唯一组合? [英] How do I select all unique combinations of two columns in an R data frame?

查看:20
本文介绍了如何选择 R 数据框中两列的所有唯一组合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相关矩阵,我将其放入数据框中,如下所示:

I have a correlation matrix that I put in a dataframe like so:

row | var1 | var2 | cor
1   | A    | B    | 0.6
2   | B    | A    | 0.6
3   | A    | C    | 0.4
4   | C    | A    | 0.4

这些结果每行复制到 2 行,同时包含var1"和var2".我只需要一个,最好先使用较低的变量(例如第 1 行和第 3 行).

These results are duplicated into 2 rows each, with both combinations of "var1" and "var2". I only need one, preferably with the lower variable first (e.g. rows 1 and 3).

我一直在玩 dplyr 两个小时并阅读旧线程,但没有找到我需要的东西.

I've been playing with dplyr for two hours and reading old threads, but not finding what I need.

# get correlation of every concept versus every concept
data.cor <- data.jobs %>% 
  select(-y,-X) %>%
  as.matrix %>%
  cor %>%
  as.data.frame %>%
  rownames_to_column(var = 'var1') %>%
  gather(var2, value, -var1)

我希望输出看起来像这样:

I would like output to look like so:

row | var1 | var2 | cor
1   | A    | B    | 0.6
3   | A    | C    | 0.4

我正在尝试不使用循环来做到这一点.

I am trying to do this without resorting to a loop.

推荐答案

这里是 tidyverse 的一种方式 -

Here's one way with tidyverse -

dat2 <- dat %>% 
  filter(!duplicated(paste0(pmax(var1, var2), pmin(var1, var2))))


# A tibble: 2 x 3
  var1  var2    cor
  <chr> <chr> <dbl>
1 A     B     0.600
2 A     C     0.400

数据 -

dat <- data_frame(
  var1 = LETTERS[c(1,2,1,3)],
  var2 = LETTERS[c(2,1,3,1)],
  cor = c(0.6,0.6,0.4,0.4))

注意:由于@tmfmnk 清理了逻辑

Note: cleaned up the logic thanks to @tmfmnk

这篇关于如何选择 R 数据框中两列的所有唯一组合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆