比较两列的组并返回索引匹配R [英] Compare group of two columns and return index matches R

查看：85 发布时间：2020/10/6 18:48:31 r dataframe compare

本文介绍了比较两列的组并返回索引匹配R的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

非常感谢您的阅读。我确信这很简单，很抱歉。

Many thanks for reading. Apologies for what I'm sure is a simple task.

我有一个数据框：
（编辑：添加了额外的列，不包括在比较中）

I have a dataframe: (Edited: Added extra column not to be included in comparison)

b = c(5, 6, 7, 8, 10, 11) 
c = c('david','alan','pete', 'ben', 'richard', 'edd') 
d = c('alex','edd','ben','pete','raymond', 'alan')
df = data.frame(b, c, d) 
df
   b       c       d
1  5   david    alex
2  6    alan     edd
3  7    pete     ben
4  8     ben    pete
5 10 richard raymond
6 11     edd    alan

我想比较列 c 和 d 的组与列 d 和 c 。也就是说，对于一行，我想将 c 和 d 中的组合值与 d 和 c 用于所有其他行。

I want to compare the group of columns c and d with the group of columns d and c. That is, for one row, I want to compare the combined values in c and d with the combined values in d and c for all other rows.

（请注意值可以是字符或整数）

(Note the values could either be characters or integers)

这些要匹配的地方我要返回索引匹配的那些行中的一个，最好是列表列表。我需要能够访问索引而不引用列 c 或 d 中的值。

Where these match I want to return the index of those rows which match, preferably as a list of lists. I need to be able to access the indexes without referring to the values in column c or d.

即对于上述数据框，我的预期输出将是：

I.e. for the above dataframe, my expected output would be:

c(c(2, 6), c(3, 4))
((2,6), (3,4))

为：

Row 2: (c + d == alan + edd) = row 6: (d + c == edd + alan)
Row 3: (c + d == pete + ben) = row 4: (d + c == ben + pete)

我了解如何使用 match melt ，但如果将它们连接在一起并遍历所有可能的行组合，则不会。


I understand how to determine the match case for two separate columns using match melt, but not if they are joined together and iterating over all possible row combinations.
我设想的是：
lapply(1:6, function(x), ifelse((df$a & df$b) == (df$b & df$a), index(x), 0))

但显然这是不正确的，不会起作用。
But obviously that is incorrect and won't work.
我咨询了以下问题，但未能提出答案。我不知道从哪里开始。
I consulted the following questions but have been unable to formulate an answer. I have no idea where to begin.
  

Matching multiple columns on different data frames and getting other column as result
 < a href = https://stackoverflow.com/questions/6880450/match-two-columns-with-two-other-columns>将两列与另外两列匹配 
 比较一个跨许多行的数据框 
  R比较所有成对的列的每个值 
如何实现以上目标？
推荐答案
您可以执行以下操作。它根据由df列形成的唯一排序字符串来拆分行索引 1：nrow（df）。排序可确保 A，B 和 B，A 得到相同的对待。
You could do something like this.  It splits the row indices 1:nrow(df) according to unique sorted strings formed from the columns of df.  The sorting ensures that A,B and B,A are treated identically.
duplist <- split(1:nrow(df),apply(df,1,function(r) paste(sort(r),collapse=" ")))

duplist
$`alan edd`
[1] 2 6

$`alex david`
[1] 1

$`ben pete`
[1] 3 4

$`raymond richard`
[1] 5


                        这篇关于比较两列的组并返回索引匹配R的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

比较两列的组并返回索引匹配R [英] Compare group of two columns and return index matches R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

比较两列的组并返回索引匹配R [英] Compare group of two columns and return index matches R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭