比较两个数据帧，列顺序无关，以获取非重复行 [英] Compare two data frames, column-order-independent, to get non-duplicated rows

查看：132 发布时间：2016/12/21 15:34:38 r dataframe compare

本文介绍了比较两个数据帧，列顺序无关，以获取非重复行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我要比较两个数据框，并检查是否有重复的行。
我们假设列的顺序不重要，所以如果df1看起来像这样：

和df2喜欢：

然后，来自两个df的非重复行将是：

  12 67 
 56 32 
 89 45 
 77 88 
  
 
 
 如何以简单的方式实现这个目标？
解决方案
这里是一个dplyr的解决方案，对大型数据集可能会非常快。
  df1<  -  data_frame （71,90,12,56），v2 = c（78,13,67,32））
 df2  
 df3<  -  bind_rows（df1，df2）
 
 df3％>％
 rowwise（）％& ％
 mutate（key = paste0（min（v1，v2），max（v1，v2）））％>％
 group_by（key）％>％
 mutate n（））％>％
 filter（size == 1）
  
仅适用于两个分组变量，将其扩展到多个变量，您基本上只需要调整如何制作密钥。 
 
 
 编辑：我按照下面的注释误解了问题。 
 
I want to compare two data frames and check if there are duplicated rows.
We assume that the order of columns doesn't matter so if df1 looks like that:
 V2 V3
 71 78
 90 13
 12 67
 56 32
and df2 like that:
V2 V3
89 45
77 88
78 71
90 13
Then the non duplicated rows from both df will be:
12 67
56 32
89 45
77 88
How can I achieve this goal in easy way?
 解决方案 
Here's a dplyr solution which will probably be pretty quick on larger datasets
df1 <- data_frame( v1 = c(71,90,12,56), v2 = c(78,13,67,32))
df2 <- data_frame( v1 = c(89,77,78,90), v2 = c(45,88,71,13) )

df3 <- bind_rows(df1, df2)

df3 %>%
  rowwise() %>% 
  mutate(key = paste0( min(v1, v2), max(v1, v2))) %>% 
  group_by(key) %>% 
  mutate( size = n()) %>% 
  filter( size == 1)
This solution only works for two grouping variables, to extend it to multiple variables you basically just need to adjust how to manufacture the key. 

Edit: I misunderstood the problem as per comments below. 

                        这篇关于比较两个数据帧，列顺序无关，以获取非重复行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

比较两个数据帧，列顺序无关，以获取非重复行 [英] Compare two data frames, column-order-independent, to get non-duplicated rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

比较两个数据帧，列顺序无关，以获取非重复行 [英] Compare two data frames, column-order-independent, to get non-duplicated rows

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭