R,找到重复的行,不管顺序 [英] R, find duplicated rows , regardless of order

查看:146
本文介绍了R,找到重复的行,不管顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在想这个问题一个晚上:
这是我的矩阵:

I've been thinking this problem for a whole night: here is my matrix:

'a' '#' 3
'#' 'a' 3
 0  'I am' 2
'I am' 0 2

.....

我想像前两行一样处理行,因为它只是'a'和'#'的不同顺序。在我的情况下,我想删除这样的行。
玩具的例子很简单,前两个是一样的,第三个和第四个是一样的。但是在我的数据集中,我不知道'同一行'在哪里。

I want to treat the rows like the first two rows are the same, because it's just different order of 'a' and '#'. In my case, I want to delete such kind of rows. The toy example is simple, the first two are the same, the third and the forth are the same. but in my data set, I don't know where is the 'same' row.

我在R.写。谢谢。

推荐答案

也许这样的事情会适合你。不清楚你想要的输出是什么。

Perhaps something like this would work for you. It is not clear what your desired output is though.

x <- structure(c("a", "#", "0", "I am", "#", "a", "I am", "0", "3", 
                 "3", "2", "2"), .Dim = c(4L, 3L))
x
#      [,1]   [,2]   [,3]
# [1,] "a"    "#"    "3" 
# [2,] "#"    "a"    "3" 
# [3,] "0"    "I am" "2" 
# [4,] "I am" "0"    "2" 


duplicated(
  lapply(1:nrow(x), function(y){
    A <- x[y, ]
    A[order(A)]
  }))
# [1] FALSE  TRUE FALSE  TRUE

基本上按行排列矩阵,然后排序每行。 复制列表上也是这样,所以你只需要把所有的东西复制到一起,找到哪些项目(行)重复。

This basically splits the matrix up by row, then sorts each row. duplicated works on lists too, so you just wrap the whole thing with `duplicated to find which items (rows) are duplicated.

这篇关于R,找到重复的行,不管顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆