R,查找重复的行,不管顺序 [英] R, find duplicated rows , regardless of order

查看:17
本文介绍了R,查找重复的行,不管顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我整晚都在想这个问题:这是我的矩阵:

I've been thinking this problem for a whole night: here is my matrix:

'a' '#' 3
'#' 'a' 3
 0  'I am' 2
'I am' 0 2

.....

我想将前两行视为相同的行,因为它只是a"和#"的顺序不同.就我而言,我想删除这种行.玩具例子很简单,前两个是一样的,第三个和第四个是一样的.但在我的数据集中,我不知道相同"行在哪里.

I want to treat the rows like the first two rows are the same, because it's just different order of 'a' and '#'. In my case, I want to delete such kind of rows. The toy example is simple, the first two are the same, the third and the forth are the same. but in my data set, I don't know where is the 'same' row.

我正在用 R 写.谢谢.

I'm writing in R. Thanks.

推荐答案

也许这样的东西对你有用.目前尚不清楚您想要的输出是什么.

Perhaps something like this would work for you. It is not clear what your desired output is though.

x <- structure(c("a", "#", "0", "I am", "#", "a", "I am", "0", "3", 
                 "3", "2", "2"), .Dim = c(4L, 3L))
x
#      [,1]   [,2]   [,3]
# [1,] "a"    "#"    "3" 
# [2,] "#"    "a"    "3" 
# [3,] "0"    "I am" "2" 
# [4,] "I am" "0"    "2" 


duplicated(
  lapply(1:nrow(x), function(y){
    A <- x[y, ]
    A[order(A)]
  }))
# [1] FALSE  TRUE FALSE  TRUE

这基本上是按行拆分矩阵,然后对每一行进行排序.duplicated 也适用于 list,因此您只需使用 `duplicated 将整个内容包装起来,以查找哪些项目(行)是重复的.

This basically splits the matrix up by row, then sorts each row. duplicated works on lists too, so you just wrap the whole thing with `duplicated to find which items (rows) are duplicated.

这篇关于R,查找重复的行,不管顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆