R,找到重复的行,不管顺序 [英] R, find duplicated rows , regardless of order
问题描述
我一直在想这个问题一个晚上:
这是我的矩阵:
I've been thinking this problem for a whole night: here is my matrix:
'a' '#' 3
'#' 'a' 3
0 'I am' 2
'I am' 0 2
.....
我想像前两行一样处理行,因为它只是'a'和'#'的不同顺序。在我的情况下,我想删除这样的行。
玩具的例子很简单,前两个是一样的,第三个和第四个是一样的。但是在我的数据集中,我不知道'同一行'在哪里。
I want to treat the rows like the first two rows are the same, because it's just different order of 'a' and '#'. In my case, I want to delete such kind of rows. The toy example is simple, the first two are the same, the third and the forth are the same. but in my data set, I don't know where is the 'same' row.
我在R.写。谢谢。
推荐答案
也许这样的事情会适合你。不清楚你想要的输出是什么。
Perhaps something like this would work for you. It is not clear what your desired output is though.
x <- structure(c("a", "#", "0", "I am", "#", "a", "I am", "0", "3",
"3", "2", "2"), .Dim = c(4L, 3L))
x
# [,1] [,2] [,3]
# [1,] "a" "#" "3"
# [2,] "#" "a" "3"
# [3,] "0" "I am" "2"
# [4,] "I am" "0" "2"
duplicated(
lapply(1:nrow(x), function(y){
A <- x[y, ]
A[order(A)]
}))
# [1] FALSE TRUE FALSE TRUE
基本上按行排列矩阵,然后排序每行。 复制
在列表
上也是这样,所以你只需要把所有的东西复制到一起,找到哪些项目(行)重复。
This basically splits the matrix up by row, then sorts each row. duplicated
works on list
s too, so you just wrap the whole thing with `duplicated to find which items (rows) are duplicated.
这篇关于R,找到重复的行,不管顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!