找到矩阵中最重复的行 [英] Find the most repeated row in a matrix

查看：180 发布时间：2017/7/20 23:58:58 r matrix duplicates row

本文介绍了找到矩阵中最重复的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在矩阵中有大约10000个样本的重复。我的矩阵有1000行和6列。列中的数字范围为0:58，具体取决于样品。

  actual.prob<  -  c（.14，.14，.16，.13，.19，.24） 
 million.rep<  -  replicate（10000，sample（1：6，58，replace = T，actual.prob））
 new.matrix<  -  matrix（nrow = 10000，ncol = 6 ）
 for（i in 1：10000）{
 new.matrix [i，]<  -  as.vector（table（factor（factor（million（1），[i]，levels = 1：6） ）
} 
 new.matrix [1:10，] 
 
 [，1] [，2] [，3] [，4] [，5] 6] 
 [1，] 3 7 11 11 11 15 
 [2，] 7 6 12 5 19 9 
 [3，] 12 7 6 8 11 14 
 [4 ，] 6 7 16 6 11 12 
 [5，] 5 9 12 5 14 13 
 [6，] 9 4 14 7 10 14 
 [7，] 6 9 9 6 15 13 
 [8，] 9 4 8 8 11 18 
 [9，] 6 11 7 5 12 17 
 [10，] 7 6 9 9 15 12

我想查找是否重复任何样本。我尝试了replicated（），它告诉我哪些行是复制的，但是我想要查看行而不必手动返回。任何建议？

解决方案

这是一个 data.table / p>

  library（data.table）
 dt<  -  data.table（new.matrix）
头（dt [，list（repeatats = .N，id = .I [[1]]），by = names（dt）] [order（repeats，decrease = T）]，20）
＃V1 V2 V3 V4 V5 V6重复id 
＃1：5 7 11 8 13 14 4 543 
＃2：5 11 13 5 10 14 4 579 
＃3：6 8 6 10 12 16 4 1433 
＃4：6 9 9 9 9 16 4 1688 
＃5：8 8 9 7 10 16 4 2382 
＃6：6 10 8 7 11 16 4 2965 
＃ 7：7 9 11 8 11 12 4 3114 
＃8：8 8 10 7 10 15 4 3546 
＃9：7 8 12 9 9 13 4 5759 
＃10：7 7 13 9 10 12 4 9021 
＃11：8 10 8 8 12 12 3 81 
＃12：9 10 7 7 11 14 3 110 
＃13：7 11 8 6 12 14 3 130 
＃14：11 9 7 7 9 15 3 143 
＃15：8 10 10 7 11 12 3 330 
＃16：8 9 10 8 13 10 3 480 
＃17：9 10 7 10 11 11 3 542 
＃18：8 6 11 9 11 13 3 555 
＃19：7 10 7 6 10 18 3 577 
＃20：7 8 10 5 12 16 3 601

其中重复是一行显示多少次，而 id 匹配该模式的矩阵中的第一行。

I have about 10000 replicates of a sample in a matrix. My matrix has 1000 rows and 6 columns. Numbers in the columns range from 0:58 depending on the sample.

actual.prob <- c(.14, .14, .16, .13, .19, .24)
million.rep <- replicate(10000, sample(1:6, 58, replace= T, actual.prob))
new.matrix <- matrix(nrow= 10000, ncol=6)
for(i in 1:10000){
  new.matrix[i,] <- as.vector(table(factor(million.rep[,i], levels=1:6)))
}
new.matrix[1:10,]

          [,1] [,2] [,3] [,4] [,5] [,6]
     [1,]    3    7   11   11   11   15
     [2,]    7    6   12    5   19    9
     [3,]   12    7    6    8   11   14
     [4,]    6    7   16    6   11   12
     [5,]    5    9   12    5   14   13
     [6,]    9    4   14    7   10   14
     [7,]    6    9    9    6   15   13
     [8,]    9    4    8    8   11   18
     [9,]    6   11    7    5   12   17
     [10,]    7    6    9    9   15   12

I want to find if any samples are repeated. I tried replicated() which tells me what rows are replicates, but I want to view the row without having to go back manually. Any suggestions?

解决方案

Here is a data.table implementation:

library(data.table)
dt <- data.table(new.matrix)
head(dt[, list(repeats=.N, id=.I[[1]]), by=names(dt)][order(repeats, decreasing=T)], 20)
#     V1 V2 V3 V4 V5 V6 repeats   id
#  1:  5  7 11  8 13 14       4  543
#  2:  5 11 13  5 10 14       4  579
#  3:  6  8  6 10 12 16       4 1433
#  4:  6  9  9  9  9 16       4 1688
#  5:  8  8  9  7 10 16       4 2382
#  6:  6 10  8  7 11 16       4 2965
#  7:  7  9 11  8 11 12       4 3114
#  8:  8  8 10  7 10 15       4 3546
#  9:  7  8 12  9  9 13       4 5759
# 10:  7  7 13  9 10 12       4 9021
# 11:  8 10  8  8 12 12       3   81
# 12:  9 10  7  7 11 14       3  110
# 13:  7 11  8  6 12 14       3  130
# 14: 11  9  7  7  9 15       3  143
# 15:  8 10 10  7 11 12       3  330
# 16:  8  9 10  8 13 10       3  480
# 17:  9 10  7 10 11 11       3  542
# 18:  8  6 11  9 11 13       3  555
# 19:  7 10  7  6 10 18       3  577
# 20:  7  8 10  5 12 16       3  601

where repeats is how many times a row shows up, and id the first row in the matrix that matches that pattern.

这篇关于找到矩阵中最重复的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

找到矩阵中最重复的行 [英] Find the most repeated row in a matrix

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

找到矩阵中最重复的行 [英] Find the most repeated row in a matrix

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭