找到矩阵中最重复的行 [英] Find the most repeated row in a matrix
本文介绍了找到矩阵中最重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
actual.prob< - c(.14,.14,.16,.13,.19,.24)
million.rep< - replicate(10000,sample(1:6,58,replace = T,actual.prob))
new.matrix< - matrix(nrow = 10000,ncol = 6 )
for(i in 1:10000){
new.matrix [i,]< - as.vector(table(factor(factor(million(1),[i],levels = 1:6) )
}
new.matrix [1:10,]
[,1] [,2] [,3] [,4] [,5] 6]
[1,] 3 7 11 11 11 15
[2,] 7 6 12 5 19 9
[3,] 12 7 6 8 11 14
[4 ,] 6 7 16 6 11 12
[5,] 5 9 12 5 14 13
[6,] 9 4 14 7 10 14
[7,] 6 9 9 6 15 13
[8,] 9 4 8 8 11 18
[9,] 6 11 7 5 12 17
[10,] 7 6 9 9 15 12
我想查找是否重复任何样本。我尝试了replicated(),它告诉我哪些行是复制的,但是我想要查看行而不必手动返回。任何建议?
解决方案
这是一个 data.table
/ p>
library(data.table)
dt< - data.table(new.matrix)
头(dt [,list(repeatats = .N,id = .I [[1]]),by = names(dt)] [order(repeats,decrease = T)],20)
#V1 V2 V3 V4 V5 V6重复id
#1:5 7 11 8 13 14 4 543
#2:5 11 13 5 10 14 4 579
#3:6 8 6 10 12 16 4 1433
#4:6 9 9 9 9 16 4 1688
#5:8 8 9 7 10 16 4 2382
#6:6 10 8 7 11 16 4 2965
# 7:7 9 11 8 11 12 4 3114
#8:8 8 10 7 10 15 4 3546
#9:7 8 12 9 9 13 4 5759
#10:7 7 13 9 10 12 4 9021
#11:8 10 8 8 12 12 3 81
#12:9 10 7 7 11 14 3 110
#13:7 11 8 6 12 14 3 130
#14:11 9 7 7 9 15 3 143
#15:8 10 10 7 11 12 3 330
#16:8 9 10 8 13 10 3 480
#17:9 10 7 10 11 11 3 542
#18:8 6 11 9 11 13 3 555
#19:7 10 7 6 10 18 3 577
#20:7 8 10 5 12 16 3 601
其中重复
是一行显示多少次,而 id
匹配该模式的矩阵中的第一行。
I have about 10000 replicates of a sample in a matrix. My matrix has 1000 rows and 6 columns. Numbers in the columns range from 0:58 depending on the sample.
actual.prob <- c(.14, .14, .16, .13, .19, .24)
million.rep <- replicate(10000, sample(1:6, 58, replace= T, actual.prob))
new.matrix <- matrix(nrow= 10000, ncol=6)
for(i in 1:10000){
new.matrix[i,] <- as.vector(table(factor(million.rep[,i], levels=1:6)))
}
new.matrix[1:10,]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 7 11 11 11 15
[2,] 7 6 12 5 19 9
[3,] 12 7 6 8 11 14
[4,] 6 7 16 6 11 12
[5,] 5 9 12 5 14 13
[6,] 9 4 14 7 10 14
[7,] 6 9 9 6 15 13
[8,] 9 4 8 8 11 18
[9,] 6 11 7 5 12 17
[10,] 7 6 9 9 15 12
I want to find if any samples are repeated. I tried replicated() which tells me what rows are replicates, but I want to view the row without having to go back manually. Any suggestions?
解决方案
Here is a data.table
implementation:
library(data.table)
dt <- data.table(new.matrix)
head(dt[, list(repeats=.N, id=.I[[1]]), by=names(dt)][order(repeats, decreasing=T)], 20)
# V1 V2 V3 V4 V5 V6 repeats id
# 1: 5 7 11 8 13 14 4 543
# 2: 5 11 13 5 10 14 4 579
# 3: 6 8 6 10 12 16 4 1433
# 4: 6 9 9 9 9 16 4 1688
# 5: 8 8 9 7 10 16 4 2382
# 6: 6 10 8 7 11 16 4 2965
# 7: 7 9 11 8 11 12 4 3114
# 8: 8 8 10 7 10 15 4 3546
# 9: 7 8 12 9 9 13 4 5759
# 10: 7 7 13 9 10 12 4 9021
# 11: 8 10 8 8 12 12 3 81
# 12: 9 10 7 7 11 14 3 110
# 13: 7 11 8 6 12 14 3 130
# 14: 11 9 7 7 9 15 3 143
# 15: 8 10 10 7 11 12 3 330
# 16: 8 9 10 8 13 10 3 480
# 17: 9 10 7 10 11 11 3 542
# 18: 8 6 11 9 11 13 3 555
# 19: 7 10 7 6 10 18 3 577
# 20: 7 8 10 5 12 16 3 601
where repeats
is how many times a row shows up, and id
the first row in the matrix that matches that pattern.
这篇关于找到矩阵中最重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文