获取与一系列向量重合的矩阵行,而不使用 apply [英] Getting rows of a matrix which coincide with a series of vectors, without using apply

查看:14
本文介绍了获取与一系列向量重合的矩阵行,而不使用 apply的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题有点与 我之前的问题有关问题.

假设我有一个矩阵和 4 个向量(可以考虑另一个矩阵,因为向量的顺序很重要),我想按顺序获得与每个向量重合的行号.我希望解决方案避免重复向量并尽可能高效,因为问题规模很大.

Suppose I have one matrix and 4 vectors (can consider this another matrix, since the order of the vectors matters), and I want to get the row numbers which coincide to each vector, in order. I would like the solution to avoid repeating vectors and be as efficient as possible, since the problem is large scale.

示例.

 set.seed(1)

    M = matrix(rpois(50,5),5,10)
    v1 = c(3, 2, 7, 7, 4, 4, 7,  4, 5, 6)
    v2=  c(8, 6,  4, 4, 3,  8,  3, 6, 5, 6)
    v3=  c(4,  8, 3,  5, 9, 4, 5,  6, 7 ,7)
    v4=  c(4,  9, 3, 6,  3, 1, 5, 7,6, 1)

Vmat = cbind(v1,v2,v3,v4)

M
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    4    8    3    5    9    4    5    6    7     7
[2,]    4    9    3    6    3    1    5    7    6     1
[3,]    5    6    6   11    6    4    5    2    7     5
[4,]    8    6    4    4    3    8    3    6    5     6
[5,]    3    2    7    7    4    4    7    4    5     6

Vmat
      v1 v2 v3 v4
 [1,]  3  8  4  4
 [2,]  2  6  8  9
 [3,]  7  4  3  3
 [4,]  7  4  5  6
 [5,]  4  3  9  3
 [6,]  4  8  4  1
 [7,]  7  3  5  5
 [8,]  4  6  6  7
 [9,]  5  5  7  6
[10,]  6  6  7  1

输出应该是...

5 4 1 2

推荐答案

类似于@user295691 的回答,我们合并,但现在在 merge.data.table<中使用 which=TRUE 选项/代码>:

Similar to @user295691's answer, we merge, but now with which=TRUE option in merge.data.table:

set.seed(1)
matdata  <- create_data(1e6,20,1e5) # using @user295691's example data

library(data.table)
M = as.data.table(matdata$M)
V = as.data.table(matdata$V)

r <- M[V, on=names(V), which=TRUE]

要验证它是否正确...

To verify that it is correct...

V[1,]
#    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
# 1:  7  5  3  2  5  6  3  3  5   5   3   2   4   9   4   4   3   6   4   3
M[r[1],]
#    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
# 1:  7  5  3  2  5  6  3  3  5   5   3   2   4   9   4   4   3   6   4   3

<小时>

基准

OP 的示例数据(在已删除的答案中):

OP's example data (in a deleted answer):

set.seed(1)

NM    = 1e6
NV    = 1e5
Ncols = 20
MM = matrix(rpois(NM*Ncols,Ncols),NM,Ncols)

rows=sample(NM,NV,replace = FALSE)

Vmat=t(MM[rows,])

# converted to data.frames, because why not?
M = as.data.frame(MM)
V = as.data.frame(t(Vmat))

# converted to data.tables
M2 = setDT(copy(M))
V2 = setDT(copy(V))

要测试的功能:

match_strings <- function(){
  m = do.call(function(...) paste(...,sep="_"), M)
  v = do.call(function(...) paste(...,sep="_"), V)
  match(v,m)
}

merge_df <- function(){ # from @user295691's answer
  M$mid = seq(nrow(M))
  V$vid = seq(nrow(V))
  with(merge(M,V), mid[order(vid)])
}

merge_dt <- function(){
  M2[V2, on=names(V2), which=TRUE]
}

结果:

system.time({r_strings = match_strings()})
#    user  system elapsed 
#   10.40    0.06   10.49     
system.time({r_merge_df = merge_df()})
#    user  system elapsed 
#   14.71    0.10   14.84
system.time({r_merge_dt = merge_dt()})
#    user  system elapsed 
#    0.39    0.00    0.40 

identical(r_strings,r_merge_df) # TRUE
identical(r_strings,r_merge_dt) # TRUE

这篇关于获取与一系列向量重合的矩阵行,而不使用 apply的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆