快速创建R中每行已知数目为1的二进制矩阵的快速方法 [英] Fast way to create a binary matrix with known number of 1 each row in R

查看:84
本文介绍了快速创建R中每行已知数目为1的二进制矩阵的快速方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个向量,提供矩阵的每一行有多少个"1".现在,我必须从向量中创建该矩阵.

I have a vector that provides how many "1" each row of a matrix has. Now I have to create this matrix out of the vector.

例如,假设我要使用下面的矢量v <- c(2,6,3,9)创建一个4 x 9的矩阵out.结果应该看起来像

For example, let say I want to create a 4 x 9 matrix out with following vector v <- c(2,6,3,9). The result should look like

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]    1    1    0    0    0    0    0    0    0
[2,]    1    1    1    1    1    1    0    0    0
[3,]    1    1    1    0    0    0    0    0    0
[4,]    1    1    1    1    1    1    1    1    1

我已经使用for循环完成了此操作,但是对于大型矩阵(100,000 x 500),我的解决方案很慢:

I've done this with a for loop but my solution is slow for a large matrix (100,000 x 500):

out <- NULL
for(i in 1:length(v)){
  out <- rbind(out,c(rep(1, v[i]),rep(0,9-v[i])))
}

有人想出一种更快的方法来创建这样的矩阵吗?

Has anyone an idea for a faster way to create such a matrix?

推荐答案

这是我使用sapplydo.call的方法以及一些小样本上的时间安排.

Here is my approach using sapply and do.call and some timings on a small sample.

library(microbenchmark)
library(Matrix)

v <- c(2,6,3,9)
    microbenchmark(
  roman = {
    xy <- sapply(v, FUN = function(x, ncols) {
      c(rep(1, x), rep(0, ncols - x))
    }, ncols = 9, simplify = FALSE)

    xy <- do.call("rbind", xy)
  },
  fourtytwo = {
    t(vapply(v, function(y) { x <- numeric( length=9); x[1:y] <- 1;x}, numeric(9) ) )
  },
  akrun = {
    m1 <- sparseMatrix(i = rep(seq_along(v), v), j = sequence(v), x = 1)
    m1 <- as.matrix(m1)
  })

Unit: microseconds
      expr      min        lq       mean    median       uq
     roman   26.436   30.0755   36.42011   36.2055   37.930
 fourtytwo   43.676   47.1250   55.53421   54.7870   57.852
     akrun 1261.634 1279.8330 1501.81596 1291.5180 1318.720

还有更大的样本

v <- sample(2:9, size = 10e3, replace = TRUE)

Unit: milliseconds
      expr      min       lq     mean   median       uq
     roman 33.52430 35.80026 37.28917 36.46881 37.69137
 fourtytwo 37.39502 40.10257 41.93843 40.52229 41.52205
     akrun 10.00342 10.34306 10.66846 10.52773 10.72638

随着对象尺寸的增加,spareMatrix的优势逐渐显现.

With a growing object size, the benefits of spareMatrix come to light.

这篇关于快速创建R中每行已知数目为1的二进制矩阵的快速方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆