如何用R中的向量元素划分矩阵的每一行 [英] How to divide each row of a matrix by elements of a vector in R

查看:141
本文介绍了如何用R中的向量元素划分矩阵的每一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将矩阵的每个除以一个固定向量.例如

I would like to divide each row of a matrix by a fixed vector. For example

mat<-matrix(1,ncol=2,nrow=2,TRUE)
dev<-c(5,10)

提供mat/dev将每个除以dev.

     [,1] [,2]
[1,]  0.2  0.2
[2,]  0.1  0.1

但是,我希望得到此结果,即执行按行操作:

However, I would like to have this as a result, i.e. do the operation row-wise :

rbind(mat[1,]/dev, mat[2,]/dev)

     [,1] [,2]
[1,]  0.2  0.1
[2,]  0.2  0.1

有明确的命令可以到达那里吗?

Is there an explicit command to get there?

推荐答案

以下是增加代码长度的几种方法:

Here are a few ways in order of increasing code length:

t(t(mat) / dev)

mat / dev[col(mat)] #  @DavidArenburg & @akrun

mat %*% diag(1 / dev)

sweep(mat, 2, dev, "/")

t(apply(mat, 1, "/", dev))

plyr::aaply(mat, 1, "/", dev)

mat / rep(dev, each = nrow(mat))

mat / t(replace(t(mat), TRUE, dev))

mapply("/", as.data.frame(mat), dev)  # added later

mat / matrix(dev, nrow(mat), ncol(mat), byrow = TRUE)  # added later

do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev))

mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev

数据帧

如果mat是数据帧并产生数据帧结果,则所有以mat /开头的解决方案也可以使用. sweep解决方案和最后一个(即mat2)解决方案也是如此. mapply解决方案可与data.frames一起使用,但会产生一个矩阵.

Data Frames

All the solutions that begin with mat / also work if mat is a data frame and produce a data frame result. The same is also the case for the sweep solution and the last, i.e. mat2, solution. The mapply solutions works with data.frames but produces a matrix.

如果mat是纯矢量而不是矩阵,那么这两个都将返回一个一列矩阵

If mat is a plain vector rather than a matrix then either of these return a one column matrix

t(t(mat) / dev)
mat / t(replace(t(mat), TRUE, dev))

这将返回一个向量:

plyr::aaply(mat, 1, "/", dev)

其他人给出了错误,警告或不希望的答案.

The others give an error, warning or not the desired answer.

代码的简洁性和清晰度可能比速度更重要,但出于完整性的考虑,这里有一些基准测试,先使用10次重复,然后再进行100次重复.

The brevity and clarity of the code may be more important than speed but for purposes of completeness here are some benchmarks using 10 repetitions and then 100 repetitions.

library(microbenchmark)
library(plyr)

set.seed(84789)

mat<-matrix(runif(1e6),nrow=1e5)
dev<-runif(10)

microbenchmark(times=10L,
  "1" = t(t(mat) / dev),
  "2" = mat %*% diag(1/dev),
  "3" = sweep(mat, 2, dev, "/"),
  "4" = t(apply(mat, 1, "/", dev)),
  "5" = mat / rep(dev, each = nrow(mat)),
  "6" = mat / t(replace(t(mat), TRUE, dev)),
  "7" = aaply(mat, 1, "/", dev),
  "8" = do.call(rbind, lapply(as.data.frame(t(mat)), "/", dev)),
  "9" = {mat2 <- mat; for(i in seq_len(nrow(mat2))) mat2[i, ] <- mat2[i, ] / dev},
 "10" = mat/dev[col(mat)])

给予:

Unit: milliseconds
 expr         min          lq       mean      median          uq        max neval
    1    7.957253    8.136799   44.13317    8.370418    8.597972  366.24246    10
    2    4.678240    4.693771   10.11320    4.708153    4.720309   58.79537    10
    3   15.594488   15.691104   16.38740   15.843637   16.559956   19.98246    10
    4   96.616547  104.743737  124.94650  117.272493  134.852009  177.96882    10
    5   17.631848   17.654821   18.98646   18.295586   20.120382   21.30338    10
    6   19.097557   19.365944   27.78814   20.126037   43.322090   48.76881    10
    7 8279.428898 8496.131747 8631.02530 8644.798642 8741.748155 9194.66980    10
    8  509.528218  524.251103  570.81573  545.627522  568.929481  821.17562    10
    9  161.240680  177.282664  188.30452  186.235811  193.250346  242.45495    10
   10    7.713448    7.815545   11.86550    7.965811    8.807754   45.87518    10

对所有耗时少于20毫秒且重复100次的测试重新运行测试:

Re-running the test on all those that took <20 milliseconds with 100 repetitions:

microbenchmark(times=100L,
  "1" = t(t(mat) / dev),
  "2" = mat %*% diag(1/dev),
  "3" = sweep(mat, 2, dev, "/"),
  "5" = mat / rep(dev, each = nrow(mat)),
  "6" = mat / t(replace(t(mat), TRUE, dev)),
 "10" = mat/dev[col(mat)])

给予:

Unit: milliseconds
 expr       min        lq      mean    median        uq       max neval
    1  8.010749  8.188459 13.972445  8.560578 10.197650 299.80328   100
    2  4.672902  4.734321  5.802965  4.769501  4.985402  20.89999   100
    3 15.224121 15.428518 18.707554 15.836116 17.064866  42.54882   100
    5 17.625347 17.678850 21.464804 17.847698 18.209404 303.27342   100
    6 19.158946 19.361413 22.907115 19.772479 21.142961  38.77585   100
   10  7.754911  7.939305  9.971388  8.010871  8.324860  25.65829   100

因此,在这两个测试#2中(使用diag)都是最快的.原因可能在于它对BLAS的直接吸引力,而#1依赖于成本更高的t.

So on both these tests #2 (using diag) is fastest. The reason may lie in its almost direct appeal to the BLAS, whereas #1 relies on the costlier t.

这篇关于如何用R中的向量元素划分矩阵的每一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆