向量化矩阵的min() [英] Vectorize min() for matrix

查看:166
本文介绍了向量化矩阵的min()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望向量化以下循环:

I'm hoping to vectorize the following loop:

for (i in 1:n) {
    for (j in 1:m) {
        temp_mat[i,j]=min(temp_mat[i,j],1);
    }
}  

我以为我可以做temp_mat=min(temp_mat,1),但这没有给我想要的结果.有没有一种方法可以向量化此循环以使其更快?

I thought I could do temp_mat=min(temp_mat,1), but this is not giving me the desired result. Is there a way to vectorize this loop to make it much faster?

推荐答案

只需使用temp_mat <- pmin(temp_mat, 1).有关并行最小值的更多用法,请参见?pmin.

Just use temp_mat <- pmin(temp_mat, 1). See ?pmin for more use of parallel minima.

示例:

set.seed(0); A <- matrix(sample(1:3, 25, replace = T), 5)
#> A
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    3    1    1    3    3
#[2,]    1    3    1    2    3
#[3,]    2    3    1    3    1
#[4,]    2    2    3    3    2
#[5,]    3    2    2    2    1
B <- pmin(A, 2)
#> B
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    2    1    1    2    2
#[2,]    1    2    1    2    2
#[3,]    2    2    1    2    1
#[4,]    2    2    2    2    2
#[5,]    2    2    2    2    1


更新

由于您具有计算科学的背景,因此我想提供更多信息.


update

Since you have background in computational science, I would like to provide more information.

pmin速度很快,但远非高性能.其前缀"parallel"仅表示element-wise. R中矢量化"的含义与HPC中的"SIMD矢量化"不同. R是一种解释型语言,因此R中的矢量化"意味着选择C级循环,而不是R级循环.因此,pmin只是用普通的C循环编码.

pmin is fast, but is far from high performance. Its prefix "parallel" only suggests element-wise. The meaning of "vectorization" in R is not the same as "SIMD vectorization" in HPC. R is an interpreted language, so "vectorization" in R means opting for C level loop rather than R level loop. Therefore, pmin is just coded with a trivial C loop.

真正的高性能计算应该受益于SIMD向量化.我相信您知道SSE/AVX内在函数.因此,如果使用SSE2中的_mm_min_pd编写简单的C代码,则pmin的速度将提高约2倍;如果您在AVX中看到_mm256_min_pd,则您将获得pmin约4倍的加速速度.

Real high performance computing should benefit from SIMD vectorization. I believe you know SSE/AVX intrinsics. So if you write a simple C code, using _mm_min_pd from SSE2, you will get ~2 times speedup from pmin; if you see _mm256_min_pd from AVX, you will get ~4 times speedup from pmin.

很遗憾,R本身无法执行任何SIMD.我对 Does R上的帖子有回答关于此问题,在进行矢量化计算时会利用SIMD吗?对于您的问题,即使将R链接到HPC BLAS,pmin也不会从SIMD中受益,仅因为pmin不涉及任何BLAS操作.因此,最好的办法是自己编写编译后的代码.

Unfortunately, R itself can not do any SIMD. I have an answer to a post at Does R leverage SIMD when doing vectorized calculations? regarding this issue. For your question, even if you link your R to a HPC BLAS, pmin will not benefit from SIMD, simply because pmin does not involve any BLAS operations. So a better bet is to write compiled code yourself.

这篇关于向量化矩阵的min()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆