在 Julia 中有效地向矩阵添加标量 [英] Adding a scalar to a matrix efficiently in Julia

查看:35
本文介绍了在 Julia 中有效地向矩阵添加标量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要为一个巨大矩阵的所有元素添加一个标量.矩阵将尽可能大.在示例中,我将使用 2 GiB 的大小,但在我的实际计算中,它会大得多.

I need to add a scalar to all elements of a huge matrix. The matrix will be as big as possible. In the example I will use a size of 2 GiB but in my real computation it will be much larger.

A = rand(2^14, 2^14)

如果我执行

A += 1.0

Julia 分配了额外的 2 GiB 内存.操作大约需要1s.我可以使用 for 循环:

Julia allocates an additional 2 GiB of memory. The operation takes about 1s. I could use a for loop:

for jj = 1:size(A, 2), ii = 1:size(A, 1)
  A[ii, jj] = A[ii, jj] + 1.0
end

这不会分配任何内存,但需要一分钟.这两种方法对我来说都不可行,因为第一种违反了内存限制,而第二种显然效率低下.对于元素乘法,有 scal!,它使用 BLAS.有没有什么方法可以像使用 scal! 一样有效地执行加法?

This does not allocate any memory, but it takes one minute. Both approaches are not viable for me, because the first one violates memory constraints and the second is clearly inefficient. For element-wise multiplication there is scal!, which uses BLAS. Is there any way of performing addition as effciently as multiplication using scal!?

推荐答案

@DSM 的回答很好.但是,这里还有很多事情我想另外说明.你的 for 循环很慢的原因是因为 A 是一个非常量的全局变量,你的代码直接改变了这个全局变量.由于 A 是非常量的,因此代码必须防止在循环执行期间的任何时候 A 成为具有不同类型的不同值的可能性.代码必须在循环的每次迭代中查找 A 的类型和位置,并在表达式 A[ii, jj] = A[ii, jj] 中动态调度方法调用+ 1.0 – 这是对 getindex+setindex! 的调用,所有这些都依赖于静态未知类型的 <代码>A.只需在函数中完成这项工作,您就可以立即获得更好的性能:

@DSM's answer is a good one. There are a number of things going on here that I'd like to address in addition, however. The reason your for loop is slow is because A is a non-constant global variable and your code is directly mutating that global. Since A is non-constant, the code has to guard against the possibility of A becoming a different value with a different type at any point during the execution of the loop. The code has to look up the type and location of A on every iteration of the loop and dynamically dispatch the method calls in the expression A[ii, jj] = A[ii, jj] + 1.0 – that's a call to getindex, + and setindex!, all of which depend on the statically unknown type of A. You can immediately get much better performance just by doing this work in a function:

julia> A = rand(2^10, 2^10);

julia> @time for jj = 1:size(A, 2), ii = 1:size(A, 1)
           A[ii, jj] += 1
       end
elapsed time: 0.288340785 seconds (84048040 bytes allocated, 15.59% gc time)

julia> function inc!(A)
           for jj = 1:size(A, 2), ii = 1:size(A, 1)
               A[ii, jj] += 1
           end
       end
inc! (generic function with 1 method)

julia> @time inc!(A)
elapsed time: 0.006076414 seconds (171336 bytes allocated)

julia> @time inc!(A)
elapsed time: 0.000888457 seconds (80 bytes allocated)

避免像这样的非常量全局变量是 性能提示部分.您可能还想仔细阅读本章的其余部分.

Avoiding non-constant globals like this is the first recommendation in the Performance Tips section of the manual. You'll probably want to peruse the rest of this chapter as well.

我们可以进一步提高 inc! 函数的性能,使用 @inbounds 注释表明此代码不需要边界检查,并使用线性索引而不是二维索引:

We can further improve the performance of the inc! function using the @inbounds annotation to indicate that bounds checks aren't necessary for this code, and by using linear indexing instead of two-dimensional indexing:

julia> function inc!(A)
           @inbounds for i = 1:length(A)
               A[i] += 1
           end
       end
inc! (generic function with 1 method)

julia> @time inc!(A)
elapsed time: 0.000637934 seconds (80 bytes allocated)

大部分加速来自 @inbounds 注释,而不是线性索引,尽管这确实带来了一点速度提升.但是,@inbounds 注释应该谨慎使用,并且只有在确定索引不会越界并且性能至关重要的情况下才可以使用.如您所见,虽然存在额外的性能改进,但并不是压倒性的.大部分好处来自不直接改变全局变量.

Most of the speedup is from the @inbounds annotation rather than the linear indexing, although that does give a little speed boost. The @inbounds annotation should be used sparingly, however, and only where one is both certain that the indexing cannot be out of bounds and performance is of the utmost importance. As you can see, the additional performance improvement while existent, is not overwhelming. Most of the benefit comes from not directly mutating global variables.

这篇关于在 Julia 中有效地向矩阵添加标量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆