Julia中的push!()和append!()方法的效率如何? [英] How efficient are push!() and append!() methods in Julia?

查看:926
本文介绍了Julia中的push!()和append!()方法的效率如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

页面上,它说方法append!()非常有效.

On this page it says that methods push!() and append!() are very efficient.

我的问题是它们到底有多有效?

My question is how exactly efficient they are?

如果一个人知道最终阵列的大小,是否仍然可以更快地预分配阵列或使用append!()/push!()逐步增加阵列的效率呢?

If one knows the size of the final array, is it still faster to preallocate the array or growing it incrementally using append!() / push!() would be just as efficient?

现在考虑一种情况,即不知道最终数组的大小.例如,将多个数组合并为一个大数组(称为A).

Now consider the case when one does not know the size of the final array. For example, merging multiple arrays into 1 big array (call it A).

两种实现方法:

  1. append!()-将每个数组分配到大小尚未预先分配的A.
  2. 每个数组的第一个维数总和,以找到合并后的数组A的最终大小.然后预分配A并复制每个数组的内容.
  1. append!()-ing each array to A, whose size has not been preallocated.
  2. First sum dimensions of each array to find the final size of the merged array A. Then preallocate A and copy over contents of each array.

在这种情况下,哪个会更有效?

Which one would be more efficient in this case?

推荐答案

对这样的问题的答案通常是:取决于".例如,您要制作什么尺寸的数组?数组的元素类型是什么?

The answer to a question like this is usually: "it depends". For example, what size array are you trying to make? What is the element-type of the array?

但是,如果您只是想尝试一下,为什么不运行简单的速度测试呢?例如,以下代码段:

But if you're just after a heuristic, why not run a simple speed test? For example, the following snippet:

function f1(N::Int)
    x = Array(Int, N)
    for n = 1:N
        x[n] = n
    end
    return(x)
end

function f2(N::Int)
    x = Array(Int, 0)
    for n = 1:N
        push!(x, n)
    end
    return(x)
end

f1(2)
f2(2)

N = 5000000000
@time f1(N)
@time f2(N)

建议使用push!的速度大约是预分配速度的6倍.如果您使用append!以较少的步长添加较大的块,则乘数几乎肯定会更少.

suggests that using push! is about 6 times slower than pre-allocating. If you were using append! to add larger blocks with less steps, the multiplier would almost certainly be less.

在解释这些数字时,请抗拒什么!?慢6倍!?"的下意识反应.此数字需要放在数组构建对整个程序/函数/子例程有多重要的上下文中.例如,如果数组构建仅包含例程运行时的1%(对于大多数典型例程,数组构建将包含很多小于1%),则如果例程运行100秒,花费1秒的时间建立阵列.将其乘以6得到6秒. 99秒+ 6秒= 105秒.因此,使用push!而不是预分配可将整个程序的运行时间延长5%.除非您从事高频交易,否则您可能根本不会在意.

When interpreting these numbers, resist the knee-jerk reaction of "What!? 6-times slower!?". This number needs to be placed in the context of how important array building is to your entire program/function/subroutine. For example, if array building comprises only 1% of the run-time of your routine (for most typical routines, array building would comprise much less than 1%), then if your routine runs for 100 seconds, 1 second is spent building arrays. Multiply that by 6 to get 6 seconds. 99 seconds + 6 seconds = 105 seconds. So, using push! instead of pre-allocating increases the runtime of your whole program by 5%. Unless you work in high-frequency trading, you're probably not going to care about that.

对于我自己,我通常的规则是:如果我可以轻松地进行预分配,那么就可以进行预分配.但是,如果push!使例程更容易编写代码,引入错误的可能性更低,并且尝试预定合适的数组大小的麻烦更少,那么我会使用push!,而无需三思而后行.

For myself, my usual rule is this: if I can pre-allocate easily, then I pre-allocate. But if push! makes the routine much easier to code, with lower possibility of introducing bugs, and less messing around trying to pre-determine the appropriate array size, then I use push! without a second thought.

最后说明:如果您想真正了解push!的工作原理,则需要深入研究C例程,因为

Final note: if you want to actually look at the specifics of how push! works, you'll need to delve into the C routines, since the julia source just wraps a ccall.

更新:OP在注释中质疑了push!与MATLAB中类似array(end+1) = n的操作之间的区别.我最近没有在MATLAB中进行编码,但是由于我所有较早论文的代码都在MATLAB中,因此我确实在机器上保留了副本.我当前的版本是R2014a.我的理解是,在此版本的MATLAB中,添加到数组的末尾将重新分配 entire 数组.相反,据我所知,Julia中的push!可以正常工作,就像.NET中的列表一样.随着向量大小的增加,分配给向量的内存将动态添加到块中.尽管我的理解是仍然需要进行一些重新分配,但是这大大减少了需要执行的重新分配量(我很乐意对此进行更正).因此,push!应该比在Matlab中添加到数组中 更快.这样我们就可以运行以下MATLAB代码:

UPDATE: OP questioned in the comments the difference between push! and an operation like array(end+1) = n in MATLAB. I haven't coded in MATLAB recently, but I do keep a copy on my machine since the code for all my older papers is in MATLAB. My current version is R2014a. My understanding is that in this version of MATLAB, adding to the end of the array will re-allocate the entire array. In contrast, push! in Julia works, to the best of my knowledge, much like lists in .NET. The memory allocated to the vector is dynamically added in blocks as the size of the vector grows. This massively reduces the amount of re-allocation that needs to be performed, although my understanding is that some re-allocation is still necessary (I'm happy to be corrected on this point). So push! should work much faster than adding to an array in Matlab. So we can run the following MATLAB code:

N = 10000000;
tic
x = ones(N, 1);
for n = 1:N
    x(n) = n;
end
toc


N = 10000000;
tic
x = [];
for n = 1:N
    x(end+1) = n;
end
toc

我得到:

Elapsed time is 0.407288 seconds.
Elapsed time is 1.802845 seconds.

因此,速度降低了约5倍.考虑到计时方法中所采用的极端严格性,人们可能会想说这等同于朱莉娅案.但是,等等,如果我们使用N = 10000000在Julia中重新运行该练习,则计时时间为0.01和0.07秒.这些数字的数量与MATLAB数量的绝对差异使我非常难以确定引擎盖下实际发生的事情,以及将MATLAB的5倍减速与MATLAB的6倍减速进行比较是否合法.朱莉娅基本上,我现在已经不懂事了.也许更了解MATLAB实际功能的人可以提供更多的见解.关于Julia,我不是C编码员,所以我怀疑通过查看源代码可以得到很多见识(该源代码是公开可用的,与MATLAB不同).

So, about a 5-times slowdown. Given the extreme non-rigour applied in the timing methodology, one might be tempted to say this is equivalent to the Julia case. But wait, if we re-run the exercise in Julia with N = 10000000, the timings are 0.01 and 0.07 seconds. The sheer difference in the magnitude of these numbers to the MATLAB numbers makes me very nervous about making claims about what is actually happening under the hood, and whether it is legitimate to compare the 5-times slowdown in MATLAB to the 6-times slowdown in Julia. Basically, I'm now out of my depth. Maybe someone who knows more about what MATLAB actually does under the hood can offer more insight. Regarding Julia, I'm not much of a C-coder, so I doubt I'll get much insight from looking through the source (which is publicly available, unlike MATLAB).

这篇关于Julia中的push!()和append!()方法的效率如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆