MATLAB中的有效数组预分配 [英] Efficient Array Preallocation in MATLAB

查看:205
本文介绍了MATLAB中的有效数组预分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于在MATLAB中有效编程的第一件事就是避免动态调整数组的大小.标准示例如下.

One of the first things one learns about programming efficiently in MATLAB is to avoid dynamically resizing arrays. The standard example is as follows.

N = 1000;

% Method 0: Bad
clear a
for i=1:N
    a(i) = cos(i);
end

% Method 1: Better
clear a; a = zeros(N,1);
for i=1:N
    a(i) = cos(i)
end

这里的"Bad"变体需要O(N ^ 2)时间来运行,因为它必须分配一个新数组并在每次循环迭代时复制旧值.

The 'Bad' variant here requires O(N^2) time to run, as it must allocate a new array and copy the old values at each iteration of the loop.

调试时,我自己的首选做法是分配一个带有NaN的数组,该数组很难与一个比0的有效值混淆.

My own preferred practice when debugging is to allocate an array with NaN, harder to confuse with a valid value than 0.

% Method 2: Easier to Debug
clear a; a = NaN(N,1);
for i=1:N
    a(i) = cos(i)
end

但是,人们会天真地认为,一旦我们的代码被调试,我们就在浪费时间,方法是分配一个数组,然后用0NaN填充它.如此处所述,您可以按如下所示创建未初始化的数组

However, one would naively think that once our code is debugged, we're wasting time by allocating an array and then filling it with 0 or NaN. As noted here, you can perhaps create an uninitialized array as follows

% Method 3 : Even Better?
clear a; a(N,1) = 0;
for i=1:N
    a(i) = cos(i);
end

但是,在我自己的测试(MATLAB R2013a)中,我注意到方法1和3之间没有明显的区别,而方法2需要更多时间.这表明MATLAB避免在调用a = zeros(N,1)时将数组显式初始化为零.

However, in my own tests (MATLAB R2013a), I notice no appreciable difference between methods 1 and 3, while method 2 takes more time. This suggests that MATLAB has avoided explicitly initializing the array to zero when a = zeros(N,1) is called.

因此,我很好奇

  • 在MATLAB中预分配(未初始化的)数组的最佳方法是什么? (最重要的是大型阵列)
  • 这也在八度音阶中保持吗?

推荐答案

测试

使用MatLab 2013b I和一个Intel Xeon 3.6GHz + 16GB RAM,我运行了以下代码进行分析.我区分了3种方法,只考虑了1D数组,即向量.方法1和2已使用列向量和行向量(n,1)和(1,n)进行了测试.

Using MatLab 2013b I and an Intel Xeon 3.6GHz + 16GB RAM I ran the code below to profile. I distinguished 3 methods and only considered 1D arrays, i.e. vectors. Methods 1 and 2 have been tested using both column vectors and row vectors, i.e. (n,1) and (1,n).

方法1(M1R,M1C)

a = zeros(1,n);

方法2 M2R,M2C

a = NaN(1,n);

方法3(M3)

a(n) = 0;

结果

计时结果和元素数量已在图timing1D中以双对数刻度绘制.

The timing results and number of elements have been plotted on a duuble logarithmic scale in figure timing1D.

如图所示,第三种方法的赋值几乎与向量的大小无关,而另一种则稳定地增加,暗示了向量的隐式定义.

As shown the third method has assignment almost independent of vector size while the other steadily increase suggesting an implicit definition of the vector.

讨论

MatLab使用JIT(及时)进行了大量代码优化,即在运行时进行代码优化.因此,提出一个更快的代码运行部分是由于编程(无论是否经过优化)还是由于优化是一个合理的问题.要测试此优化,可以使用feature('accel','off')关闭.再次运行代码的结果非常有趣:

MatLab does a lot of code optimization using JIT (Just in time), i.e. code optimization during run-time. So it is a valid question to pose whether or not the part of the code running faster is due to programming (always the same whether or not optimized) or due to optimization. To test this optimization can be turned off by using feature('accel','off'). The results of running the code again are rather interesting:

表明,对于行向量和列向量,现在方法1都是最佳的.方法3的行为与第一次测试中的其他方法一样.

It is shown that now Method 1 is optimal, both for row and column vectors. And method 3 behaves like the other methods in the first test.

结论

优化内存预分配是没有用的,而且浪费时间,因为MatLab仍会为您优化.

请注意,应该预先分配内存,但是使用方式并不重要.预分配内存的性能在很大程度上取决于MatLab的JIT编译器是否选择优化代码.这完全取决于.m文件的所有其他内容,因为编译器会同时考虑代码块,然后尝试进行优化(它甚至具有内存,因此多次运行文件可能会导致更低的执行率,时间).此外,考虑到性能,与之后执行的计算相比,内存预分配通常是一个非常短的过程

Note that memory should be pre-allocated but the way in which you do it doesn't matter. The performance of pre-allocating memory is largely dependent on whether or not the JIT compiler of MatLab chooses to optimize your code or not. This is fully dependent on all other content of your .m-file since the compiler considers chunks of codes at the time and then tries to optimize (it even has a memory so that running a file several times might result in an even lower execution-time). Also memory pre-allocation is most often a very short process considering performance compared to the calculation performed afterwards

我认为应该使用方法1或方法2来预先分配内存,以维护可读代码,并使用MatLab帮助建议的功能,因为将来最有可能对其进行改进.

In my opinion memory should be pre-allocated by either using method 1 or method 2 to maintain a readable code and use the function that MatLab help suggests since these are the most likely to be improved in the future.

使用的代码

clear all
clc
feature('accel','on')

number1D=30;

nn1D=2.^(1:number1D);

timings1D=zeros(5,number1D);

for ii=1:length(nn1D);
    n=nn1D(ii);
    % 1D
    tic
    a = zeros(1,n);
    a(randi(n,1))=1;
    timings1D(1,ii)=toc;
    fprintf('1D row vector method1 took: %f\n',timings1D(1,ii))
    clear a

    tic
    b = zeros(n,1);
    b(randi(n,1))=1;
    timings1D(2,ii)=toc;
    fprintf('1D column vector method1 took: %f\n',timings1D(2,ii))
    clear b

    tic
    c = NaN(1,n);
    c(randi(n,1))=1;
    timings1D(3,ii)=toc;
    fprintf('1D row vector method2 took: %f\n',timings1D(3,ii))
    clear c

    tic
    d = NaN(n,1);
    d(randi(n,1))=1;
    timings1D(4,ii)=toc;
    fprintf('1D row vector method2 took: %f\n',timings1D(4,ii))
    clear d

    tic
    e(n) = 0;
    e(randi(n,1))=1;
    timings1D(5,ii)=toc;
    fprintf('1D row vector method3 took: %f\n',timings1D(5,ii))
    clear e
end
logtimings1D = log10(timings1D);
lognn1D=log10(nn1D);
figure(1)
clf()
hold on
plot(lognn1D,logtimings1D(1,:),'-k','LineWidth',2)
plot(lognn1D,logtimings1D(2,:),'--k','LineWidth',2)
plot(lognn1D,logtimings1D(3,:),'-.k','LineWidth',2)
plot(lognn1D,logtimings1D(4,:),'-','Color',[0.6 0.6 0.6],'LineWidth',2)
plot(lognn1D,logtimings1D(5,:),'--','Color',[0.6 0.6 0.6],'LineWidth',2)
xlabel('Number of elements (log10[-])')
ylabel('Timing of each method (log10[s])')
legend('M1R','M1C','M2R','M2C','M3','Location','NW')
title({'Various methods of pre-allocation in 1D','nr. of elements vs timing'})
hold off

注意

包含c(randi(n,1))=1的行;除了将值1分配给预分配的数组中的随机元素外,不要执行任何操作,以便使用该数组对JIT编译器进行一些挑战.这些行不会显着影响预分配测量,即它们不可测量且不会影响测试.

The lines containing c(randi(n,1))=1; do not do anything except assigning the value one to a random element in the pre-allocated array so that the array is used to challenge the JIT compiler a bit. These lines are not affecting the pre-allocation measurement significantly, i.e. they are not measurable and do not effect the test.

这篇关于MATLAB中的有效数组预分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆