循环语句的性能并预先分配循环语句本身 [英] Looping statements performance and pre-allocating the looping statement itself
问题描述
这个观察并不重要,因为循环语句浪费的时间性能可能比循环本身高得多。但无论如何,我会分享它,因为我搜索,找不到这个话题。我总是有这样的印象,预先分配我会循环的数组,然后循环,将比直接循环,并决定检查它。代码是比较这两个函数之间的效率:
$ $ p $ $ $ $ c $ disp('Pure for with column on statement:')
tic
for k = 1:N
end
toc
disp('Pure for with column declared before statement:')
tic
M = 1:N;
for k = m
end
toc
但结果我得到的是:
对于在列上声明的纯
经过的时间为0.003309秒。
声明之前声明为列的纯粹
经过的时间为0.208744秒。
为什么这是那个?不应该预先分配更快?
事实上,的matlab 帮助说:
由于索引向量永远不会被创建,所以在FOR语句中冒号表达式出现
时,长循环的内存效率更高。因此,与我的预期相矛盾的是,for语句中的列表达式更好,因为它不分配向量,因此,速度更快。
我做了以下脚本来测试其他场合,我也会认为这会更快:
%用于比较:
N = 1000000;
$ b $ disp('Pure for循环声明在声明单元上:')
tic
for k = repmat({1},1,N)
end
toc
disp('声明之前声明的纯循环单元格:')
tic
mcell = repmat({1},1,N);
for k = mcell
end
toc
disp('纯循环计算长度声明:')
tic
for k = 1:length(mcell)
end
toc
disp('语句之前的纯循环计算长度:')
tic
lMcell = length (MCELL);
for k = 1:lMcell
end
toc
disp('Pure while循环使用le:')
%比较:
k = 1; (k <= N)
k = k + 1;
end
toc
disp('Pure while循环使用lt + 1:')
%比较:
tic
k = 1 ; (kk = k + 1;
end
toc
disp('Pure while循环使用lt + 1预分配:')
tic
k = 1;
myComp = N + 1;
while(kk = k + 1;
end
toc
而且时间是:
在语句中声明的纯循环单元格:
已用时间为0.259250秒。
在语句之前声明的单元格上的纯循环:
已用时间为0.260368秒。
声明中的纯循环计算长度:
已用时间为0.012132秒。
声明之前的纯循环计算长度:
已用时间为0.003027秒。
纯循环使用le:
已用时间为0.005679秒。
使用lt + 1的纯循环:
已用时间为0.006433秒。
使用lt + 1预分配的Pure while循环:
已用时间为0.005664秒。
结论:
- 通过在逗号语句上循环,您可以获得一些性能,但与在for循环中花费的时间相比可以忽略不计。
- 对于单元格,差异似乎可以忽略不计。
- 在执行循环之前预分配长度比较好。
- 如前所述,分配向量是有意义的
- 如预期的那样,在while语句之前计算固定表达式是更好的选择。
但是我不能回答的问题是,单元怎么样,为什么没有时间差?开销可能比观察到的要少得多?或者它必须分配的单元格,因为它不是一个基本的类型作为一个双?
如果你知道关于这个主题的其他技巧填充自由添加。
只要添加时间来显示
特征('accel','off')的结果,
$ b对于在列上声明的纯
经过的时间是0.181592秒。
声明之前声明的列为纯
已用时间为0.180011秒。
在声明上声明的单元格上的纯循环:
已用时间为0.242995秒。
语句之前声明的纯循环单元格:
已用时间为0.228705秒。
声明中的纯循环计算长度:
已用时间为0.178931秒。
语句之前的纯循环计算长度:
已用时间为0.178486秒。
纯循环使用le:
已用时间为1.138081秒。
使用lt + 1的纯循环:
已用时间为1.241420秒。
使用lt + 1预分配的Pure while循环:
已用时间为1.162546秒。
现在的结果与预期一致...
中为语句插入冒号操作符时,它告诉matlab使用多个内核(即多线程)。如果你使用特性('accel','off')
在一个内核上设置matlab,不同于双打
消失。关于cells
,matlab没有使用多线程 - 因此没有任何区别(无论accel
的状态是)。
$ b 循环的在使用冒号时是多线程的,并且只有在冒号被使用时。以下类似长度的向量不涉及多个内核:
$ ul $ b $ li $for k = randperm(N)$对于k = linspace(1,N,N)
,code>
>
但是
对于k = 1:0.9999:N
是多线程的。
一个解释可以在这个 matlab的支持页面。它指出,当由算法执行的算法中的操作很容易被划分成可以同时执行的部分时,可以完成多核处理。使用冒号运算符,Matlab知道可以对
进行分区。
This observation is not that important, because the time performance wasted on the loop statements will probably be much higher than the looping itself. But anyway, I will share it since I searched and couldn't find a topic about this. I always had this impression that pre-allocating the array I would loop, and then loop on it, would be better than looping directly on it, and decided to check it. The code would be to compare the efficiency between this two fors:
disp('Pure for with column on statement:') tic for k=1:N end toc disp('Pure for with column declared before statement:') tic m=1:N; for k=m end toc
But the results I got are:
Pure for with column on statement: Elapsed time is 0.003309 seconds. Pure for with column declared before statement: Elapsed time is 0.208744 seconds.
Why the hell is that? Shouldn't pre-allocating be faster?
In fact, the matlab
help for
says:Long loops are more memory efficient when the colon expression appears in the FOR statement since the index vector is never created.
So, contradicting my expectations the column expression at the for statement is better, because it does not allocate the vector and, because of that, is faster.
I made the following script to test other occasions that I also would think that would be faster:
% For comparison: N=1000000; disp('Pure for loop on cell declared on statement:') tic for k=repmat({1},1,N) end toc disp('Pure for loop on cell declared before statement:') tic mcell=repmat({1},1,N); for k=mcell end toc disp('Pure for loop calculating length on statement:') tic for k=1:length(mcell) end toc disp('Pure for loop calculating length before statement:') tic lMcell = length(mcell); for k=1:lMcell end toc disp('Pure while loop using le:') % While comparison: tic k=1; while (k<=N) k=k+1; end toc disp('Pure while loop using lt+1:') % While comparison: tic k=1; while (k<N+1) k=k+1; end toc disp('Pure while loop using lt+1 pre allocated:') tic k=1; myComp = N+1; while (k<myComp) k=k+1; end toc
And the timings are:
Pure for loop on cell declared on statement: Elapsed time is 0.259250 seconds. Pure for loop on cell declared before statement: Elapsed time is 0.260368 seconds. Pure for loop calculating length on statement: Elapsed time is 0.012132 seconds. Pure for loop calculating length before statement: Elapsed time is 0.003027 seconds. Pure while loop using le: Elapsed time is 0.005679 seconds. Pure while loop using lt+1: Elapsed time is 0.006433 seconds. Pure while loop using lt+1 pre allocated: Elapsed time is 0.005664 seconds.
Conclusions:
- You can gain a bit of performance just by loop on comma statements, but that can be negligible comparing to the time spent on the for-loop.
- For cells the difference seems to be negligible.
- It is better to pre-allocate length before doing the loop.
- The while has the same efficiency as the for without pre-allocating the vector, which makes sense as stated before
- As expected, it is better to calculate fixed expressions before the while statement.
But the question that I can't answer is, what about the cell, why isn't there time difference? The overhead could be much lesser than the observed? Or it has to allocate the cells since it is not a basic type as a double?
If you know other tricks concerning this topic fill free to add.
Just adding the timings to show the results of turning
feature('accel','off')
as said in @Magla's answer.Pure for with column on statement: Elapsed time is 0.181592 seconds. Pure for with column declared before statement: Elapsed time is 0.180011 seconds. Pure for loop on cell declared on statement: Elapsed time is 0.242995 seconds. Pure for loop on cell declared before statement: Elapsed time is 0.228705 seconds. Pure for loop calculating length on statement: Elapsed time is 0.178931 seconds. Pure for loop calculating length before statement: Elapsed time is 0.178486 seconds. Pure while loop using le: Elapsed time is 1.138081 seconds. Pure while loop using lt+1: Elapsed time is 1.241420 seconds. Pure while loop using lt+1 pre allocated: Elapsed time is 1.162546 seconds.
The results now area as expected…
解决方案This finding has nothing to do with preallocating or not: it deals with matlab being enable or not to compute things with several cores. When you insert the colon operator within the
for
statement, it tells matlab to use several cores (i.e. multithreading).If you set matlab on one core only with
feature('accel','off')
, the observed difference withdoubles
vanishes. Concerningcells
, matlab does not make use of multithreading - therefore no difference can be observed (whatever the status ofaccel
is).The
for
loop is multithreaded when a colon is used, and only if a colon is used. The following vectors of similar length does not engage several cores:
for k = randperm(N)
for k = linspace(1,N,N)
but
for k = 1:0.9999:N
is multithreaded.One explanation can be found on this matlab's support page. It states that multi-core processing can be done when "The operations in the algorithm carried out by the function are easily partitioned into sections that can be executed concurrently.". With a colon operator, Matlab knows that
for
can be partitioned.这篇关于循环语句的性能并预先分配循环语句本身的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文