循环语句的性能并预先分配循环语句本身 [英] Looping statements performance and pre-allocating the looping statement itself

查看:153
本文介绍了循环语句的性能并预先分配循环语句本身的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个观察并不重要,因为循环语句浪费的时间性能可能比循环本身高得多。但无论如何,我会分享它,因为我搜索,找不到这个话题。我总是有这样的印象,预先分配我会循环的数组,然后循环,将比直接循环,并决定检查它。代码是比较这两个函数之间的效率:

$ $ p $ $ $ $ c $ disp('Pure for with column on statement:')
tic
for k = 1:N
end
toc

disp('Pure for with column declared before statement:')
tic
M = 1:N;
for k = m
end
toc

但结果我得到的是:

 对于在列上声明的纯
经过的时间为0.003309秒。
声明之前声明为列的纯粹
经过的时间为0.208744秒。

为什么这是那个?不应该预先分配更快?



事实上,的matlab 帮助说:


由于索引向量永远不会被创建,所以在FOR语句中冒号表达式出现
时,长循环的内存效率更高。因此,与我的预期相矛盾的是,for语句中的列表达式更好,因为它不分配向量,因此,速度更快。



我做了以下脚本来测试其他场合,我也会认为这会更快:

 %用于比较:
N = 1000000;
$ b $ disp('Pure for循环声明在声明单元上:')
tic
for k = repmat({1},1,N)
end
toc

disp('声明之前声明的纯循环单元格:')
tic
mcell = repmat({1},1,N);
for k = mcell
end
toc

disp('纯循环计算长度声明:')
tic
for k = 1:length(mcell)
end
toc

disp('语句之前的纯循环计算长度:')
tic
lMcell = length (MCELL);
for k = 1:lMcell
end
toc

disp('Pure while循环使用le:')
%比较:

k = 1; (k <= N)
k = k + 1;
end
toc

disp('Pure while循环使用lt + 1:')
%比较:
tic
k = 1 ; (k k = k + 1;
end
toc


disp('Pure while循环使用lt + 1预分配:')
tic
k = 1;
myComp = N + 1;
while(k k = k + 1;
end
toc

而且时间是:

 在语句中声明的纯循环单元格:
已用时间为0.259250秒。
在语句之前声明的单元格上的纯循环:
已用时间为0.260368秒。
声明中的纯循环计算长度:
已用时间为0.012132秒。
声明之前的纯循环计算长度:
已用时间为0.003027秒。
纯循环使用le:
已用时间为0.005679秒。
使用lt + 1的纯循环:
已用时间为0.006433秒。
使用lt + 1预分配的Pure while循环:
已用时间为0.005664秒。

结论:


  • 通过在逗号语句上循环,您可以获得一些性能,但与在for循环中花费的时间相比可以忽略不计。

  • 对于单元格,差异似乎可以忽略不计。

  • 在执行循环之前预分配长度比较好。

  • 如前所述,分配向量是有意义的

  • 如预期的那样,在while语句之前计算固定表达式是更好的选择。


但是我不能回答的问题是,单元怎么样,为什么没有时间差?开销可能比观察到的要少得多?或者它必须分配的单元格,因为它不是一个基本的类型作为一个双?



如果你知道关于这个主题的其他技巧填充自由添加。





只要添加时间来显示特征('accel','off')的结果,
$ b

 对于在列上声明的纯
经过的时间是0.181592秒。
声明之前声明的列为纯
已用时间为0.180011秒。
在声明上声明的单元格上的纯循环:
已用时间为0.242995秒。
语句之前声明的纯循环单元格:
已用时间为0.228705秒。
声明中的纯循环计算长度:
已用时间为0.178931秒。
语句之前的纯循环计算长度:
已用时间为0.178486秒。
纯循环使用le:
已用时间为1.138081秒。
使用lt + 1的纯循环:
已用时间为1.241420秒。
使用lt + 1预分配的Pure while循环:
已用时间为1.162546秒。

现在的结果与预期一致...

中为语句插入冒号操作符时,它告诉matlab使用多个内核(即多线程)。如果你使用特性('accel','off')在一个内核上设置matlab,不同于双打消失。关于 cells ,matlab没有使用多线程 - 因此没有任何区别(无论 accel 的状态是)。
$ b 循环的在使用冒号时是多线程的,并且只有在冒号被使用时。以下类似长度的向量不涉及多个内核:

$ ul $ b $ li $ for k = randperm(N)
,code>
  • >

    但是对于k = 1:0.9999:N 是多线程的。

    一个解释可以在这个 matlab的支持页面。它指出,当由算法执行的算法中的操作很容易被划分成可以同时执行的部分时,可以完成多核处理。使用冒号运算符,Matlab知道可以对进行分区。


    This observation is not that important, because the time performance wasted on the loop statements will probably be much higher than the looping itself. But anyway, I will share it since I searched and couldn't find a topic about this. I always had this impression that pre-allocating the array I would loop, and then loop on it, would be better than looping directly on it, and decided to check it. The code would be to compare the efficiency between this two fors:

    disp('Pure for with column on statement:')
    tic
    for k=1:N
    end
    toc
    
    disp('Pure for with column declared before statement:')
    tic
    m=1:N;
    for k=m
    end
    toc
    

    But the results I got are:

    Pure for with column on statement:
    Elapsed time is 0.003309 seconds.
    Pure for with column declared before statement:
    Elapsed time is 0.208744 seconds.
    

    Why the hell is that? Shouldn't pre-allocating be faster?

    In fact, the matlab help for says:

    Long loops are more memory efficient when the colon expression appears in the FOR statement since the index vector is never created.

    So, contradicting my expectations the column expression at the for statement is better, because it does not allocate the vector and, because of that, is faster.

    I made the following script to test other occasions that I also would think that would be faster:

    % For comparison:
    N=1000000;
    
    disp('Pure for loop on cell declared on statement:')
    tic
    for k=repmat({1},1,N)
    end
    toc
    
    disp('Pure for loop on cell declared before statement:')
    tic
    mcell=repmat({1},1,N);
    for k=mcell
    end
    toc
    
    disp('Pure for loop calculating length on statement:')
    tic 
    for k=1:length(mcell)
    end
    toc
    
    disp('Pure for loop calculating length before statement:')
    tic
    lMcell = length(mcell);
    for k=1:lMcell
    end
    toc
    
    disp('Pure while loop using le:')
    % While comparison:
    tic
    k=1;
    while (k<=N)
      k=k+1;
    end
    toc
    
    disp('Pure while loop using lt+1:')
    % While comparison:
    tic
    k=1;
    while (k<N+1)
      k=k+1;
    end
    toc
    
    
    disp('Pure while loop using lt+1 pre allocated:')
    tic
    k=1;
    myComp = N+1;
    while (k<myComp)
      k=k+1;
    end
    toc
    

    And the timings are:

    Pure for loop on cell declared on statement:
    Elapsed time is 0.259250 seconds.
    Pure for loop on cell declared before statement:
    Elapsed time is 0.260368 seconds.
    Pure for loop calculating length on statement:
    Elapsed time is 0.012132 seconds.
    Pure for loop calculating length before statement:
    Elapsed time is 0.003027 seconds.
    Pure while loop using le:
    Elapsed time is 0.005679 seconds.
    Pure while loop using lt+1:
    Elapsed time is 0.006433 seconds.
    Pure while loop using lt+1 pre allocated:
    Elapsed time is 0.005664 seconds.
    

    Conclusions:

    • You can gain a bit of performance just by loop on comma statements, but that can be negligible comparing to the time spent on the for-loop.
    • For cells the difference seems to be negligible.
    • It is better to pre-allocate length before doing the loop.
    • The while has the same efficiency as the for without pre-allocating the vector, which makes sense as stated before
    • As expected, it is better to calculate fixed expressions before the while statement.

    But the question that I can't answer is, what about the cell, why isn't there time difference? The overhead could be much lesser than the observed? Or it has to allocate the cells since it is not a basic type as a double?

    If you know other tricks concerning this topic fill free to add.


    Just adding the timings to show the results of turning feature('accel','off') as said in @Magla's answer.

    Pure for with column on statement:
    Elapsed time is 0.181592 seconds.
    Pure for with column declared before statement:
    Elapsed time is 0.180011 seconds.
    Pure for loop on cell declared on statement:
    Elapsed time is 0.242995 seconds.
    Pure for loop on cell declared before statement:
    Elapsed time is 0.228705 seconds.
    Pure for loop calculating length on statement:
    Elapsed time is 0.178931 seconds.
    Pure for loop calculating length before statement:
    Elapsed time is 0.178486 seconds.
    Pure while loop using le:
    Elapsed time is 1.138081 seconds.
    Pure while loop using lt+1:
    Elapsed time is 1.241420 seconds.
    Pure while loop using lt+1 pre allocated:
    Elapsed time is 1.162546 seconds.
    

    The results now area as expected…

    解决方案

    This finding has nothing to do with preallocating or not: it deals with matlab being enable or not to compute things with several cores. When you insert the colon operator within the for statement, it tells matlab to use several cores (i.e. multithreading).

    If you set matlab on one core only with feature('accel','off'), the observed difference with doubles vanishes. Concerning cells, matlab does not make use of multithreading - therefore no difference can be observed (whatever the status of accel is).

    The for loop is multithreaded when a colon is used, and only if a colon is used. The following vectors of similar length does not engage several cores:

    • for k = randperm(N)
    • for k = linspace(1,N,N)

    but for k = 1:0.9999:N is multithreaded.

    One explanation can be found on this matlab's support page. It states that multi-core processing can be done when "The operations in the algorithm carried out by the function are easily partitioned into sections that can be executed concurrently.". With a colon operator, Matlab knows that for can be partitioned.

    这篇关于循环语句的性能并预先分配循环语句本身的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆