证明在循环中使用范围运算符不会占用额外的内存 [英] Prove that using a range operator in a loop does not use additional memory

查看:69
本文介绍了证明在循环中使用范围运算符不会占用额外的内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

范围运算符.. 的当前文档指出不会为计数循环:

...范围运算符对于编写foreach (1..10)循环和对数组执行切片操作很有用. 在当前实现中,当将range运算符用作foreach循环中的表达式时,不会创建临时数组,但是当您编写如下代码时,旧版本的Perl可能会消耗大量内存:

... The range operator is useful for writing foreach (1..10) loops and for doing slice operations on arrays. In the current implementation, no temporary array is created when the range operator is used as the expression in foreach loops, but older versions of Perl might burn a lot of memory when you write something like this:

    1.   for (1 .. 1_000_000) {
    2.       # code
    3.   }

由于上述for (MIN .. MAX)的早期实现,我仍然来自那些对使用计数循环持谨慎态度的专家,因为他们认为计数循环等同于:

Because of the aforementioned early implementations of for (MIN .. MAX), I still come by experts who are wary of using counting loops because they believe it is equivalent to:

my @temp_array = (MIN .. MAX);       # Needlessly using up memory
for (@temp_array) {

相对于逻辑和内存效率更高:

Versus the more logical and memory efficient:

for ($_ = MIN; $_ <= MAX; $_++) {    # Logical counting from MIN to MAX

问题:

  • 有没有办法证明计数循环不会浪费内存?

  • Is there a way that one could go about proving that a counting loop does not waste memory?

有人知道哪个版本的Perl存在内存问题以及何时修复?

Does anyone know which versions of Perl had the memory issue and when it was fixed?

我能够向我自己证明,使用下面的单线计数循环不会浪费内存,如果它实际上正在创建一个临时数组,那肯定会使我的系统崩溃.但是,如果有关于该主题的更多结论性信息,那就太好了,这样我们就可以平息这个老妇的故事了.

I'm able to prove to myself that counting loops don't waste memory using the below one-liner which would certainly crash my system if it was actually creating a temporary array. However, it would be nice if there was more conclusive information on the subject so that we could put this old-wives tale to rest.

$ perl -e 'for (1 .. 1_000_000_000_000_000) { print "$_\n"; last if $_ == 5 }'
1
2
3
4
5


解决方案

以下三个答案中的每一个都是解释此问题的部分方法:


Solution

Each of the three below answers go part of the way to explaining this issue:

  1. ikegami 答案彻底分解了不同类型的循环,并演示了计数循环在前端和后端有何不同. ☆
  2. friedo 答案显示了如何使用top监视内存使用情况.
  3. Borodin 答案解决了我的查询的第二部分,即何时不再成为问题:
    • perlop v5.4_68 (1998年6月23日发布)警告使用了临时数组.
    • perlop v5.4_69 (1998年6月29日发布)指出不再使用临时数组.
    • (1998年7月9日发布)指出,计数循环已得到优化.
  1. ikegami's answer thoroughly breaks down different types loops and demonstrates how counting loops differ on the front and back end. ☆
  2. friedo's answer shows how to use top to monitor memory usage.
  3. Borodin's answer addresses the second part of my query about when this no longer became an issue:
    • perlop v5.4_68 (released on 23 June 1998) warns a temporary array is used.
    • perlop v5.4_69 (released on 29 June 1998) states a temporary array is no longer used.
    • perldelta v5.4_71 (released on 9 July 1998) states that counting loops are optimized.

我可能会在某个时候做一些特定的版本测试,但是鉴于这显然是一个16年的历史了,我相信perlop中的警告可以被搁置.

I might do some specific version testing at some point, but given this is apparently a 16 year old issue, I'm confident that the warning in perlop can be put to rest.

推荐答案

首先,这里列出了不同类型的for循环以及可以应用的优化.从5.6到5.20(包括现在)的所有版本的Perl中都存在所有这些,我相信它是全面的.

First of all, here's a list of different types of for loops and the optimizations that can be applied. All of these are present in every version of Perl from 5.6 to 5.20 (present) inclusive, and I believe it's comprehensive.

  • for (EXPR; EXPR; EXPR)
    ⇒ "C样式的循环",一个增强的while循环.
  • for (EXPRX..EXPRY)
    ⇒范围和其他所有内容都没有优化到计数循环中.
  • for (@ARRAY)
    ⇒数组,没有其他东西可以直接访问而不是被展平.
  • for (reverse LIST)
    ⇒以上优化都不适用,但是列表是反向遍历的,而不是反向遍历的.
  • for (LIST)
    ⇒在通用的foreach循环中,LIST表达式在循环开始之前求值.
  • for (EXPR; EXPR; EXPR)
    ⇒ "C-style for loop", an augmented while loop.
  • for (EXPRX..EXPRY)
    ⇒ A range and nothing else is optimized into a counting loop.
  • for (@ARRAY)
    ⇒ An array and nothing else is accessed directly rather than being flattened.
  • for (reverse LIST)
    ⇒ None of the above optimizations apply, but the list is traversed in reverse rather than being reversed.
  • for (LIST)
    ⇒ In a generic foreach loop, the LIST expression is evaluated before the loop starts.

CONSTX..CONSTY展平时(即,除了for (CONSTX..CONSTY)中的任何地方),将在编译时(而不是运行时)展平.

When CONSTX..CONSTY is flattened (i.e. anywhere other than in for (CONSTX..CONSTY)), it is flattened at compile-time rather than run-time.

基准内存使用情况:

$ perl -e'system(ps, ho, rss, 0+$$);'
 1540         # 1.5 MiB

一般情况趋于平缓.

$ perl -e'$y=2_000_000; for ((),1..$y) { system(ps, ho, rss, 0+$$); last }'
80208         # 78 MiB

或更糟. (除了正常的堆栈使用率,它还在编译时展平为一个数组.)

Or worse. (It flattens into an array at compile-time in addition to the normal stack usage.)

$ perl -e'for ((),1..2_000_000) { system(ps, ho, rss, 0+$$); last }'
143224        # 140 MiB

for (CONST..CONST)不会变平.

$ perl -e'for (1..2_000_000) { system(ps, ho, rss, 0+$$); last }'
 1540         # 1.5 MiB

实际上,for (EXPR..EXPR)通常不会变平.

In fact, for (EXPR..EXPR) in general doesn't flatten.

$ perl -e'$y=2_000_000; for (1..$y) { system(ps, ho, rss, 0+$$); last }'
 1540         # 1.5 MiB

即使没有工具,您也可以分辨出编译时间的差异.

Even without tools, you could tell the the difference in compilation time.

$ time perl -c -e'1 for 1..2_000_000'
-e syntax OK

real    0m0.010s
user    0m0.004s
sys     0m0.000s

$ time perl -c -e'1 for (),1..2_000_000'
-e syntax OK

real    0m1.197s
user    0m0.952s
sys     0m0.232s


白盒证明

未优化的情况在 l 的上下文中使用范围运算符.内存中的完整列表.


White box proof

The unoptimized case uses a range operator in list context. Full list in memory.

$ perl -MO=Concise,-exec -e'$y=1_000_000; 1 for (),1..$y;'
...
8  <|> range(other->9)[t3] lK/1                      <-- Range operator
9      <#> gvsv[*y] s
a      <1> flop lKM
           goto b
i  <$> const[IV 1] s
j  <1> flip[t4] lK/LINENUM
b  <#> gv[*_] s
c  <{> enteriter(next->d last->g redo->d) lK/8       <-- No S
...

这是在编译时展平的范围:

This is what a range flattened at compile-time looks like:

$ perl -MO=Concise,-exec -e'1 for (),1..1_000_000;'
...
4  <$> const[AV ] s                                  <-- Constant array
5  <1> rv2av lKPM/1
6  <#> gv[*_] s
7  <{> enteriter(next->8 last->b redo->8) lK/8       <-- No S
...

您会看到for (CONST..CONST)创建带有"S"标志的enteriter.在enteriter上,"S"标志表示它是一个计数循环.

You can see that for (CONST..CONST) creates an enteriter with the "S" flag. On enteriter, the "S" flag means it's a counting loop.

$ perl -MO=Concise,-exec -e'1 for 1..1_000_000;'
...
4  <$> const[IV 1] s
5  <$> const[IV 1000000] s
6  <#> gv[*_] s
7  <{> enteriter(next->8 last->b redo->8) lKS/8       <-- S
...

for (EXPR..EXPR)大致相同.

$ perl -MO=Concise,-exec -e'$y=1_000_000; 1 for 1..$y;'
...
8  <$> const[IV 1] s
9  <#> gvsv[*y] s
a  <#> gv[*_] s
b  <{> enteriter(next->c last->f redo->c) lKS/8       <-- S
...

甚至for (@a)都没有展平!

$ perl -MO=Concise,-exec -e'1 for @a;'
...
4  <#> gv[*a] s
5  <1> rv2av[t2] sKRM/1
6  <#> gv[*_] s
7  <{> enteriter(next->8 last->b redo->8) lKS/8       <-- S
...

仔细检查

$ perl -MO=Concise,-exec -e'1 for (),@a;'
...
4  <#> gv[*a] s
5  <1> rv2av[t2] lKM/1
6  <#> gv[*_] s
7  <{> enteriter(next->8 last->b redo->8) lK/8        <-- No S
...

查找"S"标志的代码将确认所有这些信息.

Looking up the code for the "S" flag will confirm all of this.

  • pp_enteriter (PL_op->op_flags & OPf_STACKED checks for "S")
  • pp_iter.

这篇关于证明在循环中使用范围运算符不会占用额外的内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆