Apply与Map的内存使用情况.虚拟内存的使用和锁定 [英] Memory use of Apply vs Map. Virtual memory use and lock-ups
问题描述
我需要找到一长串对中所有数字对的总和.在Mathematica中有很多方法可以做到这一点,但是我在考虑使用Plus
或Total
.由于Total
用于列表,因此Map
是在其中使用的功能编程工具,而级别1(@@@)的Apply
是用于Plus
的工具,因为Plus
将数字用作添加为参数.
I needed to find the sum of all pairs of numbers in a long list of pairs. Lots of ways to do this in Mathematica, but I was thinking of using either Plus
or Total
. Since Total
works on lists, Map
is the functional programming instrument to use there and Apply
at level 1 (@@@) is the one to use for Plus
, as Plus
takes the numbers to be added as arguments.
以下是一些演示代码(警告:在执行此操作之前,请保存所有工作!):</p>
Here is some demo code (warning: save all your work before executing this!):
pairs = Tuples[Range[6000], {2}]; (* toy example *)
TimeConstrained[Plus @@@ pairs; // Timing, 30]
(* Out[4]= {21.73, Null} *)
Total /@ pairs; // Timing
(* Out[5]= {3.525, Null} *)
您可能已经注意到,我已经在Plus
的代码中添加了TimeConstrained
.这是我为您提供的一项保护措施,因为裸露的代码几乎使我的PC瘫痪了.实际上,上面的代码对我有用,但是如果我将第一行的范围增加到7000,我的计算机只会锁定并且永远不会恢复.没有任何效果,没有alt周期,没有程序切换,ctrl-alt-delete,尝试使用任务栏启动进程管理器,关闭笔记本电脑的盖子使其进入睡眠状态等.
You might have noticed that I've added TimeConstrained
to the code for Plus
. This is a protective measure I included for you because the bare code brought my PC almost to its knees. In fact, the above code works for me, but if I increase the range in the first line to 7000 my computer just locks up and never gets back. Nothing works, no alt-period, program switching, ctrl-alt-delete, attempts to fire up the process manager using the taskbar, closing the laptop lid to let it sleep, etc., really nothing.
该问题是由Plus @@@ pairs
行的极端内存使用引起的. "pairs"本身占用约288 MB,而列表的总数仅占其中一半,而Plus行的计算很快就消耗了约7 GB.这是我的可用物理内存的结尾,任何更大的内存都会导致磁盘上虚拟内存的使用.使用虚拟内存时,Mathematica和/或Windows显然不能很好地发挥作用(顺便说一句,MacOS和Linux的性能更好吗?).相比之下,总计"行对内存使用情况图没有明显影响.
The problem is caused by the extreme memory use of the Plus @@@ pairs
line. While 'pairs' itself takes up about 288 MB, and the list of totals half of that, the Plus line quickly consumes about 7 GB for its calculations. This is the end of my free physical memory and anything bigger causes the use of virtual memory on disk. And Mathematica and/or Windows apparently don't play nice when virtual memory is used (BTW, do MacOS and Linux behave better?). In contrast, the Total line doesn't have a noticeable impact on the memory usage graph.
我有两个问题:
- 考虑到文档中所述的
Plus
和Total
之间的等价关系("Total [list]等同于Apply [Plus,list]".)如何解释极端差异在行为上?我认为这与Apply
和Map
之间的差异有关,但是我对所涉及的内部机制感到好奇. - 我知道我可以通过使用
MemoryConstrained
来限制命令的内存占用量,但是必须在所有您怀疑Mathematica可能会占用所有系统资源的地方使用它,这是一个痛苦.我是否可以使用全局设置来告诉Mathematica在其所有操作中仅使用物理内存(或最好是其特定部分)?这将非常有帮助,因为在过去的几周中,这种行为导致了几次锁定,并且确实开始惹恼我.
- Given the equivalence between
Plus
andTotal
as stated in the documentation ("Total[list] is equivalent to Apply[Plus,list]." ) how to explain the extreme difference in behavior? I assume this has to do with the differences betweenApply
andMap
, but I'm curious as to the internal mechanisms involved. - I know I can restrict the memory footprint of a command by using
MemoryConstrained
, but it is a pain to have to use this everywhere where you suspect Mathematica might usurp all of your system resources. Is there a global setting that I can use to tell Mathematica to use physical memory only (or, preferably, a certain fraction thereof) for all of its operations? This would be extremely helpful as this behavior has caused a handful of lockups the last couple of weeks and it's really starting to annoy me.
推荐答案
我只想添加一些观察结果,可以进一步澄清这种情况.如@Joshua的回答所述(另请参阅中,我们可以看到Apply
的编译长度设置为无穷大-即使尝试自动编译Apply
,通常也没有意义.然后可能是,当编译长度大于实际数组维数时,它将解压缩.当我们将"ApplyCompileLength"
设置为有限长度时,行为的确发生了变化:
I just want to add a couple of observations that may clarify the situation a bit more. As noted in the answer by @Joshua (see also the comments to this post for a similar discussion), the reason for inefficiency is related to unpacking. My guess is that the general reason why Apply
unpacks is that the compiler (Compile
) has a very limited support for Apply
- namely, only 3 heads can be used - List
, Plus
and Times
. For this reason, in the SystemOptions["CompileOptions"]
, we can see that the compile length for Apply
is set to infinity - it just does not make sense in general to even attempt auto-compiling Apply
. And then probably, when the compilation length is larger than the real array dimension, it unpacks. When we set the "ApplyCompileLength"
to a finite length, the behavior does change:
On["Packing"]
pairs=Tuples[Range[2000],{2}];
SetSystemOptions["CompileOptions"->"ApplyCompileLength"->100];
TimeConstrained[Plus@@@pairs;//Timing,30]
{0.594,Null}
再次将其更改可恢复观察到的初始行为:
Changing it back again restores the observed initial behavior:
In[34]:=
SetSystemOptions["CompileOptions" -> "ApplyCompileLength" -> Infinity];
TimeConstrained[Plus @@@ pairs; // Timing, 30]
During evaluation of In[34]:= Developer`FromPackedArray::punpack1: Unpacking
array with dimensions {4000000,2}. >>
Out[35]= {2.094, Null}
关于您的第二个问题:也许,限制内存的系统方法是遵循@Alexey Popkov所做的工作,通过使用主内核来控制从内核,一旦内存不足,该从内核就会重新启动.我可以提供一种技巧,它远不那么复杂,但仍有一定用处.以下功能
Regarding your second question: perhaps, the systematic way to constrain the memory is along the lines of what @Alexey Popkov did, by using the master kernel to control the slave kernel that is restarted once the memory is low. I can offer a hack that is far less sophisticated but may still be of some use. The following function
ClearAll[totalMemoryConstrained];
SetAttributes[totalMemoryConstrained, HoldRest];
Module[{memException},
totalMemoryConstrained[max_, body_, failexpr_] :=
Catch[MemoryConstrained[body,
Evaluate[
If[# < 0, Throw[failexpr, memException], #] &@(max -
MemoryInUse[])], failexpr], memException]];
将尝试限制内核使用的总内存,而不仅仅是在给定的特定计算中.因此,您可以尝试一次将其包装在顶级函数调用周围.由于它依赖于MemoryConstrained
和MemoryInUse
,因此仅与它们一样好.有关如何使用它的更多详细信息,请参见
will attempt to constrain the total memory used by the kernel, not just in a given particular computation. So, you can try wrapping it around your top-level function call, just once. Since it relies on MemoryConstrained
and MemoryInUse
, it is only as good as they are. More details on how it can be used, can be found in this Mathgroup post. You can use $Pre
to automate the application of this to your input, and reduce the amount of boilerplate code.
这篇关于Apply与Map的内存使用情况.虚拟内存的使用和锁定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!