Apply与Map的内存使用情况.虚拟内存的使用和锁定 [英] Memory use of Apply vs Map. Virtual memory use and lock-ups

查看:128
本文介绍了Apply与Map的内存使用情况.虚拟内存的使用和锁定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要找到一长串对中所有数字对的总和.在Mathematica中有很多方法可以做到这一点,但是我在考虑使用PlusTotal.由于Total用于列表,因此Map是在其中使用的功能编程工具,而级别1(@@@)的Apply是用于Plus的工具,因为Plus将数字用作添加为参数.

I needed to find the sum of all pairs of numbers in a long list of pairs. Lots of ways to do this in Mathematica, but I was thinking of using either Plus or Total. Since Total works on lists, Map is the functional programming instrument to use there and Apply at level 1 (@@@) is the one to use for Plus, as Plus takes the numbers to be added as arguments.

以下是一些演示代码(警告:在执行此操作之前,请保存所有工作!):<​​/p>

Here is some demo code (warning: save all your work before executing this!):

pairs = Tuples[Range[6000], {2}]; (* toy example *)

TimeConstrained[Plus @@@ pairs; // Timing, 30]

(* Out[4]= {21.73, Null} *)

Total /@ pairs; // Timing

(* Out[5]= {3.525, Null} *)

您可能已经注意到,我已经在Plus的代码中添加了TimeConstrained.这是我为您提供的一项保护措施,因为裸露的代码几乎使我的PC瘫痪了.实际上,上面的代码对我有用,但是如果我将第一行的范围增加到7000,我的计算机只会锁定并且永远不会恢复.没有任何效果,没有alt周期,没有程序切换,ctrl-alt-delete,尝试使用任务栏启动进程管理器,关闭笔记本电脑的盖子使其进入睡眠状态等.

You might have noticed that I've added TimeConstrained to the code for Plus. This is a protective measure I included for you because the bare code brought my PC almost to its knees. In fact, the above code works for me, but if I increase the range in the first line to 7000 my computer just locks up and never gets back. Nothing works, no alt-period, program switching, ctrl-alt-delete, attempts to fire up the process manager using the taskbar, closing the laptop lid to let it sleep, etc., really nothing.

该问题是由Plus @@@ pairs行的极端内存使用引起的. "pairs"本身占用约288 MB,而列表的总数仅占其中一半,而Plus行的计算很快就消耗了约7 GB.这是我的可用物理内存的结尾,任何更大的内存都会导致磁盘上虚拟内存的使用.使用虚拟内存时,Mathematica和/或Windows显然不能很好地发挥作用(顺便说一句,MacOS和Linux的性能更好吗?).相比之下,总计"行对内存使用情况图没有明显影响.

The problem is caused by the extreme memory use of the Plus @@@ pairs line. While 'pairs' itself takes up about 288 MB, and the list of totals half of that, the Plus line quickly consumes about 7 GB for its calculations. This is the end of my free physical memory and anything bigger causes the use of virtual memory on disk. And Mathematica and/or Windows apparently don't play nice when virtual memory is used (BTW, do MacOS and Linux behave better?). In contrast, the Total line doesn't have a noticeable impact on the memory usage graph.

我有两个问题:

  1. 考虑到文档中所述的PlusTotal之间的等价关系("Total [list]等同于Apply [Plus,list]".)如何解释极端差异在行为上?我认为这与ApplyMap之间的差异有关,但是我对所涉及的内部机制感到好奇.
  2. 我知道我可以通过使用MemoryConstrained来限制命令的内存占用量,但是必须在所有您怀疑Mathematica可能会占用所有系统资源的地方使用它,这是一个痛苦.我是否可以使用全局设置来告诉Mathematica在其所有操作中仅使用物理内存(或最好是其特定部分)?这将非常有帮助,因为在过去的几周中,这种行为导致了几次锁定,并且确实开始惹恼我.
  1. Given the equivalence between Plus and Total as stated in the documentation ("Total[list] is equivalent to Apply[Plus,list]." ) how to explain the extreme difference in behavior? I assume this has to do with the differences between Apply and Map, but I'm curious as to the internal mechanisms involved.
  2. I know I can restrict the memory footprint of a command by using MemoryConstrained, but it is a pain to have to use this everywhere where you suspect Mathematica might usurp all of your system resources. Is there a global setting that I can use to tell Mathematica to use physical memory only (or, preferably, a certain fraction thereof) for all of its operations? This would be extremely helpful as this behavior has caused a handful of lockups the last couple of weeks and it's really starting to annoy me.

推荐答案

我只想添加一些观察结果,可以进一步澄清这种情况.如@Joshua的回答所述(另请参阅中,我们可以看到Apply的编译长度设置为无穷大-即使尝试自动编译Apply,通常也没有意义.然后可能是,当编译长度大于实际数组维数时,它将解压缩.当我们将"ApplyCompileLength"设置为有限长度时,行为的确发生了变化:

I just want to add a couple of observations that may clarify the situation a bit more. As noted in the answer by @Joshua (see also the comments to this post for a similar discussion), the reason for inefficiency is related to unpacking. My guess is that the general reason why Apply unpacks is that the compiler (Compile) has a very limited support for Apply - namely, only 3 heads can be used - List, Plus and Times. For this reason, in the SystemOptions["CompileOptions"], we can see that the compile length for Apply is set to infinity - it just does not make sense in general to even attempt auto-compiling Apply. And then probably, when the compilation length is larger than the real array dimension, it unpacks. When we set the "ApplyCompileLength" to a finite length, the behavior does change:

On["Packing"]
pairs=Tuples[Range[2000],{2}];
SetSystemOptions["CompileOptions"->"ApplyCompileLength"->100];
TimeConstrained[Plus@@@pairs;//Timing,30]

{0.594,Null}

再次将其更改可恢复观察到的初始行为:

Changing it back again restores the observed initial behavior:

In[34]:= 
SetSystemOptions["CompileOptions" -> "ApplyCompileLength" -> Infinity];
TimeConstrained[Plus @@@ pairs; // Timing, 30]

During evaluation of In[34]:= Developer`FromPackedArray::punpack1: Unpacking 
array with dimensions  {4000000,2}. >>

Out[35]= {2.094, Null}

关于您的第二个问题:也许,限制内存的系统方法是遵循@Alexey Popkov所做的工作,通过使用主内核来控制从内核,一旦内存不足,该从内核就会重新启动.我可以提供一种技巧,它远不那么复杂,但仍有一定用处.以下功能

Regarding your second question: perhaps, the systematic way to constrain the memory is along the lines of what @Alexey Popkov did, by using the master kernel to control the slave kernel that is restarted once the memory is low. I can offer a hack that is far less sophisticated but may still be of some use. The following function

ClearAll[totalMemoryConstrained];
SetAttributes[totalMemoryConstrained, HoldRest];
Module[{memException},
  totalMemoryConstrained[max_, body_, failexpr_] :=
   Catch[MemoryConstrained[body,
     Evaluate[
       If[# < 0, Throw[failexpr, memException], #] &@(max -
         MemoryInUse[])], failexpr], memException]]; 

将尝试限制内核使用的总内存,而不仅仅是在给定的特定计算中.因此,您可以尝试一次将其包装在顶级函数调用周围.由于它依赖于MemoryConstrainedMemoryInUse,因此仅与它们一样好.有关如何使用它的更多详细信息,请参见

will attempt to constrain the total memory used by the kernel, not just in a given particular computation. So, you can try wrapping it around your top-level function call, just once. Since it relies on MemoryConstrained and MemoryInUse, it is only as good as they are. More details on how it can be used, can be found in this Mathgroup post. You can use $Pre to automate the application of this to your input, and reduce the amount of boilerplate code.

这篇关于Apply与Map的内存使用情况.虚拟内存的使用和锁定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆