内存有效的方式来截断在Matlab大阵 [英] Memory-efficient way to truncate large array in Matlab
问题描述
我在Matlab的大(多GB)阵列,我想truncate¹。天真,我以为可以截断并不需要多大的内存,但后来我意识到,这大概可以:
I have a large (multi-GB) array in Matlab, that I want to truncate¹. Naively, I thought that truncating can't need much memory, but then I realised that it probably can:
>> Z = zeros(628000000, 1, 'single');
>> Z(364000000:end) = [];
Out of memory. Type HELP MEMORY for your options.
除非Matlab的做一些聪明的优化,截断以Z
之前,这个code实际上创建的数组(double类型!) 3.64亿:6.28亿
。我不需要这个数组,所以我可以做,而不是:
Unless Matlab does some clever optimisations, before truncating Z
, this code actually creates an array (of type double!) 364000000:628000000
. I don't need this array, so I can do instead:
>> Z = Z(1:363999999);
在此情况下,第二实施例的工作原理,并且是细我的目的。但是的为什么的操作呢?如果 Z(3.64亿:结束)= 0
失败,因为所需的中间阵列的存储 3.64亿:6.28亿
然后,为什么不 Z = Z(1:363999999)
失败的原因所需的中间阵列的存储 1:363999999
,这是的放大的?当然,我没有的需求的这个中间阵列,并乐意与任何截断我的阵列,而无需任何中间数组,或者,做不到这一点,如果Matlab的优化一个特定的方法。一个解决方案
In this case, the second example works, and is fine for my purpose. But why does it work? If Z(364000000:end) = 0
fails due to the memory needed for the intermediate array 364000000:628000000
, then why does not Z = Z(1:363999999)
fail due to the memory needed for the intermediate array 1:363999999
, that is larger? Of course, I don't need this intermediate array, and would be happy with either a solution that truncates my array without having any intermediate array, or, failing that, if Matlab optimises a particular method.
- 有什么办法截断数组而不需要创建中间索引数组?
- 如果不是,要么前述方法更内存效率比其他(它出现OT IS)?如果是这样,为什么?是否Matlab的的真正的两个实例创建中间阵列?
- Is there any way to truncate an array without creating an intermediate indexing array?
- If not, is either of the aforementioned methods more memory-efficient than the other (it appears ot is)? If so, why? Does Matlab really create intermediate arrays in both examples?
¹Reason:我在处理数据,但不知道多少preallocate。我做一个猜测,我常常分配太多。我选择根据可用内存块的大小,因为较少的块分割意味着更快的code。所以我想,以避免任何不必要的内存使用情况。另请参见这个职位上由块分配。
¹Reason: I'm processing data but don't know how much to preallocate. I make an educated guess, often I'm allocating too much. I choose chunk size based on available memory, because splitting in fewer chunks means faster code. So I want to avoid any needless memory usage. See also this post on allocating by chunk.
推荐答案
我跑的RAM 24GB的机器上这两个例子中使用曲线(' - 记忆'上',);
。该分析器选项将显示内存分配和释放上code的每一行。这些被认为是总的不的净额。我有网0免费的alloc一个简单的功能检查,其报告的总金额。但是,它很可能是没有.M code内置命令备份它们不给细粒度的内存分析器报告。
I ran both examples on a machine with 24GB of RAM with profile('-memory','on');
. This profiler option will show memory allocated and freed on each line of code. These are supposed to be gross not net amounts. I checked with a simple function that has net 0 free and alloc and it reported the gross amounts. However, it seems likely that builtin commands with no .m code to back them do not give fine-grained memory reporting to the profiler.
我跑了几个测试以下code:
I ran a couple tests for the following code:
% truncTest.m
N = 628000000;
M = 364000000;
clear Z
Z = zeros(N,1,'single');
Z(M:end) = [];
Z(1) % just because
clear Z
Z = zeros(N,1,'single');
Z = Z(1:M);
Z(1)
有关它们是什么价值,内存分析这个结果 N
和 M
是:
For what they are worth, the memory profiling results for this N
and M
are:
好吧,这两条线的看的在分配和释放内存方面是相同的。也许这并不是真相的全部。
Well, both lines look the same in terms of memory allocated and freed. Maybe that's not the whole truth.
所以,出于好奇,我减少 M
到 200
(仅200!)不改变<$ C $ ç> N ,没有轮廓清晰
并重新运行。分析要求:
So, out of curiosity I decreased M
to 200
(just 200!) without changing N
, did profile clear
and reran. Profiling claims:
有趣的是, Z = Z(1:M);
现在几乎是瞬间的,而 Z(L:结束)= [];
是快一点。既自由左右的内存2.4GB,符合市场预期。
Interestingly, Z=Z(1:M);
is practically instantaneous now, and Z(M:end)=[];
is a little faster. Both free about 2.4GB of memory, as expected.
最后,如果我们去另一个方向,并设置 M = 6亿;
:
Finally, if we go the other direction and set M=600000000;
:
现在甚至 Z = Z(1:M);
是缓慢的,但对两次快以Z (M:年底)= [];
Now even Z=Z(1:M);
is slow, but about twice as fast as Z(M:end)=[];
.
这表明:
-
Z = Z(1:M);
只是抓住指示元素,将它们存储在一个新的缓冲区或临时变量,释放旧的缓冲区,并指定新的/临时到阵列以Z
。我能够让我弱4GB的机器从2.45秒去只是通过的增加的M
离开<$ C到颠簸5分钟,页面文件$ C> N 孤单。绝对preFER此选项对小M / N
,大概在所有情况下。 -
Z(L:结束)= [];
总是重写缓冲区,并以中号执行时间的增加
太。其实总是慢了,似乎成倍增加,不像Z = Z(1:M);
。 - 内存分析不给细粒度信息有关这些内置的操作,不应该作为PTED给人一种总的内存释放,分配在命令执行misinter $ P $,而是净变化。
Z=Z(1:M);
just grabs the indicated elements, stores them in a new buffer or temporary variable, releases the old buffer and assigns the new/temporary to the arrayZ
. I was able to make my weaker 4GB machine go from 2.45 seconds to thrashing the page file for 5 minutes just by increasingM
and leavingN
alone. Definitely prefer this option for smallM/N
, probably in all cases.Z(M:end)=[];
always rewrites the buffer, and execution time increases withM
too. Actually always slower, and seems to increase exponentially, unlikeZ=Z(1:M);
.- Memory profiling does not give fine-grained information about these builtin operations and should not be misinterpreted as giving a total of memory freed and allocated over the commands execution, but rather a net change.
更新1 :只是为了好玩我计时测试在一系列的 M值的
:
UPDATE 1: Just for fun I timed the tests at a range of values of M
:
显然比分析更多的信息。这两种方法都没有空操作,但 Z = Z(1:M);
是最快的,但它几乎可以使用双 z的记忆
为 M / N
接近1。
Clearly more informative than the profiling. Both methods are not no-ops, but Z=Z(1:M);
is fastest, but it can use almost double the memory of Z
for M/N
near 1.
更新2
一个相对陌生的功能
名为 MTIC
(和微管组织中心
)均32位Windows之前R2008b可用。我还留着一台机器上安装,所以我决定去看看是否能提供任何更深入的了解,以理解(一)多自那时以来变化和(b)它在32位MATLAB采用了完全不同的内存管理器。不过,我减少了测试尺寸 N = 1.28; M = 1.01亿;
并看看。首先,功能MTIC
为 Z = Z(1:M-1);
A relatively unknown feature
called mtic
(and mtoc
) were available in 32-bit Windows prior to R2008b. I still have it installed on one machine, so I decided to see if that provides any more insight, with the understanding that (a) much has changed since then and (b) it's a completely different memory manager used in 32-bit MATLAB. Still, I reduced the test size to N=128000000; M=101000000;
and had a look. First, feature mtic
for Z=Z(1:M-1);
>> tic; feature mtic; Z=Z(1:M-1); feature mtoc, toc
ans =
TotalAllocated: 808011592
TotalFreed: 916009628
LargestAllocated: 403999996
NumAllocs: 86
NumFrees: 77
Peak: 808002024
Elapsed time is 0.951283 seconds.
清除起来,再造以Z
,另一种方式:
>> tic; feature mtic; Z(M:end) = []; feature mtoc, toc
ans =
TotalAllocated: 1428019588
TotalFreed: 1536018372
LargestAllocated: 512000000
NumAllocs: 164
NumFrees: 157
Peak: 1320001404
Elapsed time is 4.533953 seconds.
在每公吨( TotalAllocated
, TotalFreed
, NumAllocs
等), Z(L:结束)= [];
比 Z = Z(低效率的1:M-1);
。我希望可以通过检查这些号码的 N
和 M
这些值来辨别什么是内存怎么回事,但我们会被猜测老MATLAB
In every metric (TotalAllocated
, TotalFreed
, NumAllocs
, etc.), Z(M:end) = [];
is less efficient than Z=Z(1:M-1);
. I expect it is possible to discern what is going on in memory by examining these numbers for these values of N
and M
, but we'd be guessing about an old MATLAB
这篇关于内存有效的方式来截断在Matlab大阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!