在 Matlab 中截断大数组的内存高效方法 [英] Memory-efficient way to truncate large array in Matlab

查看:70
本文介绍了在 Matlab 中截断大数组的内存高效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Matlab 中有一个大型(多 GB)数组,我想截断它¹.天真地,我认为截断不需要太多内存,但后来我意识到它可能可以:

I have a large (multi-GB) array in Matlab, that I want to truncate¹. Naively, I thought that truncating can't need much memory, but then I realised that it probably can:

>> Z = zeros(628000000, 1, 'single');
>> Z(364000000:end) = [];
Out of memory. Type HELP MEMORY for your options.

除非 Matlab 做了一些巧妙的优化,否则在截断 Z 之前,这段代码实际上创建一个数组(double 类型!)364000000:628000000.我不需要这个数组,所以我可以做:

Unless Matlab does some clever optimisations, before truncating Z, this code actually creates an array (of type double!) 364000000:628000000. I don't need this array, so I can do instead:

>> Z = Z(1:363999999);

在这种情况下,第二个示例有效,并且适合我的目的.但是为什么有效?如果 Z(364000000:end) = 0 由于中间数组 364000000:628000000 所需的内存而失败,那么为什么 Z = Z(1:363999999) 由于中间数组 1:363999999 所需的内存而失败,即 更大?当然,我不需要需要这个中间数组,并且如果没有任何中间数组就可以截断我的数组的解决方案,或者如果 Matlab 优化了特定方法,我会很高兴.

In this case, the second example works, and is fine for my purpose. But why does it work? If Z(364000000:end) = 0 fails due to the memory needed for the intermediate array 364000000:628000000, then why does not Z = Z(1:363999999) fail due to the memory needed for the intermediate array 1:363999999, that is larger? Of course, I don't need this intermediate array, and would be happy with either a solution that truncates my array without having any intermediate array, or, failing that, if Matlab optimises a particular method.

  • 有没有办法在不创建中间索引数组的情况下截断数组?
  • 如果不是,上述方法中的任何一种是否比另一种更节省内存(似乎不是)?如果是这样,为什么?Matlab真的在两个例子中都创建了中间数组吗?
  • Is there any way to truncate an array without creating an intermediate indexing array?
  • If not, is either of the aforementioned methods more memory-efficient than the other (it appears ot is)? If so, why? Does Matlab really create intermediate arrays in both examples?

¹原因:我正在处理数据,但不知道要预分配多少.我做了一个有根据的猜测,我经常分配太多.我根据可用内存选择块大小,因为拆分成更少的块意味着更快的代码.所以我想避免任何不必要的内存使用.另请参阅这篇关于按块分配的帖子.

推荐答案

我在具有 24GB RAM 的机器上使用 profile('-memory','on'); 运行了这两个示例.此分析器选项将显示在每一行代码上分配和释放的内存.这些应该是总而非净额.我检查了一个简单的函数,它有 net 0 free 和 alloc,它报告了总金额.但是,没有 .m 代码支持它们的内置命令似乎不会向分析器提供细粒度的内存报告.

I ran both examples on a machine with 24GB of RAM with profile('-memory','on');. This profiler option will show memory allocated and freed on each line of code. These are supposed to be gross not net amounts. I checked with a simple function that has net 0 free and alloc and it reported the gross amounts. However, it seems likely that builtin commands with no .m code to back them do not give fine-grained memory reporting to the profiler.

我对以下代码进行了几次测试:

I ran a couple tests for the following code:

% truncTest.m
N = 628000000;
M = 364000000;

clear Z
Z = zeros(N,1,'single');
Z(M:end) = [];
Z(1) % just because

clear Z
Z = zeros(N,1,'single');
Z = Z(1:M);
Z(1)

就它们的价值而言,这个 NM 的内存分析结果是:

For what they are worth, the memory profiling results for this N and M are:

好吧,这两行看起来在分配和释放的内存方面是一样的.也许这不是全部.

Well, both lines look the same in terms of memory allocated and freed. Maybe that's not the whole truth.

所以,出于好奇,我将 M 减少到 200(仅 200!)而不更改 Nprofile 是否清除 并重新运行.分析声明:

So, out of curiosity I decreased M to 200 (just 200!) without changing N, did profile clear and reran. Profiling claims:

有趣的是,Z=Z(1:M); 现在几乎是瞬时的,而 Z(M:end)=[]; 更快一点.正如预期的那样,两者都释放了大约 2.4GB 的内存.

Interestingly, Z=Z(1:M); is practically instantaneous now, and Z(M:end)=[]; is a little faster. Both free about 2.4GB of memory, as expected.

最后,如果我们往另一个方向走,设置M=600000000;:

Finally, if we go the other direction and set M=600000000;:

现在即使 Z=Z(1:M); 也很慢,但比 Z(M:end)=[]; 快 两倍.

Now even Z=Z(1:M); is slow, but about twice as fast as Z(M:end)=[];.

这表明:

  1. Z=Z(1:M); 只是抓取指定的元素,将它们存储在一个新的缓冲区或临时变量中,释放旧的缓冲区并将新的/临时的分配给数组 Z.我能够让我的较弱的 4GB 机器从 2.45 秒到处理页面文件 5 分钟,只需 增加 M 并留下 N.对于小的 M/N,绝对更喜欢这个选项,可能在所有情况下.
  2. Z(M:end)=[]; 总是重写缓冲区,执行时间也随着M 的增加而增加.实际上总是更慢,而且似乎呈指数增长,不像 Z=Z(1:M);.
  3. 内存分析不会提供有关这些内置操作的细粒度信息,不应被误解为提供在命令执行期间释放和分配的内存总量,而是净变化.
  1. Z=Z(1:M); just grabs the indicated elements, stores them in a new buffer or temporary variable, releases the old buffer and assigns the new/temporary to the array Z. I was able to make my weaker 4GB machine go from 2.45 seconds to thrashing the page file for 5 minutes just by increasing M and leaving N alone. Definitely prefer this option for small M/N, probably in all cases.
  2. Z(M:end)=[]; always rewrites the buffer, and execution time increases with M too. Actually always slower, and seems to increase exponentially, unlike Z=Z(1:M);.
  3. Memory profiling does not give fine-grained information about these builtin operations and should not be misinterpreted as giving a total of memory freed and allocated over the commands execution, but rather a net change.

更新 1:只是为了好玩,我在 M 的一系列值上对测试进行计时:

UPDATE 1: Just for fun I timed the tests at a range of values of M:

显然比分析提供的信息更多.两种方法都不是no-ops,但是Z=Z(1:M);是最快的,但是对于M,它几乎可以使用Z的两倍的内存/N 接近 1.

Clearly more informative than the profiling. Both methods are not no-ops, but Z=Z(1:M); is fastest, but it can use almost double the memory of Z for M/N near 1.

更新 2:

在 R2008b 之前的 32 位 Windows 中提供了一个相对未知的 功能,称为 mtic(和 mtoc).我仍然将它安装在一台机器上,所以我决定看看这是否提供了更多的洞察力,并了解到 (a) 从那时起发生了很大的变化,并且 (b) 它是 32 位 MATLAB 中使用的完全不同的内存管理器.尽管如此,我还是将测试大小减少到 N=128000000;M=101000000; 看看.首先,feature mtic for Z=Z(1:M-1);

A relatively unknown feature called mtic (and mtoc) were available in 32-bit Windows prior to R2008b. I still have it installed on one machine, so I decided to see if that provides any more insight, with the understanding that (a) much has changed since then and (b) it's a completely different memory manager used in 32-bit MATLAB. Still, I reduced the test size to N=128000000; M=101000000; and had a look. First, feature mtic for Z=Z(1:M-1);

>> tic; feature mtic; Z=Z(1:M-1); feature mtoc, toc

ans = 

      TotalAllocated: 808011592
          TotalFreed: 916009628
    LargestAllocated: 403999996
           NumAllocs: 86
            NumFrees: 77
                Peak: 808002024

Elapsed time is 0.951283 seconds.

清理,重新创建Z,另一种方式:

Clearing up, recreating Z, the other way:

>> tic; feature mtic; Z(M:end) = []; feature mtoc, toc

ans = 

      TotalAllocated: 1428019588
          TotalFreed: 1536018372
    LargestAllocated: 512000000
           NumAllocs: 164
            NumFrees: 157
                Peak: 1320001404

Elapsed time is 4.533953 seconds.

在每个指标(TotalAllocatedTotalFreedNumAllocs 等)中,Z(M:end) = []; 的效率低于 Z=Z(1:M-1);.我希望可以通过检查 NM 的这些值的这些数字来辨别内存中发生了什么,但我们猜测的是旧的 MATLAB

In every metric (TotalAllocated, TotalFreed, NumAllocs, etc.), Z(M:end) = []; is less efficient than Z=Z(1:M-1);. I expect it is possible to discern what is going on in memory by examining these numbers for these values of N and M, but we'd be guessing about an old MATLAB

这篇关于在 Matlab 中截断大数组的内存高效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆