我们需要预分配.但是MATLAB是否不预分配预分配? [英] We need to preallocate. But MATLAB does not preallocate the preallocation?

查看:99
本文介绍了我们需要预分配.但是MATLAB是否不预分配预分配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在测试any()是否短路时(确实如此!),当

While testing if any() short-circuits (it does!) I found out the following interesting behavior when preallocating the test variable:

test=zeros(1e7,1);
>> tic;any(test);toc
Elapsed time is 2.444690 seconds.
>> test(2)=1;
>> tic;any(test);toc
Elapsed time is 0.000034 seconds.

但是,如果我这样做:

test=ones(1e7,1);
test(1:end)=0;
tic;any(test);toc
Elapsed time is 0.642413 seconds.
>> test(2)=1;
>> tic;any(test);toc
Elapsed time is 0.000021 seconds.

事实证明,发生这种情况的原因是,直到变量完全充满信息后,变量才真正出现在RAM上,因此第一个测试花费了更长的时间,因为它需要分配它.我检查此问题的方法是查看Windows任务管理器中使用的内存.

Turns out that this happens because the variable is not really on RAM until its completely filled with information, therefore the first test takes longer because it needs to allocate it. The way I checked this was by looking at the memory used in the Windows Task Manager.

虽然这可能是有道理的(直到需要时才进行初始化),但令我更困惑的是下面的测试,该变量填充在for循环中,并在某个时刻停止执行.

While this may make some sense (do not initialize until its needed), what confused me a bit more is the following test, where the variable is filled in a for loop and at some point the execution is stopped.

test=zeros(1e7,1);

for ii=1:1e7
    test(ii)=1;
    if ii==1e7/2
        pause
    end
end

在检查MATLAB使用的内存时,我可以看到停止时如何使用它,它仅使用了test所需内存的50%(如果已满).可以用不同的内存百分比可靠地重现此内容.

When checking the memory used by MATLAB, I could see how when stopped, it was using only 50% of test needed memory (if it was full). This can be reproduced with different % of memory quite solidly.

有趣的是,以下内容也不分配整个矩阵.

Interestingly the following does not allocate the entire matrix either.

test=zeros(1e7,1);
test(end)=1;

我知道MATLAB不会在循环中动态分配和增加test的大小,因为这会使结束迭代非常慢(由于需要高容量的内存拷贝),而且还会分配整个数组在最后的测试中,我提出了.所以我的问题是:

I know that MATLAB is not dynamically allocating and increasing the size of test in the loop, as that would make the end iterations very slow (due to the high memcopys that would need) and it would also allocate the entire array in this last test I proposed. So my question is:

发生了什么事?

有人建议这可能与虚拟内存与物理内存有关,并且与操作系统如何看待内存有关.虽然不确定如何将其链接到此处提出的第一个测试.任何进一步的解释都是理想的.

Someone suggested that this can be related to virtual-memory vs physical-memory, and related to how the OS sees memory. Not sure how that links to the first test proposed here though. Any further explanation would be ideal.

Win 10 x64,MATLAB 2017a

Win 10 x64, MATLAB 2017a

推荐答案

此行为并非MATLAB独有.实际上,MATLAB无法控制它,因为Windows是导致它的原因. Linux和MacOS表现出相同的行为.

This behavior is not unique to MATLAB. In fact, MATLAB has no control over it, as it is Windows that causes it. Linux and MacOS show the same behavior.

很多年前,我在C程序中注意到了同样的事情.事实证明,这是有据可查的行为. 这个极好的答案详细说明了内存管理在大多数现代操作系统中的工作原理(感谢Amro 用于共享链接!).如果此答案对您来说不够详细,请仔细阅读.

I had noticed this exact same thing in a C program many years ago. It turns out that this is well documented behavior. This excellent answer explains in gory details how memory management works in most modern OSes (thanks Amro for sharing the link!). Read it if this answer doesn't have enough detail for you.

首先,让我们在C中重复Ander的实验:

First, let's repeat Ander's experiment in C:

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>

int main (void) {

   const int size = 1e8;

   /* For Linux: */
   // const char* ps_command = "ps --no-headers --format \"rss vsz\" -C so";
   /* For MacOS: */
   char ps_command[128];
   sprintf(ps_command, "ps -o rss,vsz -p %d", getpid());

   puts("At program start:");
   system(ps_command);

   /* Allocate large chunck of memory */

   char* mem = malloc(size);

   puts("After malloc:");
   system(ps_command);

   for(int ii = 0; ii < size/2; ++ii) {
      mem[ii] = 0;
   }

   puts("After writing to half the array:");
   system(ps_command);

   for(int ii = size/2; ii < size; ++ii) {
      mem[ii] = 0;
   }

   puts("After writing to the whole array:");
   system(ps_command);

   char* mem2 = calloc(size, 1);

   puts("After calloc:");
   system(ps_command);

   free(mem);
   free(mem2);
}

上面的代码在兼容POSIX的操作系统(即Windows以外的任何操作系统)上均可运行,但是在Windows上,您可以使用 Cygwin 成为(主要是)POSIX兼容的.根据您的操作系统,可能需要更改ps命令语法.用gcc so.c -o so编译,用./so运行.我在MacOS上看到以下输出:

The code above works on a POSIX-compliant OS (i.e. any OS except Windows), but on Windows you can use Cygwin to become (mostly) POSIX-compliant. You might need to change the ps command syntax depending on your OS. Compile with gcc so.c -o so, run with ./so. I see the following output on MacOS:

At program start:
   RSS      VSZ
   800  4267728
After malloc:
   RSS      VSZ
   816  4366416
After writing to half the array:
   RSS      VSZ
 49648  4366416
After writing to the whole array:
   RSS      VSZ
 98476  4366416
After calloc:
   RSS      VSZ
 98476  4464076

显示两列,RSS和VSZ. RSS代表居民集大小",它是程序正在使用的物理内存(RAM)的数量. VSZ代表虚拟大小",它是分配给程序的虚拟内存的大小.两种数量均以KiB为单位.

There are two columns displayed, RSS and VSZ. RSS stands for "Resident set size", it is the amount of physical memory (RAM) that the program is using. VSZ stands for "Virtual size", it is the size of the virtual memory assigned to the program. Both quantities are in KiB.

VSZ列在程序启动时显示4 GiB.我不确定这到底是什么,这似乎是最重要的.但是该值在malloc之后和calloc之后再次增大,两次都大约为98,000 KiB(略微超过我们分配的1e8字节).

The VSZ column shows 4 GiB at program start. I'm not sure what that is about, it seems over the top. But the value grows after malloc and again after calloc, both times with approximately 98,000 KiB (slightly over the 1e8 bytes we allocated).

相反,在分配1e8字节后,RSS列显示仅增加了16 KiB.写完一半数组后,我们使用了超过5e7字节的内存,写完整个数组后,我们使用了超过1e8字节的内存.因此,内存是在使用时分配的,而不是在我们第一次请求时分配的.接下来,我们使用calloc分配另一个1e8字节,并且RSS中没有任何变化.请注意,calloc返回一个初始化为0的内存块,就像MATLAB的zeros一样.

In contrast, the RSS column shows an increase of only 16 KiB after we allocated 1e8 bytes. After writing to half the array, we have a bit over 5e7 bytes of memory in use, and after writing to the full array we have a bit over 1e8 bytes in use. Thus, the memory gets assigned as we use it, not when we first ask for it. Next, we allocate another 1e8 bytes using calloc, and see no change in the RSS. Note that calloc returns a memory block that is initialized to 0, exactly like MATLAB's zeros does.

我说的是calloc,因为MATLAB的zeros可能是通过calloc实现的.

I am talking about calloc because it is likely that MATLAB's zeros is implemented through calloc.

说明:

现代计算机体系结构将虚拟内存(进程看到的内存空间)与物理内存分开.进程(即程序)使用指针访问内存,这些指针是虚拟内存中的地址.系统会将这些地址转换为 使用时的物理地址 .这具有许多优点,例如,一个进程不可能寻址分配给另一进程的内存,因为它无法生成的地址都不会转换为未分配给该进程的物理内存.它还允许OS换出空闲进程的内存,让另一个进程使用该物理内存.请注意,虚拟内存的连续块的物理内存不必是连续的!

Modern computer architectures separate virtual memory (the memory space that a process sees) from physical memory. The process (i.e. a program) uses pointers to access memory, these pointers are addresses in virtual memory. These addresses are translated by the system into physical addresses when used. This has many advantages, for example it is impossible for one process to address memory assigned to another process, since none of the addresses it can generate will ever be translated to physical memory not assigned to that process. It also allows the OS to swap out memory of an idling process to let another process use that physical memory. Note that the physical memory for a contiguous block of virtual memory doesn't need to be contiguous!

键是上方的粗体斜体文本: 使用时 .直到进程尝试对其进行读取或写入之前,分配给进程的内存可能实际上并不存在.这就是为什么我们在分配大型数组时看不到RSS的任何变化的原因.使用的内存以页为单位分配给物理内存(块通常为4 KiB,有时可达1 MiB).因此,当我们写入新内存块的一个字节时,只会分配一页.

The key is the bolded italic text above: when used. Memory assigned to a process might not actually exist until the process tries to read from or write to it. This is why we don't see any change in RSS when allocating a large array. Memory used is assigned to physical memory in pages (blocks typically of 4 KiB, sometimes up to 1 MiB). So when we write to one byte of our new memory block, only one page gets assigned.

某些操作系统(例如Linux)甚至会过量使用"内存.假设这些进程不会使用所有分配给他们的内存,那么Linux将为其分配的虚拟内存多于它在物理内存中可分配的容量. 此答案将告诉您过多的过度使用,而不是您想知道的.

Some OSes, like Linux, will even "overcommit" memory. Linux will assign more virtual memory to processes than it has the capacity to put into physical memory, under the assumption that those processes will not use all the memory they are assigned anyway. This answer will tell you more over overcommitting than you will want to know.

那么返回零初始化内存的calloc会发生什么呢? 我之前链接的答案中也对此进行了说明.对于较小的数组malloccalloc,从程序启动时从操作系统获得的较大池中返回一个内存块.在这种情况下,calloc将向所有字节写入零,以确保将其初始化为零.但是对于较大的阵列,可以直接从OS获得新的内存块.操作系统总是提供被清零的内存(同样,它阻止一个程序查看另一程序的数据).但是,因为直到使用完才对内存进行物理分配,所以归零也会延迟到将内存页放入物理内存中为止.

So what happens with calloc, which returns zero-initialized memory? This is also explained in the answer I linked earlier. For small arrays malloc and calloc return a block of memory from a larger pool obtained from the OS at the start of the program. In this case, calloc will write zeros to all bytes to make sure it is zero-initialized. But for larger arrays, a new block of memory is directly obtained from the OS. The OS always gives out memory that is zeroed out (again, it prevents one program to see data from another program). But because the memory doesn't get physically assigned until used, the zeroing out is also delayed until a memory page is put into physical memory.

返回MATLAB:

以上实验表明,可以在恒定时间内获得调零的内存块,而无需更改程序内存的物理大小.这就是MATLAB的函数zeros分配内存的方式,而您没有看到MATLAB的内存占用量有任何变化.

The experiment above shows that it is possible to obtain a zeroed-out block of memory in constant time and without changing the physical size of a program's memory. This is how MATLAB's function zeros allocates memory without you seeing any change in MATLAB's memory footprint.

实验还表明,zeros分配了整个数组(可能通过calloc),并且内存占用量仅随着使用该数组而增加,一次一页.

The experiment also shows that zeros allocates the full array (likely through calloc), and that memory footprint only increases as this array is used, one page at a time.

MathWorks的预分配建议指出

您可以通过预分配数组所需的最大空间来缩短代码执行时间.

you can improve code execution time by preallocating the maximum amount of space required for the array.

如果我们分配一个小数组,然后要增加其大小,则必须分配一个新的数组并复制数据.数组与RAM的关联方式对此没有影响,MATLAB仅看到虚拟内存,它无法控制(甚至不知道?)这些数据在物理内存(RAM)中的存储位置.从MATLAB的角度(或任何其他程序的角度)而言,对于数组而言,重要的是该数组是虚拟内存的连续块.并非总是(通常不是?)不可能扩大现有的内存块,因此将获得一个新的内存块并复制数据.例如,请参见此其他答案中的图形:当数组扩大时(发生在较大的垂直尖峰处)数据复制;数组越大,需要复制的数据就越多.

If we allocate a small array, then want to increase its size, a new array has to be allocated and data copied over. How the array is associated to RAM has no influence on this, MATLAB only sees virtual memory, it has no control (or even knowledge?) of where in the physical memory (RAM) these data are stored. All that matters for an array from MATLAB's point of view (or that of any other program) is that the array is a contiguous block of virtual memory. Enlarging an existing block of memory is not always (usually not?) possible, and so a new block is obtained and data copied over. For example, see the graph in this other answer: when the array is enlarged (this happens at the large vertical spikes) data is copied; the larger the array, the more data needs to be copied.

预分配避免扩大数组,因为我们要使其足够大以开始.实际上,制作一个对我们需要的东西来说太大的数组会更有效,因为实际上我们从未真正将未使用的数组部分从未提供给程序.也就是说,如果我们分配很大的虚拟内存块,并且仅使用前1000个元素,那么我们实际上只会使用几页物理内存.

Preallocating avoids enlarging the array, as we make it large enough to begin with. In fact, it is more efficient to make an array that is way too large for what we need, as the portion of the array that we don't use is actually never really given to the program. That is, if we allocate a very large block of virtual memory, and only use the first 1000 elements, we'll only really use a few pages of physical memory.

上述calloc的行为也解释了 zeros函数的另一种奇怪行为:对于小数组,zeros比大型数组更昂贵,因为小型数组需要由程序显式清零,而大型数组由OS隐式清零.

The behavior of calloc described above explains also this other strange behavior of the zeros function: For small arrays, zeros is more expensive than for large arrays, because small arrays need to be zeroed explicitly by the program, whereas large arrays are implicitly zeroed by the OS.

这篇关于我们需要预分配.但是MATLAB是否不预分配预分配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆