了解glibc malloc修剪 [英] Understanding glibc malloc trimming
问题描述
我当前正在处理的某些程序消耗的内存比我想象的要多得多.因此,我试图了解glibc malloc修剪的工作原理.我编写了以下测试:
Some program that I am currently working on consumes much more memory than I think it should. So I am trying to understand how glibc malloc trimming works. I wrote the following test:
#include <malloc.h>
#include <unistd.h>
#define NUM_CHUNKS 1000000
#define CHUNCK_SIZE 100
int main()
{
// disable fast bins
mallopt(M_MXFAST, 0);
void** array = (void**)malloc(sizeof(void*) * NUM_CHUNKS);
// allocating memory
for(unsigned int i = 0; i < NUM_CHUNKS; i++)
{
array[i] = malloc(CHUNCK_SIZE);
}
// releasing memory ALMOST all memory
for(unsigned int i = 0; i < NUM_CHUNKS - 1 ; i++)
{
free(array[i]);
}
// when enabled memory consumption reduces
//int ret = malloc_trim(0);
//printf("ret=%d\n", ret);
malloc_stats();
sleep(100000);
}
测试输出(不调用malloc_trim):
Test output (without calling malloc_trim):
Arena 0:
system bytes = 112054272
in use bytes = 112
Total (incl. mmap):
system bytes = 120057856
in use bytes = 8003696
max mmap regions = 1
max mmap bytes = 8003584
即使释放了几乎所有的内存,此测试代码占用的驻留内存也比预期要多得多:
Even though almost all memory was released, this test code consumes much more resident memory than expected:
[root@node0-b3]# ps aux | grep test
root 14662 1.8 0.4 129736 **118024** pts/10 S 20:19 0:00 ./test
过程图:
0245e000-08f3b000 rw-p 00000000 00:00 0 [heap]
Size: 109428 kB
Rss: 109376 kB
Pss: 109376 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 109376 kB
Referenced: 109376 kB
Anonymous: 109376 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac
7f1c60720000-7f1c60ec2000 rw-p 00000000 00:00 0
Size: 7816 kB
Rss: 7816 kB
Pss: 7816 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 7816 kB
Referenced: 7816 kB
Anonymous: 7816 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
当我启用对malloc_trim的调用时,测试的输出几乎保持不变:
When I enable the call to malloc_trim the output of the test stays almost the same:
ret=1
Arena 0:
system bytes = 112001024
in use bytes = 112
Total (incl. mmap):
system bytes = 120004608
in use bytes = 8003696
max mmap regions = 1
max mmap bytes = 8003584
但是,RSS显着下降:
However, the RSS decreases significantly:
[root@node0-b3]# ps aux | grep test
root 15733 0.6 0.0 129688 **8804** pts/10 S 20:20 0:00 ./test
处理smap(在malloc_trim之后):
Process smaps (after malloc_trim):
01698000-08168000 rw-p 00000000 00:00 0 [heap]
Size: 109376 kB
Rss: 8 kB
Pss: 8 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 8 kB
Referenced: 8 kB
Anonymous: 8 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac
7f508122a000-7f50819cc000 rw-p 00000000 00:00 0
Size: 7816 kB
Rss: 7816 kB
Pss: 7816 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 7816 kB
Referenced: 7816 kB
Anonymous: 7816 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
在调用malloc_trim之后,堆缩小了.我认为8MB mmap段仍然可用,因为最后一块内存尚未释放.
After calling malloc_trim, the heap got shunked. I assume the 8MB mmap segment is still available because of the last piece of memory which wasn't released.
为什么malloc不自动执行堆修剪? 有没有一种方法可以配置malloc以便自动进行修剪(当它可以节省那么多内存时)?
Why heap trimming isn't performed automatically by malloc? Is there a way to configure malloc such that trimming will be done automatically (when it can save that much of a memory)?
我正在使用glibc 2.17版.
I am using glibc version 2.17.
推荐答案
出于历史原因,用于小型分配的内存来自使用版本6 Unix 一样古老-并且唯一可以更改其在内存中位置固定的竞技场"的大小.这意味着brk
池不能缩小到仍在分配的块之外.
Largely for historical reasons, memory for small allocations comes from a pool managed with the brk
system call. This is a very old system call — at least as old as Version 6 Unix — and the only thing it can do is change the size of an "arena" whose position in memory is fixed. What that means is, the brk
pool cannot shrink past a block that is still allocated.
您的程序分配了N个内存块,然后释放了N-1个内存块.它不会取消分配的一个块是位于最高地址的那个块.这是brk
的最坏情况:即使未使用99.99%的池,也无法完全减小大小!如果更改程序以使不空闲的块为array[0]
而不是array[NUM_CHUNKS-1]
,则在最后一次调用free
时,应该会看到RSS和地址空间都缩小了.
Your program allocates N blocks of memory and then deallocates N-1 of them. The one block it doesn't deallocate is the one located at the highest address. That is the worst-case scenario for brk
: the size can't be reduced at all, even though 99.99% of the pool is unused! If you change your program so that the block it doesn't free is array[0]
instead of array[NUM_CHUNKS-1]
, you should see both RSS and address space shrink upon the final call to free
.
当您显式调用malloc_trim
时,它将尝试使用Linux扩展 madvise(MADV_DONTNEED)
,它释放物理RAM,但不释放地址空间(如您所观察到的).我不知道为什么只有在显式调用malloc_trim
时才会发生这种情况.
When you explicitly call malloc_trim
, it attempts to work around this limitation using a Linux extension, madvise(MADV_DONTNEED)
, which releases the physical RAM, but not the address space (as you observed). I don't know why this only happens upon an explicit call to malloc_trim
.
顺便说一句,8MB的mmap段是用于array
的初始分配的.
Incidentally, the 8MB mmap segment is for your initial allocation of array
.
这篇关于了解glibc malloc修剪的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!