带有私有匿名映射的ENOMEM的munmap()失败 [英] munmap() failure with ENOMEM with private anonymous mapping

查看:171
本文介绍了带有私有匿名映射的ENOMEM的munmap()失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近发现Linux无法保证使用munmap释放分配给mmap的内存,如果这会导致VMA(虚拟内存区域)结构的数量超过vm.max_map_count的情况.联机帮助页(几乎)清楚地指出了这一点:

I have recently discovered that Linux does not guarantee that memory allocated with mmap can be freed with munmap if this leads to situation when number of VMA (Virtual Memory Area) structures exceed vm.max_map_count. Manpage states this (almost) clearly:

 ENOMEM The process's maximum number of mappings would have been exceeded.
 This error can also occur for munmap(), when unmapping a region
 in the middle of an existing mapping, since this results in two
 smaller mappings on either side of the region being unmapped.

问题在于,Linux内核总是尽可能地尝试合并VMA结构,即使对于单独创建的映射,munmap也会失败.我能够编写一个小程序来确认这种行为:

The problem is that Linux kernel always tries to merge VMA structures if possible, making munmap fail even for separately created mappings. I was able to write a small program to confirm this behavior:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

#include <sys/mman.h>

// value of vm.max_map_count
#define VM_MAX_MAP_COUNT        (65530)

// number of vma for the empty process linked against libc - /proc/<id>/maps
#define VMA_PREMAPPED           (15)

#define VMA_SIZE                (4096)
#define VMA_COUNT               ((VM_MAX_MAP_COUNT - VMA_PREMAPPED) * 2)

int main(void)
{
    static void *vma[VMA_COUNT];

    for (int i = 0; i < VMA_COUNT; i++) {
        vma[i] = mmap(0, VMA_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

        if (vma[i] == MAP_FAILED) {
            printf("mmap() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_COUNT; i += 2) {
        if (munmap(vma[i], VMA_SIZE) != 0) {
            printf("munmap() failed at %d (%p): %m\n", i, vma[i]);
        }
    }
}

它使用mmap分配大量页面(是允许的默认最大值的两倍),然后每隔第二页munmap来为剩余的每个页面创建单独的VMA结构.在我的计算机上,最后一个munmap调用始终失败,并显示ENOMEM.

It allocates a large number of pages (twice the default allowed maximum) using mmap, then munmaps every second page to create separate VMA structure for each remaining page. On my machine the last munmap call always fails with ENOMEM.

最初,我认为munmap如果使用与用于创建映射的地址和大小相同的值使用,则永远不会失败.显然,在Linux上不是这种情况,我无法在其他系统上找到有关类似行为的信息.

Initially I thought that munmap never fails if used with the same values for address and size that were used to create mapping. Apparently this is not the case on Linux and I was not able to find information about similar behavior on other systems.

同时,我认为对于每一个合理的实现,在任何操作系统上应用到映射区域中间的部分取消映射都将失败,但是我还没有找到任何文档表明这种失败是可能的.

At the same time in my opinion partial unmapping applied to the middle of a mapped region is expected to fail on any OS for every sane implementation, but I haven't found any documentation that says such failure is possible.

我通常会认为这是内核中的错误,但是了解Linux如何处理内存过量使用和OOM时,我几乎可以肯定这是一个存在的功能",可以提高性能并减少内存消耗.

I would usually consider this a bug in the kernel, but knowing how Linux deals with memory overcommit and OOM I am almost sure this is a "feature" that exists to improve performance and decrease memory consumption.

我能找到的其他信息:

  • Windows上的类似API由于其设计而没有此功能"(请参阅​​MapViewOfFileUnmapViewOfFileVirtualAllocVirtualFree)-它们只是不支持部分取消映射.
  • glibc malloc实现实现的创建次数不超过65535映射,当达到此限制时退回到sbrk:
  • Similar APIs on Windows do not have this "feature" due to their design (see MapViewOfFile, UnmapViewOfFile, VirtualAlloc, VirtualFree) - they simply do not support partial unmapping.
  • glibc malloc implementation does not create more than 65535 mappings, backing off to sbrk when this limit is reached: https://code.woboq.org/userspace/glibc/malloc/malloc.c.html. This looks like a workaround for this issue, but it is still possible to make free silently leak memory.
  • jemalloc had trouble with this and tried to avoid using mmap/munmap because of this issue (I don't know how it ended for them).

其他操作系统是否真的可以保证释放内存映射?我知道Windows可以做到这一点,但是其他类似Unix的操作系统呢? FreeBSD? QNX?

Do other OS's really guarantee deallocation of memory mappings? I know Windows does this, but what about other Unix-like operating systems? FreeBSD? QNX?

我正在添加一个示例,该示例显示当内部munmap调用因ENOMEM失败时,glibc的free如何泄漏内存.使用strace来查看munmap失败:

I am adding example that shows how glibc's free can leak memory when internal munmap call fails with ENOMEM. Use strace to see that munmap fails:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

#include <sys/mman.h>

// value of vm.max_map_count
#define VM_MAX_MAP_COUNT        (65530)

#define VMA_MMAP_SIZE           (4096)
#define VMA_MMAP_COUNT          (VM_MAX_MAP_COUNT)

// glibc's malloc default mmap_threshold is 128 KiB
#define VMA_MALLOC_SIZE         (128 * 1024)
#define VMA_MALLOC_COUNT        (VM_MAX_MAP_COUNT)

int main(void)
{
    static void *mmap_vma[VMA_MMAP_COUNT];

    for (int i = 0; i < VMA_MMAP_COUNT; i++) {
        mmap_vma[i] = mmap(0, VMA_MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

        if (mmap_vma[i] == MAP_FAILED) {
            printf("mmap() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_MMAP_COUNT; i += 2) {
        if (munmap(mmap_vma[i], VMA_MMAP_SIZE) != 0) {
            printf("munmap() failed at %d (%p): %m\n", i, mmap_vma[i]);
            return 1;
        }
    }

    static void *malloc_vma[VMA_MALLOC_COUNT];

    for (int i = 0; i < VMA_MALLOC_COUNT; i++) {
        malloc_vma[i] = malloc(VMA_MALLOC_SIZE);

        if (malloc_vma[i] == NULL) {
            printf("malloc() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_MALLOC_COUNT; i += 2) {
        free(malloc_vma[i]);
    }
}

推荐答案

在Linux上解决此问题的一种方法是一次同时mmap一页多(例如一次1 MB),并映射一个分隔页之后.因此,您实际上是在257页的内存上调用mmap,然后用PROT_NONE重新映射最后一页,以便无法对其进行访问.这将使内核中的VMA合并优化失败.由于您一次要分配许多页面,因此不应遇到最大映射限制.缺点是您必须手动管理要切大mmap的方式.

One way to work around this problem on Linux is to mmap more that 1 page at once (e.g. 1 MB at a time), and also map a separator page after it. So, you actually call mmap on 257 pages of memory, then remap the last page with PROT_NONE, so that it cannot be accessed. This should defeat the VMA merging optimization in the kernel. Since you are allocating many pages at once, you should not run into the max mapping limit. The downside is you have to manually manage how you want to slice the large mmap.

关于您的问题:

  1. 由于各种原因,系统调用可能会在任何系统上失败.文档并不总是完整的.

  1. System calls can fail on any system for a variety of reasons. Documentation is not always complete.

只要传入的地址位于页面边界,并且您可以将munmapmmap d区域的一部分放在页面边界上,并且将length参数四舍五入到页面大小的下一个整数倍.

You are allowed to munmap a part of a mmapd region as long as the address passed in lies on a page boundary, and the length argument is rounded up to the next multiple of the page size.

这篇关于带有私有匿名映射的ENOMEM的munmap()失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆