Malloc 与自定义分配器:Malloc 有很多开销.为什么? [英] Malloc vs custom allocator: Malloc has a lot of overhead. Why?

查看:24
本文介绍了Malloc 与自定义分配器:Malloc 有很多开销.为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个图像压缩应用程序,它现在有两个不同版本的内存分配系统.在最初的版本中,malloc 无处不在,在第二个版本中,我实现了一个简单的池分配器,它只分配内存块并将部分内存返回给 myalloc() 调用.

I have an image compression application that now has two different versions of memory allocation systems. In the original one, malloc is used everywhere, and in the second one, I implemented a simple pool-allocator, that just allocates chunk of memory and returns parts of that memory to myalloc() calls.

当使用 malloc 时,我们已经注意到巨大的内存开销:在其内存使用量达到高峰时,malloc() 代码需要大约 170 兆字节的内存来处理 1920x1080x16bpp 的图像,而池分配器仅分配 48 兆字节,其中 47 个被程序使用.

We've been noticing a huge memory overhead when malloc is used: At the height of its memory usage, the malloc() code requires about 170 megabytes of memory for a 1920x1080x16bpp image, while the pool allocator allocates just 48 megabytes, of which 47 are used by the program.

在内存分配模式方面,程序为测试图像分配了很多 8 字节(最多)、32 字节(很多)和 1080 字节(一些)块.除此之外,代码中没有动态内存分配.

In terms of memory allocation patterns, the program allocates a lot of 8byte(most), 32-byte(many) and 1080byte-blocks(some) with the test image. Apart from these, there are no dynamic memory allocations in the code.

测试系统操作系统为Windows 7(64位).

The OS of the testing system is Windows 7 (64 Bit).

我们如何测试内存使用情况?

How did we test memory usage?

使用自定义分配器,我们可以看到使用了多少内存,因为所有 malloc 调用都被推迟到分配器.使用 malloc(),在调试模式下,我们只是单步执行代码并观察任务管理器中的内存使用情况.在发布模式下,我们做了同样的事情,但粒度不那么细,因为编译器优化了很多东西,所以我们无法逐段执行代码(发布和调试之间的内存差异约为 20MB,我认为这是由于发布模式下优化和调试信息缺失).

With the custom allocator, we could see how much memory is used because all malloc calls are defered to the allocator. With malloc(), in Debug mode we just stepped through the code and watched the memory usage in the task manager. In release mode we did the same, but less fine grained because the compiler optimizes a lot of stuff away so we couldn't step through the code piece by piece (the memory difference between release and debug was about 20MB, which I would attribute to optimization and lack of debug information in release mode).

malloc 是否会导致如此巨大的开销?如果是这样,究竟是什么导致了 malloc 内部的这种开销?

Could malloc alone be the cause of such a huge overhead? If so, what exactly causes this overhead inside malloc?

推荐答案

首先 malloc 将指针对齐到 16 字节边界.此外,它们在返回值之前的地址中至少存储一个指针(或分配的长度).然后他们可能会添加一个magic value或release counter来表明链表没有被破坏或者内存块没有被释放两次(free ASSERTS for double frees).

First at all malloc aligns the pointers to 16 byte boundaries. Furthermore they store at least one pointer (or allocated length) in the addresses preceding the returned value. Then they probably add a magic value or release counter to indicate that the linked list is not broken or that the memory block has not been released twice (free ASSERTS for double frees).

#include <stdlib.h>
#include <stdio.h>

int main(int ac, char**av)
{
  int *foo = malloc(4);
  int *bar = malloc(4);
  printf("%d
", (int)bar - (int)foo);
}

返回:32

这篇关于Malloc 与自定义分配器:Malloc 有很多开销.为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆