为什么要使用_mm_malloc?(与 _aligned_malloc、alligned_alloc 或 posix_memalign 相对) [英] Why use _mm_malloc? (as opposed to _aligned_malloc, alligned_alloc, or posix_memalign)

查看:254
本文介绍了为什么要使用_mm_malloc?(与 _aligned_malloc、alligned_alloc 或 posix_memalign 相对)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

获取对齐的内存块有几种选择,但它们非常相似,问题主要归结为您所针对的语言标准和平台.

There are a few options for acquiring an aligned block of memory but they're very similar and the issue mostly boils down to what language standard and platforms you're targeting.

C11

void * aligned_alloc (size_t alignment, size_t size)

POSIX

int posix_memalign (void **memptr, size_t alignment, size_t size)

视窗

void * _aligned_malloc(size_t size, size_t alignment);

当然,手动对齐也是一种选择.

And of course it's also always an option to align by hand.

英特尔提供了另一种选择.

Intel offers another option.

英特尔

void* _mm_malloc (int size, int align)
void _mm_free (void *p)

根据英特尔发布的源代码,这似乎是他们的工程师喜欢分配对齐内存的方法,但我找不到任何文档将其与其他方法进行比较.我发现的最接近的只是承认存在其他对齐的内存分配例程.

Based on source code released by Intel, this seems to be the method of allocating aligned memory their engineers prefer but I can't find any documentation comparing it to other methods. The closest I found simply acknowledges that other aligned memory allocation routines exist.

https://software.intel.com/en-us/articles/memory-management-for-optimal-performance-on-intel-xeon-phi-coprocessor-alignment-and

动态分配一块对齐的内存,使用posix_memalign,GCC 和 Intel 编译器都支持它.好处使用它的好处是您不必更改内存处理 API.您可以像往常一样使用 free().但要注意参数配置文件:

To dynamically allocate a piece of aligned memory, use posix_memalign, which is supported by GCC as well as the Intel Compiler. The benefit of using it is that you don’t have to change the memory disposal API. You can use free() as you always do. But pay attention to the parameter profile:

int posix_memalign (void **memptr, size_t align, size_t size);

  int posix_memalign (void **memptr, size_t align, size_t size);

Intel Compiler 还提供了另一组内存分配蜜蜂.C/C++ 程序员可以使用 _mm_malloc 和 _mm_free 来分配并释放对齐的内存块.例如,以下语句为 8 个浮点数请求一个 64 字节对齐的内存块元素.

The Intel Compiler also provides another set of memory allocation APIs. C/C++ programmers can use _mm_malloc and _mm_free to allocate and free aligned blocks of memory. For example, the following statement requests a 64-byte aligned memory block for 8 floating point elements.

farray = (float *)__mm_malloc(8*sizeof(float), 64);

  farray = (float *)__mm_malloc(8*sizeof(float), 64);

使用 _mm_malloc 分配的内存必须使用_mm_免费.在使用 _mm_malloc 分配的内存上调用 free 或在使用 malloc 分配的内存上调用 _mm_free 将导致不可预测的行为.

Memory that is allocated using _mm_malloc must be freed using _mm_free. Calling free on memory allocated with _mm_malloc or calling _mm_free on memory allocated with malloc will result in unpredictable behavior.

从用户的角度来看,明显的区别是 _mm_malloc 需要直接的 CPU 和编译器支持,用 _mm_malloc 分配的内存必须用 _mm_free 释放.考虑到这些缺点,为什么还要使用 _mm_malloc? 它可以有轻微的性能优势吗?历史事故?

The clear differences from a user perspective is that _mm_malloc requires direct CPU and compiler support and memory allocated with _mm_malloc must be freed with _mm_free. Given these drawbacks, what is the reason for ever using _mm_malloc? Can it have a slight performance advantage? Historical accident?

推荐答案

Intel 编译器支持 POSIX (Linux) 和非 POSIX (Windows) 操作系统,因此不能依赖 POSIX 或 Windows 功能.因此,选择了特定于编译器但与操作系统无关的解决方案.

Intel compilers support POSIX (Linux) and non-POSIX (Windows) operating systems, hence cannot rely upon either the POSIX or the Windows function. Thus, a compiler-specific but OS-agnostic solution was chosen.

C11 是一个很好的解决方案,但微软甚至还不支持 C99,所以谁知道他们是否会支持 C11.

C11 is a great solution but Microsoft doesn't even support C99 yet, so who knows if they will ever support C11.

更新: 与 C11/POSIX/Windows 分配函数不同,ICC 内在函数包括解除分配函数.这允许此 API 使用不同于默认的堆管理器.我不知道它是否/何时真的这样做了,但支持这个模型很有用.

Update: Unlike the C11/POSIX/Windows allocation functions, the ICC intrinsics include a deallocation function. This allows this API to use a separate heap manager from the default one. I don't know if/when it actually does that, but it can be useful to support this model.

免责声明:我为英特尔工作,但对这些决定并不特别了解,这些决定发生在我加入公司之前很久.

这篇关于为什么要使用_mm_malloc?(与 _aligned_malloc、alligned_alloc 或 posix_memalign 相对)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆