如何分配16byte内存对齐的数据 [英] How to allocate 16byte memory aligned data

查看:760
本文介绍了如何分配16byte内存对齐的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想实现在一张code对,我需要我的一维数组是16字节的存储对齐SSE矢量化。不过,我已经尝试了几种办法来分配16byte内存对齐的数据,但它最终被4字节的内存对齐。

I am trying to implement SSE vectorization on a piece of code for which I need my 1D array to be 16 byte memory aligned. However, I have tried several ways to allocate 16byte memory aligned data but it ends up being 4byte memory aligned.

我与英特尔编译器国际刑事法院的工作。
这是一个示例code我与测试:

I have to work with the Intel icc compiler. This is a sample code I am testing with:

  #include <stdio.h>
  #include <stdlib.h>

  void error(char *str)
  {
   printf("Error:%s\n",str);
   exit(-1);
  }

  int main()
  {
   int i;
   //float *A=NULL;
   float *A = (float*) memalign(16,20*sizeof(float));

   //align
   // if (posix_memalign((void **)&A, 16, 20*sizeof(void*)) != 0)
   //   error("Cannot align");

    for(i = 0; i < 20; i++)
       printf("&A[%d] = %p\n",i,&A[i]);

        free(A);

         return 0;
   }

这是输出我得到:

 &A[0] = 0x11fe010
 &A[1] = 0x11fe014
 &A[2] = 0x11fe018
 &A[3] = 0x11fe01c
 &A[4] = 0x11fe020
 &A[5] = 0x11fe024
 &A[6] = 0x11fe028
 &A[7] = 0x11fe02c
 &A[8] = 0x11fe030
 &A[9] = 0x11fe034
 &A[10] = 0x11fe038
 &A[11] = 0x11fe03c
 &A[12] = 0x11fe040
 &A[13] = 0x11fe044
 &A[14] = 0x11fe048
 &A[15] = 0x11fe04c
 &A[16] = 0x11fe050
 &A[17] = 0x11fe054
 &A[18] = 0x11fe058
 &A[19] = 0x11fe05c

这是4字节对齐每次,我都用了memalign可,POSIX memalign可。由于我在Linux上工作,我不能用_mm_malloc无论是我可以用_aligned_malloc。
我得到一个内存破坏错误,当我尝试使用_aligned_attribute(这是适用于单独的gcc我认为)。

It is 4byte aligned everytime, i have used both memalign, posix memalign. Since I am working on Linux, I cannot use _mm_malloc neither can I use _aligned_malloc. I get a memory corruption error when I try to use _aligned_attribute (which is suitable for gcc alone I think).

谁能帮助我准确地生成16byte对齐内存在Linux平台上的ICC数据。

Can anyone assist me in accurately generating 16byte memory aligned data for icc on linux platform.

推荐答案

在分配的内存是对齐的16字节。请参阅:结果
&安培; A [0] = 0x11fe010 结果
但在浮动,每个元素都是4个字节,所以第二个是4字节对齐。结果

The memory you allocate is 16-byte aligned. See:
&A[0] = 0x11fe010
But in an array of float, each element is 4 bytes, so the second is 4-byte aligned.

您可以使用结构数组,每个包含一个单一的浮动,与对齐属性:

You can use an array of structures, each containing a single float, with the aligned attribute:

struct x {
    float y;
} __attribute__((aligned(16)));
struct x *A = memalign(...);

这篇关于如何分配16byte内存对齐的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆