如何使用此宏测试内存是否对齐? [英] How to use this macro to test if memory is aligned?

查看:82
本文介绍了如何使用此宏测试内存是否对齐?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是simd初学者,已经阅读这篇关于该主题的文章(因为我使用的是AVX2兼容机)。

I'm a simd beginner, I've read this article about the topic (since I'm using a AVX2-compatible machine).

现在,我已经读过这个问题,以检查您的指针是否对齐。

Now, I've read in this question to check if your pointer is aligned.

我正在对此进行测试玩具示例 main.cpp

I'm testing it with this toy example main.cpp:

#include <iostream>
#include <immintrin.h>

#define is_aligned(POINTER, BYTE_COUNT) \
    (((uintptr_t)(const void *)(POINTER)) % (BYTE_COUNT) == 0)


int main()
{
  float a[8];
  for(int i=0; i<8; i++){
    a[i]=i;
  }
  __m256 evens = _mm256_set_ps(2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0);
  std::cout<<is_aligned(a, 16)<<" "<<is_aligned(&evens, 16)<<std::endl;   
  std::cout<<is_aligned(a, 32)<<" "<<is_aligned(&evens, 32)<<std::endl;   

}

并使用 icpc进行编译- std = c ++ 11 -o main main.cpp

打印结果为:

1 1
1 1

但是,如果我在4个打印前添加3行:

However, if I add thhese 3 lines before the 4 prints:

for(int i=0; i<8; i++)
  std::cout<<a[i]<<" ";
std::cout<<std::endl;

结果是:

0 1 2 3 4 5 6 7 
1 1
0 1

尤其是我不明白最后一个 0 。为什么与上次打印不同?我缺少什么?

In particular, I don't understand that last 0. Why is it different from the last printing? What am I missing?

推荐答案

您的 is_aligned (这是一个宏) ,而不是函数)来确定对象是否已与特定边界对齐。

Your is_aligned (which is a macro, not a function) determines whether the object has been aligned to particular boundary. It does not determine the alignment requirement of the type of the object.

编译器将保证float数组至少与a的对齐要求对齐。 float-通常为4。32不是4的因数,因此不能保证数组与32字节边界对齐。但是,有许多内存地址可被4和32整除,因此有可能4字节边界处的内存地址也位于32字节边界处。这是您第一次测试中发生的事情,但是如前所述,不能保证会发生。在后面的测试中,您添加了一些局部变量,并且该数组最终位于另一个内存位置。碰巧另一个内存位置不在32字节边界。

The compiler will guarantee for a float array, that it be aligned to at least the alignment requirement of a float - which is typically 4. 32 is not a factor of 4, so there is no guarantee that the array be aligned to 32 byte boundary. However, there are many memory addresses that are divisible by both 4 and 32, so it is possible that a memory address at a 4 byte boundary happens to also be at a 32 byte boundary. This is what happened in your first test, but as explained, there is no guarantee that it would happen. In your latter test you added some local variables, and the array ended up in another memory location. It so happened that the other memory location wasn't at the 32 byte boundary.

要请求SIMD指令可能要求更严格的对齐方式,可以使用 alignas 说明符:

To request a stricter alignment that may be required by SIMD instructions, you can use the alignas specifier:

alignas(32) float a[8];

这篇关于如何使用此宏测试内存是否对齐?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆