如何判断内存是否对齐? [英] How to determine if memory is aligned?
问题描述
我是使用 SSE/SSE2 指令优化代码的新手,直到现在我还没有走多远.据我所知,一个常见的 SSE 优化函数如下所示:
I am new to optimizing code with SSE/SSE2 instructions and until now I have not gotten very far. To my knowledge a common SSE-optimized function would look like this:
void sse_func(const float* const ptr, int len){
if( ptr is aligned )
{
for( ... ){
// unroll loop by 4 or 2 elements
}
for( ....){
// handle the rest
// (non-optimized code)
}
} else {
for( ....){
// regular C code to handle non-aligned memory
}
}
}
但是,我如何正确确定 ptr
指向的内存是否通过例如对齐16 字节?我想我必须包含非对齐内存的常规 C 代码路径,因为我无法确保传递给此函数的每个内存都将对齐.并且使用内在函数将数据从未对齐的内存加载到 SSE 寄存器中似乎非常慢(甚至比常规 C 代码还要慢).
However, how do I correctly determine if the memory ptr
points to is aligned by e.g. 16 Bytes? I think I have to include the regular C code path for non-aligned memory as I cannot make sure that every memory passed to this function will be aligned. And using the intrinsics to load data from unaligned memory into the SSE registers seems to be horrible slow (Even slower than regular C code).
先谢谢你...
推荐答案
强制转换为 long
是一种廉价的方式来保护自己免受当今最可能的 int 和指针大小不同的可能性.
casting to long
is a cheap way to protect oneself against the most likely possibility of int and pointers being different sizes nowadays.
正如下面的评论所指出的,如果您愿意包含标题,还有更好的解决方案......
As pointed out in the comments below, there are better solutions if you are willing to include a header...
指针 p
在 16 字节边界上对齐且仅当 ((unsigned long)p & 15) == 0
.
A pointer p
is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0
.
这篇关于如何判断内存是否对齐?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!