如何判断内存是否对齐? [英] How to determine if memory is aligned?

查看:45
本文介绍了如何判断内存是否对齐?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是使用 SSE/SSE2 指令优化代码的新手,直到现在我还没有走多远.据我所知,一个常见的 SSE 优化函数如下所示:

I am new to optimizing code with SSE/SSE2 instructions and until now I have not gotten very far. To my knowledge a common SSE-optimized function would look like this:

void sse_func(const float* const ptr, int len){
    if( ptr is aligned )
    {
        for( ... ){
            // unroll loop by 4 or 2 elements
        }
        for( ....){
            // handle the rest
            // (non-optimized code)
        }
    } else {
        for( ....){
            // regular C code to handle non-aligned memory
        }
    }
}

但是,我如何正确确定 ptr 指向的内存是否通过例如对齐16 字节?我想我必须包含非对齐内存的常规 C 代码路径,因为我无法确保传递给此函数的每个内存都将对齐.并且使用内在函数将数据从未对齐的内存加载到 SSE 寄存器中似乎非常慢(甚至比常规 C 代码还要慢).

However, how do I correctly determine if the memory ptr points to is aligned by e.g. 16 Bytes? I think I have to include the regular C code path for non-aligned memory as I cannot make sure that every memory passed to this function will be aligned. And using the intrinsics to load data from unaligned memory into the SSE registers seems to be horrible slow (Even slower than regular C code).

先谢谢你...

推荐答案

强制转换为 long 是一种廉价的方式来保护自己免受当今最可能的 int 和指针大小不同的可能性.

casting to long is a cheap way to protect oneself against the most likely possibility of int and pointers being different sizes nowadays.

正如下面的评论所指出的,如果您愿意包含标题,还有更好的解决方案......

As pointed out in the comments below, there are better solutions if you are willing to include a header...

指针 p 在 16 字节边界上对齐且仅当 ((unsigned long)p & 15) == 0.

A pointer p is aligned on a 16-byte boundary iff ((unsigned long)p & 15) == 0.

这篇关于如何判断内存是否对齐?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆