GCC内存对齐编译 [英] gcc memory alignment pragma

查看:330
本文介绍了GCC内存对齐编译的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

海湾合作委员会是否有内存对齐编译,类似于英特尔编译器的#pragma矢量对准? 我想告诉编译器使用对齐加载/存储指令优化特定的循环。为了避免可能的混淆,这不是关于结构的包装。

Does gcc have memory alignment pragma, akin #pragma vector aligned in Intel compiler? I would like to tell compiler to optimize particular loop using aligned loads/store instructions. to avoid possible confusion, this is not about struct packing.

例如:

#if defined (__INTEL_COMPILER)
#pragma vector aligned
#endif
        for (int a = 0; a < int(N); ++a) {
            q10 += Ix(a,0,0)*Iy(a,1,1)*Iz(a,0,0);
            q11 += Ix(a,0,0)*Iy(a,0,1)*Iz(a,1,0);
            q12 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,0,1);
            q13 += Ix(a,1,0)*Iy(a,0,0)*Iz(a,0,1);
            q14 += Ix(a,0,0)*Iy(a,1,0)*Iz(a,0,1);
            q15 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,1,1);
        }

感谢

推荐答案

从<一个href="http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html">http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html

typedef double aligned_double __attribute__((aligned (16)));
// Note: sizeof(aligned_double) is 8, not 16
void some_function(aligned_double *x, aligned_double *y, int n)
{
    for (int i = 0; i < n; ++i) {
        // math!
    }
}

这不会让 aligned_double 16个字节宽。这将只是使它对准以一个16字节边界,或者说第一个以阵列会。看着我的电脑上拆卸,只要我使用的对齐的指令,我开始看到载体OPS了很多。我使用的是Power架构的计算机此刻所以它的AltiVec code,但我认为这是你想要的东西。

This won't make aligned_double 16 bytes wide. This will just make it aligned to a 16-byte boundary, or rather the first one in an array will be. Looking at the disassembly on my computer, as soon as I use the alignment directive, I start to see a LOT of vector ops. I am using a Power architecture computer at the moment so it's altivec code, but I think this does what you want.

(注:我没有使用当我测试这一点,因为那里的AltiVec不支持双浮筒)

(Note: I wasn't using double when I tested this, because there altivec doesn't support double floats.)

您可以看到使用的类型自动向量化的其他一些例子属性在这里:<一href="http://gcc.gnu.org/projects/tree-ssa/vectorization.html">http://gcc.gnu.org/projects/tree-ssa/vectorization.html

You can see some other examples of autovectorization using the type attributes here: http://gcc.gnu.org/projects/tree-ssa/vectorization.html

这篇关于GCC内存对齐编译的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆