同时使用SSE2内部函数和gcc内联汇编器 [英] Use both SSE2 intrinsics and gcc inline assembler

查看:64
本文介绍了同时使用SSE2内部函数和gcc内联汇编器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试在gcc中混合使用SSE2内部函数和内联汇编程序.但是,如果我将变量指定为xmm0/register作为输入,则在某些情况下会出现编译器错误.示例:

I have tried to mix SSE2 intrinsics and inline assembler in gcc. But if I specify a variable as xmm0/register as input then in some cases I get a compiler error. Example:

#include <emmintrin.h>
int main() {
  __m128i test = _mm_setzero_si128(); 
  asm ("pxor %%xmm0, %%xmm0" : : "xmm0" (test) : );
}

使用gcc 4.6.1版编译时,我得到:

When compiled with gcc version 4.6.1 I get:

>gcc asm_xmm.c
asm_xmm.c: In function ‘main’:
asm_xmm.c:10:3: error: matching constraint references invalid operand number
asm_xmm.c:7:5: error: matching constraint references invalid operand number

奇怪的是,在相同的情况下,我有其他输入变量/寄存器,然后突然将xmm0作为输入,但没有xmm1,等等.在另一种情况下,我可以指定xmm0-xmm4,但不能在上面指定.对这个:S

The strange thing is that in same cases where I have other input variables/registers then it suddenly works with xmm0 as input but not xmm1, etc. And in another case I was able to specify xmm0-xmm4 but not above. A little confused/frustrated about this :S

谢谢:)

推荐答案

您应该让编译器执行寄存器分配.这是 pshufb 的示例(对于 gcc 来说太老了,无法为SSSE3使用 tmmintrin ):

You should let the compiler do the register assignment. Here's an example of pshufb (for gcc too old to have tmmintrin for SSSE3):

static inline __m128i __attribute__((always_inline))
_mm_shuffle_epi8(__m128i xmm, __m128i xmm_shuf)
{
    __asm__("pshufb %1, %0" : "+x" (xmm) : "xm" (xmm_shuf));
    return xmm;
}

请注意参数上的"x" 限定符,而只需在程序集本身中添加%0 ,编译器将在其中替换所选的寄存器.

Note the "x" qualifier on the arguments and simply %0 in the assembly itself, where the compiler will substitute in the register it selected.

请小心使用正确的修饰符."+ x" 表示 xmm 既是输入参数,也是输出参数.如果您对这些修饰符草率地使用(例如,仅在需要"+ x" 时使用"= x" 表示输出),则可能会遇到有时起作用且有时起作用的情况没有.

Be careful to use the right modifiers. "+x" means xmm is both an input and an output parameter. If you are sloppy with these modifiers (eg using "=x" meaning output only when you needed "+x") you will run into cases where it sometimes works and sometimes doesn't.

这篇关于同时使用SSE2内部函数和gcc内联汇编器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆