装上彩车上交所翻转迹象 [英] Flipping sign on packed SSE floats

查看:115
本文介绍了装上彩车上交所翻转迹象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在寻找的包装在SSE寄存器中的所有四个浮点翻转标志的最有效的方法。

I'm looking for the most efficient method of flipping the sign on all four floats packed in an SSE register.

我还没有找到一个内在的英特尔架构软件开发手册这样做。以下是我已经尝试过的事情。

I have not found an intrinsic for doing this in the Intel Architecture software dev manual. Below are the things I've already tried.

有关我环绕在code 10的十亿倍,得到指示的墙时每个案件。我想至少匹配4秒,把我的非SIMD方法,它是只用元减运算符。

For each case I looped over the code 10 billion times and got the wall-time indicated. I'm trying to at least match 4 seconds it takes my non-SIMD approach, which is using just the unary minus operator.


[48秒]

_mm_sub_ps(_mm_setzero_ps(),VEC);


[32秒]

_mm_mul_ps(_mm_set1_ps(-1.0F),VEC);


[9秒]


[9 sec]

union NegativeMask {
    int   intRep;
    float fltRep;
} negMask;
negMask.intRep = 0x80000000;

_mm_xor_ps( _mm_set1_ps( negMask.fltRep ), vec );


编译器是gcc的-O3 4.2。 CPU是英特尔的Core 2 Duo处理器。


The compiler is gcc 4.2 with -O3. The CPU is an Intel Core 2 Duo.

推荐答案

仅仅通过这些内置矢量gcc的文档来完成自己的答案:

Just to complete your own answer by the gcc documentation about these builtin vectors:

The types defined in this manner can be used with a subset of normal C
operations.  Currently, GCC will allow using the following operators on
these types: `+, -, *, /, unary minus, ^, |, &, ~'.

这可能是一个好主意,始终坚持这些可能的情况下。具有很高的机率GCC总是会提供最有效的code这个东西SSE

It is probably a good idea to always stick to these when possible. With very high chances gcc will always provide the most efficient code for this SSE stuff.

有关你的编译器选项,添加更具体的东西,你的架构,像 -march =本地会做在大多数情况下。

For your compiler options, add something more specific to your architecture, something like -march=native will do in most cases.

这篇关于装上彩车上交所翻转迹象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆