gcc,simd内在函数和快速数学概念 [英] gcc, simd intrinsics and fast-math concepts

查看:187
本文介绍了gcc,simd内在函数和快速数学概念的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好:)

我试图得到关于浮点,SIMD /数学内在函数和gcc的快速数学标志的一些概念。更具体地说,我在x86 cpu上使用MinGW和gcc v4.5.0。



我现在搜索了一段时间,这就是我(想我)目前理解:

当我没有标记编译时,任何fp代码都将是标准x87,没有simd内在函数,math.h函数将从msvcrt.dll。



当我使用 mfpmath mssen 和/或 march 因此,如果我还指定了一些优化标志,例如 On ftree-vectorize ,则mmcc / sse / avx代码被启用,但gcc实际上仅使用simd指令 EM>。在这种情况下,内部函数是由gcc自动选择的,并且一些数学函数(我仍然在讨论math.h上的标准数学函数)将变为内在函数或通过内联代码优化,其他一些函数仍然来自msvcrt。 DLL。
如果我没有指定优化标志,那么做这些改变吗?



当我使用特定的simd数据类型(那些可用作gcc扩展的类型,我可以选择直接调用内部funcs,或者再次将automagic决定留给gcc。如果我不通过正确的标志启用simd指令,那么Gcc仍然可以选择标准的x87代码。
再说一遍,如果我没有指定优化标志,那么这个改变是否会发生?



如果我的任何陈述错误, p>

现在的问题:


  1. 我是否必须包含x86intrin.h才能使用intrinsics?

  2. 我是否曾经连接过libm?

  3. 什么是快速数学?我明白它放宽了IEEE标准,但是,具体来说,该如何呢?其他标准功能被使用?其他一些库链接?或者只是在某处设置了一对标志,而标准库的行为有所不同?

感谢任何想要帮助的人:D

解决方案

好吧,对于任何正在努力掌握像我这样的概念的人来说,
$ b

使用Ox对任何类型的代码进行优化,fpu或sse b

快速数学似乎只适用于x87代码。另外,它似乎没有改变fpu控制字o_O



Builtin始终包含在内。这种行为可以避免一些内置的,一些标志,如严格或无内建。



libm.a用于一些东西,不包括在glibc,但是对于mingw它只是一个虚拟文件,所以目前连接到它是无用的。

使用特殊向量类型的gcc似乎只在调用内部函数时才有用直接,否则代码无论如何都会被矢量化。



欢迎任何修正:

有用的链接: >
fpu / sse control

gcc数学

和Vector Extensions,X86 Built-in functions和Other Builtins的gcc手册


Hi all :)
I'm trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I'm using MinGW with gcc v4.5.0 on a x86 cpu.

I've searched around for a while now, and that's what I (think I) understand at the moment:

When I compile with no flags, any fp code will be standard x87, no simd intrinsics, and the math.h functions will be linked from msvcrt.dll.

When I use mfpmath, mssen and/or march so that mmx/sse/avx code gets enabled, gcc actually uses simd instructions only if I also specify some optimization flags, like On or ftree-vectorize. In which case the intrinsics are chosen automagically by gcc, and some math functions (I'm still talking about the standard math funcs on math.h) will become intrinsics or optimized out by inline code, some others will still come from the msvcrt.dll. If I don't specify optimization flags, does any of this change?

When I use specific simd data types (those available as gcc extensions, like v4si or v8qi), I have the option to call intrinsic funcs directly, or again leave the automagic decision to gcc. Gcc can still chose standard x87 code if I don't enable simd instructions via the proper flags. Again, if I don't specify optimization flags, does any of this change?

Plese correct me if any of my statements is wrong :p

Now the questions:

  1. Do I ever have to include x86intrin.h to use intrinsics?
  2. Do I ever have to link the libm?
  3. What fast-math has to do with anything? I understand it relaxes the IEEE standard, but, specifically, how? Other standard functions are used? Some other lib is linked? Or are just a couple of flags set somewhere and the standard lib behaves differently?

Thanks to anybody who is going to help :D

解决方案

Ok, I'm ansewring for anyone who is struggling a bit to grasp these concepts like me.

Optimizations with Ox work on any kind of code, fpu or sse

fast-math seems to work only on x87 code. Also, it doesn't seem to change the fpu control word o_O

Builtins are always included. This behavior can be avoided for some builtins, with some flags, like strict or no-builtins.

The libm.a is used for some stuff that is not included in the glibc, but with mingw it's just a dummy file, so at the moment it's useless to link to it

Using the special vector types of gcc seems useful only when calling the intrinsics directly, otherwise the code gets vectorized anyway.

Any correction is welcomed :)

Useful links:
fpu / sse control
gcc math
and the gcc manual on "Vector Extensions", "X86 Built-in functions" and "Other Builtins"

这篇关于gcc,simd内在函数和快速数学概念的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆