融合的乘加和默认舍入模式 [英] Fused multiply add and default rounding modes

查看：294 发布时间：2016/8/17 22:53:21 c gcc clang ieee-754 fma

本文介绍了融合的乘加和默认舍入模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用GCC 5.3以下code compield与 -O3 -fma

With GCC 5.3 the following code compield with -O3 -fma

float mul_add(float a, float b, float c) {
  return a*b + c;
}

产生以下组件

vfmadd132ss     %xmm1, %xmm2, %xmm0
ret

我注意到GCC与 -O3 操作已在GCC 4.8

I noticed GCC doing this with -O3 already in GCC 4.8.

锵3.7与 -O3 -mfma 产生

vmulss  %xmm1, %xmm0, %xmm0
vaddss  %xmm2, %xmm0, %xmm0
retq

但锵3.7与 -Ofast -mfma 产生相同的code作为GCC与 -O3快

but Clang 3.7 with -Ofast -mfma produces the same code as GCC with -O3 fast.

我很惊讶的是，海湾合作委员会确实与 -O3 因为这个答案它说：

I am surprised that GCC does with -O3 because from this answer it says

编译器不允许一个融合分离加和乘法，除非你允许一个轻松的浮点模型。

The compiler is not allowed to fuse a separated add and multiply unless you allow for a relaxed floating-point model.

这是因为FMA只有一个舍入，而一个ADD + MUL有两个。因此，编译器将通过融合违反严格IEEE浮点行为。

This is because an FMA has only one rounding, while an ADD + MUL has two. So the compiler will violate strict IEEE floating-point behaviour by fusing.

不过，从此链接它说

不管FLT_EVAL_METHOD的值，任何浮点前pression可以收缩，即，计算为如果所有中间结果有无限的范围和precision

Regardless of the value of FLT_EVAL_METHOD, any floating-point expression may be contracted, that is, calculated as if all intermediate results have infinite range and precision.

所以现在我很困惑和担心。

So now I am confused and concerned.

是gcc使用FMA合理与 -O3 ？

是否违反融合严格IEEE浮点行为？

如果熔断确实违反IEEE浮点beahviour，自<一个href=\"http://stackoverflow.com/questions/31181897/status-of-stdc-iec-559-with-modern-c-compilers\">GCC返回 __ __ STDC_IEC_559 是不是这个矛盾？

由于FMA 可以通过软件这似乎是模仿应该有有两种编译器开关FMA：一是告诉编译器在计算中使用FMA和一个告诉的硬件有FMA编译器

Since FMA can be emulated in software it seems to be there should be two compiler switches for FMA: one to tell the compiler to use FMA in calculations and one to tell the compiler that the hardware has FMA.

鸭prently这可以用选项 -ffp合约控制。随着GCC默认为 -ffp合同=快速并与锵它不是。其他选项，如 -ffp合同= ON 和 -ffp合同=关闭做没有产生FMA指令。

Apprently this can be controlled with the option -ffp-contract. With GCC the default is -ffp-contract=fast and with Clang it's not. Other options such as -ffp-contract=on and -ffp-contract=off do no produce the FMA instruction.

例如锵3.7与 -O3 -mfma -ffp合同=快速产生 vfmadd132ss 。

我检查的一些排列的#pragma STDC FP_CONTRACT 设置为 ON 和关闭与 -ffp合约设置为在，关闭和快速。在我还用 -O3 -mfma 。

I checked some permutations of #pragma STDC FP_CONTRACT set to ON and OFF with -ffp-contract set to on, off, and fast. IN all cases I also used -O3 -mfma.

使用GCC的答案很简单。 的#pragma STDC FP_CONTRACT 开启或关闭没有差别。只有 -ffp合约事项。

With GCC the answer is simple. #pragma STDC FP_CONTRACT ON or OFF makes no difference. Only -ffp-contract matters.

它使用GCC FMA 与

-ffp合同=快速（默认值）。

-ffp-contract=fast (default).

铿锵它使用 FMA

与 -ffp合同=快速。

与 -ffp合同= ON （默认）和的#pragma STDC FP_CONTRACT ON （默认为关闭）。

with -ffp-contract=fast.
with -ffp-contract=on (default) and #pragma STDC FP_CONTRACT ON (default is OFF).

在铿锵换句话说，你可以得到 FMA 与的#pragma STDC FP_CONTRACT ON （因为 -ffp合同= ON 是默认）或 -ffp合同=快速。 -ffast-数学（因此 -Ofast ）设置 -ffp合同=快。

In other words with Clang you can get fma with #pragma STDC FP_CONTRACT ON (since -ffp-contract=on is the default) or with -ffp-contract=fast. -ffast-math (and hence -Ofast) set -ffp-contract=fast.

我看着MSVC和ICC。

I looked into MSVC and ICC.

随着MSVC它采用 / O2 /拱FMA指令：AVX2 / FP：快。随着MSVC / FP：precise 是默认

With MSVC it uses the fma instruction with /O2 /arch:AVX2 /fp:fast. With MSVC /fp:precise is the default.

使用ICC它使用FMA与 -O3 -march =核心AVX2 （acctually -O1 就足够了）。这是因为在默认情况下使用ICC -fp模型快速。但ICC使用FMA甚至与 -fp模型$ P $ pcise 。要与ICC使用禁用FMA -fp模型严格或 -no-FMA 。

With ICC it uses fma with -O3 -march=core-avx2 (acctually -O1 is sufficient). This is because by default ICC uses -fp-model fast. But ICC uses fma even with -fp-model precise. To disable fma with ICC use -fp-model strict or -no-fma.

所以，在默认情况下GCC和ICC使用FMA FMA时启用（与 -mfma 为GCC /锵或 -march =核心AVX2 与ICC），但锵和MSVC没有。

So by default GCC and ICC use fma when fma is enabled (with -mfma for GCC/Clang or -march=core-avx2 with ICC) but Clang and MSVC do not.

融合的乘加和默认舍入模式 [英] Fused multiply add and default rounding modes

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

融合的乘加和默认舍入模式 [英] Fused multiply add and default rounding modes

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭