将参数std :: min更改为浮点数的编译器输出 [英] Argument order to std::min changes compiler output for floating-point

查看:140
本文介绍了将参数std :: min更改为浮点数的编译器输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在摆弄Compiler Explorer时,发现传递给std :: min的参数顺序更改了发出的程序集.

x86-64系统V 在xmm0,xmm1中传递FP args,等等,并返回xmm0.)

我的问题:为什么编译器不切换参数本身的顺序,所以不需要此movapd?当然必须知道,minsd的参数顺序不会改变答案吗?有一些我不欣赏的副作用吗?

解决方案

minsd a,b对于某些特殊FP值不是可交换的,std::min 也不可交换,除非您使用-ffast-math.

minsd a,b 确切地实现了(a<b) ? a : b,包括一切在严格的IEEE-754语义中暗示零符号和NaN. (即,它将源操作数b保持为无序 1 或相等).正如Artyer指出的那样,-0.0+0.0比较相等(即-0. < 0.为假),但是它们是不同的.

std::min是根据(a<b)比较表达式定义的( cppreference ),使用(a<b) ? a : b作为可能的实现,与 std::fmin 可以确保从任何一个操作数传播NaN. (fmin最初来自C数学库,而不是C ++模板.)

请参见>什么是有关在minss/minsd/maxss/maxsd(以及相应的内在函数,除了某些GCC版本以外,遵循相同的非交换规则)的更多详细信息,该指令在x86上给出了无分支FP的min和max的信息? /p>

脚注1:请记住,对于任何b和任何比较谓词,NaN<b均为false.例如NaN == b为false,NaN > b也是如此.即使NaN == NaN是错误的.当一对中的一个或多个是NaN时,它们是无序的". wrt.彼此.


使用-ffast-math(告诉编译器不假设NaN,以及其他假设和近似值),编译器将任一函数优化为单个minsd. https://godbolt.org/z/a7oK91

对于GCC,请参见 https://gcc.gnu.org/wiki/FloatingPointMath
clang支持类似的选项,包括-ffast-math作为全部.

几乎所有的人都应该启用其中一些选项,但奇怪的旧版代码库除外,例如-fno-math-errno. (请参阅此问题与解答; A,了解有关推荐的数学优化的更多信息).而且gcc -fno-trapping-math是个好主意,因为尽管默认情况下处于启用状态,但它仍然无法完全正常工作(某些优化仍然可以更改如果未屏蔽异常的情况下会引发的FP异常的数量,包括有时甚至从1变为0或0表示非零(IIRC). gcc -ftrapping-math还阻止了一些即使100%安全的优化.异常语义,因此非常糟糕.在不使用fenv.h的代码中,您永远不会知道区别.

但是将std::min视为可交换的,只能通过不使用NaN的选项来实现,并且类似的东西,因此对于完全在乎的代码绝对不能称为安全" . NaN会发生什么.例如-ffinite-math-only假设没有NaN(也没有无穷大)

clang -funsafe-math-optimizations -ffinite-math-only将进行您正在寻找的优化. (不安全的数学优化意味着一堆更具体的选择,包括不关心带符号的零语义).

I was fiddling in Compiler Explorer, and I found that the order of arguments passed to std::min changes the emitted assembly.

Here's the example on Godbolt Compiler Explorer

double std_min_xy(double x, double y) {
    return std::min(x, y);
}

double std_min_yx(double x, double y) {
    return std::min(y, x);
}

This is compiled (with -O3 on clang 9.0.0, for example), to:

std_min_xy(double, double):                       # @std_min_xy(double, double)
        minsd   xmm1, xmm0
        movapd  xmm0, xmm1
        ret
std_min_yx(double, double):                       # @std_min_yx(double, double)
        minsd   xmm0, xmm1
        ret

This persists if I change the std::min to an old-school ternary operator. It also persists across all the modern compilers I tried out (clang, gcc, icc).

The underlying instruction is minsd. Reading the documentation, the first argument of minsd is also the destination for the answer. Apparently xmm0 is where my function is supposed to put its return value, so if xmm0 is used as the first argument, there is no movapd needed. But if xmm0 is the second argument, then it has to movapd xmm0, xmm1 to get the value into xmm0. (editor's note: yes, x86-64 System V passes FP args in xmm0, xmm1, etc., and returns in xmm0.)

My question: why doesn't the compiler switch the order of the arguments itself, so that this movapd isn't necessary? It surely must know that the order of arguments to minsd does not change the answer? Is there some side-effect that I'm not appreciating?

解决方案

minsd a,b is not commutative for some special FP values, and neither is std::min, unless you use -ffast-math.

minsd a,b exactly implements (a<b) ? a : b including everything that implies about signed-zero and NaN in strict IEEE-754 semantics. (i.e. it keeps the source operand, b, on unordered1 or equal). As Artyer points out, -0.0 and +0.0 compare equal (i.e. -0. < 0. is false), but they are distinct.

std::min is defined in terms of an (a<b) comparison expression (cppreference), with (a<b) ? a : b as a possible implementation, unlike std::fmin which guarantees NaN propagation from either operand, among other things. (fmin originally came from the C math library, not a C++ template.)

See What is the instruction that gives branchless FP min and max on x86? for much more detail about minss/minsd / maxss/maxsd (and the corresponding intrinsics, which follow the same non-commutative rules except in some GCC versions.)

Footnote 1: Remember that NaN<b is false for any b, and for any comparison predicate. e.g. NaN == b is false, and so is NaN > b. Even NaN == NaN is false. When one or more of a pair are NaN, they are "unordered" wrt. each other.


With -ffast-math (to tell the compiler to assume no NaNs, and other assumptions and approximations), compilers will optimize either function to a single minsd. https://godbolt.org/z/a7oK91

For GCC, see https://gcc.gnu.org/wiki/FloatingPointMath
clang supports similar options, including -ffast-math as a catch-all.

Some of those options should be enabled by almost everyone, except for weird legacy codebases, e.g. -fno-math-errno. (See this Q&A for more about recommended math optimizations). And gcc -fno-trapping-math is a good idea because it doesn't fully work anyway, despite being on by default (some optimizations can still change the number of FP exceptions that would be raised if exceptions were unmasked, including sometimes even from 1 to 0 or 0 to non-zero, IIRC). gcc -ftrapping-math also blocks some optimizations that are 100% safe even wrt. exception semantics, so it's pretty bad. In code that doesn't use fenv.h, you'll never know the difference.

But treating std::min as commutative can only be accomplished with options that assume no NaNs, and stuff like that, so definitely can't be called "safe" for code that cares about exactly what happens with NaNs. e.g. -ffinite-math-only assumes no NaNs (and no infinities)

clang -funsafe-math-optimizations -ffinite-math-only will do the optimization you're looking for. (unsafe-math-optimizations implies a bunch of more specific options, including not caring about signed zero semantics).

这篇关于将参数std :: min更改为浮点数的编译器输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆