为什么Clang会优化x * 1.0而不是x + 0.0? [英] Why does Clang optimize away x * 1.0 but NOT x + 0.0?

查看:158
本文介绍了为什么Clang会优化x * 1.0而不是x + 0.0?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么Clang会优化掉这段代码中的循环

  #include< time.h> 
#include< stdio.h>

static size_t const N = 1<< 27;
static double arr [N] = {/ *初始化为零* /};

int main()
{
clock_t const start = clock();
for(int i = 0; i printf(%u ms \ n,(unsigned)(clock() - start)* 1000 / CLOCKS_PER_SEC);
}

但是这段代码没有循环?

  #include< time.h> 
#include< stdio.h>

static size_t const N = 1<< 27;
static double arr [N] = {/ *初始化为零* /};

int main()
{
clock_t const start = clock();
for(int i = 0; i printf(%u ms \ n,(unsigned)(clock() - start)* 1000 / CLOCKS_PER_SEC);



$ b

(标记为C和C ++,因为我想知道答案每一个都是不同的。)

解决方案

IEEE 754-2008浮点运算标准和 ISO / IEC 10967语言无关算术(LIA)标准,第1部分回答这是为什么。


IEEE 754§6.3符号位



当输入或结果是NaN,这个标准不解释NaN的标志。但是,请注意,位字符串上的操作(copy,negate,abs,copySign)指定NaN结果的符号位,有时基于NaN操作数的符号位。逻辑谓词totalOrder也受NaN操作数的符号位影响。对于所有其他操作,即使只有一个输入NaN,或者NaN是由一个无效操作产生的,该标准也不会指定NaN结果的符号位。



当输入和结果都不是NaN时,产品或商的符号是操作数符号的异或;一个总和的符号,或者被认为是一个总和x +( - y)的x - y的差别,最多
是一个加号的符号;并且转换结果的符号,量化操作,roundTo-Integral操作和roundToIntegralExact(见5.3.1)是第一个或唯一操作数的符号。当操作数或结果为零或无限时,这些规则将适用。

当两个符号相反的操作数的总和(或两个操作数与相同符号的差值)是恰好为零,除了roundTowardNegative之外,所有舍入方向属性的和(或差)的符号应为+0;在该属性下,确切的零和(或差)的符号应为-0。但是,x + x = x - (-x)即使在x为零时也保留与x相同的符号。



Addition



在默认舍入模式 (四舍五入关系)看到 x + 0.0 产生 x ,当 x -0.0 :在这种情况下,我们有两个相反符号的操作数的总和,其和为零,第6.3段第3段规定这个加法操作产生 +0.0



由于 +0.0 不是 bitwise 与原始 -0.0 相同,并且 -0.0 是可能作为输入发生的合法值,编译器有义务将代码转换成可能的负零到 +0.0



总结:在如果 x



x + 0.0 ul>
  • 不是 -0.0 ,那么 x 其elf是一个可接受的输出值。
  • -0.0 ,那么输出值必须 +0.0 ,这与 -0.0 不是按位相同的。 b

    乘法的情况



    在默认舍入模式下问题发生在 x * 1.0 。如果 x
    $ b


    • 是一个(子)普通数, x * 1.0 == x always。

    • +/- infinity ,那么结果是同一个星座的 +/-无穷大
    • NaN


      IEEE 754§6.2.3 NaN传播



      将NaN操作数传播到其结果并将单个NaN作为输入的操作应产生一个NaN,其输入NaN的有效负载(如果可以用目标格式表示)。

      这意味着 NaN * 1.0 的指数和尾数(尽管不是符号)是推荐的从输入 NaN 保持不变。该符号未按照上面的6.3p1规定,但是一个实现可以指定它与源 NaN


    • +/- 0.0 ,那么结果是一个 0 ,它的符号位与符号XOR位 1.0 ,与§6.3p2一致。由于 1.0 的符号位是 0 ,因此输出值与输入无关。因此,即使当 x 是(负)零时, x * 1.0 == x

      减法的情况



      在默认舍入模式下,减法 x-0.0 也是无操作的,因为它相当于 x +(-0.0)。如果 x


      • NaN ,则§6.3p1和§6.2.3的应用方式与添加和乘法的方式大致相同。
      • +/-无穷大,那么结果是相同符号的 +/-无穷大
      • 是一个(子)正常数字, x-0.0 == x always。

      • -0.0 ,那么根据§6.3p2,我们有一个和数的符号,或者是一个和x +( - y)相同的差别,这个和最多一个加数的符号不同。 。这迫使我们把 -0.0 作为( - 0.0)+( - 0.0)的结果,因为 -0.0 none 的加数不同,而 +0.0 这是违反本条款的。

      • +0.0 ,那么这就减少到了在加法的情况中考虑的(+ 0.0)+( - 0.0),根据§6.3p3被裁定赋予<$因为在所有情况下输入值是合法的输出,因此,允许考虑 x-0.0 一个无操作,并且 x == x-0.0 一个同义词。 / p>

        值改变优化



        IEEE 754-2008标准有以下有趣的引用:


        IEEE 754§10.4字面意思和数值改变优化



        [...] p>

        以下值变换其中包括保留源代码的字面含义:


        • 当x不为零时,应用标识属性0 + x不是信号NaN,结果与x指数相同。
        • 当x不是信号NaN时,应用标识属性1×x,结果与x指数相同。 / li>


        • 更改安静的NaN的有效载荷或符号位。由于所有的NaN和所有的无穷都共享相同的指数,并且正确舍入的结果 x + 0.0
          $ >和 x * 1.0 有限 x 的大小与 x ,它们的指数是相同的。

          sNaNs



          信号NaN是浮点陷阱值;它们是特殊的NaN值,用作浮点操作数会导致无效的操作异常(SIGFPE)。如果一个触发异常的循环被优化了,软件将不再表现相同。



          但是,如user2357112 在评论中指出 < C11标准明确地将未定义的信号NaNs行为( sNaN ),所以编译器可以假定它们不会发生,因此,他们养的也不会发生。 C ++ 11标准省略了描述NaN信号的行为,因此也没有定义。

          舍入模式



          在交替舍入模式下,允许的优化可能会改变。例如,在轮到负无限模式下,优化 x + 0.0 - > x 变成允许的,但 x-0.0 - > x $ / code>被禁止。

          为了防止GCC采用默认的舍入模式和行为,实验标记 -frounding-math 可以传递给GCC。



          结论



          Clang和即使在 -O3 ,GCC 仍然保留IEEE-754兼容。这意味着它必须遵守IEEE-754标准的上述规则。 x + 0.0 不完全相同 x > x ,但是 x * 1.0 可以被选择为:也就是说, p>


          1. 请注意,建议在NaN时传递 x 的有效负载。 / li>
          2. 将NaN结果的符号位保持为 * 1.0

          3. 服从命令在商品/商品期间异或符号位,当 x 不是一个NaN时。

          启用IEEE-754不安全的优化(x + 0.0) - > x ,标记 -ffast-math 需要传递给Clang或者GCC。


          Why does Clang optimize away the loop in this code

          #include <time.h>
          #include <stdio.h>
          
          static size_t const N = 1 << 27;
          static double arr[N] = { /* initialize to zero */ };
          
          int main()
          {
              clock_t const start = clock();
              for (int i = 0; i < N; ++i) { arr[i] *= 1.0; }
              printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
          }
          

          but not the loop in this code?

          #include <time.h>
          #include <stdio.h>
          
          static size_t const N = 1 << 27;
          static double arr[N] = { /* initialize to zero */ };
          
          int main()
          {
              clock_t const start = clock();
              for (int i = 0; i < N; ++i) { arr[i] += 0.0; }
              printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
          }
          

          (Tagging as both C and C++ because I would like to know if the answer is different for each.)

          解决方案

          The IEEE 754-2008 Standard for Floating-Point Arithmetic and the ISO/IEC 10967 Language Independent Arithmetic (LIA) Standard, Part 1 answer why this is so.

          IEEE 754 § 6.3 The sign bit

          When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.

          When neither the inputs nor result are NaN, the sign of a product or quotient is the exclusive OR of the operands’ signs; the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs; and the sign of the result of conversions, the quantize operation, the roundTo-Integral operations, and the roundToIntegralExact (see 5.3.1) is the sign of the first or only operand. These rules shall apply even when operands or results are zero or infinite.

          When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x + x = x − (−x) retains the same sign as x even when x is zero.

          The Case of Addition

          Under the default rounding mode (Round-to-Nearest, Ties-to-Even), we see that x+0.0 produces x, EXCEPT when x is -0.0: In that case we have a sum of two operands with opposite signs whose sum is zero, and §6.3 paragraph 3 rules this addition produces +0.0.

          Since +0.0 is not bitwise identical to the original -0.0, and that -0.0 is a legitimate value that may occur as input, the compiler is obliged to put in the code that will transform potential negative zeros to +0.0.

          The summary: Under the default rounding mode, in x+0.0, if x

          • is not -0.0, then x itself is an acceptable output value.
          • is -0.0, then the output value must be +0.0, which is not bitwise identical to -0.0.

          The Case of Multiplication

          Under the default rounding mode, no such problem occurs with x*1.0. If x:

          • is a (sub)normal number, x*1.0 == x always.
          • is +/- infinity, then the result is +/- infinity of the same sign.
          • is NaN, then according to

            IEEE 754 § 6.2.3 NaN Propagation

            An operation that propagates a NaN operand to its result and has a single NaN as an input should produce a NaN with the payload of the input NaN if representable in the destination format.

            which means that the exponent and mantissa (though not the sign) of NaN*1.0 are recommended to be unchanged from the input NaN. The sign is unspecified in accordance with §6.3p1 above, but an implementation may specify it to be identical to the source NaN.

          • is +/- 0.0, then the result is a 0 with its sign bit XORed with the sign bit of 1.0, in agreement with §6.3p2. Since the sign bit of 1.0 is 0, the output value is unchanged from the input. Thus, x*1.0 == x even when x is a (negative) zero.

          The Case of Subtraction

          Under the default rounding mode, the subtraction x-0.0 is also a no-op, because it is equivalent to x + (-0.0). If x is

          • is NaN, then §6.3p1 and §6.2.3 apply in much the same way as for addition and multiplication.
          • is +/- infinity, then the result is +/- infinity of the same sign.
          • is a (sub)normal number, x-0.0 == x always.
          • is -0.0, then by §6.3p2 we have "[...] the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs;". This forces us to assign -0.0 as the result of (-0.0) + (-0.0), because -0.0 differs in sign from none of the addends, while +0.0 differs in sign from two of the addends, in violation of this clause.
          • is +0.0, then this reduces to the addition case (+0.0) + (-0.0) considered above in The Case of Addition, which by §6.3p3 is ruled to give +0.0.

          Since for all cases the input value is legal as the output, it is permissible to consider x-0.0 a no-op, and x == x-0.0 a tautology.

          Value-Changing Optimizations

          The IEEE 754-2008 Standard has the following interesting quote:

          IEEE 754 § 10.4 Literal meaning and value-changing optimizations

          [...]

          The following value-changing transformations, among others, preserve the literal meaning of the source code:

          • Applying the identity property 0 + x when x is not zero and is not a signaling NaN and the result has the same exponent as x.
          • Applying the identity property 1 × x when x is not a signaling NaN and the result has the same exponent as x.
          • Changing the payload or sign bit of a quiet NaN.
          • [...]

          Since all NaNs and all infinities share the same exponent, and the correctly rounded result of x+0.0 and x*1.0 for finite x has exactly the same magnitude as x, their exponent is the same.

          sNaNs

          Signaling NaNs are floating-point trap values; They are special NaN values whose use as a floating-point operand results in an invalid operation exception (SIGFPE). If a loop that triggers an exception were optimized out, the software would no longer behave the same.

          However, as user2357112 points out in the comments, the C11 Standard explicitly leaves undefined the behaviour of signaling NaNs (sNaN), so the compiler is allowed to assume they do not occur, and thus that the exceptions that they raise also do not occur. The C++11 standard omits describing a behaviour for signaling NaNs, and thus also leaves it undefined.

          Rounding Modes

          In alternate rounding modes, the permissible optimizations may change. For instance, under Round-to-Negative-Infinity mode, the optimization x+0.0 -> x becomes permissible, but x-0.0 -> x becomes forbidden.

          To prevent GCC from assuming default rounding modes and behaviours, the experimental flag -frounding-math can be passed to GCC.

          Conclusion

          Clang and GCC, even at -O3, remains IEEE-754 compliant. This means it must keep to the above rules of the IEEE-754 standard. x+0.0 is not bit-identical to x for all x under those rules, but x*1.0 may be chosen to be so: Namely, when we

          1. Obey the recommendation to pass unchanged the payload of x when it is a NaN.
          2. Leave the sign bit of a NaN result unchanged by * 1.0.
          3. Obey the order to XOR the sign bit during a quotient/product, when x is not a NaN.

          To enable the IEEE-754-unsafe optimization (x+0.0) -> x, the flag -ffast-math needs to be passed to Clang or GCC.

          这篇关于为什么Clang会优化x * 1.0而不是x + 0.0?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    相关文章
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆