添加两个浮点数 [英] Adding two floating-point numbers

查看：263 发布时间：2016/11/22 21:19:10 gcc floating-point clang c99 fenv

本文介绍了添加两个浮点数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想计算两个IEEE 754二进制64数字的总和，四舍五入。为此，我写了下面的C99程序：

  #include< stdio.h> 
 #include< fenv.h> 
 #pragma STDC FENV_ACCESS ON 
 
 int main（int c，char * v []）{
 fesetround（FE_UPWARD）; 
 printf（％a \\\
，0x1.0p0 + 0x1.0p-80）;但是，如果我使用各种编译器编译和运行我的程序：）
 
 
 < > 
 
 
 
 $ gcc -v 
 ... 
 gcc 4.2.1（Apple Inc. build 5664）
 $ gcc -Wall -std = c99 add.c && ./a.out 
 add.c：3：warning：忽略#pragma STDC FENV_ACCESS 
 0x1p + 0 
 $ clang -v 
 Apple clang version 1.5（tags / Apple / clang-60）
目标：x86_64-apple-darwin10 
线程模型：posix 
 $ clang -Wall -std = c99 add.c &&。 /a.out 
 add.c：3：14：warning：pragma STDC FENV_ACCESS ON不支持，忽略
 pragma [-Wunknown-pragmas] 
 #pragma STDC FENV_ACCESS ON 
 ^ 
产生1个警告。 
 0x1p + 0 
 
它不工作！ （我期望结果 0x1.0000000000001p0 ）。
 
 
 实际上，计算是在编译时在默认round-to-nearest模式：
 
 $ clang -Wall -std = c99 -S add.c && cat add.s 
 add.c：3：14：warning：pragma STDC FENV_ACCESS ON不支持，忽略
 pragma [-Wunknown-pragmas] 
 #pragma STDC FENV_ACCESS ON 
 ^ 
 1警告。 
 ... 
 LCPI1_0：
 .quad 4607182418800017408 
 ... 
 callq _fesetround 
 movb $ 1，％cl 
 movsd LCPI1_0（％rip） xmm0 
 leaq L_.str（％rip），％rdx 
 movq％rdx，％rdi 
 movb％cl，％al 
 callq _printf 
 ... $ b b L_.str：
 .asciz％a \\\

 
是的，我看到了每个编译器。我理解，在线的规模打开或关闭适用的优化可能是棘手的。我仍然希望，如果这是可能的，将它们关闭在文件的规模，这将足以解决我的问题。
 
 
 我的问题是：我应该使用GCC或Clang使用哪些命令行选项，以便编译一个C99编译单元，该编译单元包含要使用默认以外的FPU舍入模式执行的代码？
 
 
 小说
 
 
 在研究这个问题时，我发现这个 GCC C99合规性页面，其中包含以下条目，如果有人发现有趣，我将离开这里。 Grrrr。
 
浮点| | 
环境访问| N / A |库功能，无需编译器支持。 
在< fenv.h> |中| 
 
 
 
解决方案
我找不到任何你想要的命令行选项。但是，我确实找到了一种方法来重写代码，以便即使有最大的优化（甚至架构优化），GCC和Clang都不会在编译时计算该值。相反，这迫使他们输出将在运行时计算值的代码。
 
 
  C：
 
 
 
  #include< fenv.h> 
 #include< stdio.h> 
 
 #pragma STDC FENV_ACCESS ON 
 
 //加上四舍五入
 double __attribute__（（noinline））addrup（double x，double y）{
 int round = fegetround（）; 
 fesetround（FE_UPWARD）; 
 double r = x + y; 
 fesetround（round）; //恢复旧的舍入模式
 return r; 
} 
 
 int main（int c，char * v []）{
 printf（％a \\\
，addrup（0x1.0p0，0x1.0p- 80））; 
} 
  
这会导致GCC和Clang的这些输出，即使使用最大和架构优化：
 
 
   gcc -S -xc -march = corei7 -O3 （ Godbolt GCC ）：
 
 
 
  addrup：
 push rbx 
 sub rsp，16 
 movsd QWORD PTR [rsp + 8]，xmm0 
 movsd QWORD PTR [rsp]，xmm1 
 call fegetround 
 mov edi，2048 
 mov ebx，eax 
 call fesetround 
 movsd xmm1，QWORD PTR [rsp] 
 mov edi，ebx 
 movsd xmm0，QWORD PTR [rsp + 8 ] 
 addsd xmm0，xmm1 
 movsd QWORD PTR [rsp]，xmm0 
 call fesetround 
 movsd xmm0，QWORD PTR [rsp] 
 add rsp，16 
 pop rbx 
 ret 
 .LC2：
 .string％a\\\

 main：
 sub rsp，8 
 movsd xmm1，QWORD PTR .LC0 [rip] 
 movsd xmm0，QWORD PTR .LC1 [rip] 
 call addrup 
 mov edi，OFFSET FLAT：.LC2 
 mov eax，1 
调用printf 
 xor eax，eax 
 add rsp，8 
 ret 
 .LC0：
 .long 0 
 .long 988807168 
 .LC1 ：
 .long 0 
 .long 1072693248 
  
 
 
 
  clang -S -xc -march = corei7 -O3 （ Godbolt GCC ）： 
 
 
 
  addrup：＃@addrup 
 push rbx 
 sub rsp，16 
 movsd qword ptr [rsp]，xmm1＃8-byte Spill 
 movsd qword ptr [rsp + 8]，xmm0＃8-byte Spill 
 call fegetround 
 mov ebx，eax 
 mov edi， 2048 
 call fesetround 
 movsd xmm0，qword ptr [rsp + 8]＃8-byte Reload 
 addsd xmm0，qword ptr [rsp]＃8-byte Folded Reload 
 movsd qword ptr [rsp + 8]，xmm0＃8-byte Spill 
 mov edi，ebx 
 call fesetround 
 movsd xmm0，qword ptr [rsp + 8]＃8-byte Reload 
 add rsp，16 
 pop rbx 
 ret 
 
 .LCPI1_0：
 .quad 4607182418800017408＃double 1 
 .LCPI1_1：
 .quad 4246894448610377728 ＃double 8.2718061255302767E-25 
 main：＃@main 
 push rax 
 movsd xmm0，qword ptr [rip + .LCPI1_0]＃xmm0 = mem [0]，zero 
 movsd xmm1，qword ptr [rip + .LCPI1_1]＃xmm1 = mem [0]，zero 
 call addrup 
 mov edi，.L.str 
 mov al，1 
 call printf 
 xor eax，eax 
 pop rcx 
 ret 
 
 .L.str：
 .asciz％a \\\

  
 
 
 
 
 
 现在更有趣的部分： > 
 
 
好吧，当他们（GCC和/或Clang）编译代码时，他们尝试查找和替换可以在运行时计算的值。这称为常量传播。如果你只是写了另一个函数，常数传播将停止发生，因为它不应该交叉函数。
 
 
 然而，如果他们看到一个函数，他们可以，理论上，用代替代替函数调用的代码，他们可以这样做。这称为函数内联。如果函数内联将用于一个函数，则我们假定该函数是（惊喜） inlinable 。
 
 
 如果函数总是返回相同的结果对于给定的一组输入，则将其视为纯。我们还说，它没有副作用（意味着它不会改变环境）。
 
 
 现在，如果一个函数是完全inlinable （意味着它不会调用外部库，不包括GCC和Clang中包含的一些默认值） libc ， libm 等），并且是纯的，那么它们将对该函数应用常量传播。
 
 
 换句话说，如果我们不希望它们传播常量一个函数调用，我们可以做两件事之一：
 
 
  
 使函数显示不纯：
 
  
 使用文件系统
 
 从某处随机输入一些bull子魔法
 
 使用网络
 
 使用某种系统调用
 
 从外部库调用GCC和/或Clang未知的内容
 
 
 / li> 
 
使函数不完全inlinable 
 
  
 从外部库调用GCC和/或Clang未知的内容
 
 使用 __ attribute__（（noinline）） 
 
 
 
 
  
 
 现在，最后一个是最简单的。正如你可能已经推测的， __ attribute__（（noinline））将函数标记为非内联。因为我们可以利用这个，所以我们要做的是做另一个函数，做任何我们想要的计算，标记它 __ attribute__（（noinline）），然后调用
 
 
 编译时，它们不会违反内联和扩展常量传播规则，因此，该值将在运行时使用适当的舍入模式集。
 
I would like to compute the sum, rounded up, of two IEEE 754 binary64 numbers. To that end I wrote the C99 program below:
#include <stdio.h>
#include <fenv.h>
#pragma STDC FENV_ACCESS ON

int main(int c, char *v[]){
  fesetround(FE_UPWARD);
  printf("%a\n", 0x1.0p0 + 0x1.0p-80);
}
However, if I compile and run my program with various compilers:
$ gcc -v
…
gcc version 4.2.1 (Apple Inc. build 5664)
$ gcc -Wall -std=c99 add.c && ./a.out 
add.c:3: warning: ignoring #pragma STDC FENV_ACCESS
0x1p+0
$ clang -v
Apple clang version 1.5 (tags/Apple/clang-60)
Target: x86_64-apple-darwin10
Thread model: posix
$ clang -Wall -std=c99 add.c && ./a.out 
add.c:3:14: warning: pragma STDC FENV_ACCESS ON is not supported, ignoring
      pragma [-Wunknown-pragmas]
#pragma STDC FENV_ACCESS ON
             ^
1 warning generated.
0x1p+0
It doesn't work! (I expected the result 0x1.0000000000001p0).

Indeed, the computation was done at compile-time in the default round-to-nearest mode:
$ clang -Wall -std=c99 -S add.c && cat add.s
add.c:3:14: warning: pragma STDC FENV_ACCESS ON is not supported, ignoring
      pragma [-Wunknown-pragmas]
#pragma STDC FENV_ACCESS ON
             ^
1 warning generated.
…
LCPI1_0:
    .quad   4607182418800017408
…
    callq   _fesetround
    movb    $1, %cl
    movsd   LCPI1_0(%rip), %xmm0
    leaq    L_.str(%rip), %rdx
    movq    %rdx, %rdi
    movb    %cl, %al
    callq   _printf
…
L_.str:
    .asciz   "%a\n"
Yes, I did see the warning emitted by each compiler. I understand that turning the applicable optimizations on or off at the scale of the line may be tricky. I would still like, if that was at all possible, to turn them off at the scale of the file, which would be enough to resolve my question.

My question is: what command-line option(s) should I use with GCC or Clang so as to compile a C99 compilation unit that contains code intended to be executed with an FPU rounding mode other than the default?

Digression

While researching this question, I found this GCC C99 compliance page, containing the entry below, that I will just leave here in case someone else finds it funny. Grrrr.
floating-point      |     |
environment access  | N/A | Library feature, no compiler support required.
in <fenv.h>         |     |

 解决方案 
I couldn't find any command line options that would do what you wanted. However, I did find a way to rewrite your code so that even with maximum optimizations (even architectural optimizations), neither GCC nor Clang compute the value at compile time. Instead, this forces them to output code that will compute the value at runtime.

C:

#include <fenv.h>
#include <stdio.h>

#pragma STDC FENV_ACCESS ON

// add with rounding up
double __attribute__ ((noinline)) addrup (double x, double y) {
  int round = fegetround ();
  fesetround (FE_UPWARD);
  double r = x + y;
  fesetround (round);   // restore old rounding mode
  return r;
}

int main(int c, char *v[]){
  printf("%a\n", addrup (0x1.0p0, 0x1.0p-80));
}
This results in these outputs from GCC and Clang, even when using maximum and architectural optimizations:

gcc -S -x c -march=corei7 -O3 (Godbolt GCC):

addrup:
        push    rbx
        sub     rsp, 16
        movsd   QWORD PTR [rsp+8], xmm0
        movsd   QWORD PTR [rsp], xmm1
        call    fegetround
        mov     edi, 2048
        mov     ebx, eax
        call    fesetround
        movsd   xmm1, QWORD PTR [rsp]
        mov     edi, ebx
        movsd   xmm0, QWORD PTR [rsp+8]
        addsd   xmm0, xmm1
        movsd   QWORD PTR [rsp], xmm0
        call    fesetround
        movsd   xmm0, QWORD PTR [rsp]
        add     rsp, 16
        pop     rbx
        ret
.LC2:
        .string "%a\n"
main:
        sub     rsp, 8
        movsd   xmm1, QWORD PTR .LC0[rip]
        movsd   xmm0, QWORD PTR .LC1[rip]
        call    addrup
        mov     edi, OFFSET FLAT:.LC2
        mov     eax, 1
        call    printf
        xor     eax, eax
        add     rsp, 8
        ret
.LC0:
        .long   0
        .long   988807168
.LC1:
        .long   0
        .long   1072693248


clang -S -x c -march=corei7 -O3 (Godbolt GCC):

addrup:                                 # @addrup
        push    rbx
        sub     rsp, 16
        movsd   qword ptr [rsp], xmm1   # 8-byte Spill
        movsd   qword ptr [rsp + 8], xmm0 # 8-byte Spill
        call    fegetround
        mov     ebx, eax
        mov     edi, 2048
        call    fesetround
        movsd   xmm0, qword ptr [rsp + 8] # 8-byte Reload
        addsd   xmm0, qword ptr [rsp]   # 8-byte Folded Reload
        movsd   qword ptr [rsp + 8], xmm0 # 8-byte Spill
        mov     edi, ebx
        call    fesetround
        movsd   xmm0, qword ptr [rsp + 8] # 8-byte Reload
        add     rsp, 16
        pop     rbx
        ret

.LCPI1_0:
        .quad   4607182418800017408     # double 1
.LCPI1_1:
        .quad   4246894448610377728     # double 8.2718061255302767E-25
main:                                   # @main
        push    rax
        movsd   xmm0, qword ptr [rip + .LCPI1_0] # xmm0 = mem[0],zero
        movsd   xmm1, qword ptr [rip + .LCPI1_1] # xmm1 = mem[0],zero
        call    addrup
        mov     edi, .L.str
        mov     al, 1
        call    printf
        xor     eax, eax
        pop     rcx
        ret

.L.str:
        .asciz  "%a\n"




Now for the more interesting part: why does that work?

Well, when they (GCC and/or Clang) compile code, they try to find and replace values that can be computed at runtime. This is known as constant propagation. If you had simply written another function, constant propagation would cease to occur, since it isn't supposed to cross functions.

However, if they see a function that they could, in theory, substitute the code of in place of the function call, they may do so. This is known as function inlining. If function inlining will work on a function, we say that that function is (surprise) inlinable.

If a function always return the same results for a given set of inputs, then it is considered pure. We also say that it has no side effects (meaning it makes no changes to the environment).

Now, if a function is fully inlinable (meaning that it doesn't make any calls to external libraries excluding a few defaults included in GCC and Clang - libc, libm, etc.) and is pure, then they will apply constant propagation to the function.

In other words, if we don't want them to propagate constants through a function call, we can do one of two things:


Make the function appear impure:


Use the filesystem
Do some bullshit magic with some random input from somewhere
Use the network
Use some syscall of some sort
Call something from an external library unknown to GCC and/or Clang

Make the function not fully inlinable


Call something from an external library unknown to GCC and/or Clang
Use __attribute__ ((noinline))



Now, that last one is the easiest. As you may have surmised, __attribute__ ((noinline)) marks the function as non-inlinable. Since we can take advantage of this, all we have to do is make another function that does whatever computation we want, mark it with __attribute__ ((noinline)), and then call it.

When it is compiled, they will not violate the inlining and, by extension, constant propagation rules, and therefore, the value will be computed at runtime with the appropriate rounding mode set.

                        这篇关于添加两个浮点数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

添加两个浮点数 [英] Adding two floating-point numbers

问题描述

小说

C：

`gcc -S -xc -march = corei7 -O3` （ Godbolt GCC ）：

`clang -S -xc -march = corei7 -O3` （ Godbolt GCC ）：

Digression

C:

`gcc -S -x c -march=corei7 -O3` (Godbolt GCC):

`clang -S -x c -march=corei7 -O3` (Godbolt GCC):

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

添加两个浮点数 [英] Adding two floating-point numbers

问题描述

小说

C：

gcc -S -xc -march = corei7 -O3 （ Godbolt GCC ）：

clang -S -xc -march = corei7 -O3 （ Godbolt GCC ）：

Digression

C:

gcc -S -x c -march=corei7 -O3 (Godbolt GCC):

clang -S -x c -march=corei7 -O3 (Godbolt GCC):

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

`gcc -S -xc -march = corei7 -O3` （ Godbolt GCC ）：

`clang -S -xc -march = corei7 -O3` （ Godbolt GCC ）：

`gcc -S -x c -march=corei7 -O3` (Godbolt GCC):

`clang -S -x c -march=corei7 -O3` (Godbolt GCC):

登录关闭