确定是否在C/C ++中对浮点运算进行了舍入 [英] Determine if rounding occurred for a floating-point operation in C/C++
问题描述
我正在尝试提出一种有效的方法来确定何时/将对IEEE-754操作进行舍入.不幸的是,我不能简单地检查硬件标志.它必须在几个不同的平台上运行.
I am trying to come up with an efficient method to determine when rounding will/did occur for IEEE-754 operations. Unfortunately I am not able to simply check hardware flags. It would have to run on a few different platforms.
我想到的一种方法是在不同的舍入模式下执行运算以比较结果.
One of the approaches I thought of was to perform the operation in different rounding modes to compare the results.
添加示例:
double result = operand1 + operand2;
// save rounding mode
int savedMode = fegetround();
fesetround(FE_UPWARD);
double upResult = operand1 + operand2;
fesetround(FE_DOWNWARD);
double downResult = operand1 + operand2;
// restore rounding mode
fesetround(savedMode);
return (result != upResult) || (result != downResult);
但这显然效率不高,因为它必须执行3次操作.
but this is obviously inefficient because it's having to perform the operation 3 times.
推荐答案
您的示例不一定会通过优化获得正确的结果
级别-O1
或更高.参见以下 Godbolt链接:
编译器仅生成一个加法vaddsd
.
Your example does not necessarily give the right results with optimization
levels -O1
or higher. See this Godbolt link:
only one addition vaddsd
is generated by the compiler.
经过优化
级别-O0
的汇编看起来不错,但这将导致代码效率低下.
此外,调用fegetround
和fesetround
相对昂贵,
与一些浮点运算的成本相比.
With optimization
level -O0
the assembly looks ok, but that would lead to inefficient code.
Moreover calling fegetround
and fesetround
is relatively expensive,
compared to the cost of a few floating point operations.
下面的(自我解释)代码可能是一个有趣的选择. 它使用著名的算法2Sum和2ProdFMA.在没有硬件fma或fma仿真的系统上,可以使用2Prod算法代替2ProdFMA, 例如,参见准确的浮点乘积和幂, 由Stef Graillat撰写.
The (self explaining) code below is probably an interesting alternative. It uses the well-known algorithms 2Sum and 2ProdFMA. On systems without hardware fma, or fma emulation, you can use the 2Prod algorithm instead of 2ProdFMA, See, for example, Accurate Floating Point Product and Exponentiation, by Stef Graillat.
/*
gcc -m64 -Wall -O3 -march=haswell round_ex.c -lm
or with fma emulation on systems without hardware fma support, for example:
gcc -m64 -Wall -O3 -march=nehalem round_ex.c -lm
*/
#include<math.h>
#include<float.h>
#include<stdio.h>
int add_is_not_exact(double operand1, double operand2){
double a = operand1;
double b = operand2;
double s, t, a_1, b_1, d_a, d_b;
/* Algorithm 2Sum computes s and t such that a + b = s + t, exactly. */
/* Here t is the error of the floating-point addition s = a + b. */
/* See, for example, On the robustness of the 2Sum and Fast2Sum algorithms */
/* by Boldo, Graillat, and Muller */
s = a + b;
a_1 = s - b;
b_1 = s - a_1;
d_a = a - a_1;
d_b = b - b_1;
t = d_a + d_b;
return (t!=0.0);
}
int sub_is_not_exact(double operand1, double operand2){
return add_is_not_exact(operand1, -operand2);
}
int mul_is_not_exact(double operand1, double operand2){
double a = operand1;
double b = operand2;
double s, t;
/* Algorithm 2ProdFMA computes s and t such that a * b = s + t, exactly. */
/* Here t is the error of the floating-point multiplication s = a * b. */
/* See, for example, Accurate Floating Point Product and Exponentiation */
/* by Graillat */
s = a * b;
t = fma(a, b, -s);
if (s!=0) return (t!=0.0); /* No underflow of a*b */
else return (a!=0.0)&&(b!=0.0); /* Underflow: inexact if s=0, but (a!=0.0)&&(b!=0.0) */
}
int div_is_not_exact(double operand1, double operand2){
double a = operand1;
double b = operand2;
double s, t;
s = a / b;
t = fma(s, b, -a); /* fma(x,y,z) computes x*y+z with infinite intermediate precision */
return (t!=0.0);
}
int main(){
printf("add_is_not_exact(10.0, 1.0) = %i\n", add_is_not_exact(10.0, 1.0));
printf("sub_is_not_exact(10.0, 1.0) = %i\n", sub_is_not_exact(10.0, 1.0));
printf("mul_is_not_exact( 2.5, 2.5) = %i\n", mul_is_not_exact( 2.5, 2.5));
printf("div_is_not_exact( 10, 2.5) = %i\n", div_is_not_exact( 10, 2.5));
printf("add_is_not_exact(10.0, 0.1) = %i\n", add_is_not_exact(10.0, 0.1));
printf("sub_is_not_exact(10.0, 0.1) = %i\n", sub_is_not_exact(10.0, 0.1));
printf("mul_is_not_exact( 2.6, 2.6) = %i\n", mul_is_not_exact( 2.6, 2.6));
printf("div_is_not_exact( 10, 2.6) = %i\n", div_is_not_exact( 10, 2.6));
printf("\n0x1.0p-300 = %20e, 0x1.0p-600 = %20e \n", 0x1.0p-300 , 0x1.0p-600 );
printf("mul_is_not_exact( 0x1.0p-300, 0x1.0p-300) = %i\n", mul_is_not_exact( 0x1.0p-300, 0x1.0p-300));
printf("mul_is_not_exact( 0x1.0p-600, 0x1.0p-600) = %i\n", mul_is_not_exact( 0x1.0p-600, 0x1.0p-600));
}
输出为:
$ ./a.out
add_is_not_exact(10.0, 1.0) = 0
sub_is_not_exact(10.0, 1.0) = 0
mul_is_not_exact( 2.5, 2.5) = 0
div_is_not_exact( 10, 2.5) = 0
add_is_not_exact(10.0, 0.1) = 1
sub_is_not_exact(10.0, 0.1) = 1
mul_is_not_exact( 2.6, 2.6) = 1
div_is_not_exact( 10, 2.6) = 1
0x1.0p-300 = 4.909093e-91, 0x1.0p-600 = 2.409920e-181
mul_is_not_exact( 0x1.0p-300, 0x1.0p-300) = 0
mul_is_not_exact( 0x1.0p-600, 0x1.0p-600) = 1
如评论中所述,也可以直接阅读 控制和状态寄存器:
As noted in the comments, it is also possible to directly read the control and status register:
#include <fenv.h>
#pragma STDC FENV_ACCESS ON
int add_is_not_exact_v2(double a, double b)
{
fexcept_t excepts;
feclearexcept(FE_ALL_EXCEPT);
double c = a+b;
int tst = fetestexcept(FE_INEXACT);
return (tst!=0);
}
但是,请注意,这可能不适用于编译器优化级别-O1或更高.
在这种情况下,addsd
双重加法指令有时会被完全优化掉,
导致错误的结果.
例如,使用gcc 8.2 gcc -m64 -O1 -march=nehalem
:
Note, however, that this may not work with compiler optimization level -O1 or higher.
In that case the addsd
double add instruction is sometimes optimized away completely,
leading to wrong results.
For example, with gcc 8.2 gcc -m64 -O1 -march=nehalem
:
add_is_not_exact_v2:
sub rsp, 8
mov edi, 61
call feclearexcept
mov edi, 32
call fetestexcept
test eax, eax
setne al
movzx eax, al
add rsp, 8
ret
具有优化级别-O0
,具有2个函数调用,并且相对
扩展指令来修改控制和状态寄存器,这不一定是最有效的解决方案.
With optimization level -O0
, with 2 function calls, and with relatively
expansive instructions to modify the control and status register, this is not necessarily the most efficient solution.
这篇关于确定是否在C/C ++中对浮点运算进行了舍入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!