浮点运算和机器epsilon [英] Floating point arithmetic and machine epsilon

查看:321
本文介绍了浮点运算和机器epsilon的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图计算 float 类型的epsilon值的近似值(我知道它已经在标准库中)。



这台机器上的epsilon值是(用一些近似值打印的):

$ $ $ $ $ $ $ c $ FLT_EPSILON = 1.192093 e-07
DBL_EPSILON = 2.220446e-16
LDBL_EPSILON = 1.084202e-19

FLT_EVAL_METHOD 2 所以一切都在 long double precision中完成,并且 float double 和 long double 是32位,64位和96位。

我试着从1开始计算一个近似值,然后除以2,直到它变得太小,用 float 类型进行所有的操作:

  #include< stdio.h> 

int main(void)
{
float floatEps = 1;

while(1 + floatEps / 2!= 1)
floatEps / = 2;

printf(float eps =%e \\\
,floatEps);

输出结果不是我想要的:

  float epsilon = 1.084202e-19 

中间操作以最高精度完成(由于 FLT_EVAL_METHOD 的值),所以这个结果似乎是合法的。



$ p
$ b $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ 1 + floatEps / 2.0)!= 1)
floatEps / = 2;

给出了这个输出,这是正确的:

  float epsilon = 1.192093e-07 

但是这个:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $($)$($)$($) 1)
floatEps / = 2;

再次导致错误结果,如第一个:

  float epsilon = 1.084202e-19 

这个最后两个版本应该在这个平台上相当,这是一个编译器的bug吗?如果没有,发生了什么事情?

代码是用以下代码编译的:

  gcc -O0 -std = c99 -pedantic file.c 

gcc版本相当老,但是我在大学里,我无法更新它:

  $ gcc -v 
使用内置的规格。
目标:i486-linux-gnu
配置:../src/configure -v --with-pkgversion ='Debian 4.4.5-8'
--with-bugurl = file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages = c,c ++,fortran,objc,obj-c ++ --prefix = / usr - 程序后缀= -4.4
--enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir = / usr / lib --without-included -gettext --enable-threads = posix
--with-gxx-include-dir = / usr / include / c ++ / 4.4 --libdir = / usr / lib --enable-nls
- enable-clocale = gnu --enable-libstdcxx-debug --enable-objc -gc
--enable-targets = all --with-arch-32 = i586 --with-tune = generic
--enable-checking = release --build = i486-linux-gnu --host = i486-linux-gnu
--target = i486-linux-gnu
线程模型:posix
gcc版本4.4.5(Debian 4.4.5-8)

当前版本的gcc 4.7正确地在我的家用电脑上。也有评论说,不同的版本给出了不同的结果。

经过一些答案和评论,澄清什么行为如预期,什么不是,我改变了这个问题一点使其更清晰。

解决方案

允许编译器评估 float 表达式的任何更大的精度,所以它看起来像第一个表达式计算在 long double 精度。在第二个表达式中,您再次执行将结果缩放到 float



在回答一些其他问题和下面的讨论:你基本上是寻找一个浮点类型的最小的非零差异。根据 FLT_EVAL_METHOD 的设置,编译器可能决定以比所涉及的类型更高的精度评估所有浮点表达式。在奔腾传统上,浮点单元的内部寄存器是80位,对于所有较小的浮点类型使用该精度很方便。所以最后你的测试取决于比较的精确度!= 。在没有明确的转换的情况下,这个比较的精确度是由你的编译器决定的,而不是由你的代码决定的。当你确定你的编译器已经设置了 FLT_EVAL_METHOD >到 2 ,所以它使用任何浮点计算的最高精度。



作为下面的讨论的结论我们有信心说在版本之前有一个与在 gcc 中实现 FLT_EVAL_METHOD = 2 4.5,至少从版本4.6开始确定。如果在表达式中使用整数常量 2 而不是浮点常量 2.0 ,则转换为<$ c在生成的程序集中省略$ c> float 。值得注意的是,从优化级别 -O1 ,在这些较老的编译器上产生了正确的结果,但是生成的程序集是相当不同的,并且只包含很少的浮点运算。

I'm trying to compute an approximation of the epsilon value for the float type (and I know it's already in the standard library).

The epsilon values on this machine are (printed with some approximation):

 FLT_EPSILON = 1.192093e-07
 DBL_EPSILON = 2.220446e-16
LDBL_EPSILON = 1.084202e-19

FLT_EVAL_METHOD is 2 so everything is done in long double precision, and float, double and long double are 32, 64 and 96 bit.

I tried to get an approximation of the value starting from 1 and dividing it by 2 until it becomes too small, doing all operation with the float type:

# include <stdio.h>

int main(void)
{
    float floatEps = 1;

    while (1 + floatEps / 2 != 1)
        floatEps /= 2;

    printf("float eps = %e\n", floatEps);
}

The output is not what I was looking for:

float epsilon = 1.084202e-19

Intermediate operations are done with the greatest precision (due to the value of FLT_EVAL_METHOD), so this result seems legit.

However, this:

// 2.0 is a double literal
while ((float) (1 + floatEps / 2.0) != 1)
    floatEps /= 2;

gives this output, which is the right one:

float epsilon = 1.192093e-07

but this one:

// no double literals
while ((float) (1 + floatEps / 2) != 1)
    floatEps /= 2;

leads again to a wrong result, as the first one:

float epsilon = 1.084202e-19

These last two versions should be equivalent on this platform, is this a compiler bug? If not, what's happening?

Code is compiled with:

gcc -O0 -std=c99 -pedantic file.c

The gcc version is pretty old, but I'm at university and I can't update it:

$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4
--enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
--enable-targets=all --with-arch-32=i586 --with-tune=generic
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
--target=i486-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8)

Current version of gcc, 4.7, behaves correctly on my home computer. There are also comments saying that different versions give different results.

After some answers and comments, that clarified what is behaving as expected and what's not, I changed the question a little to make it clearer.

解决方案

The compiler is allowed to evaluate float expressions in any bigger precision it likes, so it looks like the first expression is evaluated in long double precision. In the second expression you enforce scaling the result down to float again.

In answer to some of your additional questions and the discussion below: you are basically looking for the smallest non-zero difference with 1 of some floating point type. Depending on the setting of FLT_EVAL_METHOD a compiler may decide to evaluate all floating point expressions in a higher precision than the types involved. On a Pentium traditionally the internal registers of the floating point unit are 80 bits and it is convenient to use that precision for all the smaller floating point types. So in the end your test depends on the precision of your compare !=. In the absence of an explicit cast the precision of this comparison is determined by your compiler not by your code. With the explicit cast you scale the comparison down to the type you desire.

As you confirmed your compiler has set FLT_EVAL_METHOD to 2 so it uses the highest precision for any floating point calculation.

As a conclusion to the discussion below we are confident to say that there is a bug relating to implementation of the FLT_EVAL_METHOD=2 case in gcc prior to version 4.5 and that is fixed from of at least version 4.6. If the integer constant 2 is used in the expression instead of the floating point constant 2.0, the cast to float is omitted in the generated assembly. It is also worth noticing that from of optimization level -O1 the right results are produced on these older compilers, but the generated assembly is quite different and contains only few floating point operations.

这篇关于浮点运算和机器epsilon的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆