如何处理多余的precision在浮点计算? [英] How to deal with excess precision in floating-point computations?

查看:94
本文介绍了如何处理多余的precision在浮点计算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的数值模拟我有类似下面的代码片段code

In my numerical simulation I have code similar to the following snippet

double x;
do {
  x = /* some computation */;
} while (x <= 0.0);
/* some algorithm that requires x to be (precisely) larger than 0 */

对于某些编译器(如GCC)在某些平台上(如Linux的的x87数学)很可能 X 在计算比双precision高(用过量的precision)。 (更新的:当我谈到precision这里,我的意思是precision /和/范围)在这种情况下,可以想象的比较( X'LT = 0 )返回false,即使下一次x被四舍五入到双precision它变为0(而且也没有保证X不是在任意时间点舍去。)

With certain compilers (e.g. gcc) on certain platforms (e.g. linux, x87 math) it is possible that x is computed in higher than double precision ("with excess precision"). (Update: When I talk of precision here, I mean precision /and/ range.) Under these circumstances it is conceivable that the comparison (x <= 0) returns false even though the next time x is rounded down to double precision it becomes 0. (And there's no guarantee that x isn't rounded down at an arbitrary point in time.)

有什么办法来执行这种比较是

Is there any way to perform this comparison that


  • 是便携式的,

  • 工作在code,它被内联,

  • 不影响性能和

  • 不排除一些任意范围(0,EPS)?

我试图用( X&LT;的std :: numeric_limits&LT;双&GT; :: denorm_min()),但似乎循环显著放缓与SSE2时数学。 (我知道,非正规可以减慢计算,但我没想到他们要慢一些,只是移动和比较。)

I tried to use (x < std::numeric_limits<double>::denorm_min()) but that seemed to significantly slow down the loop when working with SSE2 math. (I know that denormals can slow down a computation, but I didn't expect them to be slower to just move around and compare.)

更新:
另一种方法是使用挥发性强制 X 到内存比较前,如通过写

Update: An alternative is to use volatile to force x into memory before the comparison, e.g. by writing

} while (*((volatile double*)&x) <= 0.0);

然而,根据应用和由编译器所施加的优化,这种解决方案可以引入一个明显的开销太大。

However, depending on the application and the optimizations applied by the compiler, this solution can introduce a noticeable overhead too.

更新:
与任何公差的问题是,它是很随意的,即,它取决于特定的应用或环境。我想preFER只是做不超出precision的比较,让我没有做任何额外的假设或引进一些任意epsilons到我的库函数的文档。

Update: The problem with any tolerance is that it's quite arbitrary, i.e. it depends on the specific application or context. I'd prefer to just do the comparison without excess precision, so that I don't have to make any additional assumptions or introduce some arbitrary epsilons into the documentation of my library functions.

推荐答案

由于长Arkadiy在评论中指出,有明确的转换((双)×)LT = 0.0 的工作 - 至少按标准

As Arkadiy stated in the comments, an explicit cast ((double)x) <= 0.0 should work - at least according to the standard.

C99:TC3,5.2.4.2.2§8:

C99:TC3, 5.2.4.2.2 §8:

除了分配及演员(其中去除所有多余的范围和precision),浮动运算操作和值进行评估,以它的范围和$ P格式值受通常的算术转换和浮动常量$ pcision可能比由类型所要求更大。 [...]

Except for assignment and cast (which remove all extra range and precision), the values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. [...]


如果你使用x86上GCC,您可以使用标志 -mpc32 -mpc64 -mpc80 设置浮点运算的precision到单人,双人和扩展双precision。

If you're using GCC on x86, you can use the flags -mpc32, -mpc64 and -mpc80 to set the precision of floating-point operations to single, double and extended double precision.

这篇关于如何处理多余的precision在浮点计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆