指数函数的计算精度 [英] Precision of calculations after exponent function

查看:181
本文介绍了指数函数的计算精度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一个给定的浮点数 X 和该数字的舍入,精确到 10 ^( - n)小数点位置: X'。现在,我想知道,如果在计算指数函数之后: y = 2 ^(x),我的数字和四舍五入的数字之间的差异将保持在同一水平精确。我的意思是:

| 2 ^(X)-2 ^(X')| 10 ^( - n-1)

解决方案

指数放大相对误差并且,通过扩展,ulp错误。考虑一下这个例子:


$ $ p $ $ $ $ c $ float x = 0x1.fffffep6;
printf(x =%a%15.8e exp2(x)=%a%15.8e\\\
,x,x,exp2f(x),exp2f(x));
x = nextafterf(x,0.0f);
printf(x =%a%15.8e exp2(x)=%a%15.8e\\\
,x,x,exp2f(x),exp2f(x));

这会打印出类似于

<$ p $ (x)= 0x1.ffff4ep + 127 3.40280562e + 38
x = 0x1.fffffcp + 6 1.27999985e + 02 exp2(x )= 0x1.fffe9ep + 127 3.40278777e + 38

结果中的最大ulp错误是与使用的浮点格式的2幂指数位相同的数量级。在这个特定的例子中,在IEEE-754 float 中有8个指数位,输入的1个ulp差异转化为176 ulp的结果差异。参数之间的相对差异约为5.5e-8,而结果的相对差异约为5.3e-6。

一种简单直观的思维方式关于这个放大倍数是在浮点参数的有效位数/尾数中的有限数量的位中,一些仅对结果的量值(因此指数位)有贡献(在示例中,这些位是代表如果你用数学的方式来看它,如果原始的参数x = n(n是整数部分127),而其余的位有助于结果的有效位数/尾数位。 *(1 +ε),则e x = e n *(1 +ε)= e n * e n * ε≈e* n *(1 + n *ε)。所以如果n≈128,ε≈1e-7,那么期望的最大相对误差在1.28e-5左右。

I have a question regarding precision of calculations - it is more of a mathematical theory behind programming.

I have a given float number X and the rounding of that number, which is accurate to 10^(-n) decimal place: X'. Now, I would like to know, if after calculating exponent function: y=2^(x) the difference between my number and the rounded number would stay on the same level of precision. I mean:

|2^(X)-2^(X')| is at the level of 10^(-n-1)

解决方案

Exponentiation magnifies relative error and, by extension, ulp error. Consider this illustrative example:

float x = 0x1.fffffep6;
printf ("x=%a %15.8e  exp2(x)=%a %15.8e\n", x, x, exp2f (x), exp2f(x));
x = nextafterf (x, 0.0f);
printf ("x=%a %15.8e  exp2(x)=%a %15.8e\n", x, x, exp2f (x), exp2f(x));

This will print something like

x=0x1.fffffep+6 1.27999992e+02  exp2(x)=0x1.ffff4ep+127 3.40280562e+38
x=0x1.fffffcp+6 1.27999985e+02  exp2(x)=0x1.fffe9ep+127 3.40278777e+38

The maximum ulp error in the result will be on the same order of magnitude as 2exponent bits of the floating format used. In this particular example, there are 8 exponent bits in an IEEE-754 float, and a 1 ulp difference in the input translates into a 176 ulp difference in the result. The relative difference in the arguments is about 5.5e-8, while the relative difference in the results is about 5.3e-6.

A simplified, intuitive, way of thinking about this magnification is that out of the finite number of bits in the significand / mantissa of the floating-point argument, some only contribute to the magnitude, thus exponent bits, of the result (in the example, these would be the bits representing the integral portion of 127), while the remaining bits contribute to the significand / mantissa bits of the result.

If you look at it mathematically, if the original argument x = n*(1+ε), then ex = en*(1+ε) = en * en*ε ≈ en * (1+n*ε). So if n ≈ 128, ε ≈ 1e-7, then expected maximum relative error is around 1.28e-5.

这篇关于指数函数的计算精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆