两个"np.longdouble"之和产生很大的数值误差 [英] Sum of two "np.longdouble"s yields big numerical error

查看:109
本文介绍了两个"np.longdouble"之和产生很大的数值误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

早上好

我正在从FITS文件中读取两个数字(代表单个数字的整数和浮点部分),将它们转换为长双精度数(在我的机器中为128位),然后将其求和.

I'm reading two numbers from a FITS file (representing the integer and floating point parts of a single number), converting them to long doubles (128 bit in my machine), and then summing them up.

结果并不像我预期的那样使用128位浮点数.这是代码:

The result is not as precise as I would expect from using 128-bit floats. Here is the code:

a_int = np.longdouble(read_header_key(fits_file, 'I'))
print "I %.25f" % a_int, type(a_int)
a_float = np.longdouble(read_header_key(fits_file, 'F'))
print "F %.25f" % a_float, a_float.dtype
a = a_int + a_float
print "TOT %.25f" % a, a.dtype

这是我得到的答案:

I 55197.0000000000000000000000000 <type 'numpy.float128'>
F 0.0007660185200000000195833 float128
TOT 55197.0007660185219720005989075 float128

仅11个十进制数字(总共16个有效数字)后,结果就超出了我的期望(55197.0007660185200000000195833).我希望128bit浮点数的精度更高.我究竟做错了什么?

The result departs from what I would expect(55197.0007660185200000000195833) after just 11 decimal digits (16 significant digits in total). I would expect a much better precision from 128bit floats. What am I doing wrong?

此结果在Mac机器和Linux 32位机器上重现(在这种情况下,dtype为float96,但值完全相同)

This result was reproduced on a Mac machine and on a Linux 32bit machine (in that case, the dtype was float96, but the values were exactly the same)

提前感谢您的帮助!

Matteo

推荐答案

问题出在打印np.longdouble上.使用%f进行格式化时,Python在打印之前将结果强制转换为浮点数(64位).

The problem lies in your printing of the np.longdouble. When you format using %f, Python casts the result to a float (64-bits) before printing.

这里:

>>> a_int = np.longdouble(55197)
>>> a_float = np.longdouble(76601852) / 10**11
>>> b = a_int + a_float
>>> '%.25f' % b
'55197.0007660185219720005989075'
>>> '%.25f' % float(b)
'55197.0007660185219720005989075'
>>> b * 10**18
5.5197000766018519998e+22

请注意,在我的机器上,与普通的double相比,使用longdouble只能得到更高的精度(小数点后20位而不是15位).因此,可能值得一看的是Decimal模块是否更适合您的应用程序. Decimal在不损失精度的情况下处理任意精度的十进制浮点数.

Note that on my machine, I only get a bit more precision with longdouble compared with ordinary double (20 decimal places instead of 15). So, it may be worth seeing if the Decimal module might be more suited for your application. Decimal handles arbitrary-precision decimal floating-point numbers with no loss of precision.

这篇关于两个"np.longdouble"之和产生很大的数值误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆