两个"np.longdouble"之和产生很大的数值误差 [英] Sum of two "np.longdouble"s yields big numerical error
问题描述
早上好
我正在从FITS文件中读取两个数字(代表单个数字的整数和浮点部分),将它们转换为长双精度数(在我的机器中为128位),然后将其求和.
I'm reading two numbers from a FITS file (representing the integer and floating point parts of a single number), converting them to long doubles (128 bit in my machine), and then summing them up.
结果并不像我预期的那样使用128位浮点数.这是代码:
The result is not as precise as I would expect from using 128-bit floats. Here is the code:
a_int = np.longdouble(read_header_key(fits_file, 'I'))
print "I %.25f" % a_int, type(a_int)
a_float = np.longdouble(read_header_key(fits_file, 'F'))
print "F %.25f" % a_float, a_float.dtype
a = a_int + a_float
print "TOT %.25f" % a, a.dtype
这是我得到的答案:
I 55197.0000000000000000000000000 <type 'numpy.float128'>
F 0.0007660185200000000195833 float128
TOT 55197.0007660185219720005989075 float128
仅11个十进制数字(总共16个有效数字)后,结果就超出了我的期望(55197.0007660185200000000195833).我希望128bit浮点数的精度更高.我究竟做错了什么?
The result departs from what I would expect(55197.0007660185200000000195833) after just 11 decimal digits (16 significant digits in total). I would expect a much better precision from 128bit floats. What am I doing wrong?
此结果在Mac机器和Linux 32位机器上重现(在这种情况下,dtype为float96,但值完全相同)
This result was reproduced on a Mac machine and on a Linux 32bit machine (in that case, the dtype was float96, but the values were exactly the same)
提前感谢您的帮助!
Matteo
推荐答案
问题出在打印np.longdouble
上.使用%f
进行格式化时,Python在打印之前将结果强制转换为浮点数(64位).
The problem lies in your printing of the np.longdouble
. When you format using %f
, Python casts the result to a float (64-bits) before printing.
这里:
>>> a_int = np.longdouble(55197)
>>> a_float = np.longdouble(76601852) / 10**11
>>> b = a_int + a_float
>>> '%.25f' % b
'55197.0007660185219720005989075'
>>> '%.25f' % float(b)
'55197.0007660185219720005989075'
>>> b * 10**18
5.5197000766018519998e+22
请注意,在我的机器上,与普通的double
相比,使用longdouble
只能得到更高的精度(小数点后20位而不是15位).因此,可能值得一看的是Decimal
模块是否更适合您的应用程序. Decimal
在不损失精度的情况下处理任意精度的十进制浮点数.
Note that on my machine, I only get a bit more precision with longdouble
compared with ordinary double
(20 decimal places instead of 15). So, it may be worth seeing if the Decimal
module might be more suited for your application. Decimal
handles arbitrary-precision decimal floating-point numbers with no loss of precision.
这篇关于两个"np.longdouble"之和产生很大的数值误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!