numpy浮点数不一致 [英] Numpy arange floating point inconsistency
问题描述
我有一个相当简单的numpy任务:创建一个长数组,每个元素增加0.001.当然,np.arange
是答案.我将自己限制为默认精度(float64
).对结果的简单检查是,数组的第1000个元素应具有相同的小数部分.我通过情节检查(请参见附图中的蓝线),事实并非如此.
I have a rather simple numpy task: create an long array with each element incremented by 0.001. Of course, np.arange
is the answer. I am limiting myself to the default precision (float64
). A simple check for the result is that every 1000th element of the array should have the same fractional part. I check that by a plot (see blue line in attached figure), and that is not the case.
tmin = 212990552.75436273
tmax = 213001474.74473435
tbins = np.arange(tmin, tmax, 0.001)
plt.plot(tbins[::1000] % 1)
嗯,我想...浮点怪兽又来袭了.我的起始值很大,但不是那个大,它可以拧紧64位浮点数.凭直觉,我尝试以下操作,我认为应具有相同的含义:
Hmm, I think... the floating point monster strikes again. My start value is large, but not that large that it screws up 64 bit floats. On I hunch, I try the following, which I think should mean the same thing:
nbins = tmin + np.arange(0, tmax-tmin, 0.001)
plt.plot(nbins[::1000] % 1)
多田!那里有差异.在数组中的〜10 ^ 7个元素上,差异单调地爬升至0.14.请注意,由于tmin为x.xxx36273,我希望所有数字的格式均为x.xxx36273. nbins
有,tbins
没有.
Tada! There's a discrepancy right there. The difference monotonically creeps up to 0.14 over the ~10^7 elements in the array. Note that since tmin is x.xxx36273, I expect all numbers to be of the form x.xxx36273. nbins
has that, tbins
does not.
In [68]: tbins[-1]
Out[68]: 213001474.60374644
In [69]: nbins[-1]
Out[69]: 213001474.74436274
打到numpy
名大师的电话-为什么会发生这种情况?
A call to numpy
gurus out there - why is this happening?
推荐答案
您基本上是正确的;如果您关心数组元素的精确小数,请使用第二种方法.
You are basically correct; use the second method if you care for the exact decimals of the elements of the array.
在第一次尝试0.001
的总和.与0.001
相比,该先前值始终是巨大的,因此该求和将不是很准确(为了进行浮点加法时达到最佳精度,两个操作数应具有相同的数量级).
In your first attempt, tbins = np.arange(tmin, tmax, 0.001)
, you are mixing large and small floats in a single computation. The exact value of a given element is computed as the sum of the previous element and 0.001
. This previous value is always huge compared to 0.001
, so this summation will not be very accurate (for best accuracy when doing floating point addition, the two operands should be of the same order of magnitude).
在第二次尝试np.arange(0, tmax-tmin, 0.001)
部分中的求和都非常准确,因为省略了巨大的数字tmin
,并且仅将其加到最后.最后向每个元素添加tmin
的准确性会很差,这意味着最后,每个元素都将以 one 的精度进行错误的操作.将其与第一次尝试进行比较,在第一次尝试中,给定元素的值具有所有先前元素的累积误差.也就是说,元素在数组中位于越远的位置,其状态越差(如您的图所确认).
In your second attempt, nbins = tmin + np.arange(0, tmax-tmin, 0.001)
, the summations in the np.arange(0, tmax-tmin, 0.001)
part are all very accurate, because the huge number tmin
is left out, and only added on in the end. This last addition of tmin
to each element will have poor accuracy, meaning that in the end, each element will have gone through one operation with poor accuracy. Compare that to the first attempt, where the value of a given element has the accumulated error of all previous elements. That is, the further on in the array an element is located, the worse off it is (as your plot confirms).
这篇关于numpy浮点数不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!