numpy回合的方式与python不同 [英] Numpy rounds in a different way than python

查看:130
本文介绍了numpy回合的方式与python不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

代码

import numpy as np
a = 5.92270987499999979065
print(round(a, 8))
print(round(np.float64(a), 8))

给予

5.92270987
5.92270988

知道为什么吗?

在numpy来源中没有发现任何相关内容.

Found nothing relevant in numpy sources.

更新:
我知道,解决此问题的正确方法是以这种差异无关紧要的方式构造程序.我做的.我在回归测试中偶然发现了它.

Update:
I know that the proper way to deal with this problem is to construct programs in such a way that this difference is irrelevant. Which I do. I stumbled into it in regression testing.

Update2:
关于@VikasDamodar评论.一个人不应该相信repr()函数:

Update2:
Regarding the @VikasDamodar comment. One shouldn't trust the repr() function:

>>> np.float64(5.92270987499999979065)
5.922709875
>>> '%.20f' % np.float64(5.92270987499999979065)
'5.92270987499999979065'

Update3:
在python3.6.0 x32,numpy 1.14.0,win64上进行了测试.同样在python3.6.4 x64,numpy 1.14.0,debian上.

Update3:
Tested on python3.6.0 x32, numpy 1.14.0, win64. Also on python3.6.4 x64, numpy 1.14.0, debian.

更新4:
只是要确保:

Update4:
Just to be sure:

import numpy as np
a = 5.92270987499999979065
print('%.20f' % round(a, 8))
print('%.20f' % round(np.float64(a), 8))

5.92270987000000026512
5.92270988000000020435

更新5:
下面的代码演示了在不使用str的情况下发生差异的地方:

Update5:
The following code demonstrates on which stage the difference takes place without using str:

>>> np.float64(a) - 5.922709874
1.000000082740371e-09
>>> a - 5.922709874
1.000000082740371e-09
>>> round(np.float64(a), 8) - 5.922709874
6.000000496442226e-09
>>> round(a, 8) - 5.922709874
-3.999999442783064e-09

很明显,在应用四舍五入"之前,它们是相同的数字.

Clearly, before applying 'round' they were the same number.

更新6:
与@ user2357112的答案相反,np.round大约比舍入慢4倍:

Update6:
In contrast to @user2357112's answer, np.round is roughly 4 times slower than round:

%%timeit a = 5.92270987499999979065
round(a, 8)

1.18 µs ± 26.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  

%%timeit a = np.float64(5.92270987499999979065)
round(a, 8)

4.05 µs ± 43.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

在我看来,np.round比内置round更好地舍入到最接近的偶数:最初,我将11.84541975除以2得到了这个5.92270987499999979065数字.

Also in my opinion np.round did a better job rounding to the nearest even than builtin round: originally I got this 5.92270987499999979065 number through dividing 11.84541975 by two.

推荐答案

float.__round__ takes special care to produce correctly-rounded results, using a correctly-rounded double-to-string algorithm.

NumPy没有. NumPy文档提到

NumPy does not. The NumPy docs mention that

由于IEEE浮点标准[R9]中小数部分的不精确表示以及以10的幂进行缩放时引入的错误,结果也可能令人惊讶.

Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R9] and errors introduced when scaling by powers of ten.

这更快,但是会产生更多的舍入误差.它会导致像您所观察到的一样的错误,以及甚至仍然明确地低于临界值的数字仍被四舍五入的错误:

This is faster, but produces more rounding error. It leads to errors like what you've observed, as well as errors where numbers even more unambiguously below the cutoff still get rounded up:

>>> x = 0.33499999999999996
>>> x
0.33499999999999996
>>> x < 0.335
True
>>> x < Decimal('0.335')
True
>>> x < 0.67/2
True
>>> round(x, 2)
0.33
>>> numpy.round(x, 2)
0.34000000000000002


NumPy的舍入时间变慢,但这与舍入算法变慢没有任何关系. NumPy与常规Python数学之间的任何时间比较都可以归结为NumPy针对整个数组操作进行了优化的事实.在单个NumPy标量上进行数学运算会产生很多开销,但是用numpy.round舍入整个数组很容易胜过使用round舍入一个浮点数列表:


You're getting a slower time for NumPy's rounding, but that doesn't have anything to do with which rounding algorithm is slower. Any time comparison between NumPy and regular Python math will boil down to the fact that NumPy is optimized for whole-array operations. Doing math on single NumPy scalars has a lot of overhead, but rounding an entire array with numpy.round easily beats rounding a list of floats with round:

In [6]: import numpy

In [7]: l = [i/7 for i in range(100)]

In [8]: a = numpy.array(l)

In [9]: %timeit [round(x, 1) for x in l]
59.6 µs ± 408 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [10]: %timeit numpy.round(a, 1)
5.27 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

关于哪个更准确,绝对是float.__round__.您的数字更接近5.92270987,而不是5.92270988,它是联系对偶的舍入,而不是对偶数的舍入.这里没有领带.

As for which one is more accurate, that's definitely float.__round__. Your number is closer to 5.92270987 than to 5.92270988, and it's round-ties-to-even, not round-everything-to-even. There's no tie here.

这篇关于numpy回合的方式与python不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆