为什么“numpy.any"有没有短路机制? [英] Why "numpy.any" has no short-circuit mechanism?
问题描述
我不明白为什么这么基本的优化还没有完成:
I don't understand why a so basic optimization has not yet be done:
In [1]: one_million_ones = np.ones(10**6)
In [2]: %timeit one_million_ones.any()
100 loops, best of 3: 693µs per loop
In [3]: ten_millions_ones = np.ones(10**7)
In [4]: %timeit ten_millions_ones.any()
10 loops, best of 3: 7.03 ms per loop
扫描整个数组,即使结论是第一项的证据.
The whole array is scanned, even if the conclusion is an evidence at first item.
推荐答案
这是一个未修复的性能回归.NumPy 问题 3446.实际上是 短路逻辑,但有变化ufunc.reduce
机制在短路逻辑周围引入了一个不必要的基于块的外循环,而该外循环不知道如何短路.您可以在此处看到对分块机制的一些解释.
It's an unfixed performance regression. NumPy issue 3446. There actually is short-circuiting logic, but a change to the ufunc.reduce
machinery introduced an unnecessary chunk-based outer loop around the short-circuiting logic, and that outer loop doesn't know how to short circuit. You can see some explanation of the chunking machinery here.
不过,即使没有回归,短路效应也不会出现在您的测试中.首先,您正在为数组创建计时,其次,我认为他们从未为除布尔值之外的任何输入 dtype 设置短路逻辑.从讨论中可以看出,numpy.any
背后的 ufunc 归约机制的细节会让这变得困难.
The short-circuiting effects wouldn't have showed up in your test even without the regression, though. First, you're timing the array creation, and second, I don't think they ever put in the short-circuit logic for any input dtype but boolean. From the discussion, it sounds like the details of the ufunc reduction machinery behind numpy.any
would have made that difficult.
讨论确实提出了一个令人惊讶的点,即 argmin
和 argmax
方法似乎对布尔输入短路.快速测试表明,从 NumPy 1.12(不是最新版本,但目前在 Ideone 上的版本),x[x.argmax()]
短路,它在 1- 方面胜过 x.any()
和 x.max()
维度布尔输入,无论输入是小还是大,也无论短路是否有回报.奇怪!
The discussion does bring up the surprising point that the argmin
and argmax
methods appear to short-circuit for boolean input. A quick test shows that as of NumPy 1.12 (not quite the most recent version, but the version currently on Ideone), x[x.argmax()]
short-circuits, and it outcompetes x.any()
and x.max()
for 1-dimensional boolean input no matter whether the input is small or large and no matter whether the short-circuiting pays off. Weird!
这篇关于为什么“numpy.any"有没有短路机制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!