Numba为什么不改进这个迭代......? [英] Why doesn't Numba improve this iteration ...?
问题描述
我正在尝试Numba加速计算联合发生的最小条件概率的函数。
I am trying out Numba in speeding up a function that computes a minimum conditional probability of joint occurrence.
import numpy as np
from numba import double
from numba.decorators import jit, autojit
X = np.random.random((100,2))
def cooccurance_probability(X):
P = X.shape[1]
CS = np.sum(X, axis=0) #Column Sums
D = np.empty((P, P), dtype=np.float) #Return Matrix
for i in range(P):
for j in range(P):
D[i, j] = (X[:,i] * X[:,j]).sum() / max(CS[i], CS[j])
return D
cooccurance_probability_numba = autojit(cooccurance_probability)
但是我发现 cooccurance_probability
和 cooccurance_probability_numba
的表现几乎相同。
However I am finding that the performance of cooccurance_probability
and cooccurance_probability_numba
to be pretty much the same.
%timeit cooccurance_probability(X)
1 loops, best of 3: 302 ms per loop
%timeit cooccurance_probability_numba(X)
1 loops, best of 3: 307 ms per loop
这是为什么?可能是由于元素操作的numpy元素?
Why is this? Could it be due to the numpy element by element operation?
我跟随一个例子:
http://nbviewer.ipython.org/github/ellisonbg/talk-sicm2-2013/blob/master /NumbaCython.ipynb
[注意:由于问题的对称性,我可以将执行时间缩短一半 - 但这不是我主要担心的问题]
[Note: I could half the execution time due to the symmetric nature of the problem - but that isn't my main concern]
推荐答案
我的猜测是,由于对<$ c的调用,您正在访问对象层而不是生成本机代码$ c>总和,这意味着Numba不会显着加快速度。它只是不知道如何优化/翻译总和
(此时)。另外,通常使用Numba将矢量化操作展开为显式循环更好。请注意,您链接到的ipynb仅调用 np.sqrt
,我相信它会转换为机器代码,并且它对元素而不是切片进行操作。我会尝试将内部循环中的总和扩展为元素的显式附加循环,而不是采用切片并使用 sum
方法。
My guess would be that you're hitting the object layer instead of generating native code due to the calls to sum
, which means that Numba isn't going to speed things up significantly. It just doesn't know how to optimize/translate sum
(at this point). Additionally it's usually better to unroll vectorized operations into explicit loops with Numba. Notice that the ipynb that you link to only calls out to np.sqrt
which I believe does get translated to machine code, and it operates on elements, not slices. I would try to expand out the sum in the inner loop as an explicit additional loop over elements, rather than taking slices and using the sum
method.
我的经验是,Numba有时可以创造奇迹,但它不会加速任意python代码。您需要了解限制以及可以有效优化的内容。另请注意,v0.11在这方面略有不同,而0.12和0.13由于Numba在这些版本之间经历的主要重构而有所不同。
My experience is that Numba can work wonders sometimes, but it doesn't speed-up arbitrary python code. You need to get a sense of the limitations and what it can optimize effectively. Also note that v0.11 is a bit different in this regard as compared to 0.12 and 0.13 due to the major refactoring that Numba went through between those versions.
这篇关于Numba为什么不改进这个迭代......?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!