加速Numpy Kronecker产品 [英] speeding up numpy kronecker products
问题描述
我正在做我的第一个大型python项目.我有一个函数,其中包含以下代码:
I am working on my first large python project. I have one function which has the following code in it:
# EXPAND THE EXPECTED VALUE TO APPLY TO ALL STATES,
# THEN UPDATE fullFnMat
EV_subset_expand = np.kron(EV_subset, np.ones((nrows, 1)))
fullFnMat[key] = staticMat[key] + EV_subset_expand
在我的代码探查器中,似乎这种kronecker产品实际上占用了大量时间.
In my code profiler, it seems like this kronecker product is actually taking up a huge amount of time.
Function was called by...
ncalls tottime cumtime
/home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/BellmanEquation.py:17(bellmanFn) <- 19 37.681 38.768 /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:467(solveTheModel)
{numpy.core.multiarray.concatenate} <- 342 27.319 27.319 /usr/lib/pymodules/python2.7/numpy/lib/shape_base.py:665(kron)
/home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:467(solveTheModel) <- 1 11.041 91.781 <string>:1(<module>)
{method 'argsort' of 'numpy.ndarray' objects} <- 19 7.692 7.692 /usr/lib/pymodules/python2.7/numpy/core/fromnumeric.py:597(argsort)
/usr/lib/pymodules/python2.7/numpy/core/numeric.py:789(outer) <- 171 2.526 2.527 /usr/lib/pymodules/python2.7/numpy/lib/shape_base.py:665(kron)
{method 'max' of 'numpy.ndarray' objects} <- 209 2.034 2.034 /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:391(getValPolMatrices)
是否有办法在Numpy中获得更快的克罗内克产品?似乎不应该花那么长时间.
Is there a way to get faster kronecker products in Numpy? It seems like it shouldn't take as long as it is.
推荐答案
您当然可以查看np.kron
的源代码.可以在numpy/lib/shape_base.py
中找到它,您可以查看是否可以进行改进或简化以使其更有效.或者,您可以使用Cython或其他绑定到低级语言的代码来编写自己的代码,以寻求更好的性能.
You can certainly take a look at the source for np.kron
. It can be found in numpy/lib/shape_base.py
, and you can see if there are improvements that can be made or simplifications that might make it more efficient. Alternatively you could write your own using Cython or some other binding to a low level language to try to eek out better performance.
或者如@matt所建议的,以下内容可能会更快:
Or as @matt suggested something like the following might be natively faster:
import numpy as np
nrows = 10
a = np.arange(100).reshape(10,10)
b = np.tile(a,nrows).reshape(nrows*a.shape[0],-1) # equiv to np.kron(a,np.ones((nrows,1)))
或:
b = np.repeat(a,nrows*np.ones(a.shape[0],np.int),axis=0)
时间:
In [80]: %timeit np.tile(a,nrows).reshape(nrows*a.shape[0],-1)
10000 loops, best of 3: 25.5 us per loop
In [81]: %timeit np.kron(a,np.ones((nrows,1)))
10000 loops, best of 3: 117 us per loop
In [91]: %timeit np.repeat(a,nrows*np.ones(a.shape[0],np.int),0)
100000 loops, best of 3: 12.8 us per loop
在上面的示例中,将np.repeat
用于已调整大小的数组将提供非常不错的10倍加速,但这并不太简陋.
Using np.repeat
for the sized arrays in the above example gives a pretty nice 10x speed-up, which isn't too shabby.
这篇关于加速Numpy Kronecker产品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!