双python for循环的numpy矢量化 [英] numpy vectorization of double python for loop
问题描述
V是(n,p)个numpy数组,通常尺寸为n〜10,p〜20000
V is (n,p) numpy array typically dimensions are n~10, p~20000
我现在的代码如下
A = np.zeros(p)
for i in xrange(n):
for j in xrange(i+1):
A += F[i,j] * V[i,:] * V[j,:]
我将如何重写它以避免双Python for循环?
How would I go about rewriting this to avoid the double python for loop?
推荐答案
虽然Isaac的回答似乎很有希望,因为它删除了这两个嵌套的for循环,所以您必须创建一个中间数组M
,它是n
的倍数.原始V
数组的大小. Python for循环并不便宜,但是内存访问也不免费:
While Isaac's answer seems promising, as it removes those two nested for loops, you are having to create an intermediate array M
which is n
times the size of your original V
array. Python for loops are not cheap, but memory access ain't free either:
n = 10
p = 20000
V = np.random.rand(n, p)
F = np.random.rand(n, n)
def op_code(V, F):
n, p = V.shape
A = np.zeros(p)
for i in xrange(n):
for j in xrange(i+1):
A += F[i,j] * V[i,:] * V[j,:]
return A
def isaac_code(V, F):
n, p = V.shape
F = F.copy()
F[np.triu_indices(n, 1)] = 0
M = (V.reshape(n, 1, p) * V.reshape(1, n, p)) * F.reshape(n, n, 1)
return M.sum((0, 1))
如果您现在都参加这两个考试:
If you now take both for a test ride:
In [20]: np.allclose(isaac_code(V, F), op_code(V, F))
Out[20]: True
In [21]: %timeit op_code(V, F)
100 loops, best of 3: 3.18 ms per loop
In [22]: %timeit isaac_code(V, F)
10 loops, best of 3: 24.3 ms per loop
因此,删除for循环会花费8倍的减速.并不是一件好事...此时,您甚至可能要考虑是否需要花费约3ms的时间来评估一个函数是否需要进一步优化.如果您这样做,可以使用np.einsum
进行一些小的改进:
So removing the for loops is costing you an 8x slowdown. Not a very good thing... At this point you may even want to consider whether a function taking about 3ms to evaluate requires any further optimization. IN case you do, there's a small improvement which can be had by using np.einsum
:
def einsum_code(V, F):
n, p = V.shape
F = F.copy()
F[np.triu_indices(n, 1)] = 0
return np.einsum('ij,ik,jk->k', F, V, V)
现在:
In [23]: np.allclose(einsum_code(V, F), op_code(V, F))
Out[23]: True
In [24]: %timeit einsum_code(V, F)
100 loops, best of 3: 2.53 ms per loop
因此,引入的代码可能不如for循环易读,因此速度大约提高了20%.我会说不值得...
So that's roughly a 20% speed up that introduces code that may very well not be as readable as your for loops. I would say not worth it...
这篇关于双python for循环的numpy矢量化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!