用组合发电机的Numpy:如何加快组合速度? [英] Numpy with Combinatoric generators: How does one speed up Combinations?
问题描述
据我了解, itertools函数
It is my understanding that the itertools functions are written in C. If i wanted to speed this example code up:
import numpy as np
from itertools import combinations_with_replacement
def combinatorics(LargeArray):
newArray = np.empty((LargeArray.shape[0],LargeArray.shape[0]))
for x, y in combinations_with_replacement(xrange(LargeArray.shape[0]), r=2):
z = LargeArray[x] + LargeArray[y]
newArray[x, y] = z
return newArray
由于combinations_with_replacement
用C编写,这是否意味着它不能被加速?请告知.
Since combinations_with_replacement
is written in C, does that imply that it can't be sped up? Please advise.
谢谢.
推荐答案
combinations_with_replacement
确实是用C编写的,这意味着您不太可能加快该部分代码的实现.但是您的大多数代码都没有花在寻找组合上:它在for
循环中进行添加.您确实非常希望在使用numpy时尽可能避免这种循环.通过广播的神奇力量:
It's true that combinations_with_replacement
is written in C, which means that you're not likely to speed up the implementation of that part of the code. But most of your code isn't spent on finding the combinations: it's on the for
loop that does the additions. You really, really, really want to avoid that kind of loop if at all possible when you're using numpy. This version will do almost the same thing, through the magic of broadcasting:
def sums(large_array):
return large_array.reshape((-1, 1)) + large_array.reshape((1, -1))
例如:
>>> ary = np.arange(5).astype(float)
>>> np.triu(combinatorics(ary))
array([[ 0., 1., 2., 3., 4.],
[ 0., 2., 3., 4., 5.],
[ 0., 0., 4., 5., 6.],
[ 0., 0., 0., 6., 7.],
[ 0., 0., 0., 0., 8.]])
>>> np.triu(sums(ary))
array([[ 0., 1., 2., 3., 4.],
[ 0., 2., 3., 4., 5.],
[ 0., 0., 4., 5., 6.],
[ 0., 0., 0., 6., 7.],
[ 0., 0., 0., 0., 8.]])
区别在于combinatorics
将下部三角形留为乱码,其中sums
使矩阵对称.如果您确实想避免两次添加所有内容,则可以这样做,但我想不出怎么办.
The difference is that combinatorics
leaves the lower triangle as random gibberish, where sums
makes the matrix symmetric. If you really wanted to avoid adding everything twice, you probably could, but I can't think of how to do it off the top of my head.
哦,另一个区别是:
>>> big_ary = np.random.random(1000)
>>> %timeit combinatorics(big_ary)
1 loops, best of 3: 482 ms per loop
>>> %timeit sums(big_ary)
1000 loops, best of 3: 1.7 ms per loop
这篇关于用组合发电机的Numpy:如何加快组合速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!