在Numpy数组上加速循环 [英] Speeding up loops over a Numpy array
问题描述
在我的代码中,我有for循环索引多维numpy数组,并使用在每次迭代获得的子数组做一些操作。看起来像这样
用于Arr中的子元素:
#使用子
$ b $ p $现在用<高效。另一方面,这个循环迭代大约〜10 ^ 5
次,是瓶颈。你觉得我可以通过将这部分卸载到C来获得改进吗?我有点不情愿这样做,因为使用子
做的东西使用广播,切片,智能索引技巧这将是枯燥的写在普通的C。我也欢迎有关如何处理广播,切片,智能索引时将计算转移到C的想法和建议。解决方案你可以看看
scipy.weave
。您可以使用scipy.weave.blitz
将表达式透明地转换为C ++
代码并运行它。它会自动处理切片并摆脱临时对象,但是您声明 循环的主体不会创建临时对象,因此您的milage可能会有所不同。
然而,如果你想用更有效率的东西替换整个for循环,那么你可以使用
scipy.inline
。缺点是你必须编写C ++
代码。这不应该太难,因为你可以使用非常接近numpy数组表达式的Blitz ++
语法。切片是直接支持的,但是播放不是。
有两个解决方法:
是使用numpy-C api并使用多维迭代器。他们透明地处理广播。但是,您正在调用Numpy运行时,所以可能会有一些开销。另一种选择,也许更简单的选择是使用通常的矩阵符号进行广播。广播业务可以写成所有的矢量的外部产品。好的是,
Blitz ++
实际上并不会在内存中创建这个临时播放的数组,它会弄清楚如何将它包装到一个等效的循环中。
- $ b
对于第二个选项,请查看 http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC88 索引占位符。只要你的矩阵少于11个尺寸,你就可以。这个链接显示了如何使用它们来形成外部产品 http://www.oonumerics .org / blitz / docs / blitz_3.html#SEC99 (搜索外部产品以转到文档的相关部分)。
In my code I have for loop that indexes over a multidimensional numpy array and does some operation using the sub-array that is obtained at each iteration. It looks like this
for sub in Arr: #do stuff using sub
Now the stuff that is done using
sub
is fully vectorized, so it should be efficient. On the other hand this loop iterates about~10^5
times and is the bottleneck. Do you think I will get an improvement by offloading this part to C. I am somewhat reluctant to do so because thedo stuff using sub
uses broadcasting, slicing, smart-indexing tricks that would be tedious to write in plain C. I would also welcome thoughts and suggestions about how to deal with broadcasting, slicing, smart-indexing when offloading computation to C.解决方案San you can take a look at
scipy.weave
. You can usescipy.weave.blitz
to transparently translate your expression intoC++
code and run it. It will handle slicing automatically and get rid of temporaries, but you claim that the body of yourfor
loop does not create temporaries so your milage may vary.However if you want to replace your entire for loop with something more efficient then you could make use of
scipy.inline
. The drawback is that you have to writeC++
code. This should not be too hard because you can useBlitz++
syntax which is very close to numpy array expressions. Slicing is directly supported, broadcasting however is not.There are two work arounds:
is to use the numpy-C api and use multi-dimensional iterators. They transparently handle broadcasting. However you are invoking the Numpy runtime so there might be some overhead. The other option, and possibly the simpler option is to use the usual matrix notation for broadcasting. Broadcast operations can be written as outer-products with vector of all ones. The good thing is that
Blitz++
will not actually create this temporary broadcasted arrays in memory, it will figure out how to wrap it into an equivalent loop.For the second option take a look at http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC88 for index place holders. As long as your matrix has less than 11 dimensions you are fine. This link shows how they can be used to form outer products http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC99 (search for outer products to go to the relevant part of the document).
这篇关于在Numpy数组上加速循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!