在Numpy数组上加速循环 [英] Speeding up loops over a Numpy array

查看:142
本文介绍了在Numpy数组上加速循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的代码中,我有for循环索引多维numpy数组,并使用在每次迭代获得的子数组做一些操作。看起来像这样

 用于Arr中的子元素:
#使用子

$ b $ p $现在用<高效。另一方面,这个循环迭代大约〜10 ^ 5 次,是瓶颈。你觉得我可以通过将这部分卸载到C来获得改进吗?我有点不情愿这样做,因为使用子做的东西使用广播,切片,智能索引技巧这将是枯燥的写在普通的C。我也欢迎有关如何处理广播,切片,智能索引时将计算转移到C的想法和建议。

解决方案

你可以看看 scipy.weave 。您可以使用 scipy.weave.blitz 将表达式透明地转换为 C ++ 代码并运行它。它会自动处理切片并摆脱临时对象,但是您声明 循环的主体不会创建临时对象,因此您的milage可能会有所不同。



然而,如果你想用更有效率的东西替换整个for循环,那么你可以使用 scipy.inline 。缺点是你必须编写 C ++ 代码。这不应该太难,因为你可以使用非常接近numpy数组表达式的 Blitz ++ 语法。切片是直接支持的,但是播放不是。

有两个解决方法:


  1. 是使用numpy-C api并使用多维迭代器。他们透明地处理广播。但是,您正在调用Numpy运行时,所以可能会有一些开销。另一种选择,也许更简单的选择是使用通常的矩阵符号进行广播。广播业务可以写成所有的矢量的外部产品。好的是, Blitz ++ 实际上并不会在内存中创建这个临时播放的数组,它会弄清楚如何将它包装到一个等效的循环中。


  2. 对于第二个选项,请查看 http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC88 索引占位符。只要你的矩阵少于11个尺寸,你就可以。这个链接显示了如何使用它们来形成外部产品 http://www.oonumerics .org / blitz / docs / blitz_3.html#SEC99 (搜索外部产品以转到文档的相关部分)。

  3. $ b

    In my code I have for loop that indexes over a multidimensional numpy array and does some operation using the sub-array that is obtained at each iteration. It looks like this

    for sub in Arr:
      #do stuff using sub
    

    Now the stuff that is done using sub is fully vectorized, so it should be efficient. On the other hand this loop iterates about ~10^5 times and is the bottleneck. Do you think I will get an improvement by offloading this part to C. I am somewhat reluctant to do so because the do stuff using sub uses broadcasting, slicing, smart-indexing tricks that would be tedious to write in plain C. I would also welcome thoughts and suggestions about how to deal with broadcasting, slicing, smart-indexing when offloading computation to C.

    解决方案

    San you can take a look at scipy.weave. You can use scipy.weave.blitz to transparently translate your expression into C++ code and run it. It will handle slicing automatically and get rid of temporaries, but you claim that the body of your for loop does not create temporaries so your milage may vary.

    However if you want to replace your entire for loop with something more efficient then you could make use of scipy.inline. The drawback is that you have to write C++ code. This should not be too hard because you can use Blitz++ syntax which is very close to numpy array expressions. Slicing is directly supported, broadcasting however is not.

    There are two work arounds:

    1. is to use the numpy-C api and use multi-dimensional iterators. They transparently handle broadcasting. However you are invoking the Numpy runtime so there might be some overhead. The other option, and possibly the simpler option is to use the usual matrix notation for broadcasting. Broadcast operations can be written as outer-products with vector of all ones. The good thing is that Blitz++ will not actually create this temporary broadcasted arrays in memory, it will figure out how to wrap it into an equivalent loop.

    2. For the second option take a look at http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC88 for index place holders. As long as your matrix has less than 11 dimensions you are fine. This link shows how they can be used to form outer products http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC99 (search for outer products to go to the relevant part of the document).

    这篇关于在Numpy数组上加速循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆