加快与numpy的循环 [英] Speed up for loop with numpy

查看：882 发布时间：2016/6/1 22:04:33 python arrays performance optimization numpy

本文介绍了加快与numpy的循环的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何能在下一个for循环得到与numpy的一个加速？我想一些花哨的索引招可以用在这里，但我不知道哪一个（可einsum在这里使用？）。

How can this next for-loop get a speedup with numpy? I guess some fancy indexing-trick can be used here, but i have no idea which one (can einsum be used here?).

a=0
for i in range(len(b)):
    a+=numpy.mean(C[d,e,f+b[i]])*g[i]

编辑：
C 是外形相媲美的3D numpy的阵列（20，1600，500）。
D，E，F 是是有趣的分指数（ D，E，F 的长度是相同的，大约900）
b和g具有相同的长度（约50）。平均被接管所有的点 C 与指数 D，E，F + B [I]

edit: C is a numpy 3D array of shape comparable to (20, 1600, 500). d,e,f are indices of points that are "interesting" (lengths of d,e,f are the same and around 900) b and g have the same length (around 50). The mean is taken over all the points in C with the indices d,e,f+b[i]

numpy的1.9.0

In [7]: %timeit C[d,e,f + b[:,np.newaxis]].mean(axis=1).dot(g)
1000 loops, best of 3: 942 µs per loop

In [8]: %timeit C[d[:,np.newaxis],e[:, np.newaxis],f[:, np.newaxis] + b].mean(axis=0).dot(g)
1000 loops, best of 3: 762 µs per loop

In [9]: %%timeit                                               
   ...: a = 0
   ...: for i in range(len(b)):                                     
   ...:     a += np.mean(C[d, e, f + b[i]]) * g[i]
   ...: 
100 loops, best of 3: 2.25 ms per loop

In [10]: np.__version__
Out[10]: '1.9.0'

In [11]: %%timeit
(C.ravel()[np.ravel_multi_index((d[:,np.newaxis],
                                 e[:,np.newaxis],
                                 f[:,np.newaxis] + b), dims=C.shape)]
 .mean(axis=0).dot(g))
   ....: 
1000 loops, best of 3: 940 µs per loop

numpy的1.8.2

In [7]: %timeit C[d,e,f + b[:,np.newaxis]].mean(axis=1).dot(g)
100 loops, best of 3: 2.81 ms per loop

In [8]: %timeit C[d[:,np.newaxis],e[:, np.newaxis],f[:, np.newaxis] + b].mean(axis=0).dot(g)
100 loops, best of 3: 2.7 ms per loop

In [9]: %%timeit                                               
   ...: a = 0
   ...: for i in range(len(b)):                                     
   ...:     a += np.mean(C[d, e, f + b[i]]) * g[i]
   ...: 
100 loops, best of 3: 4.12 ms per loop

In [10]: np.__version__
Out[10]: '1.8.2'

In [51]: %%timeit
(C.ravel()[np.ravel_multi_index((d[:,np.newaxis],
                                 e[:,np.newaxis],
                                 f[:,np.newaxis] + b), dims=C.shape)]
 .mean(axis=0).dot(g))
   ....: 
1000 loops, best of 3: 1.4 ms per loop

说明

您可以使用协调广播招充实从一开始你50x900数组：

Description

You can use coordinate broadcasting trick to flesh out your 50x900 array from the beginning:

In [158]: C[d,e,f + b[:, np.newaxis]].shape
Out[158]: (50, 900)

从这一点来说，的意思是和点将让你到目的地：

In [159]: C[d,e,f + b[:, np.newaxis]].mean(axis=1).dot(g)
Out[159]: 13.582349962518611

In [160]: 
a = 0
for i in range(len(b)):       
    a += np.mean(C[d, e, f + b[i]]) * g[i]
print(a)
   .....: 
13.5823499625

和它的约3.3倍比环版本快：

And it's about 3.3x faster than the loop version:

In [161]: %timeit C[d,e,f + b[:, np.newaxis]].mean(axis=1).dot(g)
1000 loops, best of 3: 585 µs per loop

In [162]: %%timeit                                               
a = 0
for i in range(len(b)):                                     
    a += np.mean(C[d, e, f + b[i]]) * g[i]
   .....: 
1000 loops, best of 3: 1.95 ms per loop

该数组是显著的大小，所以你必须在CPU缓存因素。我不能说我知道如何 np.sum 遍历数组，但在二维数组总是有一个稍微好一点的方法（当你选择的下一个元素是相邻的存储明智）和一个略差方式（当一个元素跨过步幅找到）。让我们看看，如果我们可以通过索引过程调换阵赢得更多的东西：

The array is of significant size, so you must factor in CPU cache. I cannot say I know how np.sum traverses the array, but in 2d arrays there is always a slightly better way (when the next element you pick is adjacent memory-wise) and a slightly worse way (when the next element is found across the stride). Let's see if we can win something more by transposing the array during indexing:

In [196]: C[d[:,np.newaxis], e[:,np.newaxis], f[:,np.newaxis] + b].mean(axis=0).dot(g)
Out[196]: 13.582349962518608

In [197]: %timeit C[d[:,np.newaxis], e[:,np.newaxis], f[:,np.newaxis] + b].mean(axis=0).dot(g)
1000 loops, best of 3: 461 µs per loop

这比循环快4.2倍。

That's 4.2x faster than the loop.

这篇关于加快与numpy的循环的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

加快与numpy的循环 [英] Speed up for loop with numpy

问题描述

推荐答案

numpy的1.9.0

numpy的1.8.2

说明

Description

相关文章

Python最新文章

热门教程

热门工具

登录关闭

加快与numpy的循环 [英] Speed up for loop with numpy

问题描述

推荐答案

numpy的1.9.0

numpy的1.8.2

说明

Description

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭