加快与numpy的循环 [英] Speed up for loop with numpy

查看:882
本文介绍了加快与numpy的循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何能在下一个for循环得到与numpy的一个加速?我想一些花哨的索引招可以用在这里,但我不知道哪一个(可einsum在这里使用?)。

How can this next for-loop get a speedup with numpy? I guess some fancy indexing-trick can be used here, but i have no idea which one (can einsum be used here?).

a=0
for i in range(len(b)):
    a+=numpy.mean(C[d,e,f+b[i]])*g[i]

编辑:
C 是外形相媲美的3D numpy的阵列(20,1600,500)
D,E,F 是是有趣的分指数( D,E,F 的长度是相同的,大约900)
b和g具有相同的长度(约50)。平均被接管所有的点 C 与指数 D,E,F + B [I]

edit: C is a numpy 3D array of shape comparable to (20, 1600, 500). d,e,f are indices of points that are "interesting" (lengths of d,e,f are the same and around 900) b and g have the same length (around 50). The mean is taken over all the points in C with the indices d,e,f+b[i]

推荐答案

两个会话用初始化

In [1]: C = np.random.rand(20,1600,500)

In [2]: d = np.random.randint(0, 20, size=900)

In [3]: e = np.random.randint(1600, size=900)

In [4]: f = np.random.randint(400, size=900)

In [5]: b = np.random.randint(100, size=50)

In [6]: g = np.random.rand(50)

numpy的1.9.0

In [7]: %timeit C[d,e,f + b[:,np.newaxis]].mean(axis=1).dot(g)
1000 loops, best of 3: 942 µs per loop

In [8]: %timeit C[d[:,np.newaxis],e[:, np.newaxis],f[:, np.newaxis] + b].mean(axis=0).dot(g)
1000 loops, best of 3: 762 µs per loop

In [9]: %%timeit                                               
   ...: a = 0
   ...: for i in range(len(b)):                                     
   ...:     a += np.mean(C[d, e, f + b[i]]) * g[i]
   ...: 
100 loops, best of 3: 2.25 ms per loop

In [10]: np.__version__
Out[10]: '1.9.0'

In [11]: %%timeit
(C.ravel()[np.ravel_multi_index((d[:,np.newaxis],
                                 e[:,np.newaxis],
                                 f[:,np.newaxis] + b), dims=C.shape)]
 .mean(axis=0).dot(g))
   ....: 
1000 loops, best of 3: 940 µs per loop

numpy的1.8.2

In [7]: %timeit C[d,e,f + b[:,np.newaxis]].mean(axis=1).dot(g)
100 loops, best of 3: 2.81 ms per loop

In [8]: %timeit C[d[:,np.newaxis],e[:, np.newaxis],f[:, np.newaxis] + b].mean(axis=0).dot(g)
100 loops, best of 3: 2.7 ms per loop

In [9]: %%timeit                                               
   ...: a = 0
   ...: for i in range(len(b)):                                     
   ...:     a += np.mean(C[d, e, f + b[i]]) * g[i]
   ...: 
100 loops, best of 3: 4.12 ms per loop

In [10]: np.__version__
Out[10]: '1.8.2'

In [51]: %%timeit
(C.ravel()[np.ravel_multi_index((d[:,np.newaxis],
                                 e[:,np.newaxis],
                                 f[:,np.newaxis] + b), dims=C.shape)]
 .mean(axis=0).dot(g))
   ....: 
1000 loops, best of 3: 1.4 ms per loop

说明

您可以使用协调广播招充实从一开始你50x900数组:

Description

You can use coordinate broadcasting trick to flesh out your 50x900 array from the beginning:

In [158]: C[d,e,f + b[:, np.newaxis]].shape
Out[158]: (50, 900)

从这一点来说,的意思是将让你到目的地:

In [159]: C[d,e,f + b[:, np.newaxis]].mean(axis=1).dot(g)
Out[159]: 13.582349962518611

In [160]: 
a = 0
for i in range(len(b)):       
    a += np.mean(C[d, e, f + b[i]]) * g[i]
print(a)
   .....: 
13.5823499625

和它的约3.3倍比环版本快:

And it's about 3.3x faster than the loop version:

In [161]: %timeit C[d,e,f + b[:, np.newaxis]].mean(axis=1).dot(g)
1000 loops, best of 3: 585 µs per loop

In [162]: %%timeit                                               
a = 0
for i in range(len(b)):                                     
    a += np.mean(C[d, e, f + b[i]]) * g[i]
   .....: 
1000 loops, best of 3: 1.95 ms per loop

该数组是显著的大小,所以你必须在CPU缓存因素。我不能说我知道如何 np.sum 遍历数组,但在二维数组总是有一个稍微好一点的方法(当你选择的下一个元素是相邻的存储明智)和一个略差方式(当一个元素跨过步幅找到)。让我们看看,如果我们可以通过索引过程调换阵赢得更多的东西:

The array is of significant size, so you must factor in CPU cache. I cannot say I know how np.sum traverses the array, but in 2d arrays there is always a slightly better way (when the next element you pick is adjacent memory-wise) and a slightly worse way (when the next element is found across the stride). Let's see if we can win something more by transposing the array during indexing:

In [196]: C[d[:,np.newaxis], e[:,np.newaxis], f[:,np.newaxis] + b].mean(axis=0).dot(g)
Out[196]: 13.582349962518608

In [197]: %timeit C[d[:,np.newaxis], e[:,np.newaxis], f[:,np.newaxis] + b].mean(axis=0).dot(g)
1000 loops, best of 3: 461 µs per loop

这比循环快4.2倍。

That's 4.2x faster than the loop.

这篇关于加快与numpy的循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆