用组合发电机的Numpy:如何加快组合速度? [英] Numpy with Combinatoric generators: How does one speed up Combinations?

查看:90
本文介绍了用组合发电机的Numpy:如何加快组合速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我了解, itertools函数

It is my understanding that the itertools functions are written in C. If i wanted to speed this example code up:

import numpy as np
from itertools import combinations_with_replacement

def combinatorics(LargeArray):
     newArray = np.empty((LargeArray.shape[0],LargeArray.shape[0]))
     for x, y in combinations_with_replacement(xrange(LargeArray.shape[0]), r=2):
         z = LargeArray[x] + LargeArray[y]
         newArray[x, y] = z
     return newArray

由于combinations_with_replacement用C编写,这是否意味着它不能被加速?请告知.

Since combinations_with_replacement is written in C, does that imply that it can't be sped up? Please advise.

谢谢.

推荐答案

combinations_with_replacement确实是用C编写的,这意味着您不太可能加快该部分代码的实现.但是您的大多数代码都没有花在寻找组合上:它在for循环中进行添加.您确实非常希望在使用numpy时尽可能避免这种循环.通过广播的神奇力量:

It's true that combinations_with_replacement is written in C, which means that you're not likely to speed up the implementation of that part of the code. But most of your code isn't spent on finding the combinations: it's on the for loop that does the additions. You really, really, really want to avoid that kind of loop if at all possible when you're using numpy. This version will do almost the same thing, through the magic of broadcasting:

def sums(large_array):
    return large_array.reshape((-1, 1)) + large_array.reshape((1, -1))

例如:

>>> ary = np.arange(5).astype(float)
>>> np.triu(combinatorics(ary))
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 0.,  2.,  3.,  4.,  5.],
       [ 0.,  0.,  4.,  5.,  6.],
       [ 0.,  0.,  0.,  6.,  7.],
       [ 0.,  0.,  0.,  0.,  8.]])
>>> np.triu(sums(ary))
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 0.,  2.,  3.,  4.,  5.],
       [ 0.,  0.,  4.,  5.,  6.],
       [ 0.,  0.,  0.,  6.,  7.],
       [ 0.,  0.,  0.,  0.,  8.]])

区别在于combinatorics将下部三角形留为乱码,其中sums使矩阵对称.如果您确实想避免两次添加所有内容,则可以这样做,但我想不出怎么办.

The difference is that combinatorics leaves the lower triangle as random gibberish, where sums makes the matrix symmetric. If you really wanted to avoid adding everything twice, you probably could, but I can't think of how to do it off the top of my head.

哦,另一个区别是:

>>> big_ary = np.random.random(1000)
>>> %timeit combinatorics(big_ary)
1 loops, best of 3: 482 ms per loop
>>> %timeit sums(big_ary)
1000 loops, best of 3: 1.7 ms per loop

这篇关于用组合发电机的Numpy:如何加快组合速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆