在两个numpy向量中的成对元素上用函数填充矩阵的最快方法? [英] Fastest way to populate a matrix with a function on pairs of elements in two numpy vectors?

查看:102
本文介绍了在两个numpy向量中的成对元素上用函数填充矩阵的最快方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个一维numpy向量vavb,它们通过将所有对组合传递给一个函数来填充矩阵.

I have two 1 dimensional numpy vectors va and vb which are being used to populate a matrix by passing all pair combinations to a function.

na = len(va)
nb = len(vb)
D = np.zeros((na, nb))
for i in range(na):
    for j in range(nb):
        D[i, j] = foo(va[i], vb[j])

就目前而言,由于va和vb相对较大(4626和737),因此这段代码需要很长时间才能运行.但是,我希望可以通过使用scipy中的cdist方法执行类似过程并具有非常好的性能的事实来改善这一点.

As it stands, this piece of code takes a very long time to run due to the fact that va and vb are relatively large (4626 and 737). However I am hoping this can be improved due to the fact that a similiar procedure is performed using the cdist method from scipy with very good performance.

D = cdist(va, vb, metric)

我显然知道scipy具有在C中而不是在python中运行这段代码的好处-但我希望有一些不知道的numpy函数可以快速执行此操作.

I am obviously aware that scipy has the benefit of running this piece of code in C rather than in python - but I'm hoping there is some numpy function im unaware of that can execute this quickly.

推荐答案

最鲜为人知的numpy函数之一,用于文档调用 np.frompyfunc .这将从Python函数创建一个numpy ufunc.不是某些其他对象可以模拟numpy的ufunc,而是具有所有特征的适当ufunc.尽管该行为在许多方面与np.vectorize非常相似,但它具有一些明显的优点,希望以下代码应突出显示:

One of the least known numpy functions for what the docs call functional programming routines is np.frompyfunc. This creates a numpy ufunc from a Python function. Not some other object that closely simulates a numpy ufunc, but a proper ufunc with all its bells and whistles. While the behavior is in many aspects very similar to np.vectorize, it has some distinct advantages, that hopefully the following code should highlight:

In [2]: def f(a, b):
   ...:     return a + b
   ...:

In [3]: f_vec = np.vectorize(f)

In [4]: f_ufunc = np.frompyfunc(f, 2, 1)  # 2 inputs, 1 output

In [5]: a = np.random.rand(1000)

In [6]: b = np.random.rand(2000)

In [7]: %timeit np.add.outer(a, b)  # a baseline for comparison
100 loops, best of 3: 9.89 ms per loop

In [8]: %timeit f_vec(a[:, None], b)  # 50x slower than np.add
1 loops, best of 3: 488 ms per loop

In [9]: %timeit f_ufunc(a[:, None], b)  # ~20% faster than np.vectorize...
1 loops, best of 3: 425 ms per loop

In [10]: %timeit f_ufunc.outer(a, b)  # ...and you get to use ufunc methods
1 loops, best of 3: 427 ms per loop

因此,尽管它仍然明显不如适当的矢量化实现,但它要快一些(循环在C中进行,但是您仍然需要Python函数调用开销).

So while it is still clearly inferior to a properly vectorized implementation, it is a little faster (the looping is in C, but you still have the Python function call overhead).

这篇关于在两个numpy向量中的成对元素上用函数填充矩阵的最快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆