是否有可能在Python中加快此循环? [英] Is it possible to speed up this loop in Python?
问题描述
在numpy.narray
(如np.array[map(some_func,x)]
或vectorize(f)(x)
)中映射函数的常规方法无法提供索引.
以下代码只是在许多应用程序中常见的简单示例.
The normal way to map a function in a numpy.narray
like np.array[map(some_func,x)]
or vectorize(f)(x)
can't provide an index.
The following code is just a simple example that is commonly seen in many applications.
dis_mat = np.zeros([feature_mat.shape[0], feature_mat.shape[0]])
for i in range(feature_mat.shape[0]):
for j in range(i, feature_mat.shape[0]):
dis_mat[i, j] = np.linalg.norm(
feature_mat[i, :] - feature_mat[j, :]
)
dis_mat[j, i] = dis_mat[i, j]
有没有办法加快速度?
谢谢您的帮助!使用 @ user2357112 评论过的功能,最快的方法就是这段代码:
Thank you for your help! The quickest way to speed up this code is this, using the function that @user2357112 commented about:
from scipy.spatial.distance import pdist,squareform
dis_mat = squareform(pdist(feature_mat))
@Julien 的feature_mat很小,> method 也很好,但是当feature_mat
到2000年为1000时,则需要近40GB的内存.
@Julien's method is also good if feature_mat
is small, but when the feature_mat
is 1000 by 2000, then it needs nearly 40 GB of memory.
推荐答案
您可以创建一个新轴并进行广播:
You can create a new axis and broadcast:
dis_mat = np.linalg.norm(feature_mat[:,None] - feature_mat, axis=-1)
时间:
feature_mat = np.random.rand(100,200)
def a():
dis_mat = np.zeros([feature_mat.shape[0], feature_mat.shape[0]])
for i in range(feature_mat.shape[0]):
for j in range(i, feature_mat.shape[0]):
dis_mat[i, j] = np.linalg.norm(
feature_mat[i, :] - feature_mat[j, :]
)
dis_mat[j, i] = dis_mat[i, j]
def b():
dis_mat = np.linalg.norm(feature_mat[:,None] - feature_mat, axis=-1)
%timeit a()
100 loops, best of 3: 20.5 ms per loop
%timeit b()
100 loops, best of 3: 11.8 ms per loop
这篇关于是否有可能在Python中加快此循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!