使用numpy的多维数据的欧氏距离的量化外环 [英] Vectorizing Outer Loop of euclidean distance using numpy on multi-dimensional data
问题描述
我有值的二维矩阵。每一行都是一个数据点。
I have a 2D matrix of values. Each row is a data point.
data = np.array(
[[2, 2, 3],
[4, 2, 4],
[1, 1, 4]])
现在,如果我的测试点就像是一个单一的一维数组numpy的:
Now if my test point is a single 1D numpy array like:
test = np.array([2,3,3])
我可以这样做简单的东西np.sqrt(np.sum((测试数据)** 2,轴= 1))
来计算的距离测试点相对于所有三个数据点。
I can do something simple like np.sqrt(np.sum((test-data)**2,axis=1))
to calculate the distance of the test point relative to all three data points.
不过,如果测试本身就是点的二维数组进行测试,上面没有工作,我一直在使用这样的:
However, if test is itself a 2D array of points to be tested, the above doesn't work and I been using something like:
test = np.array([[2,3,3],[4,1,2]])
for i in range(len(test)):
print np.sqrt(np.sum((test[i]-data)**2,axis=1))
>>> [ 1. 2.44948974 2.44948974]
[ 2.44948974 2.23606798 3.60555128]
为了计算在我的测试对设置在数据集中的所有点每个点。好像应该有这个矢量化整个操作的方式,使我得到对应距离的(2,3)矩阵不回来FOR循环外
In order to calculate each point in my Test set against all the points in the Data set. It seems like there should be a way to vectorize this whole operation so that I get a (2,3) matrix of corresponding distances back without the outer FOR loop
(注意:在这个特殊的例子是关于欧氏距离,我发现自己与同类型的操作,我想与另一个矩阵的各个元素一个矩阵中的所有元素进行操作,所以我希望有广义的方式来设置使用numpy的这种性质的问题。)
(Note: While this particular example is about Euclidean Distance, I find myself with similar type operations where I would like to perform an operation on all elements of one matrix with the individual elements of another matrix, so I'm hoping there's a generalized way to set up problems of this nature using Numpy.)
推荐答案
使用的广播来做到这一点:
from numpy.linalg import norm
norm(data-test[:,None],axis=2)
为
[ 1. 2.44948974 2.44948974]
[ 2.44948974 2.23606798 3.60555128]
一些解释。这是比较容易理解具有不同的形状,四个和两个点为例:
Some explanations. It is easier to understand with different shapes, four and two points for exemple:
ens1 = np.array(
[[2, 2, 3],
[4, 2, 4],
[1, 1, 4],
[2, 4, 5]])
ens2 = np.array([[2,3,3],
[4,1,2]])
In [16]: ens1.shape
Out[16]: (4, 3)
In [17]: ens2.shape
Out[17]: (2, 3)
然后:
In [21]: ens2[:,None].shape
Out[21]: (2, 1, 3)
添加一个新的层面。现在我们可以做2X4 = 8减法:
add a new dimension. now we can make the 2X4= 8 subtractions :
In [22]: (ens1-ens2[:,None]).shape
Out[22]: (2, 4, 3)
和采取规范沿轴线最后,8距离:
and take the norm along last axis, for 8 distances :
In [23]: norm(ens1-ens2[:,None],axis=2)
Out[23]:
array([[ 1. , 2.44948974, 2.44948974, 2.23606798],
[ 2.44948974, 2.23606798, 3.60555128, 4.69041576]])
这篇关于使用numpy的多维数据的欧氏距离的量化外环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!