使用numpy的多维数据的欧氏距离的量化外环 [英] Vectorizing Outer Loop of euclidean distance using numpy on multi-dimensional data

查看:835
本文介绍了使用numpy的多维数据的欧氏距离的量化外环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有值的二维矩阵。每一行都是一个数据点。

I have a 2D matrix of values. Each row is a data point.

data = np.array(
   [[2, 2, 3],
    [4, 2, 4],
    [1, 1, 4]])

现在,如果我的测试点就像是一个单一的一维数组numpy的:

Now if my test point is a single 1D numpy array like:

test = np.array([2,3,3])

我可以这样做简单的东西np.sqrt(np.sum((测试数据)** 2,轴= 1))来计算的距离测试点相对于所有三个数据点。

I can do something simple like np.sqrt(np.sum((test-data)**2,axis=1)) to calculate the distance of the test point relative to all three data points.

不过,如果测试本身就是点的二维数组进行测试,上面没有工作,我一直在使用这样的:

However, if test is itself a 2D array of points to be tested, the above doesn't work and I been using something like:

test = np.array([[2,3,3],[4,1,2]])    
for i in range(len(test)):
    print np.sqrt(np.sum((test[i]-data)**2,axis=1))

>>> [ 1.          2.44948974  2.44948974]
    [ 2.44948974  2.23606798  3.60555128]

为了计算在我的测试对设置在数据集中的所有点每个点。好像应该有这个矢量化整个操作的方式,使我得到对应距离的(2,3)矩阵不回来FOR循环外

In order to calculate each point in my Test set against all the points in the Data set. It seems like there should be a way to vectorize this whole operation so that I get a (2,3) matrix of corresponding distances back without the outer FOR loop

(注意:在这个特殊的例子是关于欧氏距离,我发现自己与同类型的操作,我想与另一个矩阵的各个元素一个矩阵中的所有元素进行操作,所以我希望有广义的方式来设置使用numpy的这种性质的问题。)

(Note: While this particular example is about Euclidean Distance, I find myself with similar type operations where I would like to perform an operation on all elements of one matrix with the individual elements of another matrix, so I'm hoping there's a generalized way to set up problems of this nature using Numpy.)

推荐答案

使用的广播来做到这一点:

from numpy.linalg import norm
norm(data-test[:,None],axis=2)

[ 1.          2.44948974  2.44948974]
[ 2.44948974  2.23606798  3.60555128]

一些解释。这是比较容易理解具有不同的形状,四个和两个点为例:

Some explanations. It is easier to understand with different shapes, four and two points for exemple:

ens1 = np.array(
   [[2, 2, 3],
    [4, 2, 4],
    [1, 1, 4],
    [2, 4, 5]])


ens2 = np.array([[2,3,3],
                 [4,1,2]])  


In [16]: ens1.shape
Out[16]: (4, 3)

In [17]: ens2.shape
Out[17]: (2, 3)   

然后:

In [21]: ens2[:,None].shape 
Out[21]: (2, 1, 3) 

添加一个新的层面。现在我们可以做2X4 = 8减法:

add a new dimension. now we can make the 2X4= 8 subtractions :

In [22]: (ens1-ens2[:,None]).shape
Out[22]: (2, 4, 3)       

和采取规范沿轴线最后,8距离:

and take the norm along last axis, for 8 distances :

In [23]: norm(ens1-ens2[:,None],axis=2)
Out[23]: 
array([[ 1.        ,  2.44948974,  2.44948974,  2.23606798],
       [ 2.44948974,  2.23606798,  3.60555128,  4.69041576]])     

这篇关于使用numpy的多维数据的欧氏距离的量化外环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆