使用Numpy有效地计算欧几里得距离矩阵 [英] Efficiently Calculating a Euclidean Distance Matrix Using Numpy

查看:550
本文介绍了使用Numpy有效地计算欧几里得距离矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在二维空间中有一组点,需要计算每个点到另一个点的距离.

I have a set of points in 2-dimensional space and need to calculate the distance from each point to each other point.

我的点数相对较少,最多不超过100个.但是由于我需要经常快速地进行操作以确定这些移动点之间的关系,并且因为我知道遍历这些点可以与O(n ^ 2)复杂度一样糟糕,我正在寻找利用numpy矩阵魔术(或scipy)的方法.

I have a relatively small number of points, maybe at most 100. But since I need to do it often and rapidly in order to determine the relationships between these moving points, and since I'm aware that iterating through the points could be as bad as O(n^2) complexity, I'm looking for ways to take advantage of numpy's matrix magic (or scipy).

正如我代码中所代表的那样,每个对象的坐标都存储在其类中.但是,当我更新类坐标时,也可以用numpy数组更新它们.

As it stands in my code, the coordinates of each object are stored in its class. However, I could also update them in a numpy array when I update the class coordinate.

class Cell(object):
    """Represents one object in the field."""
    def __init__(self,id,x=0,y=0):
        self.m_id = id
        self.m_x = x
        self.m_y = y

我想到创建一个欧几里得距离矩阵来防止重复,但是也许您有一个更聪明的数据结构.

It occurs to me to create a Euclidean distance matrix to prevent duplication, but perhaps you have a cleverer data structure.

我也很喜欢精巧算法的指针.

I'm open to pointers to nifty algorithms as well.

此外,我注意到存在类似的问题,涉及欧几里得距离和numpy,但没有找到直接解决有效填充全距离矩阵问题的任何问题.

Also, I note that there are similar questions dealing with Euclidean distance and numpy but didn't find any that directly address this question of efficiently populating a full distance matrix.

推荐答案

您可以利用complex类型:

# build a complex array of your cells
z = np.array([complex(c.m_x, c.m_y) for c in cells])

第一个解决方案

# mesh this array so that you will have all combinations
m, n = np.meshgrid(z, z)
# get the distance via the norm
out = abs(m-n)

第二个解决方案

啮合是主要思想.但是numpy很聪明,因此您不必生成m& n.只需使用z的转置版本计算差异.网格是自动完成的:

Second solution

Meshing is the main idea. But numpy is clever, so you don't have to generate m & n. Just compute the difference using a transposed version of z. The mesh is done automatically :

out = abs(z[..., np.newaxis] - z)

第三种解决方案

如果直接将z设置为二维数组,则可以使用z.T代替怪异的z[..., np.newaxis].所以最后,您的代码将如下所示:

Third solution

And if z is directly set as a 2-dimensional array, you can use z.T instead of the weird z[..., np.newaxis]. So finally, your code will look like this :

z = np.array([[complex(c.m_x, c.m_y) for c in cells]]) # notice the [[ ... ]]
out = abs(z.T-z)

示例

>>> z = np.array([[0.+0.j, 2.+1.j, -1.+4.j]])
>>> abs(z.T-z)
array([[ 0.        ,  2.23606798,  4.12310563],
       [ 2.23606798,  0.        ,  4.24264069],
       [ 4.12310563,  4.24264069,  0.        ]])

作为补充,您可能希望之后使用上三角形删除重复项:

As a complement, you may want to remove duplicates afterwards, taking the upper triangle :

>>> np.triu(out)
array([[ 0.        ,  2.23606798,  4.12310563],
       [ 0.        ,  0.        ,  4.24264069],
       [ 0.        ,  0.        ,  0.        ]])

一些基准

>>> timeit.timeit('abs(z.T-z)', setup='import numpy as np;z = np.array([[0.+0.j, 2.+1.j, -1.+4.j]])')
4.645645342274779
>>> timeit.timeit('abs(z[..., np.newaxis] - z)', setup='import numpy as np;z = np.array([0.+0.j, 2.+1.j, -1.+4.j])')
5.049334864854522
>>> timeit.timeit('m, n = np.meshgrid(z, z); abs(m-n)', setup='import numpy as np;z = np.array([0.+0.j, 2.+1.j, -1.+4.j])')
22.489568296184686

这篇关于使用Numpy有效地计算欧几里得距离矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆