复数数组上的 scipy cdist 或 pdist [英] scipy cdist or pdist on arrays of complex numbers

查看:65
本文介绍了复数数组上的 scipy cdist 或 pdist的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用scipy.spatial.distance.euclidean 计算两个复数之间的欧几里德距离有效:

The computation of a Euclidean distance between two complex numbers with scipy.spatial.distance.euclidean works:

import numpy
import scipy.spatial.distance
z1 = numpy.complex(numpy.cos(0), numpy.sin(0))
z2 = numpy.complex(numpy.cos(3*numpy.pi/2), numpy.sin(3*numpy.pi/2))
print scipy.spatial.distance.euclidean(z1, z2)

给出:

1.4142135623730951

但是成对距离矩阵或两个输入数组的每对之间的距离不起作用:

However the pairwise distance matrix or the distance between each pair of the two input arrays doesn't work:

A = numpy.random.uniform(size=(5,1)) + numpy.random.uniform(size=(5,1))*1j
print scipy.spatial.distance.pdist(A)

返回警告和实部之间的距离:

returns a warning and the distances between the real parts:

lib/python2.7/site-packages/scipy/spatial/distance.py:107: ComplexWarning: Casting complex values to real discards the imaginary part
X = X.astype(np.double)
array([ 0.78016544,  0.66201108,  0.8330932 ,  0.54355982,  0.11815436,
        0.05292776,  0.23660562,  0.17108212,  0.11845125,  0.28953338])

scipy.spatial.distance.cdist(A,A)相同.

是否可以使用 cdist 或 pdist 计算成对距离矩阵或两个输入数组中每对之间的距离,而不使用 for 循环和 scipy.spatial.distance.euclidean对于我的问题来说太慢了吗?

Is it possible to compute the pairwise distance matrix or the distance between each pair of the two input arrays using cdist or pdist, without using a for loop and scipy.spatial.distance.euclidean which is too slow for my problem?

推荐答案

一个复数的欧几里德范数定义为该数的模数,然后可以定义两个复数之间的距离为它们差的模数.

The euclidean norm of a complex number is defined as the modulus of the number, and then you can define the distance between two complex numbers as the modulus of their difference.

警告是因为 pdistcdist 是为 N 维(标量)空间设计的,在这些空间中,这种距离的概念没有任何意义.(你如何处理多个维度,每个维度都包含一个复数?对于标量很容易,但对于复杂你有几个选择)

The warning is there because pdist and cdist are designed for N-dimensional (scalar) spaces, where such notion of distance does not make any sense. (How do you deal with many dimensions, each of them containing a complex number? for scalars is pretty easy, but for complex you have a few options)

给定两个点集合:

A = numpy.random.uniform(size=(5)) + numpy.random.uniform(size=(5))*1j
B = numpy.random.uniform(size=(5)) + numpy.random.uniform(size=(5))*1j

A的每个点与B的每个点之间的距离可以计算为

The distance between each point of A and each point of B can be calculated as

MA = tile(A[:,newaxis],A.size)
MB = tile(B[:,newaxis],B.size)
dist = abs(MA-MB.T)

例如,您将在 dist[2][3] 集合的第三个点 A 和集合的第四个点 之间的距离>B.

and you'll have for example in dist[2][3] the distance between the third point of the collection A and fourth point of the collection B.

这是非常有效的,如果像@ali_m 在评论中建议的那样一步完成,效果会更好,

This is very efficient, even more so if done in one step as @ali_m suggests in the comments,

dist = np.abs(A[:, None] - B[None, :])

如果你只想要单个集合A的成对距离矩阵,你可以用A替换上面代码中的B.矩阵 dist 将是对称的,并且在对角线上为零.因此,您将在循环中执行大约两倍的操作数量,并且您将占用大约两倍所需的内存.可能它仍然比带有循环的解决方案更快(也因为使用循环你会遍历成对的数字)

If you just want the pairwise distance matrix of a single collection A, you can substitute B with A in the code above. The matrix dist will be symmetric and will be zero on the diagonal. So you'd be making about twice the number of operations you'd do in a loop, and you'd occupy about twice the memory required. Likely it would still be faster than a solution with a loop (also because with the loop you'd loop over pairs of numbers)

这篇关于复数数组上的 scipy cdist 或 pdist的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆