使用scipy.spatial的数据类型问题 [英] Data type problem using scipy.spatial

查看:224
本文介绍了使用scipy.spatial的数据类型问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用scipy.spatial的KDTree在二维数组(本质上是列表的列表,其中嵌套列表的尺寸为2)中查找最近的邻居对.我生成列表列表,将其通过管道传递到numpy的数组中,然后创建KDTree实例.但是,每当我尝试对其运行查询"时,我不可避免地会得到奇怪的答案.例如,当我输入:

I want to use scipy.spatial's KDTree to find nearest neighbor pairs in a two dimensional array (essentially a list of lists where the dimension of the nested list is 2). I generate my list of lists, pipe it into numpy's array and then create the KDTree instance. However, whenever I try to run "query" on it, I inevitably get weird answers. For example, when I type:

tree = KDTree(array)
nearest = tree.query(np.array[1,1])

最近打印输出(0.0,0).目前,我使用的数组基本上是(1,50)的y = x,所以我希望我应该得到(1,1)的(2,2)的最近邻居

nearest prints out (0.0, 0). Currently, I'm using an an array that is basically y = x for the range (1,50) so I expect that I should get the nearest neighbor of (2,2) for (1,1)

我在做错什么,scipy gurus?

What am I doing wrong, scipy gurus?

或者,如果有人可以将我指向他们用于给定点的最近邻居搜索的python的KDTree包,我想听听它.

Alternatively, if someone can point me to a KDTree package for python that they've used for nearest neighbor searches of a given point I would love to hear about it.

推荐答案

我以前使用过scipy.spatial,与scikits.ann相比,这似乎是一个不错的改进(尤其是使用界面).

I have used scipy.spatial before, and it appears to be a nice improvement (especially wrt the interface) as compared to scikits.ann.

在这种情况下,我认为您已经混淆了tree.query(...)通话的收益.从scipy.spatial.KDTree.query

In this case I think you have confused the return from your tree.query(...) call. From the scipy.spatial.KDTree.query docs:

Returns
-------

d : array of floats
    The distances to the nearest neighbors.
    If x has shape tuple+(self.m,), then d has shape tuple if
    k is one, or tuple+(k,) if k is larger than one.  Missing
    neighbors are indicated with infinite distances.  If k is None,
    then d is an object array of shape tuple, containing lists
    of distances. In either case the hits are sorted by distance
    (nearest first).
i : array of integers
    The locations of the neighbors in self.data. i is the same
    shape as d.

因此,在这种情况下,当您查询最接近[1,1]的位置时,会得到:

So in this case when you query for the nearest to [1,1] you are getting:

distance to nearest: 0.0
index of nearest in original array: 0

这意味着[1,1]array中原始数据的第一行,如果您的数据是y = x on the range [1,50],则应该是该行.

This means that [1,1] is the first row of your original data in array, which is expected given your data is y = x on the range [1,50].

scipy.spatial.KDTree.query函数还有很多其他选项,因此,例如,如果您想确保获取不是它自己的最近邻居,请尝试:

The scipy.spatial.KDTree.query function has lots of other options, so if for example you wanted to make sure to get the nearest neighbour that isn't itself try:

tree.query([1,1], k=2)

这将返回两个最近的邻居,您可以将其应用于进一步的逻辑,以使返回的距离为零(即,所查询的点是用于构建树的数据项之一)的情况第二个最近的邻居被代替,而不是第一个.

This will return the two nearest neighbours, which you could apply further logic to such that cases where the distance returned is zero (i.e. the point queried is one of data items used to build the tree) the second nearest neighbour is taken rather than the first.

这篇关于使用scipy.spatial的数据类型问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆