高效匹配两个数组(如何使用KDTree) [英] Efficient matching of two arrays (how to use KDTree)

查看：91 发布时间：2020/5/18 20:09:35 python numpy pandas scipy kdtree

本文介绍了高效匹配两个数组(如何使用KDTree)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个2d数组，obs1和obs2.它们代表两个独立的测量系列，都具有 dim0 = 2 ，并且 dim1 略有不同，例如obs1.shape = (2, 250000)和obs2.shape = (2, 250050). obs1[0]和obs2[0]表示时间，并且obs1[1]和obs2[1]表示某些空间坐标.两个数组都(或多或少)按时间排序.两个测量系列之间的时间和坐标应该相同，但实际上并非如此.同样，并非obs1中的每个测量值在obs2中都具有对应的值，反之亦然.另一个问题是时间可能会略有偏移.

I have two 2d arrays, obs1 and obs2. They represent two independent measurement series, and both have dim0 = 2, and slightly different dim1, say obs1.shape = (2, 250000), and obs2.shape = (2, 250050). obs1[0] and obs2[0] signify time, and obs1[1] and obs2[1] signify some spatial coordinate. Both arrays are (more or less) sorted by time. The times and coordinates should be identical between the two measurement series, but in reality they aren't. Also, not each measurement from obs1 has a corresponding value in obs2 and vice-versa. Another problem is that there might be a slight offset in the times.

我正在寻找一种有效的算法，以将obs2中的最佳匹配值与obs1中的每个测量值相关联.目前，我是这样的:

I'm looking for an efficient algorithm to associate the best matching value from obs2 to each measurement in obs1. Currently, I do it like this:

define dt = some_maximum_time_difference
define dx = 3
j = 0
i = 0
matchresults = np.empty(obs1.shape[1])
for j in obs1.shape[1]:
    while obs1[0, j] - obs2[0, j] < dt:
        i += 1
    matchresults[j] = i - dx + argmin(abs(obs1[1, i] - obs2[1, i-dx:i+dx+1]))

这将产生良好的结果.但是，它非常慢，循环运行.

This yields good results. However, it is extremely slow, running in a loop.

对于如何快速改进此算法的想法，我将非常感谢，例如使用KDtree或类似的东西.

I would be very thankful for ideas on how to improve this algorithm speed-wise, e.g. using KDtree or something similar.

推荐答案

在这种情况下使用cKDTree看起来像:

Using cKDTree for this case would look like:

from scipy.spatial import cKDTree

obs2 = array with shape (2, m)
obs1 = array with shape (2, n)

kdt = cKDTree(obs2.T)
dist, indices = kdt.query(obs1.T)

其中，indices将包含obs2中的列索引，该列索引与obs1中的每个观测值相对应.请注意，我必须转置obs1和obs2.

where indices will contain the column indices in obs2 corresponding to each observation in obs1. Note that I had to transpose obs1 and obs2.

这篇关于高效匹配两个数组(如何使用KDTree)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

高效匹配两个数组(如何使用KDTree) [英] Efficient matching of two arrays (how to use KDTree)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

高效匹配两个数组(如何使用KDTree) [英] Efficient matching of two arrays (how to use KDTree)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭