如何使用 scipy.sparse.csr_matrix.min 忽略隐式零? [英] How to ignore implicit zeros with scipy.sparse.csr_matrix.min?

查看:90
本文介绍了如何使用 scipy.sparse.csr_matrix.min 忽略隐式零?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 3D 空间中大约 500K 点的列表.我想找到第一近邻距离最大的两个坐标.

I have a list of about 500K points in 3D space. I want to find the two coordinates with the maximum first nearest neighbor distance.

我正在使用 scipy 计算稀疏距离矩阵:

I am using scipy to calculate a sparse distance matrix:

from scipy.spatial import cKDTree

tree = cKDTree(points, 40)
spd = tree.sparse_distance_matrix(tree, 0.01)
spo = spd.tocsr()
spo.eliminate_zeros()

我消除了显式零以说明计算每个点与其自身之间距离的对角元素.

I eliminate explicit zeros to account for the diagonal elements where the distance between each point and itself is calculated.

我现在想找到每一行/列中最小距离的坐标,它应该对应于每个点的第一个最近邻,例如:

I wanted to now find the coordinates of the minimum distance in each row/column, which should correspond to the first nearest neighbor of each point, with something like:

spo.argmin(axis=0)

通过找到该数组中元素的最大距离,我应该能够找到具有最大第一个最近邻距离的两个元素.

By finding the maximum distance for the elements in this array I should be able to find the two elements with the maximum first nearest neighbor distance.

问题是 scipy.sparse.csr_matrixminargmin 函数也考虑了隐式零,为此我不想要的应用程序.我该如何解决这个问题?有了这个巨大的矩阵,性能和内存都是问题.或者我想做的事情有完全不同的方法吗?

The issue is that the min and argmin functions of scipy.sparse.csr_matrix also take the implicit zeros into account, which for this application I do not want. How do I solve this issue? With this huge matrix, performance and memory are both issues. Or is there an entirely different approach to what I want to do?

推荐答案

我没有找到距离矩阵的解决方案,但似乎我忽略了使用 query 方法的最明显的解决方案树.

I didn't find a solution with the distance matrix but it appears I overlooked the most obvious solution using the query method of the tree.

所以为了找到我所做的第一个最近邻居之间的最大距离(使用向量一个形状为 (N, 3) 的 numpy 数组):

So to find the maximum distance between first nearest neighbors I did (with vectors a numpy array of shape (N, 3)):

tree = cKDTree(vectors, leaf_size)
# get the indexes of the first nearest neighbor of each vertex
# we use k=2 because k=1 are the points themselves with distance 0
nn1 = tree.query(vectors, k=2)[1][:,1]
# get the vectors corresponding to those indexes. Basically this is "vectors" sorted by
# first nearest neighbor of each point in "vectors".
nn1_vec = vectors[nn1]
# the distance between each point and its first nearest neighbor
nn_dist = np.sqrt(np.sum((vectors - nn1_vec)**2, axis=1))
# maximum distance
return np.max(nn_dist)

这篇关于如何使用 scipy.sparse.csr_matrix.min 忽略隐式零?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆