有效计算邻居之间的距离 [英] Calculate distance between neighbors efficiently

查看:134
本文介绍了有效计算邻居之间的距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据在地理上没有任何图案,因此我需要创建一个图像,其中每个像素的值是该像素的邻居的平均值小于X米.

I have data geographically scattered without any kind of pattern and I need to create an image where the value of each pixel is an average of the neighbors of that pixel that are less than X meters.

为此,我使用库scipy.spatial来生成包含数据(cKDTree)的KDTree.生成数据结构后,我将在地理位置上定位像素并找到最近的地理位置.

For this I use the library scipy.spatial to generate a KDTree with the data (cKDTree). Once the data structure is generated, I locate the pixel geographically and locate the geographic points that are closest.

# Generate scattered data points
coord_cart= [
    [
        feat.geometry().GetY(),
        feat.geometry().GetX(),
        feat.GetField(feature),
    ] for feat in layer
]

# Create KDTree structure
tree = cKDTree(coord_cart)

# Get raster image dimensions
pixel_size = 5
source_layer = shapefile.GetLayer()
x_min, x_max, y_min, y_max = source_layer.GetExtent()

x_res = int((x_max - x_min) / pixel_size)
y_res = int((y_max - y_min) / pixel_size)

# Create grid
x = np.linspace(x_min, x_max, x_res)
y = np.linspace(y_min, y_max, y_res)

X, Y = np.meshgrid(x, y)
grid = np.array(zip(Y.ravel(), X.ravel()))

# Get points that are less than 10 meters away
inds = tree.query_ball_point(grid, 10)

# inds is an np.array of lists of different length, so I need to convert it into an array of n_points x maximum number of neighbors
ll = np.array([len(l) for l in inds])
maxlen = max(ll)
arr = np.zeros((len(ll), maxlen), int)

# I don't know why but inds is an array of list, so I convert it into an array of array to use grid[inds]
# I THINK THIS IS A LITTLE INEFFICIENT
for i in range(len(inds)):
    inds[i].extend([i] * (maxlen - len(inds[i])))
    arr[i] = np.array(inds[i], dtype=int)

# AND THIS DOESN'T WORK
d = np.linalg.norm(grid - grid[inds])

是否有更好的方法可以做到这一点?我正在尝试使用 IDW 在两点之间进行插值.我发现了这个代码段,该函数使用的功能是获取N个最近的点,但是对我不起作用,因为我需要如果半径R中没有点,则像素的值为0.

Is there a better way to do this? I'm trying to use IDW to perform the interpolation between the points. I found this snippet that uses a function that gets the N nearest points but it does not work for me because I need that if there is no point in a radius R, the value of the pixel is 0.

d, inds = tree.query(zip(xt, yt, zt), k = 10)
w = 1.0 / d**2
air_idw = np.sum(w * air.flatten()[inds], axis=1) / np.sum(w, axis=1)
air_idw.shape = lon_curv.shape

提前谢谢!

推荐答案

这可能是KDTree不是一个好的解决方案的情况之一.这是因为您要映射到网格,这是一个非常简单的结构,这意味着KDTree的复杂性没有任何好处.可以通过简单的算法找到最近的网格点和距离.

This may be one of the cases where KDTrees are not a good solution. This is because you are mapping to a grid, which is a very simple structure meaning there is nothing to gain from the KDTree's sophistication. Nearest grid point and distance can be found by simple arithmetic.

下面是一个简单的示例实现.我使用的是高斯内核,但如果您愿意的话,将其更改为IDW应该很简单.

Below is a simple example implementation. I'm using a Gaussian kernel but changing that to IDW if you prefer should be straight-forward.

import numpy as np
from scipy import stats

def rasterize(coords, feature, gu, cutoff, kernel=stats.norm(0, 2.5).pdf):
    # compute overlap (filter size / grid unit)
    ovlp = int(np.ceil(cutoff/gu))

    # compute raster dimensions
    mn, mx = coords.min(axis=0), coords.max(axis=0)
    reso = np.ceil((mx - mn) / gu).astype(int)
    base = (mx + mn - reso * gu) / 2

    # map coordinates to raster, the residual is the distance
    grid_res = coords - base
    grid_coords = np.rint(grid_res / gu).astype(int)
    grid_res -= gu * grid_coords
    # because of overlap we must add neighboring grid points to the nearest
    gcovlp = np.c_[-ovlp:ovlp+1, np.zeros(2*ovlp+1, dtype=int)]
    grid_coords = (gcovlp[:, None, None, :] + gcovlp[None, :, None, ::-1]
                   + grid_coords).reshape(-1, 2)
    # the corresponding residuals have the same offset with opposite sign
    gdovlp = -gu * (gcovlp+1/2)
    grid_res = (gdovlp[:, None, None, :] + gdovlp[None, :, None, ::-1]
                + grid_res).reshape(-1, 2)
    # discard off fov grid points and points outside the cutoff
    valid, = np.where(((grid_coords>=0) & (grid_coords<=reso)).all(axis=1) & (
        np.einsum('ij,ij->i', grid_res, grid_res) <= cutoff*cutoff))
    grid_res = grid_res[valid]
    feature = feature[valid // (2*ovlp+1)**2]
    # flatten grid so we can use bincount
    grid_flat = np.ravel_multi_index(grid_coords[valid].T, reso+1)
    return np.bincount(
        grid_flat,
        feature * kernel(np.sqrt(np.einsum('ij,ij->i', grid_res, grid_res))),
        (reso + 1).prod()).reshape(reso+1)



gu = 5
cutoff = 10
coords = np.random.randn(10_000, 2) * (100, 20)
coords[:, 1] += 80 * np.sin(coords[:, 0] / 40)
feature = np.random.uniform(0, 1000, (10_000,))

from timeit import timeit

print(timeit("rasterize(coords, feature, gu, cutoff)", globals=globals(), number=100)*10, 'ms')

pic = rasterize(coords, feature, gu, cutoff)

import pylab
pylab.pcolor(pic, cmap=pylab.cm.jet)
pylab.colorbar()
pylab.show()

这篇关于有效计算邻居之间的距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆