填充numpy数组中的空白 [英] Filling gaps in a numpy array

查看:49
本文介绍了填充numpy数组中的空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只想用最简单的术语来插入一个 3D 数据集.线性插值,最近邻,所有这些就足够了(这是开始一些算法,所以不需要准确的估计).

I just want to interpolate, in the simplest possible terms, a 3D dataset. Linear interpolation, nearest neighbour, all that would suffice (this is to start off some algorithm, so no accurate estimate is required).

在新的 scipy 版本中,griddata 之类的东西会很有用,但目前我只有 scipy 0.8.所以我有一个立方体" (data[:,:,:], (NixNjxNk)) 数组和一个标志数组 (flags[:,:,:,]code>、TrueFalse) 的大小相同.我想为标志的相应元素为 False 的数据元素插入我的数据,例如使用数据中最近的有效数据点,或附近"点的一些线性组合.

In new scipy versions, things like griddata would be useful, but currently I only have scipy 0.8. So I have a "cube" (data[:,:,:], (NixNjxNk)) array, and an array of flags (flags[:,:,:,], True or False) of the same size. I want to interpolate my data for the elements of data where the corresponding element of flag is False, using eg the nearest valid datapoint in data, or some linear combination of "close by" points.

数据集中至少在两个维度上可能存在较大差距.除了使用 kdtrees 或类似方法编写完整的最近邻算法之外,我真的找不到通用的 N 维最近邻插值器.

There can be large gaps in the dataset in at least two dimensions. Other than coding a full-blown nearest neighbour algorithm using kdtrees or similar, I can't really find a generic, N-dimensional nearest-neighbour interpolator.

推荐答案

您可以设置晶体生长式算法沿每个轴交替移动视图,仅替换标记为 False 但有一个 True 邻居.这给出了类似最近邻"的结果(但不是欧几里得或曼哈顿距离——我认为如果你计算像素,计算所有具有公共角的连接像素,它可能是最近邻)这对于 NumPy 应该相当有效因为它只迭代轴和收敛迭代,而不是小片数据.

You can set up a crystal-growth-style algorithm shifting a view alternately along each axis, replacing only data that is flagged with a False but has a True neighbor. This gives a "nearest-neighbor"-like result (but not in Euclidean or Manhattan distance -- I think it might be nearest-neighbor if you are counting pixels, counting all connecting pixels with common corners) This should be fairly efficient with NumPy as it iterates over only axis and convergence iterations, not small slices of the data.

粗、快速且稳定.我认为这就是你所追求的:

Crude, fast and stable. I think that's what you were after:

import numpy as np
# -- setup --
shape = (10,10,10)
dim = len(shape)
data = np.random.random(shape)
flag = np.zeros(shape, dtype=bool)
t_ct = int(data.size/5)
flag.flat[np.random.randint(0, flag.size, t_ct)] = True
# True flags the data
# -- end setup --

slcs = [slice(None)]*dim

while np.any(~flag): # as long as there are any False's in flag
    for i in range(dim): # do each axis
        # make slices to shift view one element along the axis
        slcs1 = slcs[:]
        slcs2 = slcs[:]
        slcs1[i] = slice(0, -1)
        slcs2[i] = slice(1, None)

        # replace from the right
        repmask = np.logical_and(~flag[slcs1], flag[slcs2])
        data[slcs1][repmask] = data[slcs2][repmask]
        flag[slcs1][repmask] = True

        # replace from the left
        repmask = np.logical_and(~flag[slcs2], flag[slcs1])
        data[slcs2][repmask] = data[slcs1][repmask]
        flag[slcs2][repmask] = True

为了更好地衡量,这里是由最初标记为 True 的数据播种的区域的可视化 (2D).

For good measure, here's a visualization (2D) of the zones seeded by the data originally flagged True.

这篇关于填充numpy数组中的空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆