用python numpy掩码数组中的最近邻居填写缺少的值吗? [英] Fill in missing values with nearest neighbour in Python numpy masked arrays?
问题描述
我正在使用Python中的2D Numpy masked_array. 我需要更改遮罩区域中的数据值,以使其等于最接近的未遮罩值.
I am working with a 2D Numpy masked_array in Python. I need to change the data values in the masked area such that they equal the nearest unmasked value.
NB.如果有多个最接近的未屏蔽值,则可以采用这些最接近的值中的任何一个(事实证明,这是最容易编写的……)
NB. If there are more than one nearest unmasked values then it can take any of those nearest values (which ever one turns out to be easiest to code…)
例如
import numpy
import numpy.ma as ma
a = numpy.arange(100).reshape(10,10)
fill_value=-99
a[2:4,3:8] = fill_value
a[8,8] = fill_value
a = ma.masked_array(a,a==fill_value)
>>> a [[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 -- -- -- -- -- 28 29]
[30 31 32 -- -- -- -- -- 38 39]
[40 41 42 43 44 45 46 47 48 49]
[50 51 52 53 54 55 56 57 58 59]
[60 61 62 63 64 65 66 67 68 69]
[70 71 72 73 74 75 76 77 78 79]
[80 81 82 83 84 85 86 87 -- 89]
[90 91 92 93 94 95 96 97 98 99]],
- 我需要它看起来像这样:
>>> a.data
[[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 ? 14 15 16 ? 28 29]
[30 31 32 ? 44 45 46 ? 38 39]
[40 41 42 43 44 45 46 47 48 49]
[50 51 52 53 54 55 56 57 58 59]
[60 61 62 63 64 65 66 67 68 69]
[70 71 72 73 74 75 76 77 78 79]
[80 81 82 83 84 85 86 87 ? 89]
[90 91 92 93 94 95 96 97 98 99]],
NB.在哪里 "?"可以采用任何相邻的未屏蔽值.
NB. where "?" could take any of the adjacent unmasked values.
最有效的方法是什么?
感谢您的帮助.
推荐答案
您可以使用np.roll
制作a
的移位副本,然后在蒙版上使用布尔逻辑来识别要填充的斑点:>
You could use np.roll
to make shifted copies of a
, then use boolean logic on the masks to identify the spots to be filled in:
import numpy as np
import numpy.ma as ma
a = np.arange(100).reshape(10,10)
fill_value=-99
a[2:4,3:8] = fill_value
a[8,8] = fill_value
a = ma.masked_array(a,a==fill_value)
print(a)
# [[0 1 2 3 4 5 6 7 8 9]
# [10 11 12 13 14 15 16 17 18 19]
# [20 21 22 -- -- -- -- -- 28 29]
# [30 31 32 -- -- -- -- -- 38 39]
# [40 41 42 43 44 45 46 47 48 49]
# [50 51 52 53 54 55 56 57 58 59]
# [60 61 62 63 64 65 66 67 68 69]
# [70 71 72 73 74 75 76 77 78 79]
# [80 81 82 83 84 85 86 87 -- 89]
# [90 91 92 93 94 95 96 97 98 99]]
for shift in (-1,1):
for axis in (0,1):
a_shifted=np.roll(a,shift=shift,axis=axis)
idx=~a_shifted.mask * a.mask
a[idx]=a_shifted[idx]
print(a)
# [[0 1 2 3 4 5 6 7 8 9]
# [10 11 12 13 14 15 16 17 18 19]
# [20 21 22 13 14 15 16 28 28 29]
# [30 31 32 43 44 45 46 47 38 39]
# [40 41 42 43 44 45 46 47 48 49]
# [50 51 52 53 54 55 56 57 58 59]
# [60 61 62 63 64 65 66 67 68 69]
# [70 71 72 73 74 75 76 77 78 79]
# [80 81 82 83 84 85 86 87 98 89]
# [90 91 92 93 94 95 96 97 98 99]]
如果您想使用更多的最近邻居,则可以执行以下操作:
If you'd like to use a larger set of nearest neighbors, you could perhaps do something like this:
neighbors=((0,1),(0,-1),(1,0),(-1,0),(1,1),(-1,1),(1,-1),(-1,-1),
(0,2),(0,-2),(2,0),(-2,0))
请注意,neighbors
中元素的顺序很重要.您可能想与 nearest 邻居(而不是任何邻居)一起填写缺失值.可能有一种更聪明的方法来生成邻居序列,但是我暂时没有看到它.
Note that the order of the elements in neighbors
is important. You probably want to fill in missing values with the nearest neighbor, not just any neighbor. There's probably a smarter way to generate the neighbors sequence, but I'm not seeing it at the moment.
a_copy=a.copy()
for hor_shift,vert_shift in neighbors:
if not np.any(a.mask): break
a_shifted=np.roll(a_copy,shift=hor_shift,axis=1)
a_shifted=np.roll(a_shifted,shift=vert_shift,axis=0)
idx=~a_shifted.mask*a.mask
a[idx]=a_shifted[idx]
请注意,np.roll
愉快地将下边缘滚动到顶部,因此顶部的缺失值可能会从底部开始填充一个值.如果这是一个问题,我将不得不更多地考虑如何解决它.一个明显但不是很聪明的解决方案是使用if
语句并为边缘提供不同顺序的可允许邻居...
Note that np.roll
happily rolls the lower edge to the top, so a missing value at the top may be filled in by a value from the very bottom. If this is a problem, I'd have to think more about how to fix it. The obvious but not very clever solution would be to use if
statements and feed the edges a different sequence of admissible neighbors...
这篇关于用python numpy掩码数组中的最近邻居填写缺少的值吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!