用python numpy掩码数组中的最近邻居填写缺少的值吗? [英] Fill in missing values with nearest neighbour in Python numpy masked arrays?

查看:158
本文介绍了用python numpy掩码数组中的最近邻居填写缺少的值吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python中的2D Numpy masked_array. 我需要更改遮罩区域中的数据值,以使其等于最接近的未遮罩值.

I am working with a 2D Numpy masked_array in Python. I need to change the data values in the masked area such that they equal the nearest unmasked value.

NB.如果有多个最接近的未屏蔽值,则可以采用这些最接近的值中的任何一个(事实证明,这是最容易编写的……)

NB. If there are more than one nearest unmasked values then it can take any of those nearest values (which ever one turns out to be easiest to code…)

例如

import numpy
import numpy.ma as ma

a = numpy.arange(100).reshape(10,10)
fill_value=-99
a[2:4,3:8] = fill_value
a[8,8] = fill_value
a = ma.masked_array(a,a==fill_value)

>>> a  [[0 1 2 3 4 5 6 7 8 9]
  [10 11 12 13 14 15 16 17 18 19]
  [20 21 22 -- -- -- -- -- 28 29]
  [30 31 32 -- -- -- -- -- 38 39]
  [40 41 42 43 44 45 46 47 48 49]
  [50 51 52 53 54 55 56 57 58 59]
  [60 61 62 63 64 65 66 67 68 69]
  [70 71 72 73 74 75 76 77 78 79]
  [80 81 82 83 84 85 86 87 -- 89]
  [90 91 92 93 94 95 96 97 98 99]],

  • 我需要它看起来像这样:
  • >>> a.data
     [[0 1 2 3 4 5 6 7 8 9]
     [10 11 12 13 14 15 16 17 18 19]
     [20 21 22 ? 14 15 16 ? 28 29]
     [30 31 32 ? 44 45 46 ? 38 39]
     [40 41 42 43 44 45 46 47 48 49]
     [50 51 52 53 54 55 56 57 58 59]
     [60 61 62 63 64 65 66 67 68 69]
     [70 71 72 73 74 75 76 77 78 79]
     [80 81 82 83 84 85 86 87 ? 89]
     [90 91 92 93 94 95 96 97 98 99]],
    

    NB.在哪里 "?"可以采用任何相邻的未屏蔽值.

    NB. where "?" could take any of the adjacent unmasked values.

    最有效的方法是什么?

    感谢您的帮助.

    推荐答案

    您可以使用np.roll制作a的移位副本,然后在蒙版上使用布尔逻辑来识别要填充的斑点:

    You could use np.roll to make shifted copies of a, then use boolean logic on the masks to identify the spots to be filled in:

    import numpy as np
    import numpy.ma as ma
    
    a = np.arange(100).reshape(10,10)
    fill_value=-99
    a[2:4,3:8] = fill_value
    a[8,8] = fill_value
    a = ma.masked_array(a,a==fill_value)
    print(a)
    
    # [[0 1 2 3 4 5 6 7 8 9]
    #  [10 11 12 13 14 15 16 17 18 19]
    #  [20 21 22 -- -- -- -- -- 28 29]
    #  [30 31 32 -- -- -- -- -- 38 39]
    #  [40 41 42 43 44 45 46 47 48 49]
    #  [50 51 52 53 54 55 56 57 58 59]
    #  [60 61 62 63 64 65 66 67 68 69]
    #  [70 71 72 73 74 75 76 77 78 79]
    #  [80 81 82 83 84 85 86 87 -- 89]
    #  [90 91 92 93 94 95 96 97 98 99]]
    
    for shift in (-1,1):
        for axis in (0,1):        
            a_shifted=np.roll(a,shift=shift,axis=axis)
            idx=~a_shifted.mask * a.mask
            a[idx]=a_shifted[idx]
    
    print(a)
    
    # [[0 1 2 3 4 5 6 7 8 9]
    #  [10 11 12 13 14 15 16 17 18 19]
    #  [20 21 22 13 14 15 16 28 28 29]
    #  [30 31 32 43 44 45 46 47 38 39]
    #  [40 41 42 43 44 45 46 47 48 49]
    #  [50 51 52 53 54 55 56 57 58 59]
    #  [60 61 62 63 64 65 66 67 68 69]
    #  [70 71 72 73 74 75 76 77 78 79]
    #  [80 81 82 83 84 85 86 87 98 89]
    #  [90 91 92 93 94 95 96 97 98 99]]
    


    如果您想使用更多的最近邻居,则可以执行以下操作:


    If you'd like to use a larger set of nearest neighbors, you could perhaps do something like this:

    neighbors=((0,1),(0,-1),(1,0),(-1,0),(1,1),(-1,1),(1,-1),(-1,-1),
               (0,2),(0,-2),(2,0),(-2,0))
    

    请注意,neighbors中元素的顺序很重要.您可能想与 nearest 邻居(而不是任何邻居)一起填写缺失值.可能有一种更聪明的方法来生成邻居序列,但是我暂时没有看到它.

    Note that the order of the elements in neighbors is important. You probably want to fill in missing values with the nearest neighbor, not just any neighbor. There's probably a smarter way to generate the neighbors sequence, but I'm not seeing it at the moment.

    a_copy=a.copy()
    for hor_shift,vert_shift in neighbors:
        if not np.any(a.mask): break
        a_shifted=np.roll(a_copy,shift=hor_shift,axis=1)
        a_shifted=np.roll(a_shifted,shift=vert_shift,axis=0)
        idx=~a_shifted.mask*a.mask
        a[idx]=a_shifted[idx]
    

    请注意,np.roll愉快地将下边缘滚动到顶部,因此顶部的缺失值可能会从底部开始填充一个值.如果这是一个问题,我将不得不更多地考虑如何解决它.一个明显但不是很聪明的解决方案是使用if语句并为边缘提供不同顺序的可允许邻居...

    Note that np.roll happily rolls the lower edge to the top, so a missing value at the top may be filled in by a value from the very bottom. If this is a problem, I'd have to think more about how to fix it. The obvious but not very clever solution would be to use if statements and feed the edges a different sequence of admissible neighbors...

    这篇关于用python numpy掩码数组中的最近邻居填写缺少的值吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆