缺少数据的python中的2d卷积 [英] 2d convolution in python with missing data

查看:95
本文介绍了缺少数据的python中的2d卷积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道有scipy.signal.convolve2d函数来处理2d numpy数组的2维卷积,并且有numpy.ma模块来处理丢失的数据,但是这两种方法似乎彼此不兼容(意味着即使您屏蔽了numpy中的2d数组,convolve2d中的过程也不会受到影响).有什么方法可以仅使用numpy和scipy包来处理卷积中的缺失值?

I know there is scipy.signal.convolve2d function to handle 2 dimension convolution for 2d numpy array, and there is numpy.ma module to handle missing data, but these two methods don't seem to compatible with each other (which means even if you mask a 2d array in numpy, the process in convolve2d won't be affected). Is there any way to handle missing values in convolution using only numpy and scipy packages?

例如:

            1 - 3 4 5
            1 2 - 4 5
   Array =  1 2 3 - 5
            - 2 3 4 5
            1 2 3 4 -

  Kernel =  1  0
            0 -1

所需的卷积结果(数组,内核,边界='wrap'):

Desired result for convolution(Array, Kernel, boundary='wrap'):

               -1  - -1 -1 4
               -1 -1  - -1 4
    Result =   -1 -1 -1  - 5
                - -1 -1  4 4
                1 -1 -1 -1 -


感谢Aguy的建议,这是帮助卷积后计算结果的一种非常好的方法.现在,我们可以从Array.mask中获取Array的遮罩,这将为我们提供


Thanks for the suggestion from Aguy, that is a really good way to help the calculation of result after convolution. Now let's say we can get the mask of Array from Array.mask, which would give us a result of

                   False True  False False False                       
                   False False True  False False
    Array.mask ==  False False False True  False
                   True  False False False False
                   False False False False True

在卷积成蒙版数组后,如何使用此蒙版将结果转换为?

How can I use this mask to convert the result after convolution into a masked array?

推荐答案

我不认为用0代替是正确的方法,您将已累计的值推向0.这些缺失实际上应被视为缺失" .因为它们代表丢失的信息,因此没有理由仅仅假设它们可以为0,并且根本不应该参与任何计算.

I dont think replacing with 0s is the correct way of doing this, you are nudging the covolved values toward 0. These missings should literally be treated as "missings". Because they represent the missing pieces of information and there is no justification to just assume they could be 0s, and they shouldn't be involved in any calcuation at all.

我尝试将缺失值设置为numpy.nan,然后进行卷积,结果表明内核之间的任何重叠和任何缺失都会在结果中给出nan,即使内核中的重叠为0,所以您会在结果中看到更大的缺失洞.根据您的应用程序,这可能是理想的结果.

I tried setting missing values to numpy.nan and then convolve, the result suggest that any overlap between the kernel and any missing gives an nan in the result, even if the overlap is with an 0 in the kernel, so you get an enlarged hole of missings in the result. Depending on your application, this could be the desired result.

但是在某些情况下,您不想只为1个缺失就丢弃这么多信息(也许< = 50%的缺失仍然可以容忍).在这种情况下,我发现了另一个具有更好实现的模块 astropy :numpy.nan被忽略(或替换为内插值?).

But in some cases, you don't want to discard so much information just for 1 single missing (maybe <= 50% of missing is still tolerable). In such cases, I've found another module astropy that has a better implementation: numpy.nans are ignored (or replaced with interpolated values?).

因此,使用astropy,您将执行以下操作:

So using astropy, you would do the following:

from astropy.convolution import convolve
inarray=numpy.where(inarray.mask,numpy.nan,inarray) # masking still doesn't work, has to set to numpy.nan
result=convolve(inarray,kernel)

但是,您仍然无法控制可忍受的丢失量.为此,我创建了一个使用scipy.ndimage.convolve()进行初始卷积的函数,但是在涉及缺失(numpy.nan)时手动重新计算值:

But still, you don't have control over how much of missing is tolerable. To achieve that, I've created a function that uses the scipy.ndimage.convolve() for the initial convolution, but manually re-compute values whenever missings (numpy.nan) are involved:

def convolve2d(slab,kernel,max_missing=0.5,verbose=True):
    '''2D convolution with missings ignored

    <slab>: 2d array. Input array to convolve. Can have numpy.nan or masked values.
    <kernel>: 2d array, convolution kernel, must have sizes as odd numbers.
    <max_missing>: float in (0,1), max percentage of missing in each convolution
                   window is tolerated before a missing is placed in the result.

    Return <result>: 2d array, convolution result. Missings are represented as
                     numpy.nans if they are in <slab>, or masked if they are masked
                     in <slab>.

    '''

    from scipy.ndimage import convolve as sciconvolve

    assert numpy.ndim(slab)==2, "<slab> needs to be 2D."
    assert numpy.ndim(kernel)==2, "<kernel> needs to be 2D."
    assert kernel.shape[0]%2==1 and kernel.shape[1]%2==1, "<kernel> shape needs to be an odd number."
    assert max_missing > 0 and max_missing < 1, "<max_missing> needs to be a float in (0,1)."

    #--------------Get mask for missings--------------
    if not hasattr(slab,'mask') and numpy.any(numpy.isnan(slab))==False:
        has_missing=False
        slab2=slab.copy()

    elif not hasattr(slab,'mask') and numpy.any(numpy.isnan(slab)):
        has_missing=True
        slabmask=numpy.where(numpy.isnan(slab),1,0)
        slab2=slab.copy()
        missing_as='nan'

    elif (slab.mask.size==1 and slab.mask==False) or numpy.any(slab.mask)==False:
        has_missing=False
        slab2=slab.copy()

    elif not (slab.mask.size==1 and slab.mask==False) and numpy.any(slab.mask):
        has_missing=True
        slabmask=numpy.where(slab.mask,1,0)
        slab2=numpy.where(slabmask==1,numpy.nan,slab)
        missing_as='mask'

    else:
        has_missing=False
        slab2=slab.copy()

    #--------------------No missing--------------------
    if not has_missing:
        result=sciconvolve(slab2,kernel,mode='constant',cval=0.)
    else:
        H,W=slab.shape
        hh=int((kernel.shape[0]-1)/2)  # half height
        hw=int((kernel.shape[1]-1)/2)  # half width
        min_valid=(1-max_missing)*kernel.shape[0]*kernel.shape[1]

        # dont forget to flip the kernel
        kernel_flip=kernel[::-1,::-1]

        result=sciconvolve(slab2,kernel,mode='constant',cval=0.)
        slab2=numpy.where(slabmask==1,0,slab2)

        #------------------Get nan holes------------------
        miss_idx=zip(*numpy.where(slabmask==1))

        if missing_as=='mask':
            mask=numpy.zeros([H,W])

        for yii,xii in miss_idx:

            #-------Recompute at each new nan in result-------
            hole_ys=range(max(0,yii-hh),min(H,yii+hh+1))
            hole_xs=range(max(0,xii-hw),min(W,xii+hw+1))

            for hi in hole_ys:
                for hj in hole_xs:
                    hi1=max(0,hi-hh)
                    hi2=min(H,hi+hh+1)
                    hj1=max(0,hj-hw)
                    hj2=min(W,hj+hw+1)

                    slab_window=slab2[hi1:hi2,hj1:hj2]
                    mask_window=slabmask[hi1:hi2,hj1:hj2]
                    kernel_ij=kernel_flip[max(0,hh-hi):min(hh*2+1,hh+H-hi), 
                                     max(0,hw-hj):min(hw*2+1,hw+W-hj)]
                    kernel_ij=numpy.where(mask_window==1,0,kernel_ij)

                    #----Fill with missing if not enough valid data----
                    ksum=numpy.sum(kernel_ij)
                    if ksum<min_valid:
                        if missing_as=='nan':
                            result[hi,hj]=numpy.nan
                        elif missing_as=='mask':
                            result[hi,hj]=0.
                            mask[hi,hj]=True
                    else:
                        result[hi,hj]=numpy.sum(slab_window*kernel_ij)

        if missing_as=='mask':
            result=numpy.ma.array(result)
            result.mask=mask

    return result

下图是显示输出的图.左侧是30x30随机地图,其中有3个numpy.nan孔,尺寸为:

Below is a figure demonstrating the output. On the left is a 30x30 random map with 3 numpy.nan holes with sizes of:

  1. 1x1
  2. 3x3
  3. 5x5

在右边是5x5内核(全为1)的卷积输出,公差水平为50%(max_missing=0.5).

On the right is the convolved output, by a 5x5 kernel (with all 1s), and a tolerance level of 50% (max_missing=0.5).

因此,使用附近的值填充前2个较小的孔,而在最后一个孔中,因为缺失的数量> 0.5x5x5 = 12.5,所以放置numpy.nan来表示缺失的信息.

So the first 2 smaller holes are filled using nearby values, and in the last one, because the number of missings > 0.5x5x5 = 12.5, numpy.nans are placed to represent missing information.

这篇关于缺少数据的python中的2d卷积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆