Python获取具有na值的矩阵中邻居的平均值 [英] Python get get average of neighbours in matrix with na value

查看:276
本文介绍了Python获取具有na值的矩阵中邻居的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的矩阵很大,所以不想通过每一行和每一列求和.

a = [[1,2,3],[3,4,5],[5,6,7]]
def neighbors(i,j,a):
    return [a[i][j-1], a[i][(j+1)%len(a[0])], a[i-1][j], a[(i+1)%len(a)][j]]
[[np.mean(neighbors(i,j,a)) for j in range(len(a[0]))] for i in range(len(a))]

此代码适用于3x3或较小范围的矩阵,但对于2k x 2k之类的较大矩阵,则不可行.同样,如果矩阵中的任何值丢失或像na

一样,这也不起作用

此代码适用于3x3或较小范围的矩阵,但对于2k x 2k之类的较大矩阵,则不可行.如果缺少矩阵中的任何值或类似于na,这也将不起作用.如果任何邻居值是na,则跳过该邻居以获取平均值

解决方案

拍摄#1

这假设您要在窗口为3 x 3的输入数组中获取滑动窗口平均值,并且仅考虑西北-东南-南邻元素.

在这种情况下, <可以使用带有适当内核的c4> .最后,您需要将这些求和除以内核中的总和,即kernel.sum(),因为只有那些对求和有贡献的求和.这是实现-

import numpy as np
from scipy import signal

# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]

# Convert to numpy array
arr = np.asarray(a,float)    

# Define kernel for convolution                                         
kernel = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]]) 

# Perform 2D convolution with input data and kernel 
out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()

拍摄#2

这与镜头#1中的假设相同,只是我们希望在仅零个元素的邻域中找到平均值,并打算用这些平均值替换它们.

方法1:这是一种使用手动选择卷积方法的方法-

import numpy as np

# Convert to numpy array
arr = np.asarray(a,float)    

# Pad around the input array to take care of boundary conditions
arr_pad = np.lib.pad(arr, (1,1), 'wrap')

R,C = np.where(arr==0)   # Row, column indices for zero elements in input array
N = arr_pad.shape[1]     # Number of rows in input array

offset = np.array([-N, -1, 1, N])
idx = np.ravel_multi_index((R+1,C+1),arr_pad.shape)[:,None] + offset

arr_out = arr.copy()
arr_out[R,C] = arr_pad.ravel()[idx].sum(1)/4

样本输入,输出-

In [587]: arr
Out[587]: 
array([[ 4.,  0.,  3.,  3.,  3.,  1.,  3.],
       [ 2.,  4.,  0.,  0.,  4.,  2.,  1.],
       [ 0.,  1.,  1.,  0.,  1.,  4.,  3.],
       [ 0.,  3.,  0.,  2.,  3.,  0.,  1.]])

In [588]: arr_out
Out[588]: 
array([[ 4.  ,  3.5 ,  3.  ,  3.  ,  3.  ,  1.  ,  3.  ],
       [ 2.  ,  4.  ,  2.  ,  1.75,  4.  ,  2.  ,  1.  ],
       [ 1.5 ,  1.  ,  1.  ,  1.  ,  1.  ,  4.  ,  3.  ],
       [ 2.  ,  3.  ,  2.25,  2.  ,  3.  ,  2.25,  1.  ]])

要注意边界条件,还有其他填充选项.有关更多信息,请参见 numpy.pad .

方法2::这是在Shot #1前面列出的基于卷积方法的修改版本.这与之前的方法相同,除了最后我们有选择地替换 带卷积输出的零元素.这是代码-

import numpy as np
from scipy import signal

# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]

# Convert to numpy array
arr = np.asarray(a,float)

# Define kernel for convolution                                         
kernel = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]]) 

# Perform 2D convolution with input data and kernel 
conv_out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()

# Initialize output array as a copy of input array
arr_out = arr.copy()

# Setup a mask of zero elements in input array and 
# replace those in output array with the convolution output
mask = arr==0
arr_out[mask] = conv_out[mask]

备注:当输入数组中的零元素数量较少时,Approach #1是首选方法,否则请使用Approach #2.

I have very large matrix, so dont want to sum by going through each row and column.

a = [[1,2,3],[3,4,5],[5,6,7]]
def neighbors(i,j,a):
    return [a[i][j-1], a[i][(j+1)%len(a[0])], a[i-1][j], a[(i+1)%len(a)][j]]
[[np.mean(neighbors(i,j,a)) for j in range(len(a[0]))] for i in range(len(a))]

This code works well for 3x3 or small range of matrix, but for large matrix like 2k x 2k this is not feasible. Also this does not work if any of the value in matrix is missing or it's like na

This code works well for 3x3 or small range of matrix, but for large matrix like 2k x 2k this is not feasible. Also this does not work if any of the value in matrix is missing or it's like na. If any of the neighbor values is na then skip that neighbour in getting the average

解决方案

Shot #1

This assumes you are looking to get sliding windowed average values in an input array with a window of 3 x 3 and considering only the north-west-east-south neighborhood elements.

For such a case, signal.convolve2d with an appropriate kernel could be used. At the end, you need to divide those summations by the number of ones in kernel, i.e. kernel.sum() as only those contributed to the summations. Here's the implementation -

import numpy as np
from scipy import signal

# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]

# Convert to numpy array
arr = np.asarray(a,float)    

# Define kernel for convolution                                         
kernel = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]]) 

# Perform 2D convolution with input data and kernel 
out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()

Shot #2

This makes the same assumptions as in shot #1, except that we are looking to find average values in a neighborhood of only zero elements with the intention to replace them with those average values.

Approach #1: Here's one way to do it using a manual selective convolution approach -

import numpy as np

# Convert to numpy array
arr = np.asarray(a,float)    

# Pad around the input array to take care of boundary conditions
arr_pad = np.lib.pad(arr, (1,1), 'wrap')

R,C = np.where(arr==0)   # Row, column indices for zero elements in input array
N = arr_pad.shape[1]     # Number of rows in input array

offset = np.array([-N, -1, 1, N])
idx = np.ravel_multi_index((R+1,C+1),arr_pad.shape)[:,None] + offset

arr_out = arr.copy()
arr_out[R,C] = arr_pad.ravel()[idx].sum(1)/4

Sample input, output -

In [587]: arr
Out[587]: 
array([[ 4.,  0.,  3.,  3.,  3.,  1.,  3.],
       [ 2.,  4.,  0.,  0.,  4.,  2.,  1.],
       [ 0.,  1.,  1.,  0.,  1.,  4.,  3.],
       [ 0.,  3.,  0.,  2.,  3.,  0.,  1.]])

In [588]: arr_out
Out[588]: 
array([[ 4.  ,  3.5 ,  3.  ,  3.  ,  3.  ,  1.  ,  3.  ],
       [ 2.  ,  4.  ,  2.  ,  1.75,  4.  ,  2.  ,  1.  ],
       [ 1.5 ,  1.  ,  1.  ,  1.  ,  1.  ,  4.  ,  3.  ],
       [ 2.  ,  3.  ,  2.25,  2.  ,  3.  ,  2.25,  1.  ]])

To take care of the boundary conditions, there are other options for padding. Look at numpy.pad for more info.

Approach #2: This would be a modified version of convolution based approach listed earlier in Shot #1. This is same as that earlier approach, except that at the end, we selectively replace the zero elements with the convolution output. Here's the code -

import numpy as np
from scipy import signal

# Inputs
a = [[1,2,3],[3,4,5],[5,6,7],[4,8,9]]

# Convert to numpy array
arr = np.asarray(a,float)

# Define kernel for convolution                                         
kernel = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]]) 

# Perform 2D convolution with input data and kernel 
conv_out = signal.convolve2d(arr, kernel, boundary='wrap', mode='same')/kernel.sum()

# Initialize output array as a copy of input array
arr_out = arr.copy()

# Setup a mask of zero elements in input array and 
# replace those in output array with the convolution output
mask = arr==0
arr_out[mask] = conv_out[mask]

Remarks: Approach #1 would be the preferred way when you have fewer number of zero elements in input array, otherwise go with Approach #2.

这篇关于Python获取具有na值的矩阵中邻居的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆