如何检查一个二维 NumPy 数组中是否包含特定模式的值? [英] How can I check if one two-dimensional NumPy array contains a specific pattern of values inside it?

查看:32
本文介绍了如何检查一个二维 NumPy 数组中是否包含特定模式的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大的 NumPy.array field_array 和一个较小的数组 match_array,两者都由 int 值组成.使用以下示例,如何检查 field_array 的任何 match_array 形状的段是否包含与 match_array 中的值完全对应的值?

I have a large NumPy.array field_array and a smaller array match_array, both consisting of int values. Using the following example, how can I check if any match_array-shaped segment of field_array contains values that exactly correspond to the ones in match_array?

import numpy
raw_field = ( 24,  25,  26,  27,  28,  29,  30,  31,  23, \
              33,  34,  35,  36,  37,  38,  39,  40,  32, \
             -39, -38, -37, -36, -35, -34, -33, -32, -40, \
             -30, -29, -28, -27, -26, -25, -24, -23, -31, \
             -21, -20, -19, -18, -17, -16, -15, -14, -22, \
             -12, -11, -10,  -9,  -8,  -7,  -6,  -5, -13, \
              -3,  -2,  -1,   0,   1,   2,   3,   4,  -4, \
               6,   7,   8,   4,   5,   6,   7,  13,   5, \
              15,  16,  17,   8,   9,  10,  11,  22,  14)
field_array = numpy.array(raw_field, int).reshape(9,9)
match_array = numpy.arange(12).reshape(3,4)

这些示例应该返回 True,因为 match_array 描述的模式与 [6:9,3:7] 对齐.>

These examples ought to return True since the pattern described by match_array aligns over [6:9,3:7].

推荐答案

方法 #1

这种方法源自一个解决方案在python中实现Matlab的im2col'sliding',旨在将滑动块从二维数组重新排列成列.因此,为了解决我们这里的情况,来自 field_array 的那些滑动块可以堆叠为列,并与 match_array 的列向量版本进行比较.

This approach derives from a solution to Implement Matlab's im2col 'sliding' in python that was designed to rearrange sliding blocks from a 2D array into columns. Thus, to solve our case here, those sliding blocks from field_array could be stacked as columns and compared against column vector version of match_array.

这里是重排/堆叠函数的正式定义 -

Here's a formal definition of the function for the rearrangement/stacking -

def im2col(A,BLKSZ):   

    # Parameters
    M,N = A.shape
    col_extent = N - BLKSZ[1] + 1
    row_extent = M - BLKSZ[0] + 1

    # Get Starting block indices
    start_idx = np.arange(BLKSZ[0])[:,None]*N + np.arange(BLKSZ[1])

    # Get offsetted indices across the height and width of input array
    offset_idx = np.arange(row_extent)[:,None]*N + np.arange(col_extent)

    # Get all actual indices & index into input array for final output
    return np.take (A,start_idx.ravel()[:,None] + offset_idx.ravel())

为了解决我们的情况,这里是基于im2col的实现-

To solve our case, here's the implementation based on im2col -

# Get sliding blocks of shape same as match_array from field_array into columns
# Then, compare them with a column vector version of match array.
col_match = im2col(field_array,match_array.shape) == match_array.ravel()[:,None]

# Shape of output array that has field_array compared against a sliding match_array
out_shape = np.asarray(field_array.shape) - np.asarray(match_array.shape) + 1

# Now, see if all elements in a column are ONES and reshape to out_shape. 
# Finally, find the position of TRUE indices
R,C = np.where(col_match.all(0).reshape(out_shape))

问题中给定样本的输出将是 -

The output for the given sample in the question would be -

In [151]: R,C
Out[151]: (array([6]), array([3]))

方法#2

鉴于 opencv 已经具有做平方差的模板匹配功能,您可以使用它并寻找零差异,这将是您的匹配位置.因此,如果您有权访问 cv2(opencv 模块),则实现将如下所示 -

Given that opencv already has template matching function that does square of differences, you can employ that and look for zero differences, which would be your matching positions. So, if you have access to cv2 (opencv module), the implementation would look something like this -

import cv2
from cv2 import matchTemplate as cv2m

M = cv2m(field_array.astype('uint8'),match_array.astype('uint8'),cv2.TM_SQDIFF)
R,C = np.where(M==0)

给我们 -

In [204]: R,C
Out[204]: (array([6]), array([3]))

<小时>

基准测试

本节比较了为解决问题而建议的所有方法的运行时间.本节中列出的各种方法的功劳归于他们的贡献者.


Benchmarking

This section compares runtimes for all the approaches suggested to solve the question. The credit for the various methods listed in this section goes to their contributors.

方法定义 -

def seek_array(search_in, search_for, return_coords = False):
    si_x, si_y = search_in.shape
    sf_x, sf_y = search_for.shape
    for y in xrange(si_y-sf_y+1):
        for x in xrange(si_x-sf_x+1):
            if numpy.array_equal(search_for, search_in[x:x+sf_x, y:y+sf_y]):
                return (x,y) if return_coords else True
    return None if return_coords else False

def skimage_based(field_array,match_array):
    windows = view_as_windows(field_array, match_array.shape)
    return (windows == match_array).all(axis=(2,3)).nonzero()

def im2col_based(field_array,match_array):   
    col_match = im2col(field_array,match_array.shape)==match_array.ravel()[:,None]
    out_shape = np.asarray(field_array.shape) - np.asarray(match_array.shape) + 1  
    return np.where(col_match.all(0).reshape(out_shape))

def cv2_based(field_array,match_array):
    M = cv2m(field_array.astype('uint8'),match_array.astype('uint8'),cv2.TM_SQDIFF)
    return np.where(M==0)

运行时测试 -

案例#1(问题的样本数据):

Case # 1 (Sample data from question):

In [11]: field_array
Out[11]: 
array([[ 24,  25,  26,  27,  28,  29,  30,  31,  23],
       [ 33,  34,  35,  36,  37,  38,  39,  40,  32],
       [-39, -38, -37, -36, -35, -34, -33, -32, -40],
       [-30, -29, -28, -27, -26, -25, -24, -23, -31],
       [-21, -20, -19, -18, -17, -16, -15, -14, -22],
       [-12, -11, -10,  -9,  -8,  -7,  -6,  -5, -13],
       [ -3,  -2,  -1,   0,   1,   2,   3,   4,  -4],
       [  6,   7,   8,   4,   5,   6,   7,  13,   5],
       [ 15,  16,  17,   8,   9,  10,  11,  22,  14]])

In [12]: match_array
Out[12]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [13]: %timeit seek_array(field_array, match_array, return_coords = False)
1000 loops, best of 3: 465 µs per loop

In [14]: %timeit skimage_based(field_array,match_array)
10000 loops, best of 3: 97.9 µs per loop

In [15]: %timeit im2col_based(field_array,match_array)
10000 loops, best of 3: 74.3 µs per loop

In [16]: %timeit cv2_based(field_array,match_array)
10000 loops, best of 3: 30 µs per loop

案例#2(更大的随机数据):

Case #2 (Bigger random data):

In [17]: field_array = np.random.randint(0,4,(256,256))

In [18]: match_array = field_array[100:116,100:116].copy()

In [19]: %timeit seek_array(field_array, match_array, return_coords = False)
1 loops, best of 3: 400 ms per loop

In [20]: %timeit skimage_based(field_array,match_array)
10 loops, best of 3: 54.3 ms per loop

In [21]: %timeit im2col_based(field_array,match_array)
10 loops, best of 3: 125 ms per loop

In [22]: %timeit cv2_based(field_array,match_array)
100 loops, best of 3: 4.08 ms per loop

这篇关于如何检查一个二维 NumPy 数组中是否包含特定模式的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆