根据起始索引有效填充面膜 [英] Fill mask efficiently based on start indices

查看:53
本文介绍了根据起始索引有效填充面膜的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个2D数组(在这个例子中,实际上可以是ND),我想为其创建一个遮盖每行末尾的遮罩.例如:

I have a 2D array (for this example, actually can be ND), and I would like to create a mask for it that masks the end of each row. For example:

np.random.seed(0xBEEF)
a = np.random.randint(10, size=(5, 6))
mask_indices = np.argmax(a, axis=1)

我想将mask_indices转换为布尔掩码.目前,我想不出有比这更好的方法了

I would like to convert mask_indices to a boolean mask. Currently, I can't think of a better way than

mask = np.zeros(a.shape, dtype=np.bool)
for r, m in enumerate(mask_indices):
    mask[r, m:] = True

所以

a = np.array([[6, 5, 0, 2, 1, 2],
              [8, 1, 3, 7, 1, 9],
              [8, 7, 6, 7, 3, 6],
              [2, 7, 0, 3, 1, 7],
              [5, 4, 0, 7, 6, 0]])

mask_indices = np.array([0, 5, 0, 1, 3])

我想看

mask = np.array([[ True,  True,  True,  True,  True,  True],
                 [False, False, False, False, False,  True],
                 [ True,  True,  True,  True,  True,  True],
                 [False,  True,  True,  True,  True,  True],
                 [False, False, False,  True,  True,  True]])

此操作是否有矢量化形式?

Is there a vectorized form of this operation?

通常,除了定义索引点的维度外,我还希望能够在所有维度上做到这一点.

In general, I would like to be able to do this across all the dimensions besides the one that defines the index points.

推荐答案

I.沿最后一个轴(行)的Ndim数组蒙版

要使n-dim数组沿行屏蔽,我们可以-

I. Ndim array-masking along last axis (rows)

For n-dim array to mask along rows, we could do -

def mask_from_start_indices(a, mask_indices):
    r = np.arange(a.shape[-1])
    return mask_indices[...,None]<=r

样品运行-

In [177]: np.random.seed(0)
     ...: a = np.random.randint(10, size=(2, 2, 5))
     ...: mask_indices = np.argmax(a, axis=-1)

In [178]: a
Out[178]: 
array([[[5, 0, 3, 3, 7],
        [9, 3, 5, 2, 4]],

       [[7, 6, 8, 8, 1],
        [6, 7, 7, 8, 1]]])

In [179]: mask_indices
Out[179]: 
array([[4, 0],
       [2, 3]])

In [180]: mask_from_start_indices(a, mask_indices)
Out[180]: 
array([[[False, False, False, False,  True],
        [ True,  True,  True,  True,  True]],

       [[False, False,  True,  True,  True],
        [False, False, False,  True,  True]]])

II.沿通用轴的Ndim阵列遮罩

对于沿通用轴遮罩的n维数组,它应该是-

II. Ndim array-masking along generic axis

For n-dim arrays masking along a generic axis, it would be -

def mask_from_start_indices_genericaxis(a, mask_indices, axis):
    r = np.arange(a.shape[axis]).reshape((-1,)+(1,)*(a.ndim-axis-1))
    mask_indices_nd = mask_indices.reshape(np.insert(mask_indices.shape,axis,1))
    return mask_indices_nd<=r

样品运行-

数据数组设置:

In [288]: np.random.seed(0)
     ...: a = np.random.randint(10, size=(2, 3, 5))

In [289]: a
Out[289]: 
array([[[5, 0, 3, 3, 7],
        [9, 3, 5, 2, 4],
        [7, 6, 8, 8, 1]],

       [[6, 7, 7, 8, 1],
        [5, 9, 8, 9, 4],
        [3, 0, 3, 5, 0]]])

沿axis=1-

In [290]: mask_indices = np.argmax(a, axis=1)

In [291]: mask_indices
Out[291]: 
array([[1, 2, 2, 2, 0],
       [0, 1, 1, 1, 1]])

In [292]: mask_from_start_indices_genericaxis(a, mask_indices, axis=1)
Out[292]: 
array([[[False, False, False, False,  True],
        [ True, False, False, False,  True],
        [ True,  True,  True,  True,  True]],

       [[ True, False, False, False, False],
        [ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]]])

沿axis=2-

In [293]: mask_indices = np.argmax(a, axis=2)

In [294]: mask_indices
Out[294]: 
array([[4, 0, 2],
       [3, 1, 3]])

In [295]: mask_from_start_indices_genericaxis(a, mask_indices, axis=2)
Out[295]: 
array([[[False, False, False, False,  True],
        [ True,  True,  True,  True,  True],
        [False, False,  True,  True,  True]],

       [[False, False, False,  True,  True],
        [False,  True,  True,  True,  True],
        [False, False, False,  True,  True]]])


其他情况

A.扩展到给定的结束/停止索引以进行掩盖


Other scenarios

A. Extending to given end/stop-indices for masking

要扩展给出掩蔽的结束/停止索引(即我们希望对mask[r, :m] = True进行矢量化)的情况下的解决方案,我们只需要在发布的解决方案中将比较的最后一步编辑为以下内容-

To extend the solutions for cases when we are given end/stop-indices for masking, i.e. we are looking to vectorize mask[r, :m] = True, we just need to edit the last step of comparison in the posted solutions to the following -

return mask_indices_nd>r

B.输出整数数组

在某些情况下,我们可能希望获取一个int数组.在这些文件上,只需简单地查看输出即可.因此,如果out是发布的解决方案的输出,那么我们可以分别对int8uint8 dtype输出分别执行out.view('i1')out.view('u1').

There might be cases when we might be looking to get an int array. On those, simply view the output as such. Hence, if out is the output off the posted solutions, then we can simply do out.view('i1') or out.view('u1') for int8 and uint8 dtype outputs respectively.

对于其他数据类型,我们将需要使用.astype()进行dtype转换.

For other datatypes, we would need to use .astype() for dtype conversions.

C.用于停止索引的包含索引的掩盖

对于包含索引的掩码,即在停止索引的情况下要包含索引,我们需要在比较中简单地包含相等性.因此,最后一步将是-

For index-inclusive masking, i.e. the index is to be included for stop-indices case, we need to simply include the equality in the comparison. Hence, the last step would be -

return mask_indices_nd>=r

D.用于起始索引的索引专有屏蔽

在这种情况下,将给定起始索引,并且这些索引不会被屏蔽,而仅从下一个元素开始直到结束都被屏蔽.因此,类似于上一节中列出的推理,在这种情况下,我们将最后一步修改为-

This is a case when the start indices are given and those indices are not be masked, but masked only from the next element onwards until end. So, similar to the reasoning listed in previous section, for this case we would have the last step modified to -

return mask_indices_nd<r

这篇关于根据起始索引有效填充面膜的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆