应该结构化数组的面具将自己构成的? [英] Is the mask of a structured array supposed to be structured itself?

查看:141
本文介绍了应该结构化数组的面具将自己构成的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找进入 numpy的问题,2972 和几个相关的问题。事实证明,所有这些问题都涉及到其中阵列本身的结构的情况,但它的掩模是不

I was looking into numpy issue 2972 and several related problems. It turns out that all those problems are related to the situation where the array itself is structured, but its mask is not:

In [38]: R = numpy.zeros(10, dtype=[("A", "<f2"), ("B", "<f4")])

In [39]: Rm = numpy.ma.masked_where(R["A"]<5, R)

In [41]: Rm.dtype
Out[41]: dtype([('A', '<f2'), ('B', '<f4')])

In [42]: Rm.mask.dtype
Out[42]: dtype('bool')

# Now, both `__getitem__` and `__repr__` will result in errors — see issue #2972

如果我不同的创建一个蒙面的数组,面具DTYPE结构像数组本身的DTYPE:

If I create a masked array differently, the mask dtype is structured like the dtype of the array itself:

In [44]: Q.dtype
Out[44]: dtype([('A', '<f4'), ('B', '<f4')])

In [45]: Q.mask.dtype
Out[45]: dtype([('A', '?'), ('B', '?')])

前一种情况暴露出一些问题。例如,室.__再版__()室[A] 无论结果 IndexError ,虽然这是一个 ValueError错误过去。

The former situation exposes several problems. For example, Rm.__repr__() and Rm["A"] both result in IndexError, although it was a ValueError in the past.

在设计上,应该是可能的,其中 A.dtype 的结构,该模式,但 A.mask.dtype 不规整?

By design, is the pattern supposed to be possible, where A.dtype is structured, but A.mask.dtype is not structured?

在换句话说:是在 __再版__ __的GetItem __ numpy的方法的bug .ma.core.MaskedArray ,或正在发生前的实际的错误 - 通过允许这样一个蒙面的结构数组的第一个地方存在。

In other words: is the bug in the __repr__ and __getitem__ methods in numpy.ma.core.MaskedArray, or is the real bug occurring before — by permitting such a masked structured array to exist in the first place?

推荐答案

在你的第一种情况的错误表示方法期望面膜具有作为基础数组字段相同数量(和姓名)

The errors in your 1st case indicate that the methods expect the mask to have the same number (and names) of fields as the base array

__getitem__:  dout._mask = _mask[indx]
_recursive_printoption: (curdata, curmask) = (result[name], mask[name])

如果被屏蔽的阵列做的'主'的构造,面膜具有相同的结构。

If the masked array is make with the 'main' constructor, the mask has the same structure

Rn = np.ma.masked_array(R, mask=R['A']>5)
Rn.mask.dtype: dtype([('A', '?'), ('B', '?')])

在换句话说,对于每个元件的每个字段掩码值

In other words, there is a mask value for each field of each element.

masked_array DOC显然有意对相同的形状,包括 DTYPE 结构。 面膜:必须转换为布尔数组具有相同形状的数据

The masked_array doc evidently intends for 'same shape' to include dtype structure. Mask: Must be convertible to an array of booleans with the same shape as 'data'.

如果我尝试设置掩码相同的方式, masked_where 确实

If I try to set the mask in the same way that masked_where does

Rn._mask=R['A']>5

我得到同样的打印错误。结构化面具获取与新的布尔覆盖,改变它的DTYPE。相反,如果我用

I get the same print error. The structured mask gets overwritten with the new boolean, changing its dtype. In contrast if I use

Rn.mask=R['A']<5

打印罚款。 .mask 是一个属性,其设置方法显然正确处理结构化面膜。

Rn prints fine. .mask is a property, whose set method evidently handles the structured mask correctly.

如果没有挖掘到code历史(在github上)我的猜测是, masked_where 当结构dtypes被添加到其他的没有更新方便的功能在 MA code的部分。相比于 ma.masked_array 这是一个简单的函数,不会在DTYPE看看所有。像 ma.masked_greater 其他方便的功能使用 masked_where 。更改 result._mask = COND result.mask = COND 可能是所有这一切都需要纠正这个问题。

Without digging into the code history (on github) my guess is that masked_where is a convenience function that wasn't updated when structure dtypes were added to other parts of the ma code. Compared to ma.masked_array it's a simple function that does not look at the dtype at all. Other convenience functions like ma.masked_greater use masked_where. Changing result._mask = cond to result.mask = cond might be all that is need to correct this issue.

如何彻底你测试非结构化面膜的后果是什么?

How thoroughly have you tested the consequences of an unstructured mask?

Rm.flatten()

返回一个数组用结构化的掩模,即使当它开始与非结构化之一。这是因为它使用了室.__ setmask __ ,这是场很敏感。而这对于屏蔽属性设置功能。

returns an array with a structured mask, even when it started with an unstructured one. That's because it uses Rm.__setmask__, which is sensitive to fields. And that's the set function for the mask property.

Rm.tolist()  # same error as str()

masked_where 开头:

cond = make_mask(condition)

make_mask 返回简单的布尔DTYPE。它也可以用一个名为DTYPE,生产结构面膜: np.ma.make_mask(R ['A'&LT; 5,DTYPE = R.dtype)。但在 masked_where 使用时,这种结构化的面具被夷为平地。 masked_where 不是只允许一个非结构化的面具,它迫使它是非结构化的。

make_mask returns the simple 'bool' dtype. It can also be called with a dtype, producing a structured mask: np.ma.make_mask(R['A']<5,dtype=R.dtype). But such a structured mask gets flattened when used in masked_where. masked_where not only allows a unstructured mask, it forces it to be unstructured.

您非结构化面膜已经部分实施, recordmask 属性:

Your unstructured mask is already partly implemented, the recordmask property:

recordmask = property(fget=_get_recordmask)

我说,部分原因是它有一个 GET 方法,但设置方法尚未实现。参见高清_set_recordmask(个体经营):

I say partly because it has a get method, but the set method is not yet implemented. See def _set_recordmask(self):

我越是看这更我相信, masked_where 是错误的。它可以改变设置一个结构化的面具,但它不是从 masked_array 太大的不同。它可能更好,如果它提出了当数组构造一个错误(有 dtype.names )。这样 masked_where 仍将是非结构化数字数组非常有用,而preventing误用结构化的。

The more I look at this the more I'm convinced that masked_where is wrong. It could be changed to set a structured mask, but then it's not much different from masked_array. It might better if it raises an error when the array is structured (has dtype.names). That way masked_where will remain useful for unstructured numeric arrays, while preventing misapplication to structured ones.

我也应该看一下测试code。

I should also look at the test code.

这篇关于应该结构化数组的面具将自己构成的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆