numpy.ma(掩码)数组均值方法的返回类型不一致 [英] numpy.ma (masked) array mean method has inconsitent return type

查看:188
本文介绍了numpy.ma(掩码)数组均值方法的返回类型不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到 numpy掩码数组均值方法在可能不应该的情况下返回不同的类型:

I noticed that the numpy masked-array mean method returns different types when it probably should not:

import numpy as np

A = np.ma.masked_equal([1,1,0], value=0)
B = np.ma.masked_equal([1,1,1], value=0) # no masked values

type(A.mean())
#numpy.float64
type(B.mean())
#numpy.ma.core.MaskedArray

其他numpy.ma.core.MaskedArray方法似乎是一致的

type( A.sum()) == type(B.sum())
# True
type( A.prod()) == type(B.prod())
# True
type( A.std()) == type(B.std())
# True
type( A.mean()) == type(B.mean())
# False

有人可以解释吗?

更新:正如评论中指出的那样

UPDATE: As pointed out in the comments

C = np.ma.masked_array([1, 1, 1], mask=[False, False, False])
type(C.mean()) == type(A.mean())
# True 

推荐答案

B.mask开头为:

    if self._mask is nomask:
        result = super(MaskedArray, self).mean(axis=axis, dtype=dtype)

np.ma.nomaskFalse.

您的B就是这种情况:

masked_array(data = [1 1 1],
             mask = False,
       fill_value = 0)

对于A,遮罩是一个与data大小匹配的数组.在B中,它是一个标量,False,而mean将其作为特殊情况处理.

For A the mask is an array that matches the data in size. In B it is a scalar, False, and mean is handling that as a special case.

我需要进一步挖掘以了解其含义.

I need to dig a bit more to see what this implies.

In [127]: np.mean(B)
Out[127]: 
masked_array(data = 1.0,
             mask = False,
       fill_value = 0)

In [141]: super(np.ma.MaskedArray,B).mean()
Out[141]: 
masked_array(data = 1.0,
             mask = False,
       fill_value = 0)

我不确定是否有帮助;在np.ndarray方法与np函数以及np.ma方法之间存在一些循环引用,这使得很难准确地确定正在使用的代码.就像它正在使用编译的mean方法一样,但是如何处理遮罩并不清楚.

I'm not sure that helps; there's some circular referencing between np.ndarray methods and the np function and the np.ma methods, that makes it hard to identify exactly what code is being used. It like it is using the compiled mean method, but it isn't obvious how that handles the masking.

我想知道是否要使用

 np.mean(B.data) # or
 B.data.mean()

super方法获取不是正确的方法.

and the super method fetch isn't the right approach.

在任何情况下,相同的数组但带有矢量掩码将返回标量.

In any case, the same array, but with a vector mask returns the scalar.

In [132]: C
Out[132]: 
masked_array(data = [1 1 1],
             mask = [False False False],
       fill_value = 0)

In [133]: C.mean()
Out[133]: 1.0

===================

====================

在没有nomask快捷方式的情况下尝试此方法,之后会引发错误

Trying this method without the nomask shortcut, raises an error after

        dsum = self.sum(axis=axis, dtype=dtype)
        cnt = self.count(axis=axis)
        if cnt.shape == () and (cnt == 0):
            result = masked
        else:
            result = dsum * 1. / cnt

self.countnomask情况下返回标量,但在常规遮罩中返回np.int32.所以cnt.shape扼流圈.

self.count returns a scalar in the nomask case, but a np.int32 in the regular masking. So the cnt.shape chokes.

trace是尝试此super(MaskedArray...)快捷方式"的唯一其他屏蔽方法.均码显然有些困惑.

trace is the only other masked method that tries this super(MaskedArray...) 'shortcut'. There's clearly something kludgy about the mean code.

===================

====================

相关的错误问题: https://github.com/numpy/numpy/issues/5769

因此,去年在这里提出了同样的问题:

According to that the same question was raised here last year: Testing equivalence of means of Numpy MaskedArray instances raises attribute error

看起来有很多掩盖问题,而不仅仅是mean.现在或不久的将来,开发母版中可能已有修复程序.

Looks like there are a lot of masking issues, not just with mean. There may be fixes in the development master now, or in the near future.

这篇关于numpy.ma(掩码)数组均值方法的返回类型不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆