当对充满nan的numpy数组求和时出现意外的nan行为 [英] Unexpected nan behaviour when summing a numpy array full of nan
问题描述
这是一个很有趣的话题,因为它可能导致代码产生意外结果.假设我有一个如下数组;
This is an interesting topic given it could lead to unexpected results in code. Suppose I had an array as follows;
import numpy as np
X = np.array([np.nan,np.nan,np.nan,np.nan,np.nan])
np.nanmean(X)
正确返回一个警告,即平均分片为空,并返回nan.但是,当对数组np.nansum(X)
求和时,它将返回0.0
.现在,尽管数学上是正确的(总和为0),但预期返回的结果可能是np.nan.
np.nanmean(X)
rightly returns a warning that the averaging slice is empty and returns nan. However, when doing a summation of the array, np.nansum(X)
, it returns 0.0
. Now while mathematically true (the sum of nothing is 0), the result expected to be returned might be np.nan.
例如,我有一个函数,如果不存在冰数据文件,它将创建一个空的nans数组(180x360点,每个点代表纬度/经度).然后将此数组传递给一个函数,该函数对数组求和以找出数组中的冰总量.如果期望值为9-10百万平方千米,并且nansum返回0,这可能会引起误解.如果冰范围始终为0,则这尤其困难. 在下面的图中,这显然是一个丢失的数据文件,导致冻结总和为0.0,但并非所有情况都如此清楚.
For an example, I have a function where if a file of ice data doesn't exist, it will create an empty array of nans (180x360 points with each point representing a lat/lon degree). This array is then passed to a function which sums over the array to find out the total amount of ice in the array. If the expected value is 9-10 million km2, and nansum is returning 0, this can be misleading. This is especially difficult if ice extents are around 0 anyway. In the plot below this is clearly a missing data file leading to a ice sum of 0.0, but not all cases are so clear.
我已经在开发网站上看到了有关此问题的讨论,并且想知道为什么np.nansum()
没有一个kwarg选项,如果需要的话可以返回np.nan,而B,是否有一个函数可以返回True/False?整个矩阵都满了吗?
I've seen this discussed on development websites, and want to know why there isn't an kwarg option for np.nansum()
to return np.nan if required, and B, is there a function which returns True/False if the entire matrix is full of nan?
推荐答案
文档:
在NumPy版本中< = 1.8.0,对于全NaN的切片将返回Nan 或为空.在更高版本中,返回零.
In NumPy versions <= 1.8.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.
解决方法:
def nansumwrapper(a, **kwargs):
if np.isnan(a).all():
return np.nan
else:
return np.nansum(a, **kwargs)
a = np.array([np.nan, np.nan])
b = np.array([np.nan, 1., 2.])
nansumwrapper(a)
# nan
nansumwrapper(b)
# 3.0
您可以将kwargs传递给np.nansum()
:
You can pass kwargs to np.nansum()
:
c = np.arange(12, dtype=np.float_).reshape(4,3)
c[2:4, 1] = np.nan
nansumwrapper(c, axis=1)
# array([ 3., 12., 14., 20.])
这篇关于当对充满nan的numpy数组求和时出现意外的nan行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!