当对充满nan的numpy数组求和时出现意外的nan行为 [英] Unexpected nan behaviour when summing a numpy array full of nan

查看:564
本文介绍了当对充满nan的numpy数组求和时出现意外的nan行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个很有趣的话题,因为它可能导致代码产生意外结果.假设我有一个如下数组;

This is an interesting topic given it could lead to unexpected results in code. Suppose I had an array as follows;

import numpy as np

X = np.array([np.nan,np.nan,np.nan,np.nan,np.nan])

np.nanmean(X)正确返回一个警告,即平均分片为空,并返回nan.但是,当对数组np.nansum(X)求和时,它将返回0.0.现在,尽管数学上是正确的(总和为0),但预期返回的结果可能是np.nan.

np.nanmean(X) rightly returns a warning that the averaging slice is empty and returns nan. However, when doing a summation of the array, np.nansum(X), it returns 0.0. Now while mathematically true (the sum of nothing is 0), the result expected to be returned might be np.nan.

例如,我有一个函数,如果不存在冰数据文件,它将创建一个空的nans数组(180x360点,每个点代表纬度/经度).然后将此数组传递给一个函数,该函数对数组求和以找出数组中的冰总量.如果期望值为9-10百万平方千米,并且nansum返回0,这可能会引起误解.如果冰范围始终为0,则这尤其困难. 在下面的图中,这显然是一个丢失的数据文件,导致冻结总和为0.0,但并非所有情况都如此清楚.

For an example, I have a function where if a file of ice data doesn't exist, it will create an empty array of nans (180x360 points with each point representing a lat/lon degree). This array is then passed to a function which sums over the array to find out the total amount of ice in the array. If the expected value is 9-10 million km2, and nansum is returning 0, this can be misleading. This is especially difficult if ice extents are around 0 anyway. In the plot below this is clearly a missing data file leading to a ice sum of 0.0, but not all cases are so clear.

我已经在开发网站上看到了有关此问题的讨论,并且想知道为什么np.nansum()没有一个kwarg选项,如果需要的话可以返回np.nan,而B,是否有一个函数可以返回True/False?整个矩阵都满了吗?

I've seen this discussed on development websites, and want to know why there isn't an kwarg option for np.nansum() to return np.nan if required, and B, is there a function which returns True/False if the entire matrix is full of nan?

推荐答案

文档:

在NumPy版本中< = 1.8.0,对于全NaN的切片将返回Nan 或为空.在更高版本中,返回零.

In NumPy versions <= 1.8.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.

解决方法:

def nansumwrapper(a, **kwargs):
    if np.isnan(a).all():
        return np.nan
    else:
        return np.nansum(a, **kwargs)

a = np.array([np.nan, np.nan])
b = np.array([np.nan, 1., 2.])


nansumwrapper(a)
# nan

nansumwrapper(b)
# 3.0

您可以将kwargs传递给np.nansum():

You can pass kwargs to np.nansum():

c = np.arange(12, dtype=np.float_).reshape(4,3)
c[2:4, 1] = np.nan

nansumwrapper(c, axis=1)
# array([  3.,  12.,  14.,  20.])

这篇关于当对充满nan的numpy数组求和时出现意外的nan行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆