在 numpy 数组中插入 NaN 值 [英] Interpolate NaN values in a numpy array

查看:59
本文介绍了在 numpy 数组中插入 NaN 值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种快速的方法可以用(比如)线性插值替换 numpy 数组中的所有 NaN 值?

例如

[1 1 1 nan nan 2 2 nan 0]

会被转换成

[1 1 1 1.3 1.6 2 2 1 0]

解决方案

让我们先定义一个简单的辅助函数,以便更直接地处理 NaN:

将 numpy 导入为 npdef nan_helper(y):"""帮助处理 NaN 的索引和逻辑索引.输入:- y, 1d numpy 数组,可能包含 NaN输出:- nans,NaN 的逻辑索引- 索引,一个函数,签名索引 = index(logical_indices),将 NaN 的逻辑索引转换为等效"索引例子:>>># NaN 的线性插值>>>nans, x= nan_helper(y)>>>y[nans]= np.interp(x(nans), x(~nans), y[~nans])"""返回 np.isnan(y), lambda z: z.nonzero()[0]

现在 nan_helper(.) 现在可以像这样使用:

<预><代码>>>>y= 数组([1, 1, 1, NaN, NaN, 2, 2, NaN, 0])>>>>>>nans, x= nan_helper(y)>>>y[nans]= np.interp(x(nans), x(~nans), y[~nans])>>>>>>打印 y.round(2)[ 1. 1. 1. 1.33 1.67 2. 2. 1. 0. ]

---
虽然首先指定一个单独的函数来做这样的事情似乎有点矫枉过正:

<预><代码>>>>nans, x= np.isnan(y), lambda z: z.nonzero()[0]

它最终会支付红利.

因此,每当您处理与 NaN 相关的数据时,只需将所需的所有(新的 NaN 相关)功能封装在某些特定的辅助函数下即可.您的代码库将更加连贯和可读,因为它遵循易于理解的习惯用法.

插值确实是了解 NaN 处理如何完成的一个很好的上下文,但类似的技术也用于各种其他上下文.

Is there a quick way of replacing all NaN values in a numpy array with (say) the linearly interpolated values?

For example,

[1 1 1 nan nan 2 2 nan 0]

would be converted into

[1 1 1 1.3 1.6 2 2  1  0]

解决方案

Lets define first a simple helper function in order to make it more straightforward to handle indices and logical indices of NaNs:

import numpy as np

def nan_helper(y):
    """Helper to handle indices and logical indices of NaNs.

    Input:
        - y, 1d numpy array with possible NaNs
    Output:
        - nans, logical indices of NaNs
        - index, a function, with signature indices= index(logical_indices),
          to convert logical indices of NaNs to 'equivalent' indices
    Example:
        >>> # linear interpolation of NaNs
        >>> nans, x= nan_helper(y)
        >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans])
    """

    return np.isnan(y), lambda z: z.nonzero()[0]

Now the nan_helper(.) can now be utilized like:

>>> y= array([1, 1, 1, NaN, NaN, 2, 2, NaN, 0])
>>>
>>> nans, x= nan_helper(y)
>>> y[nans]= np.interp(x(nans), x(~nans), y[~nans])
>>>
>>> print y.round(2)
[ 1.    1.    1.    1.33  1.67  2.    2.    1.    0.  ]

---
Although it may seem first a little bit overkill to specify a separate function to do just things like this:

>>> nans, x= np.isnan(y), lambda z: z.nonzero()[0]

it will eventually pay dividends.

So, whenever you are working with NaNs related data, just encapsulate all the (new NaN related) functionality needed, under some specific helper function(s). Your code base will be more coherent and readable, because it follows easily understandable idioms.

Interpolation, indeed, is a nice context to see how NaN handling is done, but similar techniques are utilized in various other contexts as well.

这篇关于在 numpy 数组中插入 NaN 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆