如何使用 nans zscore 规范化 pandas 列? [英] how to zscore normalize pandas column with nans?

查看：40 发布时间：2021/12/31 12:06:03 python numpy pandas scipy

本文介绍了如何使用 nans zscore 规范化 pandas 列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 Pandas 数据框，其中有一列我想要 zscore 归一化的真实值:

I have a pandas dataframe with a column of real values that I want to zscore normalize:

>> a
array([    nan,  0.0767,  0.4383,  0.7866,  0.8091,  0.1954,  0.6307,
        0.6599,  0.1065,  0.0508])
>> df = pandas.DataFrame({"a": a})

问题是单个nan值使得所有数组nan:

The problem is that a single nan value makes all the array nan:

>> from scipy.stats import zscore
>> zscore(df["a"])
array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan])

将 zscore(或不是来自 scipy 的等效函数)应用于 Pandas 数据帧的列并让它忽略 nan 值的正确方法是什么?对于无法标准化的值，我希望它与带有 np.nan 的原始列的维度相同

What's the correct way to apply zscore (or an equivalent function not from scipy) to a column of a pandas dataframe and have it ignore the nan values? I'd like it to be same dimension as original column with np.nan for values that can't be normalized

edit:也许最好的解决方案是使用 scipy.stats.nanmean 和 scipy.stats.nanstd?我不明白为什么需要为此目的更改 std 的自由度:

edit: maybe the best solution is to use scipy.stats.nanmean and scipy.stats.nanstd? I don't see why the degrees of freedom need to be changed for std for this purpose:

zscore = lambda x: (x - scipy.stats.nanmean(x)) / scipy.stats.nanstd(x)

推荐答案

mean 和 std 的 pandas' 版本将处理Nan 所以你可以这样计算(为了获得与 scipy zscore 相同的结果，我认为你需要在 std 上使用 ddof=0):

Well the pandas' versions of mean and std will hand the Nan so you could just compute that way (to get the same as scipy zscore I think you need to use ddof=0 on std):

df['zscore'] = (df.a - df.a.mean())/df.a.std(ddof=0)
print df

        a    zscore
0     NaN       NaN
1  0.0767 -1.148329
2  0.4383  0.071478
3  0.7866  1.246419
4  0.8091  1.322320
5  0.1954 -0.747912
6  0.6307  0.720512
7  0.6599  0.819014
8  0.1065 -1.047803
9  0.0508 -1.235699

这篇关于如何使用 nans zscore 规范化 pandas 列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 nans zscore 规范化 pandas 列? [英] how to zscore normalize pandas column with nans?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 nans zscore 规范化 pandas 列? [英] how to zscore normalize pandas column with nans?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭