如何计算pandas DataFrame中列中的NaN值 [英] How to count the NaN values in a column in pandas DataFrame
本文介绍了如何计算pandas DataFrame中列中的NaN值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想在我的数据的每一列中找到 NaN
的数量,以便在 NaN
少于某个阈值时我可以删除一列.我看了看,但找不到任何功能.value_counts
是对我来说太慢了,因为大多数值都是不同的,我只对 NaN
计数感兴趣.
I want to find the number of NaN
in each column of my data so that I can drop a column if it has fewer NaN
than some threshold. I looked but wasn't able to find any function for this. value_counts
is too slow for me because most of the values are distinct and I'm only interested in the NaN
count.
推荐答案
您可以使用 isna()
方法(或者它的别名 isnull()
,它也与较旧的 Pandas 版本 <0.21.0 兼容)和然后求和以计算 NaN 值.对于一列:
You can use the isna()
method (or it's alias isnull()
which is also compatible with older pandas versions < 0.21.0) and then sum to count the NaN values. For one column:
In [1]: s = pd.Series([1,2,3, np.nan, np.nan])
In [4]: s.isna().sum() # or s.isnull().sum() for older pandas versions
Out[4]: 2
对于多列,它也有效:
In [5]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]})
In [6]: df.isna().sum()
Out[6]:
a 1
b 2
dtype: int64
这篇关于如何计算pandas DataFrame中列中的NaN值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文