如何计算pandas DataFrame列中的NaN值 [英] How to count the NaN values in a column in pandas DataFrame
问题描述
我有数据,想要在其中找到NaN
的数字,因此,如果它小于某个阈值,我将删除此列.我看了一下,但是找不到任何功能.有 value_counts
,但它会对我来说很慢,因为大多数值是不同的,并且我只希望计数NaN
.
I have data, in which I want to find number of NaN
, so that if it is less than some threshold, I will drop this columns. I looked, but didn't able to find any function for this. there is value_counts
, but it would be slow for me, because most of values are distinct and I want count of NaN
only.
推荐答案
您可以使用 isna()
方法(或别名isnull()
,它也与较早的熊猫版本<0.21.0兼容),然后求和以计算NaN值.对于一列:
You can use the isna()
method (or it's alias isnull()
which is also compatible with older pandas versions < 0.21.0) and then sum to count the NaN values. For one column:
In [1]: s = pd.Series([1,2,3, np.nan, np.nan])
In [4]: s.isna().sum() # or s.isnull().sum() for older pandas versions
Out[4]: 2
对于几列,它也可以工作:
For several columns, it also works:
In [5]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]})
In [6]: df.isna().sum()
Out[6]:
a 1
b 2
dtype: int64
这篇关于如何计算pandas DataFrame列中的NaN值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!