如何检查Pandas DataFrame中的值是否为NaN [英] How to check if any value is NaN in a Pandas DataFrame

查看:447
本文介绍了如何检查Pandas DataFrame中的值是否为NaN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python Pandas中,检查DataFrame是否具有一个(或多个)NaN值的最佳方法是什么?

In Python Pandas, what's the best way to check whether a DataFrame has one (or more) NaN values?

我知道函数pd.isnan,但这会为每个元素返回一个布尔值的DataFrame. 此处的帖子也不能完全回答我的问题.

I know about the function pd.isnan, but this returns a DataFrame of booleans for each element. This post right here doesn't exactly answer my question either.

推荐答案

jwilner 的回应很明显.我一直在探索是否有一个更快的选择,因为根据我的经验,求平面数组的总和(奇怪)比计数快.这段代码似乎更快:

jwilner's response is spot on. I was exploring to see if there's a faster option, since in my experience, summing flat arrays is (strangely) faster than counting. This code seems faster:

df.isnull().values.any()

例如:

In [2]: df = pd.DataFrame(np.random.randn(1000,1000))

In [3]: df[df > 0.9] = pd.np.nan

In [4]: %timeit df.isnull().any().any()
100 loops, best of 3: 14.7 ms per loop

In [5]: %timeit df.isnull().values.sum()
100 loops, best of 3: 2.15 ms per loop

In [6]: %timeit df.isnull().sum().sum()
100 loops, best of 3: 18 ms per loop

In [7]: %timeit df.isnull().values.any()
1000 loops, best of 3: 948 µs per loop

df.isnull().sum().sum()稍慢一些,但当然还有其他信息-NaNs的数量.

df.isnull().sum().sum() is a bit slower, but of course, has additional information -- the number of NaNs.

这篇关于如何检查Pandas DataFrame中的值是否为NaN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆