筛选出超过一定数量NaN的行 [英] Filter out rows with more than certain number of NaN

查看:69
本文介绍了筛选出超过一定数量NaN的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Pandas数据框中,我想过滤出所有超过2个NaN的行.

In a Pandas dataframe, I would like to filter out all the rows that have more than 2 NaNs.

基本上,我有4列,我只想保留至少2列具有有限值的那些行.

Essentially, I have 4 columns and I would like to keep only those rows where at least 2 columns have finite values.

有人可以建议如何实现这一目标吗?

Can somebody advise on how to achieve this?

推荐答案

以下内容应该有效

df.dropna(thresh=2)

请参见在线文档

我们在这里所做的是删除任何NaN行,其中一行中有2个或更多非NaN值.

What we are doing here is dropping any NaN rows, where there are 2 or more non NaN values in a row.

示例:

In [25]:

import pandas as pd

df = pd.DataFrame({'a':[1,2,NaN,4,5], 'b':[NaN,2,NaN,4,5], 'c':[1,2,NaN,NaN,NaN], 'd':[1,2,3,NaN,5]})

df

Out[25]:

    a   b   c   d
0   1 NaN   1   1
1   2   2   2   2
2 NaN NaN NaN   3
3   4   4 NaN NaN
4   5   5 NaN   5

[5 rows x 4 columns]

In [26]:

df.dropna(thresh=2)

Out[26]:

   a   b   c   d
0  1 NaN   1   1
1  2   2   2   2
3  4   4 NaN NaN
4  5   5 NaN   5

[4 rows x 4 columns]

编辑

对于上面的示例,它可以工作,但是您应该注意,您必须知道列数并适当地设置thresh值,我本来以为它是指NaN值的数目,但实际上是指 NaN个值.

For the above example it works but you should note that you would have to know the number of columns and set the thresh value appropriately, I thought originally it meant the number of NaN values but it actually means number of Non NaN values.

这篇关于筛选出超过一定数量NaN的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆