过滤掉超过一定数量的 NaN 的行 [英] Filter out rows with more than certain number of NaN

查看:24
本文介绍了过滤掉超过一定数量的 NaN 的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Pandas 数据框中,我想过滤掉所有超过 2 个 NaN 的行.

In a Pandas dataframe, I would like to filter out all the rows that have more than 2 NaNs.

基本上,我有 4 列,我只想保留那些至少有 2 列具有有限值的行.

Essentially, I have 4 columns and I would like to keep only those rows where at least 2 columns have finite values.

有人可以就如何实现这一目标提出建议吗?

Can somebody advise on how to achieve this?

推荐答案

以下应该有效

df.dropna(thresh=2)

请参阅在线文档

我们在这里做的是删除任何 NaN 行,其中一行中有 2 个或更多非 NaN 值.

What we are doing here is dropping any NaN rows, where there are 2 or more non NaN values in a row.

示例:

In [25]:

import pandas as pd

df = pd.DataFrame({'a':[1,2,NaN,4,5], 'b':[NaN,2,NaN,4,5], 'c':[1,2,NaN,NaN,NaN], 'd':[1,2,3,NaN,5]})

df

Out[25]:

    a   b   c   d
0   1 NaN   1   1
1   2   2   2   2
2 NaN NaN NaN   3
3   4   4 NaN NaN
4   5   5 NaN   5

[5 rows x 4 columns]

In [26]:

df.dropna(thresh=2)

Out[26]:

   a   b   c   d
0  1 NaN   1   1
1  2   2   2   2
3  4   4 NaN NaN
4  5   5 NaN   5

[4 rows x 4 columns]

编辑

对于上面的例子它是有效的,但你应该注意你必须知道列数并适当地设置 thresh 值,我认为最初它意味着 NaN 值,但它实际上意味着 Non NaN 值的数量.

For the above example it works but you should note that you would have to know the number of columns and set the thresh value appropriately, I thought originally it meant the number of NaN values but it actually means number of Non NaN values.

这篇关于过滤掉超过一定数量的 NaN 的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆