筛选出超过一定数量NaN的行 [英] Filter out rows with more than certain number of NaN
问题描述
在Pandas数据框中,我想过滤出所有超过2个NaN
的行.
In a Pandas dataframe, I would like to filter out all the rows that have more than 2 NaN
s.
基本上,我有4列,我只想保留至少2列具有有限值的那些行.
Essentially, I have 4 columns and I would like to keep only those rows where at least 2 columns have finite values.
有人可以建议如何实现这一目标吗?
Can somebody advise on how to achieve this?
推荐答案
以下内容应该有效
df.dropna(thresh=2)
请参见在线文档
我们在这里所做的是删除任何NaN
行,其中一行中有2个或更多非NaN
值.
What we are doing here is dropping any NaN
rows, where there are 2 or more non NaN
values in a row.
示例:
In [25]:
import pandas as pd
df = pd.DataFrame({'a':[1,2,NaN,4,5], 'b':[NaN,2,NaN,4,5], 'c':[1,2,NaN,NaN,NaN], 'd':[1,2,3,NaN,5]})
df
Out[25]:
a b c d
0 1 NaN 1 1
1 2 2 2 2
2 NaN NaN NaN 3
3 4 4 NaN NaN
4 5 5 NaN 5
[5 rows x 4 columns]
In [26]:
df.dropna(thresh=2)
Out[26]:
a b c d
0 1 NaN 1 1
1 2 2 2 2
3 4 4 NaN NaN
4 5 5 NaN 5
[4 rows x 4 columns]
编辑
对于上面的示例,它可以工作,但是您应该注意,您必须知道列数并适当地设置thresh
值,我本来以为它是指NaN
值的数目,但实际上是指 非 NaN
个值.
For the above example it works but you should note that you would have to know the number of columns and set the thresh
value appropriately, I thought originally it meant the number of NaN
values but it actually means number of Non NaN
values.
这篇关于筛选出超过一定数量NaN的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!