按nan数排列的pandas dataframe删除列 [英] pandas dataframe drop columns by number of nan

查看:104
本文介绍了按nan数排列的pandas dataframe删除列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一些包含nan的列的数据框.我想删除带有特定数量nan的那些列.例如,在下面的代码中,我想删除具有2个或更多nan的任何列.在这种情况下,列"C"将被删除,仅保留"A"和"B".我该如何实施?

I have a dataframe with some columns containing nan. I'd like to drop those columns with certain number of nan. For example, in the following code, I'd like to drop any column with 2 or more nan. In this case, column 'C' will be dropped and only 'A' and 'B' will be kept. How can I implement it?

import pandas as pd
import numpy as np

dff = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'))
dff.iloc[3,0] = np.nan
dff.iloc[6,1] = np.nan
dff.iloc[5:8,2] = np.nan

print dff

推荐答案

thresh参数. DataFrame.dropna.html#pandas.DataFrame.dropna"rel =" noreferrer> dropna ,您只需要传递df的长度-您希望将NaN值的数量作为阈值:

There is a thresh param for dropna, you just need to pass the length of your df - the number of NaN values you want as your threshold:

In [13]:

dff.dropna(thresh=len(dff) - 2, axis=1)
Out[13]:
          A         B
0  0.517199 -0.806304
1 -0.643074  0.229602
2  0.656728  0.535155
3       NaN -0.162345
4 -0.309663 -0.783539
5  1.244725 -0.274514
6 -0.254232       NaN
7 -1.242430  0.228660
8 -0.311874 -0.448886
9 -0.984453 -0.755416

因此,以上内容将删除所有不符合df长度(行数)-2作为非Na值数量标准的列.

So the above will drop any column that does not meet the criteria of the length of the df (number of rows) - 2 as the number of non-Na values.

这篇关于按nan数排列的pandas dataframe删除列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆