删除列中“空"超过60%的列 pandas 价值观 [英] Drop Columns with more than 60 Percent of "empty" Values in Pandas
本文介绍了删除列中“空"超过60%的列 pandas 价值观的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像这样的数据框:
I have got a dataframe like this:
import pandas as pd
data = {
'c1': ['Test1','Test2','NULL','Test3',' ','Test4','Test4','Test1',"Test3"],
'c2': [' ','Test1',' ','NULL',' ','NULL','NULL','NULL','NULL'],
'c3': [0,0,0,0,0,1,5,0,0],
'c4': ['NULL', 'Test2', 'Test1','Test1', 'Test2', 'Test2','Test1','Test1','Test2']
}
df = pd.DataFrame(data)
df
数据框如下所示:
c1 c2 c3 c4
0 Test1 0 NULL
1 Test2 Test1 0 Test2
2 NULL 0 Test1
3 Test3 NULL 0 Test1
4 0 Test2
5 Test4 NULL 1 Test2
6 Test4 NULL 5 Test1
7 Test1 NULL 0 Test1
8 Test3 NULL 0 Test2
我想删除所有具有超过空"值60%的列.在我的情况下,"Empty"表示值例如:','NULL"或0.有字符串(c1,c2,c4)和整数(c3).
I want to drop all columns, that have more than 60 % of "empty" values. "Empty" means in my case that the values are for example: ' ', 'NULL' or 0. There are strings (c1, c2, c4) as well as integers (c3).
结果应该是仅包含c1和c4列的数据框.
The result should be a dataframe with columns c1 and c4 only.
c1 c4
0 Test1 NULL
1 Test2 Test2
2 NULL Test1
3 Test3 Test1
4 Test2
5 Test4 Test2
6 Test4 Test1
7 Test1 Test1
8 Test3 Test2
我不知道该如何解决这个问题.我唯一想到的就是
I have no idea how to handle that problem. Only thing that comes to my mind is something like
df.loc[:, (df != 0).any(axis=0)]
删除所有值均为0,'NULL'等的所有列.
to delete all columns where all values are 0, 'NULL' and so on.
推荐答案
使用 DataFrame.isin
检查所有格式,然后获取mean
阈值并通过
查看全文