如何删除 pandas 中仅包含零的列? [英] How do I delete a column that contains only zeros in Pandas?

查看:67
本文介绍了如何删除 pandas 中仅包含零的列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前有一个数据框,该数据框由以1和0作为值的列组成,我想遍历这些列并删除仅由0组成的列.到目前为止,这是我尝试过的:

I currently have a dataframe consisting of columns with 1's and 0's as values, I would like to iterate through the columns and delete the ones that are made up of only 0's. Here's what I have tried so far:

ones = []
zeros = []
for year in years:
    for i in range(0,599):
        if year[str(i)].values.any() == 1:
            ones.append(i)
        if year[str(i)].values.all() == 0:
            zeros.append(i)
    for j in ones:
        if j in zeros:
            zeros.remove(j)
    for q in zeros:
        del year[str(q)]

其中的年份是我正在分析的各种年份的数据帧的列表,其中的数据帧由其中包含一个的列组成,而零则是包含所有零的列的列表.是否有更好的方法根据条件删除列?由于某些原因,我必须检查一列是否也位于零列表中,并将它们从零列表中删除,以获得所有零列的列表.

In which years is a list of dataframes for the various years I am analyzing, ones consists of columns with a one in them and zeros is a list of columns containing all zeros. Is there a better way to delete a column based on a condition? For some reason I have to check whether the ones columns are in the zeros list as well and remove them from the zeros list to obtain a list of all the zero columns.

推荐答案

df.loc[:, (df != 0).any(axis=0)]


以下是其工作方式的细分:


Here is a break-down of how it works:

In [74]: import pandas as pd

In [75]: df = pd.DataFrame([[1,0,0,0], [0,0,1,0]])

In [76]: df
Out[76]: 
   0  1  2  3
0  1  0  0  0
1  0  0  1  0

[2 rows x 4 columns]

df != 0创建一个布尔型DataFrame,它为True,其中df为非零:

df != 0 creates a boolean DataFrame which is True where df is nonzero:

In [77]: df != 0
Out[77]: 
       0      1      2      3
0   True  False  False  False
1  False  False   True  False

[2 rows x 4 columns]

(df != 0).any(axis=0)返回一个布尔系列,指示哪些列具有非零条目. (any操作将沿0轴(即沿行)的值聚合为一个布尔值.因此,结果是每一列一个布尔值.)

(df != 0).any(axis=0) returns a boolean Series indicating which columns have nonzero entries. (The any operation aggregates values along the 0-axis -- i.e. along the rows -- into a single boolean value. Hence the result is one boolean value for each column.)

In [78]: (df != 0).any(axis=0)
Out[78]: 
0     True
1    False
2     True
3    False
dtype: bool

df.loc可用于选择这些列:

In [79]: df.loc[:, (df != 0).any(axis=0)]
Out[79]: 
   0  2
0  1  0
1  0  1

[2 rows x 2 columns]


要删除"零列,请重新分配df:

df = df.loc[:, (df != 0).any(axis=0)]

这篇关于如何删除 pandas 中仅包含零的列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆