根据Pandas中的列名删除多个列 [英] Deleting multiple columns based on column names in Pandas

查看:387
本文介绍了根据Pandas中的列名删除多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据,当我导入它时,我得到了以下不需要的列,我正在寻找一种删除所有这些数据的简便方法

I have some data and when I import it I get the following unneeded columns I'm looking for an easy way to delete all of these

   'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
   'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
   'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
   'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
   'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
   'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
   'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
   'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
   'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
   'Unnamed: 60'

它们通过0索引建立索引,所以我尝试了

They are indexed by 0-indexing so I tried something like

    df.drop(df.columns[[22, 23, 24, 25, 
    26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

但这不是很有效.我尝试编写一些for循环,但这让我感到震惊,因为熊猫的行为不佳.因此,我在这里问这个问题.

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

我已经看到了一些类似的示例(拖放多列熊猫),但是这没有回答我的问题.

I've seen some examples which are similar (Drop multiple columns pandas) but this doesn't answer my question.

推荐答案

我不知道您所说的效率低下是什么意思,但是如果您要进行键入的意思是,只需选择感兴趣的cols并分配回df:

I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:

df = df[cols_of_interest]

cols_of_interest是您关心的列的列表.

Where cols_of_interest is a list of the columns you care about.

或者您可以对列进行切片并将其传递给drop:

Or you can slice the columns and pass this to drop:

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

head的调用仅选择0行,因为我们只对列名感兴趣,而不对数据感兴趣

The call to head just selects 0 rows as we're only interested in the column names rather than data

更新

另一种更简单的方法是使用str.contains中的布尔掩码并将其反转以掩码列:

Another method would be simpler would be to use the boolean mask from str.contains and invert it to mask the columns:

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df

Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []

In [4]:
~df.columns.str.contains('Unnamed:')

Out[4]:
array([ True, False, False,  True], dtype=bool)

In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]

Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []

这篇关于根据Pandas中的列名删除多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆