Python Pandas-通过列表删除多列 [英] Python Pandas - Dropping multiple columns through list

查看:85
本文介绍了Python Pandas-通过列表删除多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试搜索该问题的答案,但找不到它...所以就去了.

I tried searching for the answer to this question but was not able to find it... so here it goes.

我有一个包含23987列的数据集.我实际上只希望其中35列中的信息(在它们之间相当分散).我已将这35项列入清单.我想知道是否有一种快速的方法可以通过传递列表删除除列之外的所有列

I have a dataset with 23987 columns. I actually only want the information in 35 of those columns (quite spread out between them). I have put these 35 items in a list. I wanted to know if there is a quick way to drop all the columns except those by passing the list

我尝试过:

df1.drop(df1.columns.difference([ALTJ_genes]), axis=1, inplace=True)

ALTJ_genes是包含35个项目的列表.我得到的错误是:

ALTJ_genes is the list with the 35 items. The error I get is:

TypeError: unhashable type: 'list'

我想知道是否有办法做到这一点,我知道我可以通过传递各个列来达到自己的目标,但是我想知道是否可以使用该列表.这样可以使代码更加清晰.

I was wondering if there is a way to do it, I know I can reach my goal by passing the individual columns but I want to know if with the list is possible. This would make the code much clearer.

无论如何,谢谢!

我提供了一些屏幕截图,也许很有用.

I provide some screenshot, maybe it is useful.

现在,这是我在传递带有所有基因的列表时遇到的完全错误.

Now, this is the complete error I get when passing the list with all the genes.

---------------------------------------------------------------------------

KeyError跟踪(最近一次通话)在----> 1 df1 [ALTJ_genes]

KeyError Traceback (most recent call last) in ----> 1 df1[ALTJ_genes]

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py在 getitem 中(自身,密钥)第2984章984然大悟2985 键 = 列表(键)-> 2986索引器= self.loc._convert_to_indexer(键,轴= 1,raise_missing =真)29872988#take()不接受布尔索引器

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in getitem(self, key) 2984 if is_iterator(key): 2985 key = list(key) -> 2986 indexer = self.loc._convert_to_indexer(key, axis=1, raise_missing=True) 2987 2988 # take() does not accept boolean indexers

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py(self,obj,axis,is_setter,raise_missing)1283#设置时,即使使用.loc也不允许丢失键:1284 kwargs = {"raise_missing":如果is_setter否则为true,则为true-> 1285返回self._get_listlike_indexer(obj,axis,** kwargs) 1 1286其他:1287试试:

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter, raise_missing) 1283 # When setting, missing keys are not allowed, even with .loc: 1284 kwargs = {"raise_missing": True if is_setter else raise_missing} -> 1285 return self._get_listlike_indexer(obj, axis, **kwargs)1 1286 else: 1287 try:

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py(自身,键,轴,raise_missing)1090第1091章-> 1092键,索引器,o._get_axis_number(axis),raise_missing = raise_missing1093)1094返回关键字,索引器

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing) 1090 1091 self._validate_read_indexer( -> 1092 keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing 1093 ) 1094 return keyarr, indexer

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py(self,key,indexer,axis,raise_missing)第1175章死了1176"[{key}]都不在[{axis}]中".-> 1177键=键,轴= self.obj._get_axis_name(轴)1178)1179)

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing) 1175 raise KeyError( 1176 "None of [{key}] are in the [{axis}]".format( -> 1177 key=key, axis=self.obj._get_axis_name(axis) 1178 ) 1179 )

KeyError:"[Index([('APEX1',),('ASF1A',),('CDKN2D',),('CIB1',),('DNA2',),\ n('FAAP24',),('FANCM',),('GEN1',),('HRAS',),('LIG1',),\ n('LIG3',),('MEN1',),('MRE11',),('MSH3',),('MSH6',),\ n('NUDT1',),('MTOR',),('NABP2',),('NTHL1',),('PALB2',),\ n('PARP1',),('PARP3',),('POLA1',),('POLM',),('POLQ',),\ n('PRPF19',),('RAD51D',),('RBBP8',),('RRM2',),('RUVBL2',),\ n('SOD1',),('KAT5',),('UNG',),('WRN',),('XRCC1',)],\ n dtype ='object',name ='Gene_Name')]位于[列]中.

KeyError: "None of [Index([ ('APEX1',), ('ASF1A',), ('CDKN2D',), ('CIB1',), ('DNA2',),\n ('FAAP24',), ('FANCM',), ('GEN1',), ('HRAS',), ('LIG1',),\n ('LIG3',), ('MEN1',), ('MRE11',), ('MSH3',), ('MSH6',),\n ('NUDT1',), ('MTOR',), ('NABP2',), ('NTHL1',), ('PALB2',),\n ('PARP1',), ('PARP3',), ('POLA1',), ('POLM',), ('POLQ',),\n ('PRPF19',), ('RAD51D',), ('RBBP8',), ('RRM2',), ('RUVBL2',),\n ('SOD1',), ('KAT5',), ('UNG',), ('WRN',), ('XRCC1',)],\n dtype='object', name='Gene_Name')] are in the [columns]"

推荐答案

我认为您需要删除 [] ,因为 ALTJ_genes 是列表,而 [ALTJ_genes] 是嵌套列表:

I think you need remove [] because ALTJ_genes is list and [ALTJ_genes] is nested list:

df1.drop(df1.columns.difference(ALTJ_genes), axis=1, inplace=True)

但是更简单的是按列表选择列:

But simplier is select columns by list:

df1 = df1[ALTJ_genes]

我认为问题在于已定义的带有嵌套列表的列,因此请获取一级非标准MultiIndex:

I think problem is with defined columns with nested list, so get one level non standard MultiIndex:

df1 = pd.DataFrame([[1,2,3,4]])
#nested list
df1.columns = [['APEX1', 'ASF1A', 'CDKN2D', 'AAA']]
print (df1) 
  APEX1 ASF1A CDKN2D AAA
0     1     2      3   4

print (df1.columns)
MultiIndex([( 'APEX1',),
            ( 'ASF1A',),
            ('CDKN2D',),
            (   'AAA',)],
           )

如果传递非嵌套列表:

df1 = pd.DataFrame([[1,2,3,4]])
#not nested list
df1.columns = ['APEX1', 'ASF1A', 'CDKN2D', 'AAA']
print (df1) 
   APEX1  ASF1A  CDKN2D  AAA
0      1      2       3    4

print (df1.columns)
Index(['APEX1', 'ASF1A', 'CDKN2D', 'AAA'], dtype='object')

这篇关于Python Pandas-通过列表删除多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆