按标签( pandas )选择多列 [英] Select multiple columns by labels (pandas)

查看:105
本文介绍了按标签( pandas )选择多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找通过python文档和论坛选择列的方法,但是索引列的每个示例都过于简单.

I've been looking around for ways to select columns through the python documentation and the forums but every example on indexing columns are too simplistic.

假设我有一个10 x 10的数据帧

Suppose I have a 10 x 10 dataframe

df = DataFrame(randn(10, 10), index=range(0,10), columns=['A', 'B', 'C', 'D','E','F','G','H','I','J'])

到目前为止,所有文档仅提供了一个简单的索引编制示例,例如

So far, all the documentations gives is just a simple example of indexing like

subset = df.loc[:,'A':'C']

subset = df.loc[:,'C':]

但是当我尝试索引多个非连续列时会出现错误

But I get an error when I try index multiple, non-sequential columns, like this

subset = df.loc[:,('A':'C', 'E')]

如果要选择A到C,E和G到我列,我该如何在Pandas中建立索引?看来这种逻辑不起作用

How would I index in Pandas if I wanted to select column A to C, E, and G to I? It appears that this logic will not work

subset = df.loc[:,('A':'C', 'E', 'G':'I')]

我认为解决方案非常简单,但是我无法解决此错误.谢谢!

I feel that the solution is pretty simple, but I can't get around this error. Thanks!

推荐答案

基于名称或标签的名称(使用正则表达式语法)

df.filter(regex='[A-CEG-I]')   # does NOT depend on the column order

基于位置(取决于列顺序)

df[ list(df.loc[:,'A':'C']) + ['E'] + list(df.loc[:,'G':'I']) ]

请注意,与基于标签的方法不同,此方法仅在您的列按字母顺序排序时才有效.但是,这不一定是问题.例如,如果您的列转到['A','C','B'],则可以将上面的'A':'C'替换为'A':'B'.

Note that unlike the label-based method, this only works if your columns are alphabetically sorted. This is not necessarily a problem, however. For example, if your columns go ['A','C','B'], then you could replace 'A':'C' above with 'A':'B'.

为了完整起见,@ Magdalena总是显示简单地单独列出每一列的选项,尽管随着列数的增加它可能会变得更加冗长:

And for completeness, you always have the option shown by @Magdalena of simply listing each column individually, although it could be much more verbose as the number of columns increases:

df[['A','B','C','E','G','H','I']]   # does NOT depend on the column order

以上任何一种方法的结果

          A         B         C         E         G         H         I
0 -0.814688 -1.060864 -0.008088  2.697203 -0.763874  1.793213 -0.019520
1  0.549824  0.269340  0.405570 -0.406695 -0.536304 -1.231051  0.058018
2  0.879230 -0.666814  1.305835  0.167621 -1.100355  0.391133  0.317467

这篇关于按标签( pandas )选择多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆