pandas:具有多索引的布尔索引 [英] pandas: Boolean indexing with multi index

查看:91
本文介绍了pandas:具有多索引的布尔索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里有很多标题相似的问题,但我找不到解决这个问题的方法.

There are many questions here with similar titles, but I couldn't find one that's addressing this issue.

我有许多不同来源的数据框,我想彼此过滤.当布尔序列的大小与过滤后的数据帧的大小相同时,使用布尔索引的效果很好,但是当序列的大小与过滤后的数据帧的更高级别的索引相同时,使用布尔索引就可以了.

I have dataframes from many different origins, and I want to filter one by the other. Using boolean indexing works great when the boolean series is the same size as the filtered dataframe, but not when the size of the series is the same as a higher level index of the filtered dataframe.

简而言之,假设我有这个数据框:

In short, let's say I have this dataframe:

In [4]: df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3], 
                           'b':[1,2,3,1,2,3,1,2,3], 
                           'c':range(9)}).set_index(['a', 'b'])
Out[4]: 
     c
a b   
1 1  0
  2  1
  3  2
2 1  3
  2  4
  3  5
3 1  6
  2  7
  3  8

这个系列:

In [5]: filt = pd.Series({1:True, 2:False, 3:True})
Out[6]: 
1     True
2    False
3     True
dtype: bool

我想要的输出是这样:

     c
a b   
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

我不是在寻找不使用filt系列的解决方案,例如:

I am not looking for solutions that are not using the filt series, such as:

df[df.index.get_level_values('a') != 2]
df[df.index.get_level_values('a').isin([1,3])]

我想知道我是否可以按原样使用输入的filt系列,就像我在c上使用过滤器一样:

I want to know if I can use my input filt series as is, as I would use a filter on c:

filt = df.c < 7
df[filt]

推荐答案

如果将索引"a"转换回一列,则可以按以下步骤进行操作:

If you transform your index 'a' back to a column, you can do it as follows:

>>> df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3], 
                       'b':[1,2,3,1,2,3,1,2,3], 
                       'c':range(9)})
>>> filt = pd.Series({1:True, 2:False, 3:True})
>>> df[filt[df['a']].values]
   a  b  c
0  1  1  0
1  1  2  1
2  1  3  2
6  3  1  6
7  3  2  7
8  3  3  8

修改. 正如@joris所建议的,这也适用于索引.这是示例数据的代码:

edit. As suggested by @joris, this works also with indices. Here is the code for your sample data:

>>> df[filt[df.index.get_level_values('a')].values]
     c
a b   
1 1  0
  2  1
  3  2
3 1  6
  2  7
  3  8

这篇关于pandas:具有多索引的布尔索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆