pandas:具有多索引的布尔索引 [英] pandas: Boolean indexing with multi index
问题描述
这里有很多标题相似的问题,但我找不到解决这个问题的方法.
There are many questions here with similar titles, but I couldn't find one that's addressing this issue.
我有许多不同来源的数据框,我想彼此过滤.当布尔序列的大小与过滤后的数据帧的大小相同时,使用布尔索引的效果很好,但是当序列的大小与过滤后的数据帧的更高级别的索引相同时,使用布尔索引就可以了.
I have dataframes from many different origins, and I want to filter one by the other. Using boolean indexing works great when the boolean series is the same size as the filtered dataframe, but not when the size of the series is the same as a higher level index of the filtered dataframe.
简而言之,假设我有这个数据框:
In short, let's say I have this dataframe:
In [4]: df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3],
'b':[1,2,3,1,2,3,1,2,3],
'c':range(9)}).set_index(['a', 'b'])
Out[4]:
c
a b
1 1 0
2 1
3 2
2 1 3
2 4
3 5
3 1 6
2 7
3 8
这个系列:
In [5]: filt = pd.Series({1:True, 2:False, 3:True})
Out[6]:
1 True
2 False
3 True
dtype: bool
我想要的输出是这样:
c
a b
1 1 0
2 1
3 2
3 1 6
2 7
3 8
我不是在寻找不使用filt
系列的解决方案,例如:
I am not looking for solutions that are not using the filt
series, such as:
df[df.index.get_level_values('a') != 2]
df[df.index.get_level_values('a').isin([1,3])]
我想知道我是否可以按原样使用输入的filt
系列,就像我在c上使用过滤器一样:
I want to know if I can use my input filt
series as is, as I would use a filter on c:
filt = df.c < 7
df[filt]
推荐答案
如果将索引"a"转换回一列,则可以按以下步骤进行操作:
If you transform your index 'a' back to a column, you can do it as follows:
>>> df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3],
'b':[1,2,3,1,2,3,1,2,3],
'c':range(9)})
>>> filt = pd.Series({1:True, 2:False, 3:True})
>>> df[filt[df['a']].values]
a b c
0 1 1 0
1 1 2 1
2 1 3 2
6 3 1 6
7 3 2 7
8 3 3 8
修改. 正如@joris所建议的,这也适用于索引.这是示例数据的代码:
edit. As suggested by @joris, this works also with indices. Here is the code for your sample data:
>>> df[filt[df.index.get_level_values('a')].values]
c
a b
1 1 0
2 1
3 2
3 1 6
2 7
3 8
这篇关于pandas:具有多索引的布尔索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!