如何使用另一个MultiIndex切片一个MultiIndex DataFrame [英] How to slice one MultiIndex DataFrame with the MultiIndex of another

查看:1404
本文介绍了如何使用另一个MultiIndex切片一个MultiIndex DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫数据框,有3个级别的MultiIndex。我试图根据与两个级别相对应的值列表来拉出这个数据帧的行。



我有这样的东西:

  ix = pd.MultiIndex.from_product([[1,2,3],['foo','bar'],['baz' ,'can']],names = ['a','b','c'])
data = np.arange(len(ix))
df = pd.DataFrame(data, index = ix,columns = ['hi'])
print(df)

hi
abc
1 foo baz 0
可以1
bar baz 2
can 3
2 foo baz 4
can 5
bar baz 6
can 7
3 foo baz 8
can 9
bar baz 10
可以11

现在我想采取所有行其中索引级别b和c在此索引中:

  ix_use = pd.MultiIndex.from_tuples([(' foo','can'),('bar','baz')],names = ['b','c'])

ie hi ('foo','can')('bar' ,'baz') b c >(1,2,5,6,9,10)。



所以我想要一个切片(无),并在第二和第三级别拉出特定的元组。



最初我以为通过一个多指针对象到.loc将拉出我想要的值/级别,但这不起作用。这样做最好的方法是什么?

解决方案

这是一种获得此片段的方法:

  df.sort_index(inplace = True)
idx = pd.IndexSlice
df.loc [idx [:, foo ','bar'),'can'],:]

yield

  hi 
abc
1 bar can 3
foo can 1
2 bar can 7
foo can 5
3 bar can 11
foo can 9

请注意,可能需要对MultiIndex进行排序,然后才能对其进行分割。大熊猫很好,可以警告你是否需要这样做:

  KeyError:'MultiIndex切片要求索引完全输入tuple len(3),lexsort depth(1)'

您可以阅读更多关于如何使用切片器在文档



如果由于某种原因使用切片器不是一个选项,这里是一种使用 .isin()方法获取相同切片的方法:

  df [df.index.get_level_values('b')。isin(ix_use.get_level_values(0))& df.index.get_level_values('c')。isin(ix_use.get_level_values(1))] 

这显然不是简洁的。



更新:



你在这里更新是一种方法:

  cond1 =(df.index.get_level_values('b')。isin (['foo']))& (df.index.get_level_values('c')。isin(['can']))
cond2 =(df.index.get_level_values('b')。isin(['bar']) (df.index.get_level_values('c')。isin(['baz']))
df [cond1 | cond2]

生产:

  hi 
abc
1 foo can 1
bar baz 2
2 foo can 5
bar baz 6
3 foo can 9
bar baz 10


I have a pandas dataframe with 3 levels of a MultiIndex. I am trying to pull out rows of this dataframe according to a list of values that correspond to two of the levels.

I have something like this:

ix = pd.MultiIndex.from_product([[1, 2, 3], ['foo', 'bar'], ['baz', 'can']], names=['a', 'b', 'c'])
data = np.arange(len(ix))
df = pd.DataFrame(data, index=ix, columns=['hi'])
print(df)

           hi
a b   c      
1 foo baz   0
      can   1
  bar baz   2
      can   3
2 foo baz   4
      can   5
  bar baz   6
      can   7
3 foo baz   8
      can   9
  bar baz  10
      can  11

Now I want to take all rows where index levels 'b' and 'c' are in this index:

ix_use = pd.MultiIndex.from_tuples([('foo', 'can'), ('bar', 'baz')], names=['b', 'c'])

i.e. values of hi having ('foo', 'can') or ('bar', 'baz') in levels b and c respectively: (1, 2, 5, 6, 9, 10).

So I'd like to take a slice(None) on the first level, and pull out specific tuples on the second and third levels.

Initially I thought that passing a multi-index object to .loc would pull out the values / levels that I wanted, but this isn't working. What's the best way to do something like this?

解决方案

Here is a way to get this slice:

df.sort_index(inplace=True)
idx = pd.IndexSlice
df.loc[idx[:, ('foo','bar'), 'can'], :]

yielding

           hi
a b   c      
1 bar can   3
  foo can   1
2 bar can   7
  foo can   5
3 bar can  11
  foo can   9

Note that you might need to sort MultiIndex before you can slice it. Well pandas is kind enough to warn if you need to do it:

KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (3), lexsort depth (1)'

You can read more on how to use slicers in the docs

If for some reason using slicers is not an option here is a way to get the same slice using .isin() method:

df[df.index.get_level_values('b').isin(ix_use.get_level_values(0)) & df.index.get_level_values('c').isin(ix_use.get_level_values(1))]

Which is clearly not as concise.

UPDATE:

For the conditions that you have updated here is a way to do it:

cond1 = (df.index.get_level_values('b').isin(['foo'])) & (df.index.get_level_values('c').isin(['can']))
cond2 = (df.index.get_level_values('b').isin(['bar'])) & (df.index.get_level_values('c').isin(['baz']))
df[cond1 | cond2]

producing:

           hi
a b   c      
1 foo can   1
  bar baz   2
2 foo can   5
  bar baz   6
3 foo can   9
  bar baz  10

这篇关于如何使用另一个MultiIndex切片一个MultiIndex DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆