将 pandas MultiIndex切片后,如何更新其水平? [英] How do you update the levels of a pandas MultiIndex after slicing its DataFrame?

查看:86
本文介绍了将 pandas MultiIndex切片后,如何更新其水平?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有熊猫MultiIndex的数据框:

I have a Dataframe with a pandas MultiIndex:

In [1]: import pandas as pd
In [2]: multi_index = pd.MultiIndex.from_product([['CAN','USA'],['total']],names=['country','sex'])
In [3]: df = pd.DataFrame({'pop':[35,318]},index=multi_index)
In [4]: df
Out[4]:
               pop
country sex
CAN     total   35
USA     total  318

然后我从该DataFrame中删除一些行:

Then I remove some rows from that DataFrame:

In [5]: df = df.query('pop > 100')

In [6]: df
Out[6]:
               pop
country sex
USA     total  318

但是当我查询MutliIndex时,它仍然在两个国家都处于其水平.

But when I consult the MutliIndex, it still has both countries in its levels.

In [7]: df.index.levels[0]
Out[7]: Index([u'CAN', u'USA'], dtype='object')

我可以用一种很奇怪的方式自己解决这个问题:

I can fix this myself in a rather strange way:

In [8]: idx_names = df.index.names

In [9]: df = df.reset_index(drop=False)

In [10]: df = df.set_index(idx_names)

In [11]: df
Out[11]:
               pop
country sex
USA     total  318

In [12]: df.index.levels[0]
Out[12]: Index([u'USA'], dtype='object')

但是,这似乎很混乱.有什么更好的方法我想念吗?

But this seems rather messy. Is there a better way I'm missing?

推荐答案

这是我以前咬过的东西.出于性能和哲学上的考虑,删除列或行不会更改基础MultiIndex,并且正式不将其视为错误(

This is something that has bitten me before. Dropping columns or rows does NOT change the underlying MultiIndex, for performance and philosophical reasons, and this is officially not considered a bug (read more here). The short answer is that the developers say "that's not what the MultiIndex is for". If you need a list of the contents of a MultiIndex level after modification, for example for iteration or to check to see if something is included, you can use:

df.index.get_level_values(<levelname>)

这将返回该索引级别内的当前活动值.

This returns the current active values within that index level.

所以我想这里的窍门"是API的本机方法是使用get_level_values而不是.index或.columns

So I guess the "trick" here is that the API native way to do it is to use get_level_values instead of just .index or .columns

这篇关于将 pandas MultiIndex切片后,如何更新其水平?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆