在尊重其索引结构的同时对多索引进行排序 [英] Sorting a multi-index while respecting its index structure

查看:60
本文介绍了在尊重其索引结构的同时对多索引进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在尊重级别组织的同时对多索引数据帧进行排序?

例如给定以下 df,假设我们根据 C 对其进行排序(例如按降序):

 C D E甲乙一栏 -0.346528 1.528538 1三 -0.136710 -0.147842 1通量六 0.795641 -1.610137 1三 1.051926 -1.316725 2富五 0.906627 0.717922 0一 -0.152901 -0.043107 2两个 0.542137 -0.373016 2两个 0.329831 1.067820 1

我们应该得到:

 C D E甲乙第三条 -0.136710 -0.147842 1一 -0.346528 1.528538 1通量三 1.051926 -1.316725 2六 0.795641 -1.610137 1富五 0.906627 0.717922 0两个 0.542137 -0.373016 2两个 0.329831 1.067820 1两个 -0.152901 -0.043107 2

请注意,我所说的尊重其索引结构"是指在不改变更高级别索引的顺序的情况下对数据帧的叶子进行排序.换句话说,我想对第二级进行排序,同时保持第一级的顺序不变.

升序顺序做同样的事情怎么样?

我阅读了这两个线程(是的,标题相同):

但他们根据不同的标准(例如索引名称或组中的特定列)对数据框进行排序.

解决方案

.reset_index,然后根据列 AC 和然后将索引设置回来;这将比早期的 groupby 解决方案更有效:

<预><代码>>>>df.reset_index().sort(columns=['A', 'C'], 升序=[True, False]).set_index(['A', 'B'])电子设备甲乙第三小节 -0.137 -0.148 1一 -0.347 1.529 1通量三 1.052 -1.317 2六 0.796 -1.610 1富五 0.907 0.718 0两个 0.542 -0.373 2两个 0.330 1.068 1一 -0.153 -0.043 2

<小时>

早期的解决方案:.groupby(...).apply 相对较慢,并且可能无法很好地扩展:

<预><代码>>>>df['arg-sort'] = df.groupby(level='A')['C'].apply(pd.Series.argsort)>>>f = lambda obj: obj.iloc[obj.loc[::-1, 'arg-sort'], :]>>>df.groupby(level='A', group_keys=False).apply(f)C D E arg-sort甲乙第三小节 -0.137 -0.148 1 1一 -0.347 1.529 1 0通量三 1.052 -1.317 2 1六 0.796 -1.610 1 0富五 0.907 0.718 0 1两个 0.542 -0.373 2 2两个 0.330 1.068 1 0一 -0.153 -0.043 2 3

How can I sort a multi-index dataframe while respecting the organization of levels?

E.g. given the following df, say we sort it according to C (e.g. in descending order):

                   C         D  E
A    B                           
bar  one   -0.346528  1.528538  1
     three -0.136710 -0.147842  1
flux six    0.795641 -1.610137  1
     three  1.051926 -1.316725  2
foo  five   0.906627  0.717922  0
     one   -0.152901 -0.043107  2
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1

We should get:

                   C         D  E
A    B                           
bar  three -0.136710 -0.147842  1
     one   -0.346528  1.528538  1
flux three  1.051926 -1.316725  2
     six    0.795641 -1.610137  1
foo  five   0.906627  0.717922  0
     two    0.542137 -0.373016  2
     two    0.329831  1.067820  1
     two   -0.152901 -0.043107  2

Note that what I mean by "respecting its index structure" is sorting the leafs of the dataframe without changing the ordering of higher-level indices. In other words, I want to sort the second level while keeping the ordering of the the first level untouched.

What about doing the same in ascending order?

I read these two threads (yes, with the same title):

but they sort the dataframes according to a different criteria (e.g. index names, or a specific column in a group).

解决方案

.reset_index, then sort based on columns A and C and then set the index back; This will be more efficient than the earlier groupby solution:

>>> df.reset_index().sort(columns=['A', 'C'], ascending=[True, False]).set_index(['A', 'B'])
                C      D  E
A    B                     
bar  three -0.137 -0.148  1
     one   -0.347  1.529  1
flux three  1.052 -1.317  2
     six    0.796 -1.610  1
foo  five   0.907  0.718  0
     two    0.542 -0.373  2
     two    0.330  1.068  1
     one   -0.153 -0.043  2


earlier solution: .groupby(...).apply is relatively slow, and may not scale very well:

>>> df['arg-sort'] = df.groupby(level='A')['C'].apply(pd.Series.argsort)
>>> f = lambda obj: obj.iloc[obj.loc[::-1, 'arg-sort'], :]
>>> df.groupby(level='A', group_keys=False).apply(f)
                C      D  E  arg-sort
A    B                               
bar  three -0.137 -0.148  1         1
     one   -0.347  1.529  1         0
flux three  1.052 -1.317  2         1
     six    0.796 -1.610  1         0
foo  five   0.907  0.718  0         1
     two    0.542 -0.373  2         2
     two    0.330  1.068  1         0
     one   -0.153 -0.043  2         3

这篇关于在尊重其索引结构的同时对多索引进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆