Python Pandas按列排序多索引,但保留树结构 [英] Python Pandas sorting multiindex by column, but retain tree structure
问题描述
使用pandas 0.20.3我试图按列('D')的值(降序)对数据帧的n个多级排序(以降序排列),以保持组的层次结构.
Using pandas 0.20.3 I am trying to sort the n multilevels of a dataframe by a column ('D') with values (descendlingly) such that the hierarchy of the groups is maintained.
示例输入:
D
A B C
Gran1 Par1 Child1 3
Child2 7
Child3 2
Par2 Child1 9
Child2 2
Par3 Child1 6
Gran2 Par1 Child1 3
Par2 Child1 8
Child2 2
Child3 3
Par3 Child1 6
Child2 8
所需结果:
D
A B C
Gran2 Par3 Child2 8
Child1 6
Par2 Child1 8
Child3 3
Child2 2
Par1 Child1 3
Gran1 Par1 Child2 7
Child1 3
Child3 2
Par2 Child1 9
Child2 2
Par3 Child1 6
与对多级索引进行排序和排序有关的其他问题的解决方案似乎集中于对索引的实际级别进行排序或在对列进行排序时保持索引的顺序.我没有找到一种多级排序,其中列的值用于按该特定级别的汇总值对索引进行排序.任何建议都将不胜感激.
Solutions to other problems related to sorting and ordering multilevel indices, seem to be focussed on sorting the actual level of the index or maintaining it in order while sorting a column. I did not find a multilevel sort where the values of the columns are used to sort the index by the aggregate value at that specific level. Any suggestions are greatly appreciated.
推荐答案
您需要创建三个单独的数组,并按所有数组的组合进行排序.在此示例中,我使用Numpy的np.lexsort
进行排序,然后使用iloc
进行排序.最后,我使用a[::-1]
进行反向排序.
You need to create three separate arrays and sort by the combination of all them. In this example, I use Numpy's np.lexsort
to do the sorting and then I use iloc
to respect that sort. At the end, I use a[::-1]
to get the reverse sort.
a = np.lexsort([
df.D.values,
df.groupby(level=[0, 1]).D.transform('sum').values,
df.groupby(level=0).D.transform('sum').values
])
df.iloc[a[::-1]]
D
A B C
Gran2 Par3 Child2 8
Child1 6
Par2 Child1 8
Child3 3
Child2 2
Par1 Child1 3
Gran1 Par1 Child2 7
Child1 3
Child3 2
Par2 Child1 9
Child2 2
Par3 Child1 6
这篇关于Python Pandas按列排序多索引,但保留树结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!