Python Pandas按多索引和列排序 [英] Python Pandas sorting by multiindex and column

查看：4234 发布时间：2018/8/2 13:55:49 python sorting pandas indexing

本文介绍了Python Pandas按多索引和列排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在Pandas 0.17中，我尝试按特定列排序，同时保持层次索引（A和B）。 B是通过串联设置数据帧时创建的运行编号。我的数据如下所示：

In Pandas 0.17 I try to sort by a specific column while maintaining the hierarchical index (A and B). B is a running number created when setting up the dataframe through concatenation. My data looks like this:

          C      D
A   B
bar one   shiny  10
    two   dull   5
    three glossy 8
foo one   dull   3
    two   shiny  9
    three matt   12

这就是我需要的：

          C      D
A   B
bar two   dull   5
    three glossy 8
    one   shiny  10
foo one   dull   3
    three matt   12
    two   shiny  9

以下是我使用的代码和结果。注意：Pandas 0.17会警告dataframe.sort将被弃用。

Below is the code I am using and the result. Note: Pandas 0.17 alerts that dataframe.sort will be deprecated.

df.sort_values(by="C", ascending=True)
          C      D
A   B
bar two   dull   5
foo one   dull   3
bar three glossy 8
foo three matt   12
bar one   shiny  10
foo two   shiny  9

添加.groupby会产生相同的结果：

Adding .groupby produces the same result:

df.sort_values(by="C", ascending=True).groupby(axis=0, level=0, as_index=True)

同样，首先切换到排序索引，然后按列分组并不富有成效：

Similarly, switching to sorting indices first, and then groupby the column is not fruitful:

df.sort_index(axis=0, level=0, as_index=True).groupby(C, as_index=True)

我不确定重新索引我需要保留第一个索引A，第二个索引B可以重新分配，但不必。如果没有简单的解决方案，我会感到惊讶;我想我只是找不到它。任何建议都表示赞赏。

I am not certain about reindexing I need to keep the first index A, second index B can be reassigned, but does not have to. It would surprise me if there is not an easy solution; I guess I just don't find it. Any suggestions are appreciated.

编辑：在此期间我删除了第二个索引B，将第一个索引A重新分配为列而不是索引排序多列，然后重新索引它：

In the meantime I dropped the second index B, reassigned first index A to be a column instead of an index sorted multiple columns, then re-indexed it:

df.index = df.index.droplevel(1)
df.reset_index(level=0, inplace=True)
df_sorted = df.sort_values(["A", "C"], ascending=[1,1]) #A is a column here, not an index.
df_reindexed = df_sorted.set_index("A")

仍然非常详细。

推荐答案

感觉可能有更好的方法，但这里有一种方法：

Feels like there could be a better way, but here's one approach:

In [163]: def sorter(sub_df):
     ...:     sub_df = sub_df.sort_values('C')
     ...:     sub_df.index = sub_df.index.droplevel(0)
     ...:     return sub_df

In [164]: df.groupby(level='A').apply(sorter)
Out[164]: 
                C   D
A   B                
bar two      dull   5
    three  glossy   8
    one     shiny  10
foo one      dull   3
    three    matt  12
    two     shiny   9

这篇关于Python Pandas按多索引和列排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas按多索引和列排序 [英] Python Pandas sorting by multiindex and column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas按多索引和列排序 [英] Python Pandas sorting by multiindex and column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭