groupby pandas python的自定义排序功能 [英] Custom sort order function for groupby pandas python

查看:74
本文介绍了groupby pandas python的自定义排序功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个如下所示的分组数据框(通过初始 df.groupby(df["A"]).apply(some_func) where some_func 返回一个数据帧本身).第二列是groupby创建的multiindex的第二层.

Let's say I have a grouped dataframe like the below (which was obtained through an initial df.groupby(df["A"]).apply(some_func) where some_func returns a dataframe itself). The second column is the second level of the multiindex which was created by the groupby.

A   B C
1 0 1 8
  1 3 3
2 0 1 2
  1 2 2
3 0 1 3
  1 2 4

我想对应用于组的自定义函数的结果进行排序.

And I would like to order on the result of a custom function that I apply to the groups.

让我们假设这个例子的函数是

Let's assume for this example that the function is

def my_func(group):
    return sum(group["B"]*group["C"])

然后我希望排序操作的结果返回

I would then like the result of the sort operation to return

A   B C
2 0 1 2
  1 2 2
3 0 1 3
  1 2 4
1 0 1 8
  1 3 3

推荐答案

这是基于@Wen-Ben 的优秀答案,但使用 sort_values 来维护组内/组间订单.

This is based on @Wen-Ben's excellent answer, but uses sort_values to maintain the intra/inter group orders.

df['func'] = (groups.apply(my_func)
              .reindex(df.index.get_level_values(0))
              .values)

(df.reset_index()
 .sort_values(['func','A','i'])
 .drop('func', axis=1)
 .set_index(['A','i']))

注意:idx.argsort()quicksort 的默认算法并不稳定.这就是@Wen-Ben 的答案对于复杂数据集失败的原因.您可以使用 idx.argsort(kind='mergesort') 进行稳定排序,即在出现平局值时保持原始顺序.

Note: the default algorithm for idx.argsort(), quicksort, is not stable. That's why @Wen-Ben's answer fails for complicated datasets. You can use idx.argsort(kind='mergesort') for a stable sort, i.e., maintaining the original order in case of tie values.

这篇关于groupby pandas python的自定义排序功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆