groupby pandas python的自定义排序功能 [英] Custom sort order function for groupby pandas python
问题描述
假设我有一个如下所示的分组数据框(通过初始 df.groupby(df["A"]).apply(some_func)
where some_func
返回一个数据帧本身).第二列是groupby
创建的multiindex
的第二层.
Let's say I have a grouped dataframe like the below (which was obtained through an initial df.groupby(df["A"]).apply(some_func)
where some_func
returns a dataframe itself). The second column is the second level of the multiindex
which was created by the groupby
.
A B C
1 0 1 8
1 3 3
2 0 1 2
1 2 2
3 0 1 3
1 2 4
我想对应用于组的自定义函数的结果进行排序.
And I would like to order on the result of a custom function that I apply to the groups.
让我们假设这个例子的函数是
Let's assume for this example that the function is
def my_func(group):
return sum(group["B"]*group["C"])
然后我希望排序操作的结果返回
I would then like the result of the sort operation to return
A B C
2 0 1 2
1 2 2
3 0 1 3
1 2 4
1 0 1 8
1 3 3
推荐答案
这是基于@Wen-Ben 的优秀答案,但使用 sort_values
来维护组内/组间订单.
This is based on @Wen-Ben's excellent answer, but uses sort_values
to maintain the intra/inter group orders.
df['func'] = (groups.apply(my_func)
.reindex(df.index.get_level_values(0))
.values)
(df.reset_index()
.sort_values(['func','A','i'])
.drop('func', axis=1)
.set_index(['A','i']))
注意:idx.argsort()
、quicksort
的默认算法并不稳定.这就是@Wen-Ben 的答案对于复杂数据集失败的原因.您可以使用 idx.argsort(kind='mergesort')
进行稳定排序,即在出现平局值时保持原始顺序.
Note: the default algorithm for idx.argsort()
, quicksort
, is not stable. That's why @Wen-Ben's answer fails for complicated datasets. You can use idx.argsort(kind='mergesort')
for a stable sort, i.e., maintaining the original order in case of tie values.
这篇关于groupby pandas python的自定义排序功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!