pandas :按组对观察结果进行排序 [英] pandas: sorting observations within groupby groups
问题描述
根据对 pandas groupby在组内进行排序的答案,以便进行排序每个组中的观测值需要对第一个groupby
的结果进行第二个groupby
.为什么需要第二个groupby
?我本以为在运行第一个groupby
之后,观察值已经被分组了,而所需要的只是一种枚举那些组的方式(并用order
运行apply
).
According to the answer to pandas groupby sort within groups, in order to sort observations within each group one needs to do a second groupby
on the results of the first groupby
. Why a second groupby
is needed? I would've assumed that observations are already arranged into groups after running the first groupby
and all that would be needed is a way to enumerate those groups (and run apply
with order
).
推荐答案
因为在groupby之后应用了函数,结果将重新组合为正常的未分组数据帧.使用groupby和诸如sort之类的groupby方法应该被视为 Split-Apply-Combine操作
Because once you apply a function after a groupby the results are combined back into a normal ungrouped data frame. Using groupby and a groupby method like sort should be thought of like a Split-Apply-Combine operation
groupby拆分原始数据帧,并将该方法应用于每个组,但随后再次隐式合并结果.
The groupby splits the original data frame and the method is applied to each group, but then the results are combined again implicitly.
在另一个问题中,他们可以颠倒操作(先排序),然后不必使用两个groupby.他们可以这样做:
In that other question, they could have reversed the operation (sorted first) and then not have to use two groupbys. They could do:
df.sort(['job','count'],ascending=False).groupby('job').head(3)
这篇关于 pandas :按组对观察结果进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!