pandas 群体在群体中排序 [英] pandas groupby sort within groups
问题描述
在[167]中,我将数据框分成两列,然后对聚合结果进行排序。 :
df
出[167]:
计数工作来源
0 2销售额A
1 4销售额B
2 6销售额C
3 3销售额D
4 7销售额E
5 5市场A
6 3市场B
7 2市场C
8 4市场D
9 1市场E
在[168]中:
df.groupby(['job','source'])。agg({'count':sum})
出[168]:
计数
工作来源
市场A 5
B 3
C 2
D 4
E 1
销售额A 2
B 4
C 6
D 3
E 7
现在我想按每个组中的降序对count列进行排序。然后只取前三排。要获得这样的东西:
计数
工作源
市场A 5
D 4
B 3
销售E 7
C 6
B 4
你想要做的实际上又是一个groupby(关于第一个groupby的结果):排序并获取每个组的前三个元素。 b
从第一个groupby的结果开始:
In [60]:df_agg = df。 groupby(['job','source'])。agg({'count':sum})
我们按指数的第一级进行分组:
In [63]:g = df_agg ['count']。 groupby(level = 0,group_keys = False)
然后我们想对每个组并采取前三个元素:
pre $ In [64]:res = g.apply(lambda x:x.order ascending = False).head(3))
然而,为此,有一个快捷方式可以做到这一点, 最大的
:
在[65]中:g.nlargest(3)
出[65]:
工作来源
市场A 5
D 4
B 3
销售E 7
C 6
B 4
dtype:int64
I want to group my dataframe by two columns and then sort the aggregated results within the groups.
In [167]:
df
Out[167]:
count job source
0 2 sales A
1 4 sales B
2 6 sales C
3 3 sales D
4 7 sales E
5 5 market A
6 3 market B
7 2 market C
8 4 market D
9 1 market E
In [168]:
df.groupby(['job','source']).agg({'count':sum})
Out[168]:
count
job source
market A 5
B 3
C 2
D 4
E 1
sales A 2
B 4
C 6
D 3
E 7
I would now like to sort the count column in descending order within each of the groups. And then take only the top three rows. To get something like:
count
job source
market A 5
D 4
B 3
sales E 7
C 6
B 4
What you want to do is actually again a groupby (on the result of the first groupby): sort and take the first three elements per group.
Starting from the result of the first groupby:
In [60]: df_agg = df.groupby(['job','source']).agg({'count':sum})
We group by the first level of the index:
In [63]: g = df_agg['count'].groupby(level=0, group_keys=False)
Then we want to sort ('order') each group and take the first three elements:
In [64]: res = g.apply(lambda x: x.order(ascending=False).head(3))
However, for this, there is a shortcut function to do this, nlargest
:
In [65]: g.nlargest(3)
Out[65]:
job source
market A 5
D 4
B 3
sales E 7
C 6
B 4
dtype: int64
这篇关于 pandas 群体在群体中排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!