如何在大 pandas 分组之后从每个分组中选择前n行? [英] How to select top n row from each group after group by in pandas?
本文介绍了如何在大 pandas 分组之后从每个分组中选择前n行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个熊猫形状的数据框
I have a pandas dataframe with following shape
open_year, open_month, type, col1, col2, ....
我想在每个(年,月)中找到头号类型,所以我首先要找到每个(年,月)中每种类型的数
I'd like to find the top type in each (year,month) so I first find the count of each type in each (year,month)
freq_df = df.groupby(['open_year','open_month','type']).size().reset_index()
freq_df.columns = ['open_year','open_month','type','count']
然后我想根据每个(year_month)的频率(例如计数)找到前n个类型.我该怎么办?
Then I want to find the top n type based on their freq (e.g. count) for each (year_month). How can I do that?
我可以使用nlargest
,但是我缺少类型
I can use nlargest
but I'm missing the type
freq_df.groupby(['open_year','open_month'])['count'].nlargest(5)
但我缺少列type
推荐答案
我建议您先按降序对计数进行排序,然后您可以在<之后调用GroupBy.head
I'd recommend sorting your counts in descending order first, and you can call GroupBy.head
after—
(freq_df.sort_values('count', ascending=False)
.groupby(['open_year','open_month'], sort=False).head(5)
)
这篇关于如何在大 pandas 分组之后从每个分组中选择前n行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文