如何在大 pandas 分组之后从每个分组中选择前n行? [英] How to select top n row from each group after group by in pandas?

查看:69
本文介绍了如何在大 pandas 分组之后从每个分组中选择前n行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫形状的数据框

I have a pandas dataframe with following shape

 open_year, open_month, type, col1, col2, ....

我想在每个(年,月)中找到头号类型,所以我首先要找到每个(年,月)中每种类型的数

I'd like to find the top type in each (year,month) so I first find the count of each type in each (year,month)

freq_df = df.groupby(['open_year','open_month','type']).size().reset_index()
freq_df.columns = ['open_year','open_month','type','count']

然后我想根据每个(year_month)的频率(例如计数)找到前n个类型.我该怎么办?

Then I want to find the top n type based on their freq (e.g. count) for each (year_month). How can I do that?

我可以使用nlargest,但是我缺少类型

I can use nlargest but I'm missing the type

freq_df.groupby(['open_year','open_month'])['count'].nlargest(5)

但我缺少列type

推荐答案

我建议您先按降序对计数进行排序,然后您可以在<之后调用GroupBy.head

I'd recommend sorting your counts in descending order first, and you can call GroupBy.head after—

(freq_df.sort_values('count', ascending=False)
        .groupby(['open_year','open_month'], sort=False).head(5)
)

这篇关于如何在大 pandas 分组之后从每个分组中选择前n行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆