如何在大 pandas 分组之后从每个分组中选择前n行? [英] How to select top n row from each group after group by in pandas?

查看：69 发布时间：2020/5/23 23:25:03 python pandas

本文介绍了如何在大 pandas 分组之后从每个分组中选择前n行?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我有一个熊猫形状的数据框

I have a pandas dataframe with following shape

 open_year, open_month, type, col1, col2, ....

我想在每个(年，月)中找到头号类型，所以我首先要找到每个(年，月)中每种类型的数

I'd like to find the top type in each (year,month) so I first find the count of each type in each (year,month)

freq_df = df.groupby(['open_year','open_month','type']).size().reset_index()
freq_df.columns = ['open_year','open_month','type','count']

然后我想根据每个(year_month)的频率(例如计数)找到前n个类型.我该怎么办?

Then I want to find the top n type based on their freq (e.g. count) for each (year_month). How can I do that?

我可以使用nlargest，但是我缺少类型

I can use nlargest but I'm missing the type

freq_df.groupby(['open_year','open_month'])['count'].nlargest(5)

但我缺少列type

我建议您先按降序对计数进行排序，然后您可以在<之后调用GroupBy.head

I'd recommend sorting your counts in descending order first, and you can call GroupBy.head after—

(freq_df.sort_values('count', ascending=False)
        .groupby(['open_year','open_month'], sort=False).head(5)
)

这篇关于如何在大 pandas 分组之后从每个分组中选择前n行?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文