如何过滤Pandas GroupBy对象并获得GroupBy对象? [英] How can I filter a Pandas GroupBy object and obtain a GroupBy object back?

查看:62
本文介绍了如何过滤Pandas GroupBy对象并获得GroupBy对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当对Pandas groupby操作的结果执行过滤时,它将返回一个数据帧.但是假设我要执行进一步的组计算,则必须再次调用groupby,这似乎有点麻烦.有没有更惯用的方法呢?

When performing filter on the result of a Pandas groupby operation, it returns a dataframe. But supposing that I want to perform further group computations, I have to call groupby again, which seems sort of round about. Is there a more idiomatic way of doing this?

为了说明我在说什么:

我们无耻地从Pandas文档中窃取了一个玩具数据框,并进行分组:

We shamelessly steal a toy dataframe from the Pandas docs, and group:

>>> dff = pd.DataFrame({'A': np.arange(8), 'B': list('aabbbbcc')})
>>> grouped = dff.groupby('B')
>>> type(grouped)
<class 'pandas.core.groupby.DataFrameGroupBy'>

这将返回一个groupby对象,我们可以在该对象上进行迭代,执行逐组操作等.但是如果我们进行过滤,则:

This returns a groupby object over which we can iterate, perform group-wise operations, etc. But if we filter:

>>> filtered = grouped.filter(lambda x: len(x) > 2)
>>> type(filtered)
<class 'pandas.core.frame.DataFrame'>

我们得到一个数据框.是否有一种很好的惯用方式来获取过滤后的组,而不仅仅是属于过滤后的组的原始行?

We get back a dataframe. Is there a nice idiomatic way of obtaining the filtered groups back, instead of just the original rows which belonged to the filtered groups?

推荐答案

如果要合并过滤器和聚合,我想到的最好方法是在内部使用三元if合并过滤器和聚合apply,返回None过滤组,然后dropna从最终结果中删除这些行:

If you want to combine a filter and an aggregate, the best way I can think of would be to combine your filter and aggregate using a ternary if inside apply, returning None for filtered groups, and then dropna to remove these rows from your final result:

grouped.apply(lambda x: x.sum() if len(x) > 2 else None).dropna()

如果您想遍历各个组,比如说将它们重新组合在一起,则可以使用生成器理解

If you're wanting to iterate through the groups, say to join them back together, you could use a generator comprehension

pd.concat(g for i,g in grouped if len(g)>2)

最终,我认为groupby.filter可以选择返回groupby对象会更好.

Ultimately I think it would be better if groupby.filter had an option to return a groupby object.

这篇关于如何过滤Pandas GroupBy对象并获得GroupBy对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆