如何过滤Pandas GroupBy对象并获得GroupBy对象? [英] How can I filter a Pandas GroupBy object and obtain a GroupBy object back?
问题描述
当对Pandas groupby操作的结果执行过滤时,它将返回一个数据帧.但是假设我要执行进一步的组计算,则必须再次调用groupby,这似乎有点麻烦.有没有更惯用的方法呢?
When performing filter on the result of a Pandas groupby operation, it returns a dataframe. But supposing that I want to perform further group computations, I have to call groupby again, which seems sort of round about. Is there a more idiomatic way of doing this?
为了说明我在说什么:
我们无耻地从Pandas文档中窃取了一个玩具数据框,并进行分组:
We shamelessly steal a toy dataframe from the Pandas docs, and group:
>>> dff = pd.DataFrame({'A': np.arange(8), 'B': list('aabbbbcc')})
>>> grouped = dff.groupby('B')
>>> type(grouped)
<class 'pandas.core.groupby.DataFrameGroupBy'>
这将返回一个groupby对象,我们可以在该对象上进行迭代,执行逐组操作等.但是如果我们进行过滤,则:
This returns a groupby object over which we can iterate, perform group-wise operations, etc. But if we filter:
>>> filtered = grouped.filter(lambda x: len(x) > 2)
>>> type(filtered)
<class 'pandas.core.frame.DataFrame'>
我们得到一个数据框.是否有一种很好的惯用方式来获取过滤后的组,而不仅仅是属于过滤后的组的原始行?
We get back a dataframe. Is there a nice idiomatic way of obtaining the filtered groups back, instead of just the original rows which belonged to the filtered groups?
推荐答案
如果要合并过滤器和聚合,我想到的最好方法是在内部使用三元if
合并过滤器和聚合apply
,返回None
过滤组,然后dropna
从最终结果中删除这些行:
If you want to combine a filter and an aggregate, the best way I can think of would be to combine your filter and aggregate using a ternary if
inside apply
, returning None
for filtered groups, and then dropna
to remove these rows from your final result:
grouped.apply(lambda x: x.sum() if len(x) > 2 else None).dropna()
如果您想遍历各个组,比如说将它们重新组合在一起,则可以使用生成器理解
If you're wanting to iterate through the groups, say to join them back together, you could use a generator comprehension
pd.concat(g for i,g in grouped if len(g)>2)
最终,我认为groupby.filter
可以选择返回groupby对象会更好.
Ultimately I think it would be better if groupby.filter
had an option to return a groupby object.
这篇关于如何过滤Pandas GroupBy对象并获得GroupBy对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!