Python 3 pandas.groupby.filter [英] Python 3 pandas.groupby.filter

查看:941
本文介绍了Python 3 pandas.groupby.filter的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试执行与本文档中的示例非常相似的groupby筛选器:熊猫groupby过滤器

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...                           'foo', 'bar'],
...                    'B' : [1, 2, 3, 4, 5, 6],
...                    'C' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')
>>> grouped.filter(lambda x: x['B'].mean() > 3.)
     A  B    C
1  bar  2  5.0
3  bar  4  1.0
5  bar  6  9.0

我试图返回一个具有所有3列但只有2行的DataFrame。在按列A分组之后,这两行包含了列B的最小值。我尝试了以下代码行:

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

但是这不起作用,并且我得到这个错误:
TypeError:过滤器函数返回了一个Series,但是期望一个标量布尔值

But this doesn't work, and I get this error: TypeError: filter function returned a Series, but expected a scalar bool

我要返回的DataFrame应该看起来像这样:

The DataFrame I am trying to return should look like this:

    A   B   C
0  foo  1  2.0
1  bar  2  5.0

对您的帮助我将不胜感激可以提供。

I would appreciate any help you can provide. Thank you, in advance, for your help.

推荐答案

>>> # sort=False to return the rows in the order they originally occurred
>>> df.loc[df.groupby("A", sort=False)["B"].idxmin()]

     A  B    C
0  foo  1  2.0
1  bar  2  5.0

这篇关于Python 3 pandas.groupby.filter的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆