列出 pandas "agg","AttributeError/ValueError:函数不减少". [英] Pandas `agg` to list, "AttributeError / ValueError: Function does not reduce"

查看:89
本文介绍了列出 pandas "agg","AttributeError/ValueError:函数不减少".的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常,当我们使用熊猫执行groupby操作时,我们可能希望在多个系列中应用多个功能.

Often when we perform groupby operations using pandas we may wish to apply several functions across multiple series.

groupby.agg 似乎是执行这些分组和计算的自然方法.

groupby.agg seems the natural way to perform these groupings and calculations.

但是,在groupby.agggroupby.apply的实现方式之间似乎存在差异,因为我无法使用agg分组到列表.元组和集合工作正常,这对我来说建议您只能通过agg聚合为不可变的类型.通过groupby.apply,我可以直接将一个系列汇总到一个列表中.

However, there seems to be discrepancy between how groupby.agg and groupby.apply are implemented, because I cannot group to a list using agg. Tuple and set works fine, which suggests to me you can only aggregate to immutable types via agg. Via groupby.apply, I can aggregate one series to a list directly with no issues.

下面是一个完整的示例.函数(1),(2),(3)成功完成. (4)返回# ValueError: Function does not reduce.

Below is a complete example. Functions (1), (2), (3) complete successfully. (4) comes back with # ValueError: Function does not reduce.

import pandas as pd

df = pd.DataFrame([['Bob', '1/1/18', 'AType', 'blah', 'test', 'test2'],
                   ['Bob', '1/1/18', 'AType', 'blah2', 'test', 'test3'],
                   ['Bob', '1/1/18', 'BType', 'blah', 'test', 'test2']],
                  columns=['NAME', 'DATE', 'TYPE', 'VALUE A', 'VALUE B', 'VALUE C'])


def grouper(df, func):
    f = {'VALUE A': lambda x: func(x), 'VALUE B': 'last', 'VALUE C': 'last'}
    return df.groupby(['NAME', 'DATE', 'TYPE'])['VALUE A', 'VALUE B', 'VALUE C']\
             .agg(f).reset_index()

# (1) SUCCESS
grouper(df, set)

# (2) SUCCESS
grouper(df, tuple)

# (3) SUCCESS
df.groupby(['NAME', 'DATE', 'TYPE', 'VALUE B', 'VALUE C'])['VALUE A']\
  .apply(list).reset_index()

# (4) FAIL
grouper(df, list)

# AttributeError
# ValueError: Function does not reduce

推荐答案

经过大量调查,我发现这是一个bug,将在以后的熊猫版本中修复.

After much investigation, I have discovered this is a bug, which will be fixed in a future release of pandas.

0.22.x中的违规代码groupby.py ,请注意isinstance(res, list):

def _aggregate_series_pure_python(self, obj, func):

    group_index, _, ngroups = self.group_info

    counts = np.zeros(ngroups, dtype=int)
    result = None

    splitter = get_splitter(obj, group_index, ngroups, axis=self.axis)

    for label, group in splitter:
        res = func(group)
        if result is None:
            if (isinstance(res, (Series, Index, np.ndarray)) or
                    isinstance(res, list)):
                raise ValueError('Function does not reduce')
            result = np.empty(ngroups, dtype='O')

        counts[label] = group.shape[0]
        result[label] = res

    result = lib.maybe_convert_objects(result, try_float=0)
    return result, counts

groupby.py的主分支,省略了isinstance(res, list):

def _aggregate_series_pure_python(self, obj, func):

        group_index, _, ngroups = self.group_info

        counts = np.zeros(ngroups, dtype=int)
        result = None

        splitter = get_splitter(obj, group_index, ngroups, axis=self.axis)

        for label, group in splitter:
            res = func(group)
            if result is None:
                if (isinstance(res, (Series, Index, np.ndarray))):
                    raise ValueError('Function does not reduce')
                result = np.empty(ngroups, dtype='O')

            counts[label] = group.shape[0]
            result[label] = res

        result = lib.maybe_convert_objects(result, try_float=0)
        return result, counts

这篇关于列出 pandas "agg","AttributeError/ValueError:函数不减少".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆