将不同的聚合函数应用于不同的列(现在不适用于重命名) [英] Applying different aggregate functions to different columns (now that dict with renaming is deprecated)

查看:172
本文介绍了将不同的聚合函数应用于不同的列(现在不适用于重命名)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

之前我曾问过这个问题: python pandas:apply不同列的不同聚合函数
,但对熊猫的最新更改 https://github.com/pandas-dev/pandas/pull/15931
意味着我认为是优雅的pythonic解决方案已被弃用,因为我真的不明白。



现在的问题是,现在仍然是:在做groupby时,如何将不同的集合函数应用于不同的领域(例如x的总和,x的平均值,y的最小值,max的z等),并重新命名结果字段,一次性完成,或者至少以可能是pythonic而不是过于繁琐的方式重新命名结果字段?即sum_x不会,我需要显式重命名字段。



这个我喜欢的方法:

<$ ({ctr)。agg({realgdp:{mean_gdp:mean,std_gdp:std},
unemp :{mean_unemp:mean}})

将被弃用,现在会产生此警告:

  FutureWarning:使用带重命名的字典已过时,将在未来版本中删除

$ b

谢谢!

解决方案

agg )不被弃用,但使用agg进行重命名。



请仔细阅读文档: https://pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecate -groupby-agg-with-a-dictionary-when-renaming



弃用的是:
1. Pa向一个分组/滚动/重新取样的系列添加一个字典,允许用户重命名结果集合
2.将一个字典传递给一个分组/滚动/重新取样的DataFrame。

虽然它不是一行代码,但它可以工作。

  df.groupby('qtr') .agg({realgdp:[mean,std],unemp:mean})

df.columns = df.columns.map('_'。join )

df.rename(columns = {'realgdp_mean':'mean_gdp','realgdp_std':'std_gdp','unemp_mean':'mean_unemp'},inplace = True)


I had asked this question before: python pandas: applying different aggregate functions to different columns but the latest changes to pandas https://github.com/pandas-dev/pandas/pull/15931 mean that what I thought was an elegant and pythonic solution is deprecated, for reasons I genuinely fail to understand.

The question was, and still is: when doing a groupby, how can I apply different aggregate functions to different fields (e.g. sum of x, avg of x, min of y, max of z, etc.) and rename the resulting fields, all in one go, or at least in a possibly pythonic and not-too-cumbersome way? I.e. sum_x won't do, I need to rename the fields explicitly.

This approach, which I liked:

df.groupby('qtr').agg({"realgdp": {"mean_gdp": "mean", "std_gdp": "std"},
                                "unemp": {"mean_unemp": "mean"}})

will be deprecated and now produces this warning:

FutureWarning: using a dict with renaming is deprecated and will be removed in a future version

Thanks!

解决方案

agg() is not deprecated but renaming using agg is.

Do go through the documentation: https://pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecate-groupby-agg-with-a-dictionary-when-renaming

What is deprecated: 1. Passing a dict to a grouped/rolled/resampled Series that allowed one to rename the resulting aggregation 2. Passing a dict-of-dicts to a grouped/rolled/resampled DataFrame.

This will work, though its not a single line of code

df.groupby('qtr').agg({"realgdp": ["mean",  "std"], "unemp": "mean"})

df.columns = df.columns.map('_'.join)

df.rename(columns = {'realgdp_mean': 'mean_gdp', 'realgdp_std':'std_gdp', 'unemp_mean':'mean_unemp'}, inplace = True)

这篇关于将不同的聚合函数应用于不同的列(现在不适用于重命名)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆