使用pandas GroupBy.agg()对同一列进行多次聚合 [英] Multiple aggregations of the same column using pandas GroupBy.agg()
问题描述
是否有熊猫内置方法将两个不同的聚合函数f1, f2
应用于同一列df["returns"]
,而不必多次调用agg()
?
Is there a pandas built-in way to apply two different aggregating functions f1, f2
to the same column df["returns"]
, without having to call agg()
multiple times?
示例数据框:
import pandas as pd
import datetime as dt
pd.np.random.seed(0)
df = pd.DataFrame({
"date" : [dt.date(2012, x, 1) for x in range(1, 11)],
"returns" : 0.05 * np.random.randn(10),
"dummy" : np.repeat(1, 10)
})
语法上错误但直观上正确的方法是:
The syntactically wrong, but intuitively right, way to do it would be:
# Assume `f1` and `f2` are defined for aggregating.
df.groupby("dummy").agg({"returns": f1, "returns": f2})
很明显,Python不允许重复的键.还有其他方式可以表达对agg()
的输入吗?也许元组[(column, function)]
的列表可以更好地工作,以允许将多个函数应用于同一列?但是agg()
似乎只接受字典.
Obviously, Python doesn't allow duplicate keys. Is there any other manner for expressing the input to agg()
? Perhaps a list of tuples [(column, function)]
would work better, to allow multiple functions applied to the same column? But agg()
seems like it only accepts a dictionary.
除了定义仅在其中应用两个功能的辅助功能之外,是否还有其他解决方法? (无论如何,这如何与聚合一起使用?)
Is there a workaround for this besides defining an auxiliary function that just applies both of the functions inside of it? (How would this work with aggregation anyway?)
推荐答案
您可以简单地将函数作为列表传递:
You can simply pass the functions as a list:
In [20]: df.groupby("dummy").agg({"returns": [np.mean, np.sum]})
Out[20]:
mean sum
dummy
1 0.036901 0.369012
或作为字典:
In [21]: df.groupby('dummy').agg({'returns':
{'Mean': np.mean, 'Sum': np.sum}})
Out[21]:
returns
Mean Sum
dummy
1 0.036901 0.369012
这篇关于使用pandas GroupBy.agg()对同一列进行多次聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!